WO2013125414A1

WO2013125414A1 - Mutual authentication system, mutual authentication server, mutual authentication method, and mutual authentication program

Info

Publication number: WO2013125414A1
Application number: PCT/JP2013/053414
Authority: WO
Inventors: 潤野田
Original assignee: 日本電気株式会社
Priority date: 2012-02-23
Filing date: 2013-02-13
Publication date: 2013-08-29
Also published as: JPWO2013125414A1

Abstract

[Problem] To provide a mutual authentication system for creating an ad-hoc connection between information devices. [Solution] The mutual authentication system (1) according to the present invention is configured by connecting a plurality of terminal devices (20) to a mutual authentication server (10) which generates and assigns an authentication key to the terminal devices. Each terminal device (20) is provided with a sound data transmission means (203) for transmitting the ambient sound of the surroundings to the mutual authentication server as time-series data representing the change in volume of sound over time. The mutual authentication server (10) is provided with a feature vector generation means (103) for subjecting the time-series data received from each terminal device to a fast Fourier transform (FFT) and generating feature vectors, a feature vector comparison means (104) for comparing the generated feature vectors for the terminal devices and determining whether there is a match, and a key sharing means (105) for generating and transmitting the authentication key to each terminal device when the feature vectors match.

Description

Mutual authentication system, mutual authentication server, mutual authentication method, and mutual authentication program

The present invention relates to a mutual authentication system, a mutual authentication server, a mutual authentication method, and a mutual authentication program, and more particularly to a mutual authentication system for constructing an ad hoc connection relationship between specific devices.

The technology of data communication between information devices has been developed recently, but there are cases where it is desired to establish an "ad hoc (non-permanent, temporary)" connection relationship between specific devices. . For example, when a company employee and a trader have a business meeting, they may want to share materials, minutes, etc. between the attendees of the meeting. Or it may be the case where people who have no particular contact in a specific place (for example, a restaurant or a concert or sports game venue) want to build a community.

相互 Mutual authentication between devices is essential even when building such an ad hoc connection relationship. In this case, it is desirable that the mutual authentication can be performed by a simpler operation for the user, rather than a complicated operation in which the user inputs, for example, a long-digit PIN (Personal Identification Number) or a password.

There are the following technical documents as related technologies. Patent Document 1 discloses a communication function that detects that buttons provided on an apparatus are simultaneously pressed, generates a unique group connection ID, and uses this as a common key (authentication key) for mutual authentication. Is described.

Non-Patent Document 1 describes an outline of a technique in which keys are shared by holding devices provided with a non-contact IC reader and used for mutual authentication as an authentication key. Further, in Patent Documents 2 to 3 and Non-Patent Documents 2 to 4, a common change amount is detected by applying the same movement from the outside to the two devices including the acceleration sensor, thereby sharing the authentication key. The technology is described.

Patent Document 4 describes a mutual authentication system in which a mobile terminal reads a two-dimensional code generated by a web server and displayed on a terminal, thereby generating unique information for specifying a user. Patent Document 5 describes an authentication method including a plurality of terminals and a session management apparatus, and exchanging key information between terminals via an encrypted channel established by the terminal-session management apparatus.

Japanese Patent No. 3707660 JP 2008-31726 A JP 2010-187282 A JP 2009-124311 A JP 2005-160005 A

Although each of the above-described mutual authentication methods, in the technique described in Patent Document 1, an operation of pressing a button is an action that can be easily performed by a third party. There is a risk of issuing an authentication key. Further, if the timing is shifted by the operation of simultaneously pressing the buttons, even a legitimate user may not be able to authenticate normally, so the convenience of this system is not high. The technique described in Non-Patent Document 1 has a problem in cost because it is necessary to equip each device with a non-contact IC reader.

In the techniques described in Patent Documents 2 to 3 and Non-Patent Documents 2 to 4, the user needs to perform operations such as shaking the two devices together. For example, there are many devices that cannot be operated due to physical reasons such as being large, heavy, and vulnerable to impact, and thus cannot be applied to such devices.

The technology that can solve this problem is not described in the remaining patent documents 4 to 5. The technique described in Patent Document 4 is for authenticating that the terminal and the portable terminal are the same user, and does not construct an ad hoc connection relationship, so the object of the invention is different in the first place. . The technique described in Patent Document 5 is a technique for constructing a communication path for distributing an authentication key, and is not a technique for generating an authentication key.

An object of the present invention is to provide a mutual authentication system that does not require a complicated operation for a user, does not significantly increase costs, and can establish an ad hoc connection relationship between information devices with sufficient security, To provide a mutual authentication server, a mutual authentication method, and a mutual authentication program.

In order to achieve the above object, a mutual authentication system according to the present invention is configured by connecting a plurality of terminal devices and a mutual authentication server that generates and gives an authentication key to these terminal devices. In the mutual authentication system, each terminal device includes voice data transmitting means for transmitting surrounding environmental sound to the mutual authentication server as time-series data representing a temporal change in sound volume. A feature vector generation unit that analyzes a frequency component for each time-series data received from each terminal device and generates a feature vector, and compares the generated feature vectors between the terminal devices to determine whether they match. A feature vector comparing means for determining whether or not, and a key sharing means for generating and transmitting an authentication key to each terminal device when the feature vectors match.

In order to achieve the above object, a mutual authentication server according to the present invention is a mutual authentication server that is mutually connected to a plurality of terminal devices and constitutes a mutual authentication system, and the volume of sound received from each terminal device. The feature vector generation means for generating a feature vector by analyzing the frequency component for each of the time-series data representing the time variation of the time, and comparing the generated feature vector between the terminal devices, whether or not they match A feature vector comparing means for determining whether or not and a key sharing means for generating and transmitting an authentication key to each terminal device when the feature vectors match.

In order to achieve the above object, a mutual authentication method according to the present invention is configured by connecting a plurality of terminal devices and a mutual authentication server that generates and gives an authentication key to these terminal devices. In the mutual authentication system, the voice data transmitting means of each terminal device transmits surrounding environmental sounds to the mutual authentication server as time-series data representing the time change of the volume of the sound, and the feature vector of the mutual authentication server The generation means analyzes the frequency component for each of the time-series data received from each terminal device to generate a feature vector, and the feature vector comparison means of the mutual authentication server transfers the generated feature vector between the terminal devices. The key sharing means of the mutual authentication server generates and transmits an authentication key to each terminal device when the feature vectors match. And butterflies.

In order to achieve the above object, a mutual authentication program according to the present invention is configured by connecting a plurality of terminal devices and a mutual authentication server that generates and gives an authentication key to these terminal devices. A procedure for generating a feature vector by analyzing a frequency component for each of time-series data representing a temporal change in sound volume received from each terminal device in a computer provided in the mutual authentication server in the mutual authentication system A procedure for comparing the generated feature vectors between the terminal devices to determine whether they match, and a procedure for generating and transmitting an authentication key to each terminal device if the feature vectors match Is executed.

Since the present invention is configured such that the mutual authentication server generates the authentication key when the feature vectors of the environmental sounds around the terminal device match as described above, the voice input means provided in many devices in advance is provided. Mutual authentication can be performed by using it as it is. As a result, there is an excellent feature that it is possible to construct an ad hoc connection relationship between information devices with sufficient safety without requiring a complicated operation for the user and without significantly increasing the cost. A mutual authentication system, a mutual authentication server, a mutual authentication method, and a mutual authentication program can be provided.

It is explanatory drawing shown about the structure of the mutual authentication system which concerns on the 1st Embodiment of this invention. FIG. 2 is an explanatory diagram showing the configuration of each of the data synchronization means to key sharing means shown in FIG. 1 in more detail. It is explanatory drawing shown about an example of the operation | movement in which the audio | voice data transmission means shown in FIG. 1 obtains the representative value of environmental sound. It is explanatory drawing shown about an example of the operation | movement in which the data synchronization means shown in FIG. 1 adjusts the shift | offset | difference of the time-axis direction of time series data. It is explanatory drawing shown about the process which the Fourier-transform function of the feature vector production | generation means shown in FIG. 2 performs. It is explanatory drawing shown about the process which the cutoff function of the feature vector production | generation means shown in FIG. 2 performs. It is explanatory drawing shown about the process which the quantization function of the feature vector production | generation means shown in FIG. 2 performs. FIG. 8 is an explanatory diagram illustrating examples of a plurality of types of quantization patterns prepared when the quantization function of the feature vector generation unit illustrated in FIG. 2 performs the processing illustrated in FIG. 7. It is explanatory drawing shown about the process which the feature vector comparison means shown in FIG. 1 performs. It is explanatory drawing shown about the process which the key sharing means 105 shown in FIG. 1 performs. It is explanatory drawing which shows the example of the mutual authentication system produced experimentally. It is a graph which shows the change with respect to the measurement time of the sum total of the information content calculated between each terminal device in the mutual authentication system shown in FIG. It is a flowchart shown about operation | movement of the mutual authentication performed between the mutual authentication server shown in FIG. 1, and a terminal device. It is explanatory drawing shown about the structure of the mutual authentication system which concerns on the 2nd Embodiment of this invention. It is a flowchart shown about the operation | movement of the mutual authentication performed between the mutual authentication server shown in FIG. 14, and a terminal device.

(First embodiment)
Hereinafter, the configuration of the embodiment of the present invention will be described with reference to the accompanying FIGS.
First, the basic content of the present embodiment will be described, and then more specific content will be described.
A mutual authentication system 1 according to the present embodiment includes a mutual authentication system configured by connecting a plurality of terminal devices 20 and a mutual authentication server 10 that generates and gives an authentication key to these terminal devices. It is. Each terminal device 20 includes a voice data transmission unit 203 that transmits surrounding environmental sounds to the mutual authentication server as time-series data representing temporal changes in sound volume. The mutual authentication server 10 compares the generated feature vector between the terminal devices and the feature vector generating unit 103 that generates a feature vector by analyzing the frequency component of each time-series data received from each terminal device. The feature vector comparison unit 104 that determines whether or not they match, and the key sharing unit 105 that generates and transmits an authentication key to each terminal device when the feature vectors match.

Each terminal device 20 includes a sensing unit 202 that collects ambient environmental sounds as audio data, and the audio data transmission unit 203 extracts a plurality of representative values from the collected audio data and converts them into time-series data. Send as. Furthermore, each of the terminal device 20 and the mutual authentication server 10 includes time synchronization means 201 and 101 that synchronize time with each other in advance.

The mutual authentication server 10 detects a predetermined number of extreme values from each time series data received from each terminal device, and based on the timing at which these extreme values are detected, the time axis direction between the time series data The data synchronization unit 102 corrects the deviation and outputs it to the feature vector generation unit 103.

The feature vector generation means 103 of the mutual authentication server 10 divides the time-series data into time windows with a constant time interval, performs FFT (Fast Fourier Transform) on each time window, and outputs a power spectrum. 103a, a quantization function 103c that outputs a feature vector for each frequency by collating the power level for each frequency of the output power spectrum with a threshold set in advance in a plurality of stages, and a Fourier transform function 103a. A cut-off function 103b that removes a frequency component equal to or higher than a predetermined cut-off frequency from the power spectrum and shifts it to the quantization function 103c.

The quantization function 103c of the mutual authentication server 10 provides a plurality of groups with a plurality of thresholds with respect to a preset level as one group, and outputs a feature vector for each group. The feature vector comparison unit 104 compares a plurality of feature vectors generated in the same time window between the terminal devices, and if there is even one feature vector that matches between the terminal devices, It is determined that the feature vectors of the terminal device match.

Then, the key sharing means 105 of the mutual authentication server 10 calculates the total value of the information amount per unit time of the feature vectors that coincide in the target time range of the time series data, and the calculated total value of the information amount is given in advance. Only when the value is equal to or greater than the predetermined value, the matched feature vectors are concatenated and a hash value of the concatenated matched feature vectors is generated as an authentication key.

By providing the above configuration, the mutual authentication system 1 can construct an ad hoc connection relationship between the terminal devices 20 without requiring a complicated operation, without significantly increasing costs, and with sufficient safety. It becomes possible.
Hereinafter, this will be described in more detail.

FIG. 1 is an explanatory diagram showing the configuration of the mutual authentication system 1 according to the first embodiment of the present invention. The mutual authentication system 1 includes two

terminal devices

20a and 20b (hereinafter collectively referred to as terminal devices 20) to be subjected to mutual authentication, and a mutual authentication server that generates and gives an authentication key to these terminal devices 20. 10 are connected to each other via a network 30. Here, the network 30 may be a wired connection or a wireless connection, and the connection form and protocol are not particularly limited.

The mutual authentication server 10 has a basic configuration as a computer device. In other words, the computer 11 includes a processor 11 that is an operation subject of the computer program, a storage unit 12 that stores the program and data, and a communication unit 13 that performs data communication with each terminal device 20.

The processor 11 of the mutual authentication server 10 operates the time synchronization means 101 for matching the time with the terminal device 20 by the operation of the mutual authentication program, and the time axis direction deviation of the time series data received from each terminal device 20. Data synchronization unit 102 to absorb, feature vector generation unit 103 that generates a feature vector from the time-series data, feature vector comparison unit 104 that compares the generated feature vectors to determine whether they match, and the feature vectors match In this case, each of the key sharing means 105 functions to generate and transmit an authentication key to each terminal device 20. The generated authentication key 111 is stored in the storage unit 12.

FIG. 2 is an explanatory diagram showing the configuration of each of the data synchronization means 102 to key sharing means 105 shown in FIG. 1 in more detail. The feature vector generation unit 103 is further divided into three functional units, that is, a Fourier transform function 103a, a cutoff function 103b, and a quantization function 103c.

The Fourier transform function 103a performs FFT (Fast Fourier Transform) on the input time series data and outputs a power spectrum. The cut-off function 103b cuts frequency components of the input power spectrum that are equal to or higher than a predetermined cut-off frequency. The quantization function 103c collates the output power spectrum with a plurality of threshold values given in advance, and outputs a feature vector having the quantized value of the power spectrum for each frequency as an element.

Referring back to FIG. 1, the terminal devices 20 (20a and 20b) both have the same configuration, and all have a basic configuration as a computer device. That is, the voice data is transmitted by the processor 21 that is the main operating body of the computer program, the storage unit 22 that stores the program and data, the communication unit 23 that performs data communication with the mutual authentication server 10 and other terminal devices 20, and a microphone. Voice input means 24 for acquiring and inputting

The processor 21 of the terminal device 20 operates the time series of sound pressure by using the time synchronization unit 201 that synchronizes the time with the mutual authentication server 10 and the sound input unit 24 by operating the mutual authentication program. Sensing means 202 that collects change data (voice data), voice data sending means 203 that compresses the amount of acquired voice data and sends it to the mutual authentication server 10, and receives an authentication key from the mutual authentication server 10 Function as mutual authentication means 204 for performing mutual authentication. The storage unit 22 stores the authentication key 211 received from the mutual authentication server 10.

(Time synchronization means)
Next, the operation of each means of the mutual authentication server 10 and the terminal device 20 will be described. In the mutual authentication server 10 and the terminal device 20, each time synchronization means 101 and 201 performs time synchronization processing for adjusting time between each other. This time synchronization process may be performed when the mutual authentication server 10 and the terminal device 20 are turned on, or may be performed periodically at a specific cycle. An NTP (Network Time Protocol) protocol can be used for time synchronization.

(Sensing means)
The sensing means 202 of the terminal device 20 is based on the assumption that the time synchronization process is performed, and a reference time t0 that is set in advance as a common value between the mutual authentication server 10 and each terminal device 20. Based on the time interval w, when the current time (elapsed time from the reference time) t (where t0 + (α−1) w <t <t0 + αw, α is a natural number), the time w from the time point t = (t0 + αw) Minute environmental sound is obtained via the voice input means 24, and this is sampled at a sampling rate f (unit: Hz) set as a common value between the mutual authentication server 10 and each terminal device 20, Acquired as sound pressure data.

Here, the values of t0, w, and f may be set as common values in the time synchronization process. For example, when values such as t0 = 0, w = 3, t = 10, and f = 8000 (8 kHz) are set, the environmental sound is composed of sound pressure values of 8000 sounds / second for 12 to 15 seconds. It means that it is collected as time-series audio data.

(Voice data transmission means)
FIG. 3 is an explanatory diagram illustrating an example of an operation in which the audio data transmission unit 203 illustrated in FIG. 1 obtains a representative value of the environmental sound. The voice data transmission unit 203 reduces the data amount of the sound pressure data 251 of the f / second environmental sound collected by the sensing unit 202 as data of fc (f> fc) / second.

More specifically, the data is divided for each number of quotients obtained by dividing f by fc, and the maximum value obtained for each division is set as the representative value of the section to generate a total of fc time-series data 252. Then, the time-series data 252 is transmitted to the mutual authentication server 10 together with the label 252b indicating the sensing time in the sensing means 202. Here, the label 252b indicates a time range in which audio data is acquired, such as “t0 + αw to t0 + (α + 1) w”. Further, instead of the maximum value, an average value in the time range can be used as a representative value. Hereinafter, this “time range in which the audio data is acquired” is referred to as a target time range.

In this example, assuming that f = 8000 and fc = 100, the voice data transmission unit 203 obtains an average value of every 80 sound pressure values, and obtains time series data 252 of 100 / unit time. Then, the obtained time series data is transmitted to the mutual authentication server 10. When w = 3, 300 time-series data are obtained every 3 seconds. By this processing, it is possible to obtain time-series data from which high-frequency components included in the audio data are removed.

(Data synchronization means)
In the mutual authentication server 10, the data synchronization means 102 waits for the arrival of the time series data 252 from the two

terminal devices

20a and 20b, and corrects the deviation in the time axis direction for the time series data having the same label indicating the sensing time. . This data synchronization means 102 absorbs the time lag of the time series data between the terminal devices. However, when the terminal time can be accurately synchronized, it is considered that such time lag does not occur at all. Therefore, this means may not be provided.

FIG. 4 is an explanatory diagram showing an example of an operation in which the data synchronization means 102 shown in FIG. 1 adjusts the time axis direction deviation of the time series data. The data synchronization means 102 obtains, from the time series data received from the terminal device 20, a point where the slope of the sound pressure fluctuation graph changes from positive to negative, that is, a point that is convex upward as an extreme value.

At this time, the data synchronization means 102 has the smallest difference between possible values among the extreme values whose distance on the time axis is within a predetermined threshold when the waveforms received from the respective terminal devices 20 are superimposed. The value is determined as an extreme value measured at the same timing. Then, from the extreme values determined to be extreme values at the same timing, an average value of distances in the time axis direction is calculated for a predetermined number (this number is a variable p_num) from the top of the value, and the calculation is performed. It operates to shift one data in the time axis direction by the average value.

In FIG. 4, the waveforms of the time series data received from the

terminal devices

20a and 20b are displayed as 261 and 262, respectively. For these

waveforms

261 and 262, p_num = 4, that is, an average value of distances in the time axis direction is calculated for the top four extreme values 271 to 274 in the waveform within the target time range. Based on this average value, one waveform is shifted in the time axis direction by the average value.

At this time, the time series data of the terminal device on the time axis advanced side may be padded with zeros (so-called zero padding) for the time advanced to the head of the time series data. Or when the data of the time slot | zone before the target time range of the said terminal device are acquirable, you may supplement the data for the time advanced from there. Alternatively, as described above, when it is possible to accurately synchronize the terminal time, this processing by the data synchronization unit 102 is omitted, and the feature vector generation unit 103 that follows is obtained from each terminal device 20. It is also possible to pass time-series data as it is.

(Feature vector generation means / Fourier transform function)
FIG. 5 is an explanatory diagram showing processing performed by the Fourier transform function 103a of the feature vector generation unit 103 shown in FIG. The Fourier transform function 103a divides the time-series data whose time axis direction deviation is adjusted by the data synchronization means 102 into small intervals of a predetermined time interval. This is referred to herein as time windows 301-303. The Fourier transform function 103a performs FFT (Fast Fourier Transform) on each time window, and outputs a power spectrum indicating the frequency characteristics. FIG. 5 shows a power spectrum 311 output with respect to the time window 301.

The time-series data received from each terminal device 20 includes sound pressure values of 100 / unit time as described above. In the example shown in FIG. 5, 64 continuous sound pressure values are used as one time window 301. In addition, 50% (32 pieces) of them are overlapped with the next time window 302. The subsequent time window 303 is also overlapped with the previous time window 302 by 50% (32). The rate at which the continuous time windows are overlapped can be arbitrarily set. By doing in this way, more time windows can be cut out as comparison target data from time series data in the same target time range.

(Feature vector generation means / cut-off function)
FIG. 6 is an explanatory diagram showing processing performed by the cut-off function 103b of the feature vector generation unit 103 shown in FIG. The cut-off function 103b cuts a frequency component having a frequency equal to or higher than a predetermined cut-off frequency fm from the output power spectrum 311. In the example shown in FIG. 6, fm = 10 Hz, but this cutoff frequency can be set arbitrarily. Alternatively, the cut-off function 103b may be realized in an analog manner by a so-called LPF (low-pass filter) placed before the Fourier transform function 103a.

However, fm must be a value less than or equal to half of the fc used by the voice data transmission means 203 to create time series data. In the example shown in this specification, since fc = 100 Hz, fm = 10 Hz sufficiently satisfies the condition.

(Feature vector generation means / quantization function)
FIG. 7 is an explanatory diagram showing processing performed by the quantization function 103c of the feature vector generation unit 103 shown in FIG. The quantization function 103c performs a quantization process by applying a quantization pattern to the power spectrum 311 output from the Fourier transform function 103a. The quantization pattern here is a pattern in which a plurality of threshold values are set for each component frequency given in advance at a cutoff frequency fm or lower. A plurality of quantization patterns are prepared in advance, which will be described later, and FIG. 7 shows processing by one quantization pattern.

In the example shown in FIG. 7, one quantization pattern includes four threshold values T1 to T4 for each component frequency of the power spectrum 311. The power at each component frequency is classified into five levels. ing. Here, the component frequency is set in increments of 1 Hz from 0 to 10 Hz. For example, when the component frequency is “0 Hz”, the power corresponding to each component frequency means the maximum value of power in the frequency range of “0 Hz or more and less than 1 Hz”. The same applies to component frequencies of 1 Hz or higher. In actual operation, the step is obtained as fc / N from the number of data points fc per unit time of the data input to the FFT and the number N of data points in the time window.

Region (5) if the power corresponding to each component frequency is T4 or more, region (4) if T3 or more and less than T4, region (3) if T2 or more and less than T3, region if T1 or more and less than T2. (2) If it is less than T1, it is classified into region (1). The number and value of each of these threshold values and component frequencies can be arbitrarily set. In the text of this specification, for example, “5 number” is written as “(5)”.

As described above, the feature vector generation unit 103 outputs the feature vector 321 with respect to the power spectrum 311 shown in FIG. The feature vector here is an array of regions corresponding to the power at each frequency for each component frequency of 1 to 10 Hz.

FIG. 8 is an explanatory diagram showing examples of a plurality of quantization patterns prepared when the quantization function 103c of the feature vector generation unit 103 shown in FIG. 2 performs the processing shown in FIG. In the example shown in FIG. 8, four kinds of quantization patterns 331 to 334 are prepared, and each of them includes four threshold values as in the case shown in FIG.

The number of quantization patterns can also be set arbitrarily, but the number of threshold values is common among the quantization patterns. The threshold value included in each quantization pattern is an exponentially widened value based on the maximum value of the power spectrum calculated from the time series data. The threshold value is slightly different between the quantization patterns.

As will be described in detail later, when the power is in the vicinity of the threshold, whether the obtained power value is higher or lower than the threshold may change even if the acoustic characteristics and the position from the sound source are slightly different. As a result, different feature vectors may be output even for time windows obtained from the same environmental sound. For this reason, the quantization function 103c prepares a plurality of quantization patterns with the threshold values shifted little by little, and outputs a plurality of feature vectors from one time window.

(Feature vector comparison means)
FIG. 9 is an explanatory diagram showing processing performed by the feature vector comparison unit 104 shown in FIG. The feature vector comparison unit 104 compares the plurality of feature vectors output from the quantization function 103c with respect to the same time window output from each

terminal device

20a and 20b, and the environmental sound collected by each terminal device is It is determined whether or not they are the same. At this time, assuming that there are c quantization patterns prepared by the quantization function 103c, c feature vectors are output from each of the

terminal devices

20a and 20b in the same time window. Here, the “same time window” means a time window observed and processed in the same time range in each of the

terminal devices

20a and 20b.

The feature vector comparison means 104 compares the feature vectors output c by way of the same time window in each of the

terminal devices

20a and 20b, and performs a brute force comparison. It is determined that the time series data match between the

devices

20a and 20b.

In the example shown in FIG. 9,

feature vectors

341 and 342 are output from the

terminal devices

20a and 20b, respectively, in the same time range. Since the number of quantization patterns is c = 4, there are four

output feature vectors

341 and 342, respectively. Each of the

feature vectors

341 and 342 represents which region in FIGS. 7 to 8 corresponds to the power corresponding to each component frequency in increments of 1 Hz from 0 to 3 Hz.

In FIG. 9 and FIG. 10 to be described later, in order to simplify the description, only the power corresponding to each component frequency of “0 to 3 Hz in 1 Hz increments” is displayed for each feature vector. The definition of “power corresponding to each component frequency” is the same as in FIG. For example, when the component frequency is “0 Hz”, the power corresponding to each component frequency means the maximum value of power in the frequency range of “0 Hz or more and less than 1 Hz”. The same applies to component frequencies of 1 Hz or higher.

The feature vector comparison unit 104 finds that “(5) (1) (2) (1)” of the feature vector matches the third from the top of the feature vector 331 and the first from the top of the feature vector 332. Therefore, it is already determined that the time series data match between the

terminal devices

20a and 20b.

(Key sharing means)
FIG. 10 is an explanatory diagram showing processing performed by the key sharing unit 105 shown in FIG. The key sharing unit 105 receives the comparison result, generates an authentication key 111 for the

terminal devices

20a and 20b, stores the authentication key 111 in the storage unit 12, and transmits the authentication key to the

terminal devices

20a and 20b.

At this time, the key sharing means 105 calculates the information amount from the generation probability of the matched feature vector, and the key length exceeds a certain threshold with the sum of the information amount of the matched feature vectors for each time window as the key length. Only in this case, the hash value of the concatenated matching time windows is set as the authentication key 111. The amount of information here (selected information amount, self-entropy) is a concept in information theory. If the probability of occurrence of a certain event is P, the amount of information I (bits) calculated by the following equation 1 is used. is there.

The key sharing means 105 counts the total number of matched feature vectors and the number of matching individual feature vectors in the time range to be compared, but this count is not reset in units of time windows. The feature vectors that match in other time windows within the target time range are also accumulated and counted, and the total amount of information (bits) of the matched feature vectors is obtained.

In the example illustrated in FIG. 10, the total number of feature vectors 341 that coincide between the

terminal devices

20a and 20b is ten, of which “(5) (1) (4) (1)” is three, (5) (2) (4) (1) ", one" (5) (1) (4) (2) "," (5) (1) (3) (1) " Three pieces, “(5) (2) (4) (2)”, coincide with two pieces. The respective feature vectors are obtained from the

time windows

301 or 262 obtained from the time

series data waveforms

261 and 262 obtained from the

terminal devices

20a and 20b, and matched by the feature vector comparison means. Is observed.

In this case, for example, in the case of “(5) (1) (4) (1)”, three of the ten pieces matched, so when the probability P = 3/10 is applied to the above equation 1, the information amount I = 1.737 bits are required. Similarly, when the amount of information is calculated for other feature vectors and the total is obtained as the key length, 12.44 bits are obtained. In this example, since the threshold value is set to 10 bits, “12.44 bits” calculated here exceeds this threshold value.

FIG. 11 is an explanatory diagram showing an example of a mutual authentication system 401 created experimentally. This mutual authentication system 401 is configured by connecting three terminal devices 420a to 420c and a mutual authentication server 410 via a wireless network. Each of the terminal devices 420a to 420c has the same configuration as that of the terminal device 20 shown in FIGS. The mutual authentication server 410 has the same configuration as the mutual authentication server 10 shown in FIGS. The

terminal devices

420a and 420b are installed in the same room 421, and the terminal device 420c is installed in the adjacent room 422.

Then, the sum of the information amounts of the matched feature vectors was calculated between the

terminal devices

420a and 420b and between the

terminal devices

420a and 420c by the method described up to FIG. FIG. 12 is a graph showing changes with respect to the measurement time of the total amount of information calculated between the

terminal devices

420a and 420b and between the

terminal devices

420a and 420c in the mutual authentication system 401 shown in FIG. . The sum total of the information amount calculated between the

terminal devices

420a and 420b is represented as a graph 431, and the sum total of the information amount calculated between the

terminal devices

420a and 420c is represented as a graph 432.

In other words, the graph 431 represents the total amount of information calculated in each case in the situation where the same environmental sound should be observed, whereas the graph 432 represents the situation where the same environmental sound should not be observed. ing. As clearly shown in FIG. 12, the graphs 431 and 432 clearly differ in the total amount of information. Therefore, an appropriate value in the middle is set in advance as the threshold value 433, and the length of the target time when sensing is performed. It is possible to detect whether or not the environmental sounds are the same by comparing the corresponding threshold value with the total amount of information obtained.

When it is determined that the environmental sounds are the same, the key sharing unit 105 generates a hash value of the concatenated feature vectors 341 concatenated as the authentication key 111 and causes the storage unit 12 to store the hash value. And 20b. The hash value here is an output value obtained by inputting the concatenated feature vectors 341 into an irreversible function.

(Mutual authentication means)
In the

terminal devices

20a and 20b, the mutual authentication unit 204 receives the authentication key 211 (111) and stores it in the storage unit 22, and the

terminal devices

20a and 20b perform mutual authentication using the authentication key. The authentication key 111 (211) is preferably transmitted and received by a secure communication method such as SSL (Secure Socket Layer). The mutual authentication operation can use a known technique such as challenge-response authentication.

(flowchart)
FIG. 13 is a flowchart showing the mutual authentication operation performed between the mutual authentication server 10 and the terminal device 20 shown in FIG. First, in the mutual authentication server 10 and the terminal device 20, the respective time synchronization means 101 and 201 perform time synchronization processing for adjusting the time between them (steps S101 and 201).

Next, on the terminal device 20 side, the sensing means 202 acquires the environmental sound as voice data via the voice input means 24 (step S202). And the audio | voice data transmission means 203 of the terminal device 20 produces the time series data which removed the high frequency component from the acquired audio | voice data, and reduced the data amount, and transmits this to the mutual authentication server 10 (step S203). .

In the mutual authentication server 10, the data synchronization means 102 waits for reception of time-series data from the

terminal devices

20a and 20b, and corrects the time-axis direction deviation of the time-series data received from both terminal devices (step S102). Then, the feature vector generation unit 103 divides this time series data into time windows to form time windows, and applies c quantization patterns prepared in advance to each time window to obtain c feature vectors. Generate (step S103).

Then, the feature vector comparison means 104 compares the c feature vectors generated in the same time window for both terminal devices in a brute force manner, and if even one feature vector is common, the feature vectors match. Determination is made (step S104). If the feature vectors do not match, the process ends abnormally.

The key sharing unit 105 that has received the comparison result calculates the information amount of each matched feature vector from the generation probability of the matched feature vector, and the matched feature vector only when the sum exceeds a given threshold value. Are transmitted as an authentication key to each terminal (steps S105 to S106). Even when the information amount of the feature vector does not exceed a certain threshold, the process ends abnormally. The mutual authentication means 204 of the

terminal devices

20a and 20b performs mutual authentication using the received authentication key (step S204).

Through the processing described above, this mutual authentication system can generate and issue an effective authentication key with sufficient strength by voice input alone. No special operation is required for this, and since many devices are equipped in advance with hardware and software necessary for voice input, the installation cost is small.

(Overall operation of the first embodiment)
Next, the overall operation of the above embodiment will be described.
The mutual authentication method according to the present embodiment is a mutual authentication system in which a plurality of terminal devices 20 and a mutual authentication server 10 that generates and gives authentication keys to these terminal devices are connected to each other. Then, the voice data transmitting means of each terminal device transmits the surrounding environmental sound to the mutual authentication server as time-series data representing the time change of the sound volume (FIG. 13, steps S202 to 203). The feature vector generating unit of the authentication server analyzes the frequency component for each of the time series data received from each terminal device to generate a feature vector (FIG. 13, step S103), and the feature vector comparing unit of the mutual authentication server Compares the generated feature vectors between the terminal devices to determine whether or not they match (FIG. 13, step S104), and the key sharing means of the mutual authentication server It generates and transmits an authentication key to each terminal device when the feature vectors are matched (13 steps S105 ~ 106).

Further, the process of generating the feature vector by the feature vector generating means shown in step S103 of FIG. 13 is performed by dividing the time series data into time windows having a constant time interval by the Fourier transform function and performing FFT ( (Fast Fourier Transform) is performed to output a power spectrum, and the output power spectrum is collated with a plurality of thresholds to which a quantization function is given in advance to output a feature vector for each given frequency. Further, here, a plurality of feature vectors are output for each of a plurality of patterns given in advance for a combination of a plurality of threshold values.

Then, the process of comparing the feature vectors by the feature vector comparison unit shown in step S104 in FIG. 13 compares the plurality of feature vectors generated in the same time window between the terminal devices, and the feature between the terminal devices. If even one of the vectors matches, it is determined that the feature vectors of the terminal devices match.

Further, the process of generating and transmitting the authentication key by the key sharing means shown in steps S105 to S106 in FIG. Only when the calculated total amount of information is greater than or equal to a certain value, the hash value of the concatenated matched feature vectors is generated as the authentication key.

Here, each of the above-described operation steps may be programmed to be executable by a computer, and may be executed by the processor 11 of the mutual authentication server 10 that directly executes each of the steps. The program may be recorded on a non-temporary recording medium, such as a DVD, a CD, or a flash memory. In this case, the program is read from the recording medium by a computer and executed.
By this operation, this embodiment has the following effects.

(Effect obtained by this embodiment)
Sound is also “vibration” in a broad sense. Therefore, this embodiment using environmental sound for authentication is similar to the prior art described in Patent Documents 2 to 3 and Non-Patent Documents 2 to 4, in which the user “shakes two devices together”. Is certainly.

However, when environmental sound is used for authentication, the following three points are problematic.
The first point is that the vibration that is the subject of authentication in this prior art is at most several times per second, but the voice is, for example, in the specification of a microphone for voice calls of a mobile phone terminal. Since sampling at 8 kHz and 8 bits is required at least, if this is transmitted as it is, a huge amount of communication (8 kilobytes / second) occurs during transmission, and the amount of processing at the time of determination is also huge It will be a thing.
The second point is that it is not always appropriate to share a key for mutual authentication even among users under similar environmental sounds.
The third point is that it is greatly influenced by differences in observation locations and acoustic characteristics between devices.
From the above three points, it is very difficult to replace “vibration” in the prior art with “environmental sound” as it is.

In this embodiment, when the environmental sound data collected by the terminal device 20 is transmitted to the mutual authentication server 10, the sound pressure data is compressed from 8000 to 100 per second and transmitted. This not only reduces the amount of communication during transmission (corresponding to the first point), but also incorrectly issues an authentication key to an unauthorized user, that is, the risk of false positive occurrence. The effect of reducing can also be acquired. This will be described in more detail below.

Environmental sounds can be broadly divided into “steady sound systems” that occur periodically and “pulse systems” that occur suddenly, but the environmental sounds of “steady sound systems” are not physically close to each other. There are many cases where the same sound is observed even in a place. Therefore, in order to apply the environmental sound for the purpose of “mutual authentication” of the present application, it is not a “steady sound system”, but whether or not it is coincident with the “pulse system” environmental sound that is observed only at that time. It is necessary to judge.

“Environmental sound of“ stationary sound system ”often has a relatively high frequency, whereas environmental sound of“ pulse system ”often has a relatively low frequency. By compressing and transmitting the sound pressure data on the terminal device side, the same effect as passing through a so-called low-pass filter can be obtained, so the influence of environmental sound of a “steady sound system” having a relatively high frequency can be reduced. Can do.

Also, by calculating the amount of information of the matched feature vectors, it is mainly from the “pulse system” rather than between users whose feature vectors that appear frequently from the environmental sound of the “steady sound system” match. It is possible to control so that keys are more easily shared between users whose feature vectors that rarely appear are matched. This has the effect of falsely issuing an authentication key to an unauthorized user, ie, reducing the risk of false positive (corresponding to the second point).

Furthermore, by preparing multiple quantization patterns with threshold values changed little by little, and using them to generate and compare multiple feature vectors, differences in observation locations and differences in acoustic characteristics between devices It is also possible to obtain an effect of reducing the influence of the above (corresponding to the third point).

Thus, according to the present embodiment, it is possible to realize reliable mutual authentication with less risk of erroneously issuing an authentication key to an unauthorized user. In that case, all that is required is to input the sound around the apparatus, and therefore, it can be implemented only with hardware and software that are provided as standard in many devices.

(Second Embodiment)
In the mutual authentication system 501 according to the second embodiment of the present invention, in addition to the configuration of the first embodiment, when the key sharing unit 605 of the mutual authentication server 510 has a total amount of information equal to or less than a predetermined value, The feature vector is temporarily stored in a storage means provided in advance, and the feature vector stored temporarily is further concatenated with the feature vector generated from the time-series data sent from the same set of terminal devices. It was assumed that the hash value of what was created was generated as the authentication key.

With this configuration, the same effect as that of the first embodiment can be obtained, and an authentication key can be generated even when a feature vector having a sufficient key length cannot be obtained.
Hereinafter, this will be described in more detail.

FIG. 14 is an explanatory diagram showing the configuration of the mutual authentication system 501 according to the second embodiment of the present invention. The mutual authentication system 501 is the mutual authentication system 1 shown in the first embodiment, in which the mutual authentication server 10 is replaced with another mutual authentication server 510. The terminal device 20 is the same as that in the first embodiment.

The mutual authentication server 510 has the same hardware configuration as the mutual authentication server 10, and the software configuration that operates on the processor 11 is replaced by another key sharing unit 605 instead of the key sharing unit 105. It is the same except that an area for storing the feature vector 612 is added to the means 12. Therefore, only the differences will be described here.

FIG. 15 is a flowchart showing the mutual authentication operation performed between the mutual authentication server 510 and the terminal device 20 shown in FIG. It goes without saying that the operation of the terminal device 20 is the same as that of the first embodiment shown in FIG. 13, but the operation of the mutual authentication server 510 and steps S101 to S106 are also shown in FIG. It is the same as the operation of the form.

The only operation different from that of the first embodiment is that the key sharing unit 605 stores the feature vector 612 in the storage unit 12 when the total amount of information (key length) of the matched feature vectors does not exceed a certain threshold. (Step S107). Then, when new time-series data is sent from the same set of terminal devices 20 in the subsequent operation, the key sharing means 605 stores the feature vector generated from the new time-series data. The feature vectors 612 are further concatenated and the hash values are transmitted as authentication keys to each terminal.

In operation, there may be cases where it is not possible to obtain a feature vector with a sufficient key length using only time-series data obtained during a specific period. In this embodiment, in such a case, it is possible to use the time series data continuously obtained from the same set of terminal devices as an authentication key, so that the authentication key is generated and used more reliably. It becomes possible.

(Extended embodiment)
Various extensions can be considered in the first and second embodiments described above without departing from the spirit of the first and second embodiments. This will be described below.
First, one mutual authentication system may include three or more terminal devices, and these terminal devices may perform mutual authentication with a set of three or more devices. One of the terminal devices included in one mutual authentication system may also have a function as a mutual authentication server.

Further, the user designates a specific time range, and only the designated time range is set as a target time range to be authenticated in the present invention, or conversely, excluded from the target time range to be authenticated in the present invention. It is also possible to adopt a configuration that does this.

The present invention has been described with reference to the specific embodiments shown in the drawings. However, the present invention is not limited to the embodiments shown in the drawings, and any known hitherto provided that the effects of the present invention are achieved. Even if it is a structure, it is employable.

The summary of the new technical contents of the above-described embodiment is summarized as follows. In addition, although part or all of the said embodiment is summarized as follows as a novel technique, this invention is not necessarily limited to this.

(Supplementary Note 1) A mutual authentication system configured by connecting a plurality of terminal devices and a mutual authentication server that generates and gives an authentication key to these terminal devices,
Each terminal device includes voice data transmitting means for transmitting surrounding environmental sound to the mutual authentication server as time-series data representing a temporal change in sound volume,
The mutual authentication server is
Feature vector generation means for generating a feature vector by analyzing a frequency component for each of the time-series data received from each terminal device;
Comparing the generated feature vectors between the terminal devices and determining whether or not they match,
A mutual authentication system comprising key sharing means for generating and transmitting an authentication key to each of the terminal devices when the feature vectors match.

(Additional remark 2) While each said terminal device is equipped with the sensing means which collects surrounding environmental sound as audio | voice data,
The mutual authentication system according to appendix 1, wherein the voice data transmitting unit extracts a plurality of representative values from the collected voice data and transmits the extracted representative values as the time-series data.

(Supplementary note 3) The mutual authentication system according to supplementary note 1, characterized in that each of the terminal device and the mutual authentication server has a time synchronization means for adjusting the time between each other in advance.

(Appendix 4) The mutual authentication server is
A predetermined number of extreme values are detected from each of the time series data received from each terminal device, and a time axis direction shift between the time series data is detected based on the timing at which the extreme values are detected. The mutual authentication system according to appendix 1, further comprising a data synchronization unit that corrects and outputs the data to the feature vector generation unit.

(Supplementary Note 5) The feature vector generation means of the mutual authentication server includes:
A Fourier transform function that divides the time-series data into time windows with a constant time interval and performs FFT (Fast Fourier Transform) on each time window to output a power spectrum;
Supplementary note 1 characterized by having a quantization function that outputs a feature vector for each frequency by collating the power level for each frequency of the output power spectrum with a threshold set in advance in a plurality of stages. The mutual authentication system described.

(Supplementary Note 6) The feature vector generation means of the mutual authentication server includes:
Appendix 5 characterized by having a cut-off function for removing a frequency component equal to or higher than a predetermined cut-off frequency from the power spectrum obtained by the Fourier transform function and shifting the power component to the quantization function. The mutual authentication system described.

(Supplementary note 7) The quantization function of the mutual authentication server is:
6. The mutual authentication system according to appendix 5, wherein a plurality of the threshold values for a preset level are set as one group, a plurality of the groups are provided, and the feature vector is output for each group.

(Supplementary Note 8) The feature vector comparison unit of the mutual authentication server compares the plurality of feature vectors generated in the same time window between the terminal devices, and the feature vector is 1 between the terminal devices. The mutual authentication system according to appendix 7, wherein if there is at least one match, it is determined that the feature vectors of the terminal devices match.

(Supplementary Note 9) The key sharing unit of the mutual authentication server calculates a total value of information amounts per unit time of the feature vectors that match in the target time range of the time series data, and the calculated information amount Note that only when the total value is equal to or greater than a predetermined value, the matched feature vectors are concatenated and a hash value of the concatenated matched feature vectors is generated as the authentication key. 8. The mutual authentication system according to 8.

(Supplementary Note 10) The key sharing means of the mutual authentication server calculates a total value of information amounts per unit time of the feature vectors that coincide in the target time range of the time series data, and the calculated information amount When the total value is equal to or greater than a predetermined value, a hash value of the concatenated feature vectors and the concatenated feature vectors is generated as the authentication key, and the total value of the information amount is When the value is equal to or less than the predetermined value, the feature vector is temporarily stored in a storage unit provided in advance, and further added to the feature vector generated from the time-series data sent from the same set of terminal devices. 9. The mutual authentication system according to appendix 8, wherein a hash value of a concatenation of the temporarily stored feature vectors is generated as the authentication key. Beam.

(Supplementary Note 11) A mutual authentication server that is mutually connected to a plurality of terminal devices to constitute a mutual authentication system,
Feature vector generating means for generating a feature vector by analyzing a frequency component for each of the time-series data representing a temporal change in the volume of sound received from each terminal device;
Comparing the generated feature vectors between the terminal devices and determining whether or not they match,
A mutual authentication server, comprising: key sharing means for generating and transmitting an authentication key to each of the terminal devices when the feature vectors match.

(Supplementary Note 12) The feature vector generation means includes
A Fourier transform function that divides the time-series data into time windows with a constant time interval and performs FFT (Fast Fourier Transform) on each time window to output a power spectrum;
Appendix 11 has a quantization function for outputting a feature vector for each frequency by collating the power level for each frequency of the output power spectrum with a threshold set in advance in a plurality of stages. The mutual authentication server described.

(Supplementary note 13) The quantization function is
13. The mutual authentication server according to appendix 12, wherein a plurality of the thresholds with respect to a preset level are set as one group, a plurality of the groups are provided, and the feature vector is output for each group.

(Additional remark 14) The said feature vector comparison means compares the said several feature vectors produced | generated in the said same time window between each said terminal device, and even if the said feature vector matches among the said terminal devices The mutual authentication server according to appendix 13, wherein if there is, it is determined that the feature vectors of the terminal devices match.

(Supplementary Note 15) The key sharing unit calculates a total value of information amounts per unit time of the feature vectors that match in the target time range of the time series data, and the total value of the calculated information amounts is given in advance. The mutual feature according to appendix 14, characterized in that only when the predetermined feature value is greater than or equal to the predetermined value, the matched feature vectors are concatenated and a hash value of the concatenated matched feature vectors is generated as the authentication key. Authentication server.

(Supplementary Note 16) In a mutual authentication system in which a plurality of terminal devices and a mutual authentication server that generates and gives authentication keys to these terminal devices are connected to each other,
The voice data transmitting means of each terminal device transmits surrounding environmental sound to the mutual authentication server as time-series data representing a temporal change in sound volume,
The feature vector generation means of the mutual authentication server generates a feature vector by analyzing a frequency component for each of the time-series data received from each terminal device,
The feature vector comparison means of the mutual authentication server compares the generated feature vectors between the terminal devices to determine whether or not they match.
A mutual authentication method, wherein the key sharing means of the mutual authentication server generates and transmits an authentication key to each terminal device when the feature vectors match.

(Supplementary Note 17) The process in which the feature vector generation unit generates the feature vector includes:
The Fourier transform function divides the time series data into time windows with a constant time interval, performs FFT (Fast Fourier Transform) on each time window, and outputs a power spectrum,
The feature vector for each frequency is output by collating the power level for each frequency of the output power spectrum with a threshold value for which a quantization function is set in advance in a plurality of stages. Mutual authentication method.

(Supplementary note 18) The process in which the quantization function outputs the feature vector includes:
18. The mutual authentication method according to appendix 17, wherein a plurality of the thresholds for a preset level are set as one group, a plurality of the groups are provided, and the feature vector is output for each group.

(Supplementary note 19) The feature vector comparing means compares the feature vectors.
The plurality of feature vectors generated in the same time window are compared between the terminal devices, and when there is even one feature vector that matches between the terminal devices, the terminal device of the terminal device 19. The mutual authentication method according to appendix 18, wherein it is determined that the feature vectors match.

(Supplementary note 20) The key sharing means generates and transmits an authentication key.
Only when the total value of the information amount per unit time of the feature vector matched in the target time range of the time series data is calculated and the calculated total amount of information amount is equal to or greater than a predetermined value The mutual authentication method according to appendix 19, wherein the matching feature vectors are concatenated and a hash value of the concatenated matching feature vectors is generated as the authentication key.

(Supplementary note 21) In a mutual authentication system in which a plurality of terminal devices and a mutual authentication server that generates and gives authentication keys to these terminal devices are connected to each other,
A computer provided in the mutual authentication server,
A procedure for generating a feature vector by analyzing a frequency component for each of time-series data representing a temporal change in sound volume received from each terminal device,
A procedure for comparing the generated feature vectors between the terminal devices and determining whether or not they match,
And a mutual authentication program for executing a procedure of generating and transmitting an authentication key to each of the terminal devices when the feature vectors match.

(Supplementary Note 22) The procedure for generating the feature vector includes:
A step of dividing the time series data into time windows at regular time intervals and performing FFT (Fast Fourier Transform) on each time window to output a power spectrum;
And appending the power level for each frequency of the output power spectrum with a threshold set in advance in a plurality of stages to output a feature vector for each frequency. Mutual authentication program.

(Supplementary Note 23) The procedure for outputting the feature vector includes:
The mutual authentication program according to appendix 22, characterized by including a procedure in which a plurality of the thresholds for a preset level are set as one group, a plurality of the groups are provided, and the feature vector is output for each group. .

(Supplementary Note 24) The procedure for comparing the feature vectors includes:
The plurality of feature vectors generated in the same time window are compared between the terminal devices, and when there is even one feature vector that matches between the terminal devices, the terminal device of the terminal device The mutual authentication program according to appendix 23, including a procedure for determining that the feature vectors match.

(Supplementary Note 25) The procedure for generating and transmitting the authentication key is as follows.
Only when the total value of the information amount per unit time of the feature vector matched in the target time range of the time series data is calculated and the calculated total amount of information amount is equal to or greater than a predetermined value 25. The mutual authentication program according to appendix 24, further comprising a step of concatenating the matched feature vectors and generating a hash value of the concatenated matched feature vectors as the authentication key.

This application claims priority based on Japanese Patent Application No. 2012-037361 filed on February 23, 2012, the entire disclosure of which is incorporated herein.

The present invention can be used in a mutual authentication system for constructing an ad hoc (non-permanent, temporary) connection relationship between specific devices.

1, 401, 501

Mutual authentication system

10, 410, 510

Mutual authentication server

11, 21

Processor

12, 22 Storage means 13, 23 Communication means 20, 20a, 20b, 420, 420a, 420b, 420c Terminal device 24 Voice input means 30 Network 101 Time synchronization means 102 Data synchronization means 103 Feature vector generation means 103a Fourier transform function 103b Cut-off function 103c Quantization function 104 Feature vector comparison means 105, 605 Key sharing means 111, 211 Authentication key 201 Time synchronization means 202 Sensing means 203 Voice data transmission means 204 Mutual authentication means 612 Feature vector

Claims

A mutual authentication system configured by connecting a plurality of terminal devices and a mutual authentication server that generates and gives an authentication key to these terminal devices,
Each terminal device includes voice data transmitting means for transmitting surrounding environmental sound to the mutual authentication server as time-series data representing a temporal change in sound volume,
The mutual authentication server is
Feature vector generation means for generating a feature vector by analyzing a frequency component for each of the time-series data received from each terminal device;
Comparing the generated feature vectors between the terminal devices and determining whether or not they match,
A mutual authentication system comprising key sharing means for generating and transmitting an authentication key to each of the terminal devices when the feature vectors match.
Each terminal device includes a sensing unit that collects ambient environmental sound as audio data, and
The mutual authentication system according to claim 1, wherein the voice data transmitting unit extracts a plurality of representative values from the collected voice data and transmits the representative values as the time-series data.
2. The mutual authentication system according to claim 1, wherein each of the terminal device and the mutual authentication server includes a time synchronization unit that synchronizes the time with each other in advance.
The mutual authentication server is
A predetermined number of extreme values are detected from each of the time series data received from each terminal device, and a time axis direction shift between the time series data is detected based on the timing at which the extreme values are detected. The mutual authentication system according to claim 1, further comprising a data synchronization unit that corrects and outputs the data to the feature vector generation unit.
The feature vector generation means of the mutual authentication server includes:
A Fourier transform function that divides the time-series data into time windows with a constant time interval and performs FFT (Fast Fourier Transform) on each time window to output a power spectrum;
2. A quantization function for outputting a feature vector for each frequency by comparing a power level for each frequency of the output power spectrum with a threshold set in advance in a plurality of stages. The mutual authentication system described in 1.
The feature vector generation means of the mutual authentication server includes:
6. The apparatus according to claim 5, further comprising: a cut-off function that removes a frequency component equal to or higher than a predetermined cut-off frequency from the power spectrum obtained by the Fourier transform function and shifts the frequency component to the quantization function. The mutual authentication system described in 1.
The quantization function of the mutual authentication server is
The mutual authentication system according to claim 5, wherein a plurality of the thresholds with respect to a preset level are set as one group, a plurality of the groups are provided, and the feature vector is output for each group.
The feature vector comparison means of the mutual authentication server compares the plurality of feature vectors generated in the same time window between the terminal devices, and even one feature vector matches between the terminal devices. The mutual authentication system according to claim 7, wherein when there is something, it is determined that the feature vectors of the terminal devices match.
The key sharing means of the mutual authentication server calculates a total value of information amounts per unit time of the feature vectors that coincide in the target time range of the time series data, and the calculated total value of information amounts is calculated in advance. 9. The hash value of the concatenated feature vectors and the concatenated feature vectors are generated as the authentication key only when the matched feature vectors are concatenated only when they are equal to or greater than a given value. Mutual authentication system.
The key sharing means of the mutual authentication server calculates a total value of information amounts per unit time of the feature vectors that coincide in the target time range of the time series data, and the calculated total value of information amounts is calculated in advance. A hash value of a concatenation of the matched feature vectors and a concatenation of the matched feature vectors is generated as the authentication key when the given feature vector is greater than or equal to a given value, and the total value of the information amount is less than or equal to the given value In this case, the feature vector is temporarily stored in a storage means provided in advance, and further temporarily stored in the feature vector generated from the time-series data sent from the same set of terminal devices. The mutual authentication system according to claim 8, wherein a hash value of the connected feature vectors is generated as the authentication key.
A mutual authentication server that is mutually connected to a plurality of terminal devices to constitute a mutual authentication system,
Feature vector generating means for generating a feature vector by analyzing a frequency component for each of the time-series data representing a temporal change in the volume of sound received from each terminal device;
Comparing the generated feature vectors between the terminal devices and determining whether or not they match,
A mutual authentication server, comprising: key sharing means for generating and transmitting an authentication key to each of the terminal devices when the feature vectors match.
The feature vector generation means is
A Fourier transform function that divides the time-series data into time windows with a constant time interval and performs FFT (Fast Fourier Transform) on each time window to output a power spectrum;
12. A quantization function for outputting a feature vector for each frequency by comparing a power level for each frequency of the output power spectrum with a threshold set in advance in a plurality of stages. The mutual authentication server described in 1.
The quantization function is
13. The mutual authentication server according to claim 12, wherein a plurality of the thresholds with respect to a preset level are set as one group, a plurality of the groups are provided, and the feature vector is output for each group.
When the feature vector comparison means compares the plurality of feature vectors generated in the same time window between the terminal devices, and when there is even one of the feature vectors between the terminal devices. The mutual authentication server according to claim 13, wherein it is determined that the feature vectors of the terminal devices match.
The key sharing means calculates a total value of information amounts per unit time of the feature vectors that coincide in the target time range of the time series data, and the calculated total amount of information amounts is a predetermined value 15. The mutual authentication server according to claim 14, wherein only in the case described above, the matched feature vectors are concatenated and a hash value of the concatenated matched feature vectors is generated as the authentication key.
In a mutual authentication system configured by mutually connecting a plurality of terminal devices and a mutual authentication server that generates and gives authentication keys to these terminal devices,
The voice data transmitting means of each terminal device transmits surrounding environmental sound to the mutual authentication server as time-series data representing a temporal change in sound volume,
The feature vector generation means of the mutual authentication server generates a feature vector by analyzing a frequency component for each of the time-series data received from each terminal device,
The feature vector comparison means of the mutual authentication server compares the generated feature vectors between the terminal devices to determine whether or not they match.
A mutual authentication method, wherein the key sharing means of the mutual authentication server generates and transmits an authentication key to each terminal device when the feature vectors match.
The feature vector generating means generates the feature vector,
The Fourier transform function divides the time series data into time windows with a constant time interval, performs FFT (Fast Fourier Transform) on each time window, and outputs a power spectrum,
17. The feature vector for each frequency is output by comparing the power level for each frequency of the output power spectrum with a threshold whose quantization function is set in a plurality of stages in advance. Mutual authentication method.
The process in which the quantization function outputs the feature vector,
18. The mutual authentication method according to claim 17, wherein a plurality of the thresholds with respect to a preset level are set as one group, a plurality of the groups are provided, and the feature vector is output for each of the groups.
The process in which the feature vector comparison means compares the feature vectors,
The plurality of feature vectors generated in the same time window are compared between the terminal devices, and when there is even one feature vector that matches between the terminal devices, the terminal device of the terminal device The mutual authentication method according to claim 18, wherein it is determined that the feature vectors match.
The key sharing means generates and transmits an authentication key,
Only when the total value of the information amount per unit time of the feature vector matched in the target time range of the time series data is calculated and the calculated total amount of information amount is equal to or greater than a predetermined value 20. The mutual authentication method according to claim 19, wherein the matched feature vectors are concatenated and a hash value of the concatenated matched feature vectors is generated as the authentication key.
In a mutual authentication system configured by mutually connecting a plurality of terminal devices and a mutual authentication server that generates and gives authentication keys to these terminal devices,
A computer provided in the mutual authentication server,
A procedure for generating a feature vector by analyzing a frequency component for each of time-series data representing a temporal change in sound volume received from each terminal device,
A procedure for comparing the generated feature vectors between the terminal devices and determining whether or not they match,
And a mutual authentication program for executing a procedure of generating and transmitting an authentication key to each of the terminal devices when the feature vectors match.
The procedure for generating the feature vector comprises:
A step of dividing the time series data into time windows at regular time intervals and performing FFT (Fast Fourier Transform) on each time window to output a power spectrum;
The method further comprises a step of outputting a feature vector for each frequency by comparing the power level for each frequency of the output power spectrum with a threshold set in advance in a plurality of stages. Mutual authentication program.
The procedure of outputting the feature vector comprises:
23. The mutual authentication according to claim 22, further comprising a step of setting a plurality of the threshold values for a preset level as a group, providing a plurality of the groups, and outputting the feature vector for each group. program.
The procedure for comparing the feature vectors comprises:
The plurality of feature vectors generated in the same time window are compared between the terminal devices, and when there is even one feature vector that matches between the terminal devices, the terminal device of the terminal device The mutual authentication program according to claim 23, further comprising a procedure for determining that the feature vectors match.
The procedure for generating and transmitting the authentication key comprises:
Only when the total value of the information amount per unit time of the feature vector matched in the target time range of the time series data is calculated and the calculated total amount of information amount is equal to or greater than a predetermined value 25. The mutual authentication program according to claim 24, further comprising a step of concatenating the matched feature vectors and generating a hash value of the concatenated matched feature vectors as the authentication key.