WO2017217288A1

WO2017217288A1 - Electronic apparatus, head-mounted display, processing method of electronic apparatus, and program therefor

Info

Publication number: WO2017217288A1
Application number: PCT/JP2017/021067
Authority: WO
Inventors: 軌行石井
Original assignee: コニカミノルタ株式会社
Priority date: 2016-06-14
Filing date: 2017-06-07
Publication date: 2017-12-21

Abstract

The present invention provides an electronic apparatus, a head-mounted display, a processing method of the electronic apparatus, and a program therefor with which it is possible to suppress information from leaking to a third person who heard the utterance of a user. The electronic apparatus has a storage device for storing a pattern, a display device for displaying a plurality of images, a speech recognition device for acquiring speech uttered by a user in response to the images and converting same into a corresponding character/numeric string, and a processing device for enabling a prescribed function when the plurality of images displayed on the display device are linked in order of the character/numeric string converted by the speech recognition device and a locus thereof matches the pattern stored in the storage device.

Description

Electronic device, head mounted display, electronic device processing method and program thereof

The present invention relates to an electronic device having a voice recognition function, a head mounted display, a processing method of the electronic device, and a program thereof.

In recent years, mobile terminals such as smartphones that have been rapidly developed are often used for business and home work assistance. A typical mobile terminal has a touch panel screen that serves both as an image display and a user interface. By touching this screen, the user can make necessary inputs to display a desired image or input information. Can be performed.

On the other hand, it is also common to lock the screen of a mobile terminal in order to prevent the use of a third party without permission. To use a mobile device with a locked screen, you must first unlock it. However, there are cases where the user wants to unlock the mobile terminal without performing a button operation, such as when the user's hand is occupied.

When attempting to unlock without performing a button operation, for example, biometric authentication is performed in which a biometric pattern such as a user's fingerprint, voiceprint, vein, retina, etc. is read for authentication. However, such biometric authentication requires a sensor for reading a biometric pattern and dedicated software for performing pattern matching processing, resulting in a problem that the system becomes complicated and costs increase. In addition, when a single portable terminal is shared by a plurality of people, the biometric patterns of the plurality of people must be stored, which is inconvenient.

On the other hand, some portable terminals have a voice recognition function to realize hands-free. Therefore, there is an idea that the mobile terminal can be unlocked without performing a button operation by using the voice recognition function. However, when unlocking using voice recognition, the words spoken by the user must first be collected with a microphone in order to convert them into character strings, etc. There is a risk that the password will be known. The same problem also occurs when it is desired to input information such as a telephone number to be concealed into a portable terminal or the like using a voice recognition function.

On the other hand, in Patent Document 1, a pair of keywords randomly selected from a plurality of pairs of keywords and passwords registered in advance is displayed on the display unit. A technology that authenticates the terminal user himself / herself by verifying the identity of the password spoken by the user and the displayed keyword and the paired password based on the voice recognition result by uttering the password to be paired Is disclosed.

JP 2002-312318 A

However, according to the technique of Patent Document 1, there is a problem that the user must remember a keyword / password pair, and the burden on the user is relatively large. Moreover, it is necessary to register both a keyword and password pair, which is troublesome. On the other hand, if the number of pairs is reduced, the burden on the user will be reduced accordingly, but the risk that the password will be known instead increases. In particular, Patent Document 1 mentions the “correct answer rate”, which is a proof that the user is assumed to make a mistake, and there is a concern that there is difficulty in terms of usability.

The present invention has been made in view of the above circumstances, and is capable of suppressing leakage of information to a third party who has heard a user's utterance while using a voice recognition function, a head mounted display, and an electronic device. It is an object of the present invention to provide a device processing method and a program thereof.

In order to achieve at least one of the above objects, an electronic device reflecting one aspect of the present invention is provided.
A storage device for storing patterns;
A display device for displaying a plurality of images;
A voice recognition device that acquires voice spoken by a user according to the image and converts the voice into a corresponding character / number string;
When the plurality of images displayed on the display device are connected in the order of the character / number string converted by the voice recognition device, and the locus matches the pattern stored in the storage device And a processing device that permits a predetermined function.

In order to achieve at least one of the objects described above, another electronic device reflecting one aspect of the present invention is:
A storage device for storing a password consisting of a character / number string;
A display device that displays at least letters / numbers constituting the letter / number string of the password in association with a plurality of images according to a predetermined relationship;
A voice recognition device that acquires a voice spoken by a user according to the image associated with the letters / numbers and converts the voice into a first letter / number string;
A conversion device for converting the password into a second character / number string in accordance with the predetermined relationship;
A processing device that permits a predetermined function when the first character / numeric string converted by the voice recognition device matches the second character / numeric string converted by the conversion device; is there.

In order to achieve at least one of the above-described objects, another electronic device reflecting one aspect of the present invention is:
A storage device for storing a password consisting of a character / number string;
A display device that displays at least letters / numbers constituting the letter / number string of the password in association with a plurality of images according to a predetermined relationship;
A voice recognition device that acquires a voice spoken by a user according to the image associated with the letters / numbers and converts the voice into a third letter / number string;
Processing for converting the third character / numeric string to a fourth character / numeric string in accordance with the predetermined relationship and permitting a predetermined function when the fourth character / numeric string matches the password And a device.

In order to achieve at least one of the above objects, still another electronic device reflecting one aspect of the present invention is:
A display device that displays letters / numbers constituting a prescribed letter / number string in association with a plurality of images according to a predetermined relationship;
A voice recognition device that acquires a voice spoken by a user according to the image associated with the letters / numbers and converts the voice into a first letter / number string;
A conversion device that converts the first character / number string converted by the voice recognition device into a second character / number string in accordance with the predetermined relationship;
And an input device for inputting the second character / number string as the prescribed character / number string.

In order to achieve at least one of the above objects, still another electronic device reflecting one aspect of the present invention is:
A display device for displaying a plurality of images;
With a microphone,
A voice recognition device that analyzes voice acquired through the microphone and recognizes a character / number string represented by the voice;
A processing unit that specifies an image corresponding to the character / number string among the plurality of images and performs processing based on the specified image.

In order to achieve at least one of the objects described above, a processing method of an electronic device reflecting one aspect of the present invention is as follows.
Remember the pattern,
Display multiple images,
Acquire the voice spoken by the user according to the image, convert it to the corresponding character / number string,
When the plurality of displayed images are connected in the order of the converted character / numerical string, when the locus coincides with the pattern stored in the storage device, predetermined authentication is performed. is there.

In order to realize at least one of the above-described objects, another electronic device processing method reflecting one aspect of the present invention includes:
Memorize passwords consisting of letters and numbers,
Displaying at least letters / numbers constituting the password letter / number string in association with a plurality of images according to a predetermined relationship;
Obtaining a voice spoken by the user according to the image associated with the letter / number, and converting the voice into a first letter / number string;
Converting the password into a second character / number string according to the predetermined relationship;
When the first character / numeric string matches the second character / numeric string, predetermined authentication is performed.

In order to achieve at least one of the above objects, still another electronic device processing method reflecting one aspect of the present invention is as follows.
Memorize passwords consisting of letters and numbers,
Displaying at least letters / numbers constituting the password letter / number string in association with a plurality of images according to a predetermined relationship;
Obtaining voice spoken by the user according to the image associated with the letters / numbers, and converting the voice into a third letter / number string;
The third character / number string is converted into a fourth character / number string according to the predetermined relationship, and when the fourth character / number string matches the password, predetermined authentication is performed. is there.

In order to achieve at least one of the above objects, still another electronic device processing method reflecting one aspect of the present invention is as follows.
The letters / numbers that make up the prescribed letter / number string are displayed in association with a plurality of images according to a predetermined relationship,
Obtaining a voice spoken by the user according to the image associated with the letter / number, and converting the voice into a first letter / number string;
Converting the first character / number string converted by the voice recognition device into a second character / number string in accordance with the predetermined relationship;
The second character / number string is input as the prescribed character / number string.

According to the present invention, it is possible to provide an electronic device, a head-mounted display, a method for processing an electronic device, and a program thereof that can prevent information from leaking to a third party who has heard a user's speech while using a voice recognition function. Can do.

1 is a front view showing a head mounted display (hereinafter referred to as HMD) 100, which is an electronic device, according to the present embodiment in a state in which a user is wearing it. 3 is a schematic cross-sectional view showing a configuration of a display unit 104. FIG. It is a block diagram of HMD100 concerning this embodiment. It is a figure which shows the pattern PT memorize | stored in the authentication code memory | storage part 114 as authentication code information. It is a figure which shows typically the numerical string NA produced | generated by the process part 112. FIG. It is a figure which shows the message displayed in order to confirm whether speech recognition was performed appropriately. It is a figure which shows the example of the character string CA produced | generated by the process part 112 instead of a number string. It is a figure which shows the arrangement | sequence GA of the pattern of the vegetable or the fruit produced | generated by the process part 112 instead of a numerical string or a character string. (A) is a figure which shows the numerical sequence (here "4 * 9 * 1 * 3 * 5") memorize | stored as authentication code information (password) in the authentication code memory | storage part 114. FIG. (B) is a figure which shows the combination image NG of the number and image produced | generated by the process part 112 with the description image RG. It is a figure which shows the combination image NG of the number and image produced | generated by the process part 112 with description image RG. It is a figure which shows the corresponding | compatible table CT which matched and arranged the character instead of the colored block corresponding to the number which comprises the number sequence memorize | stored in the authentication code memory | storage part. It is a figure which shows the corresponding | compatible table CT arrange | positioned corresponding to the number which comprises the number sequence memorize | stored in the authentication code memory | storage part 114, and matched the pattern of vegetables or fruits instead of a colored block or a character. It is a flowchart which shows the control operation (however, except step S102) of the process part 112 concerning this embodiment.

Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a front view showing a head mounted display (hereinafter referred to as HMD) 100, which is an electronic device, according to the present embodiment in a state where a user wears it. Hereinafter, the right side and the left side of the HMD 100 refer to the right side and the left side for the user wearing the HMD 100.

In the HMD 100 of the present embodiment shown in FIG. 1, the frame 101 to be mounted on the head of the user US has two spectacle lenses 102 arranged in front of the user US. A cylindrical main body 103 is fixed on the upper part of the spectacle lens 102 on the right side (which may be on the left side according to the user's dominant eye). The main body 103 is provided with a display unit 104. A display drive control unit 104DR (see FIG. 3 described later) that controls display control of the display unit 104 is disposed in the main body unit 103. If necessary, a display unit may be arranged in front of both eyes.

FIG. 2 is a schematic cross-sectional view showing the configuration of the display unit 104. The display unit 104 as a display device includes an image forming unit 104A and an image display unit 104B. The image forming unit 104A is incorporated in the main body unit 103, and includes a light source 104a, a unidirectional diffuser 104b, a condenser lens 104c, and a display element 104d. On the other hand, the image display unit 104B, which is a so-called see-through type display member, is disposed on the entire plate so as to extend downward from the main body unit 103 and extend in parallel to one eyeglass lens 102 (see FIG. 1). The eyepiece prism 104f, the deflecting prism 104g, and the hologram optical element 104h.

Next, the operation of the display unit 104 will be described. The light emitted from the light source 104a is diffused by the unidirectional diffusion plate 104b, condensed by the condenser lens 104c, and enters the display element 104d. The light incident on the display element 104d is modulated for each pixel based on the image data input from the display drive control unit 104DR, and is emitted as image light. As a result, a color image is displayed on the display element 104d.

Image light from the display element 104d enters the eyepiece prism 104f from its base end face PL1, is totally reflected a plurality of times by the inner side face PL2 and the outer side face PL3, and enters the hologram optical element 104h. The light incident on the hologram optical element 104h is reflected there, passes through the inner side surface PL2, and reaches the pupil B. At the position of the pupil B, the user can observe an enlarged virtual image of the image displayed on the display element 104d, and can visually recognize it as a screen formed on the image display unit 104B. In this case, it can be considered that the hologram optical element 104h constitutes a screen, or it can be considered that a screen is formed on the inner surface PL2. In the present specification, “screen” may refer to an image to be displayed.

On the other hand, since the eyepiece prism 104f, the deflecting prism 104g, and the hologram optical element 104h transmit almost all of the external light, the user can observe an external field image (real image) through them. Therefore, the virtual image of the image displayed on the display element 104d is observed so as to overlap with a part of the external image. In this manner, the user of the HMD 100 can simultaneously observe the image provided from the display element 104d and the external image via the hologram optical element 104h. Note that when the display unit 104 is in the non-display state, the image display unit 104B is transparent, and only the external image can be observed. In this example, a display unit is configured by combining a light source, a liquid crystal display element, and an optical system. However, instead of a combination of a light source and a liquid crystal display element, a self-luminous display element (for example, an organic EL display) is used. Element) may be used. Further, instead of a combination of a light source, a liquid crystal display element, and an optical system, a transmissive organic EL display panel having transparency in a non-light emitting state may be used. In any case, when the screen is arranged so as to fall within the visual field of the user's eye facing the image display unit 104B, and preferably at least partially overlaps the effective visual field, the user can easily visually recognize the image. Can do.

FIG. 3 is a block diagram of the HMD 100 according to the present embodiment, which is shown together with the user US. The HMD 100 collects the above-described display unit (display unit) 104, the microphone 105 that collects the voice spoken by the user US and converts it into a signal, and the voice that processes the signal output from the microphone 105 and outputs it as a voice signal. The processing unit 106 and the control unit 110 that receives the audio signal output from the audio processing unit 106 are included.

The control unit 110 receives the voice signal output from the voice processing unit 106, recognizes a character / number string represented by the voice, and converts the character / number string to the corresponding character / number, and the voice recognition unit 113. A processing unit (processing device) 112 that processes characters / numbers output from the display unit, a display drive control unit 104DR that receives signals from the processing unit 112 and drives and controls the display unit 104, and an authentication code (here, a pattern) Or an authentication code storage unit (storage device) 114 that stores a password. The microphone 105, the voice processing unit 106, and the voice recognition unit 113 constitute a voice recognition device. In this specification, when referring to letters / numbers, it means at least one of letters and numbers, and when referring to letters / number strings, a plurality of letters or numbers are arranged (including only letters or numbers only). An image includes letters / numbers. A word includes a single character.

(First embodiment)
Next, an embodiment of the present invention according to a processing method of an electronic device will be described. Note that all the processing methods of the electronic device described below can be realized by positioning the control unit 110 as a computer and executing a program incorporated therein. First, the operation of the HMD 100 according to the first embodiment will be described. In the following embodiment, the user US stores a pattern or a password, and the usage rule of the HMD 100 in which words representing a plurality of images displayed on the display unit 104 are uttered in order along the pattern or the password is manually It is assumed that the information is known beforehand. FIG. 4 is a diagram showing a pattern PT stored as authentication code information in the authentication code storage unit 114. The pattern PT is formed by connecting a horizontal line from left to right, a vertical line from top to bottom, and a horizontal line from left to right in this order. The pattern PT is assumed to be stored in advance by the user US. Here, the “pattern” refers to a traced geometric shape, and corresponds to, for example, a one-stroke stroke.

FIG. 5 is a diagram schematically showing a numeric string NA generated by the processing unit 112. However, in FIG. 5, the arrow AR is drawn so as to be easily understood, but it is not actually displayed. Here, the numerical string NA is, for example, a random number generated by the processing unit 112 assigned to 3 rows and 3 columns.

The processing unit 112 transmits information on the numeric string NA to the display drive control unit 104DR. Then, the display drive control unit 104DR converts the information of the numeric string NA into an image signal and transmits it to the display unit 104, so that the display unit 104 can display the numeric string NA shown in FIG.

Although it is necessary for the user US to remember the geometric pattern PT, unlike the encryption, the user US does not need to remember the arrangement order of individual elements to be displayed such as character strings, numeric strings, and images in advance. It is only necessary to utter the elements displayed along the pattern PT in order, which can be said to be highly convenient. In order to improve confidentiality, it is preferable to display individual elements at random. Elements to be displayed may be numbers, letters (alphabet, hiragana, katakana, kanji ...), colors, and patterns. These are collectively called images. Only one type (for example, only hiragana) may be displayed, or a plurality of types (such as katakana and numbers) may be mixed. However, since the elements are collated by voice recognition, it is preferable that the reading method is registered. In the case of a pattern, it is preferable that the pattern and its reading are registered in advance in the processing unit 112 and the user US also remembers the pattern and its registered reading. A plurality of readings may be registered for one pattern. However, it is desirable to avoid using multiple pictures with the same reading (bridge, chopsticks, etc.).

Suppose that the screen of the HMD 100 is locked before authentication. First, when the user US turns on an unillustrated switch of the HMD 100, the control unit 110 performs a display (not illustrated) for requesting an authentication code to be input to the display unit 104 via the display drive control unit 104DR. The numeric string NA shown in FIG. 5 is displayed.

Since the user US stores the pattern PT as the authentication code, when the stored pattern PT is applied to the numerical string NA displayed as shown in FIG. 5, the numbers “5” are entered in the order indicated by the arrow AR. If you read “3, 6, 2, 9”, you can see that the pattern PT can be reproduced. Therefore, when the user US utters a number in the form of “Go, San, Roku, Ni, Kyu”, the microphone 105 collects the sound, and the speech recognition unit 113 via the speech processing unit 106 collects this. Is converted into a numeric string “5, 3, 6, 2, 9” and transmitted to the processing unit 112 as numeric string information.

At this time, the processing unit 112 causes the display unit 104 to perform a display as shown in FIG. 6 via the display drive control unit 104DR, and confirms whether the input number string is as intended by the user US. You may do it. When the user US speaks “No” while the display of FIG. 6 is being performed, a voice signal is input from the microphone 105 via the voice processing unit 106 to the processing unit 112, and the displayed number string is inappropriate. The processing unit 112 determines that it is a thing and requests the utterance of the user US again. On the other hand, when the user US speaks “Yes”, the processing unit 112 determines that the displayed number string is appropriate, and performs the subsequent processing.

The processing unit 112 that has received the number string information from the voice recognition unit 113 reads the pattern PT stored in the authentication code storage unit 114 and applies it to the number string NA stored in itself. More specifically, when the numeric strings NA are connected in the order of the numeric strings (“5, 3, 6, 2, 9”) converted by the speech recognition unit 113, the locus coincides with the pattern PT. In this case, the processing unit 112 determines that the authentication codes match and releases the screen lock of the HMD 100. After releasing the screen lock, the pattern PT stored in the authentication code storage unit 114 may be updated through similar authentication. On the other hand, if the locus does not match the pattern PT, the processing unit 112 determines that the authentication codes do not match, and continues the screen lock of the HMD 100. At this time, input of a new authentication code may be requested.

FIG. 7 is a diagram illustrating an example of a character string CA generated by the processing unit 112 instead of a numeric string. However, in FIG. 7, the arrow AR is drawn so as to be easy to understand, but it is not actually displayed. Assuming that the user US stores the pattern PT as the same authentication code, when the stored pattern PT is applied to the character string CA displayed as shown in FIG. Is read as “C, H, G, D, E”, it can be seen that the pattern PT can be reproduced. Therefore, when the user US utters the alphabet in the form of “Shi-Eichi-Ji-Die-I”, the microphone 105 collects the sound and passes through the voice processing unit 106. A voice signal is input to the voice recognition unit 113, converted into a character string “C · H · G · D · E”, and transmitted to the processing unit 112 as character string information. Similarly, the processing unit 112 reads the pattern PT stored in the authentication code storage unit 114 and applies it to the numeric string NA stored in the processing unit 112 to determine whether or not the authentication codes match.
Other configurations are the same as those in the above-described embodiment.

FIG. 8 is a diagram showing an arrangement GA of vegetable or fruit patterns generated by the processing unit 112 instead of a numeric string or a character string. However, in FIG. 8, the arrow AR is drawn so as to be easily understood, but it is not actually displayed. Here, it is assumed that the processing unit 112 registers image names in advance in association with individual images. The sequence of images is random.

Assuming that the user US stores the pattern PT as the same authentication code, when the stored pattern PT is applied to the displayed image array GA as shown in FIG. It can be seen that the pattern PT can be reproduced by connecting images of “tomato, corn, mandarin orange, persimmon, green pepper”. Therefore, when the user US utters a word of a pattern in the form of “Tomato, corn, mandarin orange, kaki, peaman”, the microphone 105 collects the sound, and the voice recognition unit 113 via the voice processing unit 106 collects the words. Is converted to a character string “Tomato, corn, mandarin orange, kaki, piman” and transmitted to the processing unit 112 as character information. In the user's utterance, it is preferable to provide a certain silent time between words because it is easy to convert to an appropriate word when performing speech recognition.

The processing unit 112 selects the tomato pattern TO when the character information of “Tomato” matches the name of the tomato as the display image, and the character information of “corn” matches the name of the corn as the display image When selecting the corn pattern CR, when the character information of “mandarin orange” matches the name of the mandarin orange as the display image, select the pattern orange of the mandarin orange, and the character information of “oyster” is displayed as the display image. If the name matches the name of the rice cake, the image PR of the rice cake is selected, and if the character information of “Piman” matches the name of the green pepper as the display image, the image of the green pepper is selected and the selected image is selected. Are connected in this order, if the trajectory matches the pattern PT, it is determined that the authentication codes match. Other configurations are the same as those in the above-described embodiment. As described above, the processing unit 112 specifies and selects an image corresponding to a character / number string from a plurality of display images, and performs processing based on the selected image.

(Second Embodiment)
Next, the operation of the HMD 100 according to the second embodiment will be described. FIG. 9A is a diagram showing a number string (here, “4 · 9 · 1 · 3 · 5”) stored as authentication code information (password that is a character / number string) in the authentication code storage unit 114. is there. FIG. 9B is a diagram showing a combination image NG of numbers and images generated by the processing unit 112 together with the explanation image RG. The combined image NG shown in FIG. 9B is an image in which colored blocks are arranged in 3 rows and 3 columns, and the two are associated by arranging a number in the center. However, each color is represented by hatching or vertical and horizontal lines as shown in the adjacent explanatory image RG. The numbers always include those constituting the number string stored in the authentication code storage unit 114. However, the combination of numbers and colors is random. The combination of each color and the corresponding number constitutes a predetermined relationship. The explanation image RG is added to indicate the color corresponding to the block diagram, and is not actually displayed.

The processing unit 112 transmits information on the generated combination image NG to the display drive control unit 104DR. Then, the display drive control unit 104DR converts the information of the combination image NG into an image signal and transmits it to the display unit 104. Therefore, the display unit 104 displays the combination image NG (excluding the explanation image RG) shown in FIG. Can be displayed.

Suppose that the screen of the HMD 100 is locked before authentication. First, when the user US turns on a switch (not shown) of the HMD 100, the control unit 110 performs a display requesting to input an authentication code on the display unit 104 via the display drive control unit 104DR, and FIG. The combined image NG shown in b) is displayed.

Since the user US stores a numerical string (“4, 9, 1, 3, 5”) as an authentication code, when viewing the combined image NG as shown in FIG. The color of the block corresponding to “4” is yellow, the color of the block corresponding to the number “9” is green, the color of the block corresponding to the number “1” is blue, and corresponds to the number “3”. It can be seen that the color of the block is yellow and the color of the block corresponding to the number “5” is blue. Therefore, when the user US utters the word “Ki / Midori / Ao / Ki / Ao”, the microphone 105 collects the sound and inputs the audio signal to the audio recognition unit 113 via the audio processing unit 106. Then, it is converted into a character string (referred to as a first character / number string) of “Ki / Midori / Ao / Ki / Ao” and transmitted to the processing unit 112 as character string information.

In parallel with this, the processing unit 112 that also serves as the conversion device reads the numeric string (“4 · 9 · 1 · 3 · 5”) stored in the authentication code storage unit 114 and generates the combined image NG generated by itself. Then, the characters “ki, green, ao, ki, ao” are picked up from the corresponding image color and converted to a character string (called a second character / number string) arranged in this order. Further, when the first character / number string and the second character / number string match, the processing unit 112 determines that the authentication codes match and releases the screen lock of the HMD 100. On the other hand, if the first character / number string does not match the second character / number string, the processing unit 112 determines that the authentication codes do not match, and continues the screen lock of the HMD 100. At this time, input of a new authentication code may be requested. Other configurations are the same as those in the above-described embodiment.

Next, a method for updating the password stored in the authentication code storage unit 114 will be described below. It is assumed that the user US who wants to update the password releases the screen lock of the HMD 100, selects the password update from the setting screen (not shown), and performs an operation described later. Here, it is assumed that the password “1, 2, 3, 4” stored in the authentication code storage unit 114 is used as an update code and updated to a new password “9, 8, 5, 6”.

10 (a) to 10 (c) are diagrams showing a combination image NG of numbers and images generated by the processing unit 112 together with an explanation image RG. When a password update is requested from the user US, the processing unit 112 displays the combined image NG illustrated in FIG. 10A on the display unit 104 via the display drive control unit 104DR. However, the combination of numbers and colors is random.

Since the user US stores a numeric string (“1, 2, 3, 4”) as an update code, when viewing the combined image NG as shown in FIG. The color of the block corresponding to is blue, the color of the block corresponding to the number “2” is red, the color of the block corresponding to the number “3” is yellow, and the block corresponding to the number “4” It can be seen that the color is yellow. Therefore, when the user US utters a word of the color “Ao / Aka / Ki / Ki”, the microphone 105 collects the sound, and a voice signal is input to the voice recognition unit 113 via the voice processing unit 106. It is converted into a character string (referred to as a first character / number string) “Ao, Aka, Ki, Ki” and transmitted to the processing unit 112 as character string information.

In parallel with this, the processing unit 112 reads the numeric string (“1, 2, 3, 4”) stored as the update code in the authentication code storage unit 114, and generates the combined image NG (FIG. 10). Applying to (a), the characters “Ao, Aka, Ki, Ki” are picked up from the color of the corresponding image, and converted into a character string (second character / number string) arranged in this order. Further, when the first character / numeric string and the second character / numeric string match, the processing unit 112 determines that the update codes match and permits the password update. The processing unit 112 that has permitted the update of the password generates a combined image NG in which the correspondence between the numbers and the colors is changed as shown in FIG. 10B, and displays the display unit 104 via the display drive control unit 104DR. To display.

When the user US who wishes to update to the new password “9, 8, 5, 6” sees the combined image NG shown in FIG. 10B, the color of the block corresponding to the number “9” is yellow. In other words, the color of the block corresponding to the number “8” is purple, the color of the block corresponding to the number “5” is green, and the color of the block corresponding to the number “6” is red.

Therefore, when the user US utters a word of the color “Ki / Murasaki / Midori / Aka”, the microphone 105 collects the sound, and a voice signal is input to the voice recognition unit 113 via the voice processing unit 106. It is converted into an updated character string “Ki / Murasaki / Midori / Aka” and transmitted to the processing unit 112 as updated character string information.

The processing unit 112 collates with the combined image NG shown in FIG. 10B to determine that the character “ki” in the updated character string is yellow and the corresponding number is “7, 9”. The character “Murasaki” indicates purple, the corresponding number is determined to be “4,8”, the character “Midori” indicates green, the corresponding number is determined to be “5”, and the character “Red” indicates red, and the corresponding numbers are determined to be “1, 6”. Then, the processing unit 112 has a plurality of password candidates that the user US desires to update, which are “7 · 4 · 5 · 1”, “9 · 4 · 5 · 1”, “7 · 8 · 5 · 1”. "," 9,5,5 "," 7,4,5,6 "," 9,4,5,6 "," 7,8,5,6 "," 9,8,5,6 " It is determined that there are eight ways. The processing unit 112 stores these eight numeric strings as password candidates. As described above, the processing unit 112 identifies a numeric string corresponding to a character in the character string, and performs processing based on the identified numeric string.

In such a case, the processing unit 112 newly generates a combination image NG in which the correspondence between numbers and colors is changed as illustrated in FIG. 10C and displays the combined image NG again on the display unit 104 via the display drive control unit 104DR. To do. When the user US views the combination image NG shown in FIG. 10C, the color of the block corresponding to the number “9” is blue, and the color of the block corresponding to the number “8” is green. It can be seen that the color of the block corresponding to the number “5” is yellow and the color of the block corresponding to the number “6” is green.

Therefore, when the user US utters a word of the color “Ao / Midori / Ki / Midori”, the microphone 105 collects the sound, and a voice signal is input to the voice recognition unit 113 via the voice processing unit 106. It is converted into an update character string “Ao / Midori / Ki / Midori” and transmitted to the processing unit 112 as update character string information.

The processing unit 112 collates with the combination image NG shown in FIG. 10C to determine that the character “AO” in the updated character string is blue and the corresponding number is “2, 9”. The character “Midori” indicates green, and the corresponding number is determined to be “6, 8”, the character “K” indicates yellow, and the corresponding number is determined to be “1, 5”. . When this is compared with the stored password candidate, only one of “9, 8, 5, 6” remains, so the processing unit 112 uses the numeric string “9, 8, 5, 6” as a new password. The password in the authentication code storage unit 114 is updated. If a plurality of password candidates remain in the second process, the processing unit 112 may further display a new combined image and obtain the user's US utterance. In the above embodiment, the number of colors used for the combined image is limited to five so that it can be easily distinguished. Therefore, the user US has to speak a plurality of times for password input. If the number of passwords is matched one-on-one, the user's utterance can be done only once. Alternatively, the number of images and the number of passwords can be matched on a one-to-one basis by using characters and patterns as in the following embodiment without using color types as combination images. Similarly, one utterance is sufficient.

FIG. 11 is a diagram showing a correspondence table CT in which characters are arranged in association with a predetermined relationship instead of colored blocks in correspondence with the numbers constituting the number string stored in the authentication code storage unit 114. At the time of authentication, the correspondence table CT in FIG. 11 is displayed on the display unit 104.

Here, since the user US stores a numeric string (“4, 9, 1, 3, 5”) as an authentication code, when viewing the correspondence table CT as shown in FIG. Is the character corresponding to the number “9”, the character corresponding to the number “9” is “no”, the character corresponding to the number “1” is “no”, and the character corresponding to the number “3”. Is “ka”, and the character corresponding to the number “5” is “ma”. Therefore, when the user US speaks hiragana “ko, ta, no, ka, or ma”, the microphone 105 collects the sound and a voice signal is input to the voice recognition unit 113 via the voice processing unit 106. It is converted into a character string (referred to as a first character / numerical string) of “ko, ta, no, koma” and transmitted to the processing unit 112 as character string information.

In parallel with this, the processing unit 112 reads the numeric string (“4 · 9 · 1 · 3 · 5”) stored in the authentication code storage unit 114 and generates the correspondence table CT (FIG. 11) generated by itself. Are collated to pick up the characters “ko, ta, no, ka, or ma” corresponding to the numbers, respectively, and convert them into a character string arranged in this order (referred to as a second character / number string). Further, when the first character / number string and the second character / number string match, the processing unit 112 determines that the authentication codes match and releases the screen lock of the HMD 100. Other configurations are the same as those in the above-described embodiment.

As a modification of the above, when the user US who stores a numeric string (“4, 9, 1, 3, 5”) as an authentication code looks at the correspondence table CT as shown in FIG. When the hiragana is spoken, the microphone 105 collects the sound and a voice signal is input to the voice recognition unit 113 via the voice processing unit 106. Is converted into a character string (referred to as a third character / number string) and transmitted to the processing unit 112 as character string information.

Further, the processing unit 112 compares the character string “ko, ta, no, ka, or ma” with the correspondence table CT (FIG. 11) generated by the processing unit 112 to convert the character string into numbers corresponding to hiragana, respectively. It is converted to a numeric string “4 · 9 · 1, 3 · 5” (referred to as a fourth character / numeric sequence) arranged in this order. Further, when the fourth character / number string and the password match, the processing unit 112 determines that the authentication codes match and releases the screen lock of the HMD 100. Other configurations are the same as those in the above-described embodiment.

FIG. 12 shows a correspondence table CT in which vegetables or fruit patterns are arranged in association with a predetermined relationship in place of colored blocks or characters, corresponding to the numbers constituting the number string stored in the authentication code storage unit 114. FIG. In the correspondence table CT, the names of vegetables or fruits are registered in association with the displayed patterns of vegetables or fruits, but this is not always necessary. At the time of authentication, the correspondence table CT shown in FIG. 12 is displayed on the display unit 104.

Here, since the user US stores a numeric string (“4, 9, 1, 3, 5”) as an authentication code, when viewing the combined image NG as shown in FIG. The pattern corresponding to "" is shiitake, the pattern corresponding to the number "9" is "green pepper", the pattern corresponding to the number "1" is "tomato", and the pattern corresponding to the number "3" Is “cherry” and the pattern corresponding to the number “5” is “mandarin orange”. Therefore, when the user US utters the word “Shitake / Piman / Tomato / Sakurabo / Mikan”, the microphone 105 collects the sound and inputs the speech signal to the speech recognition unit 113 via the speech processing unit 106. Then, it is converted into a character string (referred to as a first character / number string) “Shitake / Piman / Tomato / Cherry Bamboo / Mikan” and transmitted to the processing unit 112 as character string information.

In parallel with this, the processing unit 112 reads the numeric string (“4 · 9 · 1 · 3 · 5”) stored in the authentication code storage unit 114 and generates the correspondence table CT (FIG. 12) generated by itself. By collating with each other, the characters “Shitake / Piman / Tomato / Sakurambo / Mikan” corresponding to the numbers are picked up and converted into a character string (second character / number string) arranged in this order. Further, when the first character / number string and the second character / number string match, the processing unit 112 determines that the authentication codes match and releases the screen lock of the HMD 100. Other configurations are the same as those in the above-described embodiment.

As a modification example described above, when the user US storing a numeric string (“4, 9, 1, 3, 5”) as an authentication code looks at the correspondence table CT as shown in FIG. When a word with a pattern such as “Shitake, Piman, Tomato, Sakura, Mikan” is uttered, the microphone 105 collects the sound, and a speech signal is input to the speech recognition unit 113 via the speech processing unit 106. It is converted into a character string (referred to as a third character / number string) “Tomato, cherry, orange” and transmitted to the processing unit 112 as character string information.

Further, the processing unit 112 converts the character string “Shitake / Pi-Man / Tomato / Cherry Bamboo / Tangerine” into a correspondence table CT (FIG. 11) generated by itself, thereby converting the character string into a number corresponding to the design. It is converted to a numeric string “4 · 9 · 1, 3 · 5” (referred to as a fourth character / numeric sequence) arranged in this order. Further, when the fourth character / number string and the password match, the processing unit 112 determines that the authentication codes match and releases the screen lock of the HMD 100. Other configurations are the same as those in the above-described embodiment.

FIG. 13 is a flowchart showing a control operation (except for step S102) of the processing unit 112 according to the above-described embodiment. In step S101, the processing unit 112, as an image, includes a numeric string NA (FIG. 5), a character string CA (FIG. 7), an image arrangement GA (FIG. 8), a combined image NG (FIG. 9B), and a correspondence table CT. (FIGS. 11 and 12) are displayed.

When the user US who has seen the display image speaks in response to this (step S102), the voice recognition unit 113 recognizes the character / numeric string represented by the voice and performs voice recognition (step S103). 112 displays the audio processing result as shown in step S104 (see FIG. 6).

Here, if the voice processing result is inappropriate (NO in step S105), the flow returns to step S102, and the same processing is repeated. On the other hand, if the voice processing result is appropriate (YES in step S105), the processing unit 112 reads the authentication code in step S106, collates the voice recognition result and the authentication code in step S107, and the two match. Process whether or not.

As a result of the collation, when it is determined that the voice recognition result and the authentication code do not match, the processing unit 112 displays a message such as “authentication failed” on the display unit 104 in step S109, Continue screen lock.

On the other hand, if it is determined that the voice recognition result matches the authentication code, in step S108, the processing unit 112 releases the screen lock because the authentication is successful.

For example, although general biometric authentication can identify an individual user, there is a demerit that biometric information as a release code is difficult to share with multiple people. In other words, it can be shared by registering biometric information for a plurality of people, but it takes time to register, and it is expected that a single electronic device will be shared by a plurality of people, especially when an electronic device is used in a factory. In such a case, there is a request to increase the convenience of sharing by using a common password. According to the present embodiment, there is an advantage that it can be used with almost the same user load as conventional password authentication. Speech recognition is a user interface that can be input hands-free, and has an advantage of high affinity with HMD. Furthermore, if the HMD is mounted on the user's head, the image displayed on the display unit is difficult for others to see, so there is an advantage that confidentiality can be kept high. Even if someone remembers the character string that the user utters, if a different image (or a combination of a character / number string and an image) is displayed for each display, the learned character string will be displayed at a different timing. Even if another person speaks, the electronic device is not unlocked and strong security can be secured. In addition, there is no need for the process of displaying the input characters etc. as the hidden characters “******”, which is performed when inputting the password to the conventional electronic device. If the image displayed on the display unit is not visible to others, a code reversely converted when displaying the result of voice recognition may be displayed.

(Third embodiment)
The HMD 100 of the present embodiment can also be used as information input means. For example, when the user US tries to make a call to a telephone number “030-1234-5678”, the processing unit 112 generates a correspondence table CT as shown in FIG. 11 in response to a request from the user US. Displayed on the display unit 104. Here, according to the displayed correspondence table CT, the user US utters the characters corresponding to the telephone number as “ke, ka, ke, no, ru, ka, ko, ma, mi, ni, i”. Is picked up by the microphone 105, and a voice signal is input to the voice recognition unit 113 via the voice processing unit 106, and "Ke-ka-Ke-no-ru-ka-ko-ma-mi-ni-i" Is converted to a character string (first character / number string) and transmitted to the processing unit 112 as character string information.

The processing unit 112 uses the correspondence table CT (FIG. 11) generated by itself to process the input character string “ke, ka, ke, no, ru, ka, ko, ma, mi, ni, i” It is converted into a number string (second character / number string) of “0, 3, 0, 1, 2, 3, 4, 5, 6, 7, 8”. Further, when the processing unit 112 has a telephone function, the processing unit 112 can make a call by inputting the obtained numeric string as a telephone number. In such a case, the processing unit 112 also serves as an input device. As a result, the user US can make a hands-free call without knowing the other party's telephone number. The converted telephone number may be displayed on the display unit 104 to make a call after confirming the user US. The HMD 100 may be used for inputting not only a telephone number but also a My Number, a credit card number, and the like.

The present invention is not limited to the embodiments described in the specification, and other embodiments and modifications are included for those skilled in the art from the embodiments and technical ideas described in the present specification. it is obvious. The description and the embodiments are for illustrative purposes only, and the scope of the present invention is indicated by the following claims. For example, in the above embodiment, the present invention has been described by taking the HMD as an example. However, the present invention is not limited to the HMD and can be applied to all electronic devices such as portable terminals. Further, some or all of the functions permitted by the authentication described above may be used.

In the above-described embodiment, the screen lock is released as an example of permitting a predetermined function by inputting a password. However, it is also possible to grant start permission of a specific application by inputting a password. More specifically, the application can be started by appropriately inputting a password as described above in a state where a login screen at the time of starting the application is displayed on the display unit 104. In addition, after starting a specific application, it can also authenticate in it. In such a case, it is desirable to change to the authentication screen during a hands-free operation using the user's utterance without using an operation to turn on the switch.

DESCRIPTION OF SYMBOLS 101 Frame 102 Eyeglass lens 103 Main body part 104 Display unit 104A Image formation part 104B Image display part 104DR Display drive control part 104a Light source 104b Unidirectional diffuser

104c Condensing lens

104d Display element

104f Eyepiece prism

104g Deflection prism 104h Hologram optical element 105 Microphone 106 Voice processing unit 110 Control unit 112 Processing unit 113 Voice recognition unit 114 Authentication code storage unit CA Character string CT Correspondence table NA Number string NG Image PT Pattern US User

Claims

A storage device for storing patterns;
A display device for displaying a plurality of images;
A voice recognition device that acquires voice spoken by a user according to the image and converts the voice into a corresponding character / number string;
When the plurality of images displayed on the display device are connected in the order of the character / number string converted by the voice recognition device, and the locus matches the pattern stored in the storage device And an electronic device having a processing device that permits a predetermined function.
The electronic device according to claim 1, wherein the display device displays a different image each time it is displayed.
The predetermined function is the update of the pattern, and when the processing device permits the update of the pattern, the pattern stored in the storage device is updated in response to the input of a new pattern. The electronic device according to claim 1 or 2.
A storage device for storing a password consisting of a character / number string;
A display device that displays at least letters / numbers constituting the letter / number string of the password in association with a plurality of images according to a predetermined relationship;
A voice recognition device that acquires a voice spoken by a user according to the image associated with the letters / numbers and converts the voice into a first letter / number string;
A conversion device for converting the password into a second character / number string in accordance with the predetermined relationship;
An electronic apparatus having a processing device that permits a predetermined function when the first character / numeric string converted by the voice recognition device matches the second character / numeric string converted by the conversion device .
A storage device for storing a password consisting of a character / number string;
A display device that displays at least letters / numbers constituting the letter / number string of the password in association with a plurality of images according to a predetermined relationship;
A voice recognition device that acquires a voice spoken by a user according to the image associated with the letters / numbers and converts the voice into a third letter / number string;
Processing for converting the third character / numeric string to a fourth character / numeric string in accordance with the predetermined relationship and permitting a predetermined function when the fourth character / numeric string matches the password An electronic device having the apparatus.
The electronic device according to claim 4 or 5, wherein the display device changes the predetermined relationship for associating an image with a character / number for each display.
The predetermined function is update of the password, and when the processing device permits the update of the password, the password stored in the storage device is updated in response to the input of a new password. The electronic device according to any one of claims 4 to 6.
A display device that displays letters / numbers constituting a prescribed letter / number string in association with a plurality of images according to a predetermined relationship;
A voice recognition device that acquires a voice spoken by a user according to the image associated with the letters / numbers and converts the voice into a first letter / number string;
A conversion device that converts the first character / number string converted by the voice recognition device into a second character / number string in accordance with the predetermined relationship;
An electronic apparatus comprising: an input device that inputs the second character / number string as the prescribed character / number string.
9. The electronic device according to claim 8, wherein the display device changes the predetermined relationship for associating an image with a character / number for each display.
A display device for displaying a plurality of images;
With a microphone,
A voice recognition device that analyzes voice acquired through the microphone and recognizes a character / number string represented by the voice;
An electronic device comprising: a processing unit that specifies an image corresponding to the character / number string among the plurality of images and performs processing based on the specified image.
A head mounted display having the electronic device according to any one of claims 1 to 10.
Remember the pattern,
Display multiple images,
Acquire the voice spoken by the user according to the image, convert it to the corresponding character / number string,
An electronic device that performs predetermined authentication when the plurality of displayed images are connected in the order of the converted character / numerical string, and the trajectory matches the pattern stored in the storage device Processing method.
Memorize passwords consisting of letters and numbers,
Displaying at least letters / numbers constituting the password letter / number string in association with a plurality of images according to a predetermined relationship;
Obtaining a voice spoken by the user according to the image associated with the letter / number, and converting the voice into a first letter / number string;
Converting the password into a second character / number string according to the predetermined relationship;
A processing method of an electronic device that performs predetermined authentication when the first character / numeric string matches the second character / numeric string.
Memorize passwords consisting of letters and numbers,
Displaying at least letters / numbers constituting the password letter / number string in association with a plurality of images according to a predetermined relationship;
Obtaining voice spoken by the user according to the image associated with the letters / numbers, and converting the voice into a third letter / number string;
An electronic device that converts the third character / numeric string into a fourth character / numeric string according to the predetermined relationship, and performs predetermined authentication when the fourth character / numeric string matches the password Processing method.
The letters / numbers that make up the prescribed letter / number string are displayed in association with a plurality of images according to a predetermined relationship,
Obtaining a voice spoken by the user according to the image associated with the letter / number, and converting the voice into a first letter / number string;
Converting the first character / number string converted by the voice recognition device into a second character / number string in accordance with the predetermined relationship;
A processing method of an electronic device for inputting the second character / number string as the prescribed character / number string.
A program for causing a computer to execute the processing method for an electronic device according to any one of claims 12 to 15.