KR101681944B1 - Korean pronunciation display device and method for any of the input speech - Google Patents

Korean pronunciation display device and method for any of the input speech Download PDF

Info

Publication number
KR101681944B1
KR101681944B1 KR1020150097541A KR20150097541A KR101681944B1 KR 101681944 B1 KR101681944 B1 KR 101681944B1 KR 1020150097541 A KR1020150097541 A KR 1020150097541A KR 20150097541 A KR20150097541 A KR 20150097541A KR 101681944 B1 KR101681944 B1 KR 101681944B1
Authority
KR
South Korea
Prior art keywords
korean
pronunciation
speech signal
voice
words
Prior art date
Application number
KR1020150097541A
Other languages
Korean (ko)
Inventor
이기남
이문호
Original Assignee
(주)신명시스템즈
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)신명시스템즈 filed Critical (주)신명시스템즈
Priority to KR1020150097541A priority Critical patent/KR101681944B1/en
Application granted granted Critical
Publication of KR101681944B1 publication Critical patent/KR101681944B1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

Provided is a device and method for displaying Korean pronunciation about any input voice. According to an embodiment of the present invention, the device and method is to maintain: a voice signal database in which a plurality of words for each language with respect to a plurality of languages and a voice signal pattern for pronunciations of the corresponding words are stored; and a Korean pronunciation database in which Korean pronunciation characters with respect to the words for each language are stored. When a user voice, which is formed with minority race language not stored in the database, is inputted to a microphone, a voice signal pattern of the voice signal patterns stored in the voice signal data base, which has highest similarity with a voice signal pattern with respect to the user voice, is selected. Then, Korean pronunciation characters corresponding to a word of a language corresponding to the selected voice signal pattern are extracted from the Korean pronunciation database to display through a display. Therefore, although a user voice of a language which is not stored in the database is inputted, the device and method can support to output Korean pronunciation characters most similar to the corresponding user voice on the display.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a Korean pronunciation display apparatus and a method for displaying a Korean pronunciation on an arbitrary input voice,

Embodiments of the present invention are directed to a technology for recognizing a certain input voice such as a voice in a language of a minority and displaying the pronunciation of the voice in Korean characters.

2. Description of the Related Art [0002] With the recent development of speech recognition technology, technologies for recognizing a user's voice and providing various services based on the recognized voice have appeared.

For example, there is a technique of performing a control of a mobile phone by recognizing a voice in a mobile phone, a technique of mounting a microphone for voice recognition on a remote controller of a TV, and then controlling the TV after recognizing the user's voice through the microphone.

The speech recognition technology compares a pattern of a user's voice inputted after a user's voice is input with a pattern of a stored voice to determine a pattern of a voice having the highest similarity to the pattern of the user's voice Thereby recognizing the user's voice.

In this regard, in the speech recognition technology, when a user's voice is input, the user removes noise from the user's voice, extracts a feature vector of the user's voice from which noise has been removed, And finally determines whether the user voice is a word or an instruction.

In this case, a method of extracting the feature vector of the user speech includes a linear predictive coefficient technique using a concept that a current signal can be known by a combination of previous signals, a linear predictive coefficient And Cepstrum technology, which is a method for helping to maintain robust recognition rate regardless of change of speaker.

As a method of measuring the similarity with a previously stored speech pattern based on the feature vector of the user speech, a feature vector is compared with a codebook made up of a set of reference vectors, and a code having the maximum similarity in the codebook A vector quantization technique for matching voice, a statistical information of patterns corresponding to speech units is stored as a probability model, and when an unknown input pattern comes in, the probability that this pattern can come out in each model is calculated, There is a Hidden Markov Model (HMM) which is a method to find the most suitable speech unit for this pattern.

As the speech recognition technology is developed, a service for displaying the pronunciation of a corresponding English word in a Korean character appears when a user speaks a specific word in English to a microphone. For example, when a user speaks an English word "school" in a microphone, a service for recognizing the user's voice and displaying the pronunciation of the English word "school" have.

These services are built not only for English but also for various languages such as Japanese, Chinese, and Spanish, and can be used effectively when Korean users want to learn other languages such as English, Japanese, Chinese, and Spanish.

However, since these services usually have voice data only for languages that are frequently used in the world, such as English, Japanese, Chinese, Spanish, etc., There are many cases where the Korean pronunciation letter is not displayed for the word pronounced in the language.

Therefore, even if a user inputs a voice in a language of a minority group in which voice data is not constructed, the user can use the voice data constructed for the existing English, Japanese, Chinese, Spanish, Research is needed on technology that supports pronunciation of Korean characters.

A Korean pronunciation display apparatus and method for an arbitrary input speech according to an embodiment of the present invention includes a speech signal database storing a plurality of words for each language and a speech signal pattern for pronunciation of the words in a plurality of languages, If a user's voice in a language of a minority language not stored in the database is input to the microphone, the Korean pronunciation database stored in the voice signal database is stored A voice signal pattern having the highest similarity to the voice signal pattern for the user voice is selected from among the voice signal patterns, and then Korean pronunciation characters corresponding to the words of the language corresponding to the selected voice signal pattern are extracted from the Korean pronunciation database Displayed through display So that even if a user's voice of a language not stored in the database is input, Korean pronunciation characters most similar to the user's voice can be output on the display.

The Korean pronunciation display apparatus for an arbitrary input speech according to an embodiment of the present invention includes a plurality of words for each language and a predetermined speech signal for pronunciation of each of a plurality of words for each language, A plurality of words for each of the plurality of languages and a plurality of words for each of the plurality of languages are stored in correspondence with each other so that Korean pronunciation characters represented by Korean characters correspond to each other, When a user's voice is input from a user through a sound database and a microphone, recognizes the user's voice and outputs a predetermined voice signal pattern for pronunciation of each of a plurality of words for each language stored in the voice signal database, To measure the similarity between speech signal patterns A pattern selection unit for selecting a first speech signal pattern having the maximum similarity among predetermined speech signal patterns stored in the speech signal database and predetermining a pronunciation of each of a plurality of words for each language; A word extracting unit for extracting a first word stored in correspondence with the first speech signal pattern from a signal database, a Korean pronunciation extracting unit for extracting a first Korean pronunciation character corresponding to the first word from the Korean pronunciation database, And a Korean pronunciation output unit for outputting the extracted first Korean pronunciation letter through the extracting unit and the display unit using the Korean pronunciation corresponding to the user's voice.

Further, a Korean pronunciation display method for an arbitrary input speech according to an embodiment of the present invention may include displaying a plurality of words for each language and a plurality of predetermined words The method comprising the steps of: maintaining a voice signal database in which voice signal patterns are stored so as to correspond to each other; storing a plurality of words for each language and a plurality of words for each language, The method comprising the steps of: maintaining a Korean pronunciation database stored so as to correspond to a plurality of words of a plurality of words recognized by the user; And a sound for the user's voice The first speech signal pattern having the maximum similarity among the speech signal patterns previously specified for pronunciation of each of a plurality of words for each language stored in the speech signal database is selected Extracting a first word corresponding to the first speech signal pattern from the speech signal database, extracting a first Korean pronunciation character corresponding to the first word from the Korean pronunciation database, And outputting the extracted first Korean pronunciation letter as a Korean pronunciation corresponding to the user voice through a step and a display.

A Korean pronunciation display apparatus and method for an arbitrary input speech according to an embodiment of the present invention includes a speech signal database storing a plurality of words for each language and a speech signal pattern for pronunciation of the words in a plurality of languages, If a user's voice in a language of a minority language not stored in the database is input to the microphone, the Korean pronunciation database stored in the voice signal database is stored A voice signal pattern having the highest similarity to the voice signal pattern for the user voice is selected from among the voice signal patterns, and then Korean pronunciation characters corresponding to the words of the language corresponding to the selected voice signal pattern are extracted from the Korean pronunciation database Displayed through display Thus, even if a user voice of a language not stored in the database is inputted, it is possible to support that the Korean pronunciation character most similar to the user voice is displayed on the display.

1 is a diagram illustrating a structure of a Korean pronunciation display apparatus for an input speech according to an embodiment of the present invention.
FIG. 2 is a flowchart illustrating a method of displaying Korean pronunciation of an arbitrary input voice according to an embodiment of the present invention.

While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It is to be understood, however, that the invention is not to be limited to the specific embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing.

It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between.

The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.

Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

1 is a diagram illustrating a structure of a Korean pronunciation display apparatus for an input speech according to an embodiment of the present invention.

Referring to FIG. 1, a Korean pronunciation display apparatus 110 for an arbitrary input speech according to an embodiment of the present invention includes a speech signal database 111, a Korean pronunciation database 112, a similarity measure unit 113, A word extracting unit 115, a Korean pronunciation extracting unit 116, and a Korean pronunciation output unit 117. The selecting unit 114, the word extracting unit 115,

In the voice signal database 111, a plurality of words for each language and a predetermined voice signal pattern for pronunciation of each of a plurality of words for each language are stored so as to correspond to each other.

For example, information may be stored in the voice signal database 111 as shown in Table 1 below.

Types of languages Multiple words A predetermined speech signal pattern English Word 1 Voice signal pattern 1 Word 2 Voice signal pattern 2 ... ... Chinese Word A Speech signal pattern A Word B Voice signal pattern B ... ... Japanese Word The voice signal pattern is Words Voice signal pattern ... ... ... ... ...

The Korean pronunciation database 112 stores a plurality of words for each language and a Korean pronunciation character in which the pronunciation of each of a plurality of words in each language is represented by Korean characters.

For example, information may be stored in the Korean pronunciation database 112 as shown in Table 2 below.

Types of languages Multiple words Korean pronunciation letter English Word 1 Korean pronunciation letter 1 Word 2 Korean pronunciation letter 2 ... ... Chinese Word A Korean pronunciation letter A Word B Korean pronunciation letter B ... ... Japanese Word Korean pronunciation letter Words Korean pronunciation letter or ... ... ... ... ...

When a user's voice is input from a user through a microphone, the similarity measuring unit 113 recognizes the user's voice and outputs a pre-designated voice corresponding to the pronunciation of each of a plurality of words for each language stored in the voice signal database 111 The similarity between the signal pattern and the speech signal pattern for the user speech is measured.

In this case, according to an embodiment of the present invention, the similarity measuring unit 113 extracts a feature vector from the user voice through a linear predictive coefficient technique or cepstrum technique, A predetermined speech signal pattern for pronunciation of each of a plurality of words for each language stored in the speech signal database 111 through a vector quantization method or a hidden markov model (HMM) The degree of similarity between the voice signal patterns with respect to the user voice can be measured.

The pattern selection unit 114 selects a first speech signal pattern having the maximum similarity among predetermined speech signal patterns stored in the speech signal database for each of a plurality of words for each language.

The word extracting unit 115 extracts a first word stored in the speech signal database 111 in correspondence with the first speech signal pattern.

The Korean pronunciation extraction unit 116 extracts the first Korean pronunciation character corresponding to the first word from the Korean pronunciation database 112.

The Korean pronunciation output unit 117 outputs the extracted first Korean pronunciation letter through the display in the Korean pronunciation corresponding to the user's voice.

Hereinafter, the operation of the Korean pronunciation display apparatus 110 for an arbitrary input speech will be described in detail, for example.

If the user inputs a specific user's voice in the language of a minority through the microphone, the similarity measuring unit 113 recognizes the user's voice and outputs a plurality of words for each language stored in the voice signal database 111 It is possible to measure the degree of similarity between a predetermined speech signal pattern for each pronunciation and a speech signal pattern for the user speech.

If a speech signal pattern having the highest degree of similarity to the speech signal pattern for the user's voice among the speech signal patterns previously specified for pronunciation of each of a plurality of words for each language stored in the speech signal database 111, 1, the pattern selector 114 selects one of the speech signal patterns stored in the speech signal database 111, which is stored in the speech signal database 111, Quot; A ", which is the maximum of the speech signal pattern A, can be selected.

Then, the word extracting unit 115 can extract the word "A " which is a Chinese word stored in association with the" speech signal pattern A "

Then, the Korean pronunciation extraction unit 116 can extract "Korean pronunciation letter A", which is the Korean pronunciation letter stored in correspondence with the "word A" from the Korean pronunciation database 112.

At this time, the Korean pronunciation output unit 117 can output the extracted "Korean pronunciation letter A " through the display in the Korean pronunciation corresponding to the user's voice input in the minority language.

As a result, the Korean pronunciation display device 110 for an arbitrary input speech according to an embodiment of the present invention includes a speech signal database 110 for storing a plurality of words for each language and a speech signal pattern for pronunciation of the words, And a Korean pronunciation database 112 storing Korean pronunciation characters for the words of each language. The user's voice in a language of a minority language not stored in the database is input to the microphone A voice signal pattern having the highest similarity to the voice signal pattern for the user voice is selected from the voice signal patterns stored in the voice signal database 111, and the voice signal pattern corresponding to the selected voice signal pattern is selected from the Korean pronunciation database 112 A Korean pronunciation character corresponding to a word of the language is extracted and displayed So that even if a user's voice of a language not stored in the database is input, the Korean pronunciation character most similar to the user's voice is displayed on the display, so that the user can support the Korean pronunciation of the language of the minority people .

According to an embodiment of the present invention, the pattern selection unit 114 may select one of speech signals of a plurality of words for each language stored in the speech signal database 111, At least one second voice signal pattern in which the first voice signal pattern is excluded may be further selected while the first voice signal pattern is selected and the similarity exceeds a predetermined first degree of similarity.

At this time, the word extracting unit 115 extracts the first word stored corresponding to the first speech signal pattern from the speech signal database 111, and simultaneously stores the first word corresponding to the at least one second speech signal pattern And extract at least one second word that has been extracted.

At this time, the Korean pronunciation extraction unit 116 extracts the first Korean pronunciation characters stored corresponding to the first word from the Korean pronunciation database 112, and stores the first Korean pronunciation characters corresponding to the at least one second word At least one second Korean pronunciation character can be extracted.

The Korean pronunciation output unit 117 outputs the extracted first Korean pronunciation characters to the selected first region of the display in a Korean pronunciation corresponding to the user's voice, The at least one second Korean pronunciation character may be output as the Korean candidate word pronunciation corresponding to the user voice.

For example, when the user inputs a user's voice in the language of a minority group to the microphone, the similarity measuring unit 113 calculates the similarity of the pronunciation of each of a plurality of words for each language stored in the voice signal database 111 As a result of measuring the similarity between the designated speech signal pattern and the speech signal pattern for the user speech, it is found that the speech signal pattern having the maximum similarity is "speech signal pattern 1" in Table 1, Assume that the signal pattern is "speech signal pattern A" and "speech signal pattern" in Table 1 above.

At this time, the pattern selection section 114 selects the "speech signal pattern 1" having the maximum similarity degree from the speech signal database 111 and selects the speech signal pattern "A" that is the speech signal pattern exceeding the selected first similarity degree "And" voice signal pattern "can be further selected.

The word extracting unit 115 extracts "word 1" stored in association with the "speech signal pattern 1" from the speech signal database 111, and extracts the "speech signal pattern A" and the "speech signal pattern" Word A " and "word" stored in association with each other.

Then, the Korean pronunciation extraction unit 116 extracts the " Korean pronunciation letter 1 "stored corresponding to the word" 1 " from the Korean pronunciation database 112, And can further extract "Korean pronunciation letter A" and "Korean pronunciation letter ".

At this time, the Korean pronunciation output unit 117 outputs the "Korean pronunciation letter 1" for the selected first area of the display in Korean pronunciation corresponding to the user's voice, and for the selected second area of the display The " Korean pronunciation letter A "and the" Korean pronunciation letter "can be output as Korean candidate group pronunciation corresponding to the user voice.

Accordingly, the user can see Korean pronunciation characters most similar to the user's voice, and additionally can see similar Korean pronunciation pronunciation candidates close to the user's voice.

According to an embodiment of the present invention, the speech signal database 111 includes Korean of the plurality of languages, and a plurality of Korean words for the Korean words and a plurality of Korean words for pronunciation of the plurality of Korean words The designated voice signal patterns may be stored in correspondence with each other.

At this time, the Korean pronunciation database 112 may store the plurality of Korean words and Korean pronunciation characters in which the pronunciation of each of the plurality of Korean words is expressed in Korean characters, corresponding to each other.

The degree-of-similarity measuring unit 113 may include a primary measuring unit 118, a similarity determining unit 119, and a secondary measuring unit 120.

When the user's voice is input from the user through the microphone, the primary measurement unit 118 recognizes the user's voice and outputs the user's voice in the voice signal database 111 in advance to the pronunciation of each of the plurality of Korean words stored in the voice signal database 111 The degree of similarity between the designated speech signal pattern and the speech signal pattern for the user speech can be firstly measured.

The similarity determination unit 119 determines whether or not the first degree of similarity among the predetermined speech signal patterns for the pronunciation of each of the plurality of Korean words stored in the speech signal database 111 exceeds a predetermined second similarity degree It can be determined whether or not a signal pattern exists.

The second measurement unit 120 measures the degree of similarity of the first measured voice signal pattern among the predetermined voice signal patterns for pronunciation of each of the plurality of Korean words stored in the voice signal database 111 to exceed the predetermined second similarity degree The speech signal pattern for the user's speech is determined based on a predetermined speech signal pattern for pronunciation of each of a plurality of words for each language stored in the speech signal database 111, The degree of similarity between the two can be measured.

At this time, when the second measurement of the similarity is completed, the pattern selection unit 114 selects a predetermined speech signal pattern for the pronunciation of each of a plurality of words for each language stored in the speech signal database 111 The first voice signal pattern having the maximum degree of similarity measured in the second order can be selected.

That is, the voice signal database 111 and the Korean pronunciation database 112 include not only the foreign language such as English, Japanese, Chinese, etc. but also the Korean language data. When the user's voice is inputted by the user, 118 performs a comparison between the speech signal pattern for Korean words and the speech signal pattern for user speech, and if it is determined that there is no speech signal pattern for a Korean word exceeding the second similarity degree, The secondary measurement unit 120 compares the voice signal pattern for all language words stored in the voice signal database 111 with the voice signal pattern for the user voice so that the Korean pronunciation corresponding to the user voice So that a process of displaying a character can be performed.

In this case, according to an embodiment of the present invention, the pattern selection unit 114 determines whether the first degree of similarity among the predetermined speech signal patterns for the pronunciation of each of the plurality of Korean words exceeds the predetermined second degree of similarity Of the plurality of Korean words stored in the speech signal database 111. When the first degree of similarity of the first to third measured speech signal patterns of the plurality of Korean words stored in the speech signal database 111 is the maximum, The voice signal pattern can be selected.

When the third voice signal pattern is selected, the word extracting unit 115 extracts a first voice signal pattern corresponding to the third voice signal pattern among the plurality of Korean words stored in the voice signal database 111 Korean words can be extracted.

The Korean pronunciation extraction unit 116 may extract a third Korean pronunciation character corresponding to the first Korean word from the Korean pronunciation database 112.

Then, the Korean pronunciation output unit 117 can output the extracted third Korean pronunciation letter through the display in the Korean pronunciation corresponding to the user's voice.

That is, the voice signal database 111 and the Korean pronunciation database 112 include not only the foreign language such as English, Japanese, Chinese, etc. but also the Korean language data. When the user's voice is inputted by the user, 118) performs a comparison between the speech signal pattern for Korean words and the speech signal pattern for user speech, and if it is determined that there is a speech signal pattern for a Korean word exceeding the second similarity degree, The selecting unit 114 and the word extracting unit 115 may select a Korean word which is similar to the voice signal pattern for the user voice among the voice signal patterns for the Korean words stored in the voice signal database 111, The pronunciation extracting unit 116 extracts from the Korean pronunciation database 112 the Korean words corresponding to the extracted Korean words The process of displaying the Korean pronunciation characters corresponding to the user voice can be performed in a preferential manner based on the Korean data stored in the voice signal database 111 and the Korean pronunciation database 112 have.

FIG. 2 is a flowchart illustrating a method of displaying Korean pronunciation of an arbitrary input voice according to an embodiment of the present invention.

In step S210, a speech signal database in which a plurality of words for each language and a predetermined speech signal pattern for pronunciation of each of a plurality of words for each language are stored so as to correspond to each other is held for each type of a plurality of languages .

In step S220, a Korean pronunciation database is stored, in which a plurality of words for each language and a plurality of words for each language are stored so that Korean pronunciation characters represented by Korean characters correspond to each other.

In step S230, when the user's voice is input from the user through the microphone, the user's voice is recognized, and a predetermined voice signal pattern for pronunciation of each of a plurality of words for each language stored in the voice signal database, And measures the similarity between the voice signal patterns with respect to the user voice.

In step S240, a first speech signal pattern having the maximum similarity among speech signal patterns previously designated for pronunciation of each of a plurality of words for each language stored in the speech signal database is selected.

In step S250, a first word stored in association with the first speech signal pattern is extracted from the speech signal database.

In step S260, the first Korean pronunciation character corresponding to the first word is extracted from the Korean pronunciation database.

In step S270, the extracted first Korean pronunciation character is output as a Korean pronunciation corresponding to the user voice through a display.

At this time, according to an embodiment of the present invention, in step S240, among the predetermined speech signal patterns stored in the speech signal database for each of a plurality of words for each language, 1 voice signal pattern is selected and at least one second voice signal pattern in which the first voice signal pattern is excluded can be further selected while the degree of similarity exceeds the first similarity degree.

At this time, in step S250, the first word stored corresponding to the first speech signal pattern is extracted from the speech signal database and at least one word corresponding to the at least one second speech signal pattern In step S260, the first Korean pronunciation letter stored corresponding to the first word is extracted from the Korean pronunciation database, and at the same time, the second word of the at least one second word At step S270, it is possible to extract at least one second Korean pronunciation character corresponding to the user's voice corresponding to the user's voice in the selected first region of the display, Outputting a Korean pronunciation, wherein for the selected second region of the display the at least one second Korean The pronunciation character can be output as the Korean candidate group pronunciation corresponding to the user voice.

According to an embodiment of the present invention, the speech signal database includes Korean words among the plurality of languages, and a plurality of Korean words for the Korean words and a plurality of Korean words for pronunciation of the plurality of Korean words Wherein the plurality of Korean words and the Korean pronunciation characters in which the pronunciation of each of the plurality of Korean words is represented by Korean characters are stored in correspondence with each other in the Korean pronunciation database .

At this time, in step S230, when the user's voice is input from the user through the microphone, the user's voice is recognized and a pre-designated voice signal for pronunciation of each of the plurality of Korean words stored in the voice signal database A first step of measuring a degree of similarity between a pattern and a speech signal pattern for the user speech, a step of firstly determining a similarity degree between the pattern and the speech signal pattern for the user speech, Determining whether or not a speech signal pattern exceeding a predetermined second similarity degree is present, and determining whether or not there is a first-order measurement Lt; RTI ID = 0.0 > similarity < / RTI & A speech signal pattern corresponding to a user's speech and a predetermined speech signal pattern for pronunciation of each of a plurality of words for each language stored in the speech signal database, The second degree of similarity may be measured.

At this time, in step S240, when the second measurement of the similarity is completed, the second measurement of the speech signal patterns previously specified for pronunciation of each of a plurality of words for each language stored in the speech signal database The first audio signal pattern having the maximum similarity degree can be selected.

In this case, the Korean pronunciation display method for an arbitrary input speech according to an embodiment of the present invention is characterized in that the first degree of similarity measured among the previously designated speech signal patterns for pronunciation of each of the plurality of Korean words is the second Wherein when the first degree of similarity is determined to be the maximum among the predetermined speech signal patterns for pronunciation of each of the plurality of Korean words stored in the speech signal database, 3 ", " 3 ", " 1 ", " 2 ", " 3 ", " Extracting a first Korean word from the Korean pronunciation database, It may further include a third Korean pronunciation stage and outputting the Korean pronunciation corresponding to the extracted third Korean pronunciation symbols through the display to the user speech to extract the character stored.

The Korean pronunciation display method for any input voice according to the embodiment of the present invention has been described above with reference to FIG. Here, the Korean pronunciation display method for an arbitrary input voice according to an embodiment of the present invention may correspond to the configuration of the operation of the Korean pronunciation display apparatus 110 for any input voice described with reference to FIG. 1 , And a detailed description thereof will be omitted.

The Korean pronunciation display method for an arbitrary input voice according to an exemplary embodiment of the present invention can be implemented by a computer program stored in a storage medium for execution through a combination with a computer.

Also, the Korean pronunciation display method for an arbitrary input voice according to an exemplary embodiment of the present invention may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

As described above, the present invention has been described with reference to particular embodiments, such as specific elements, and specific embodiments and drawings. However, it should be understood that the present invention is not limited to the above- And various modifications and changes may be made thereto by those skilled in the art to which the present invention pertains.

Accordingly, the spirit of the present invention should not be construed as being limited to the embodiments described, and all of the equivalents or equivalents of the claims, as well as the following claims, belong to the scope of the present invention .

110: Korean pronunciation display device for arbitrary input speech
111: voice signal database 112: Korean pronunciation database
113: similarity measuring unit 114: pattern selector
115: word extraction unit 116: Korean pronunciation extraction unit
117: Korean pronunciation output unit 118: Primary measurement unit
119: degree of similarity determination unit 120: secondary measurement unit

Claims (10)

A voice signal database in which a plurality of words for each language and a predetermined voice signal pattern for pronunciation of each of a plurality of words for each language are stored so as to correspond to each other for each type of a plurality of languages;
A Korean pronunciation database in which a plurality of words for each language and a plurality of words for each language are stored so that Korean pronunciation characters represented by Korean characters correspond to each other;
Recognizing the user's voice and outputting a predetermined voice signal pattern for pronunciation of each of a plurality of words for each language stored in the voice signal database and voice A similarity measuring unit for measuring a similarity between signal patterns;
Selecting a first speech signal pattern having a maximum similarity among predetermined speech signal patterns stored in the speech signal database and predetermining a pronunciation of each of a plurality of words for each language, A pattern selector for selecting at least one second speech signal pattern that is excluded from the first speech signal pattern while exceeding a first similarity;
Extracting a first word corresponding to the first speech signal pattern from the speech signal database and extracting at least one second word stored corresponding to the at least one second speech signal pattern, An extraction unit;
Extracting a first Korean pronunciation character corresponding to the first word from the Korean pronunciation database and extracting at least one second Korean pronunciation character corresponding to the at least one second word, A pronunciation extraction unit; And
Outputting the extracted first Korean pronunciation letter to the selected first area of the display in a Korean pronunciation corresponding to the user's voice while displaying the at least one second Korean pronunciation letter for the selected second area of the display A Korean pronunciation output unit for outputting a Korean candidate word corresponding to the user voice,
Lt; / RTI >
The voice signal database
Wherein a plurality of Korean words and a predetermined voice signal pattern for pronunciation of each of the plurality of Korean words are stored in association with each other for the Korean words,
In the Korean pronunciation database,
Wherein the plurality of Korean words and the pronunciation of each of the plurality of Korean words are stored in association with each other,
The similarity-
Wherein the speech recognition unit recognizes the user speech when the user's voice is input from the user through the microphone and outputs a predetermined speech signal pattern for pronunciation of each of the plurality of Korean words stored in the speech signal database, A primary measurement unit for primarily measuring the similarity between voice signal patterns;
Whether or not a speech signal pattern exceeding the second similarity degree determined in the first degree of similarity among predetermined speech signal patterns for pronunciation of each of the plurality of Korean words stored in the speech signal database exists A similarity determination unit; And
It is determined that there is no speech signal pattern in which the degree of similarity measured in the first order among the predetermined speech signal patterns for pronunciation of each of the plurality of Korean words stored in the speech signal database exceeds the predetermined second similarity degree A second measurement unit that measures a degree of similarity between a predetermined speech signal pattern for pronunciation of each of a plurality of words for each language stored in the speech signal database and a speech signal pattern for the user speech,
Lt; / RTI >
The pattern selector
Wherein when the second measurement of the degree of similarity is completed, the second degree of similarity measured by the second degree among the predetermined speech signal patterns stored in the speech signal database, 1 < / RTI > voice signal pattern, the second degree of similarity measuring the second degree of similarity exceeds the predetermined first degree of similarity, and selecting the at least one second voice signal pattern from which the first voice signal pattern is excluded Korean pronunciation display device for voice.
delete delete The method according to claim 1,
The pattern selector
If it is determined that the first-order similarity degree of the predetermined one of the speech signal patterns for the pronunciation of each of the plurality of Korean words exceeds the predetermined second similarity degree, Selecting a third speech signal pattern having a maximum similarity degree measured first among a plurality of predetermined speech signal patterns for pronunciation of each of the plurality of Korean words,
The word extracting unit
Extracting a first Korean word corresponding to the third voice signal pattern among the plurality of Korean words stored in the voice signal database when the third voice signal pattern is selected,
The Korean pronunciation extraction unit
Extracts a third Korean pronunciation character corresponding to the first Korean word from the Korean pronunciation database,
The Korean pronunciation output unit
And outputting the extracted third Korean pronunciation letter through the display in a Korean pronunciation corresponding to the user voice.
Maintaining a voice signal database in which a plurality of words for each language and a predetermined voice signal pattern for pronunciation of each of a plurality of words for each language correspond to each other for each of a plurality of languages;
The method comprising: maintaining a Korean pronunciation database in which a plurality of words for each language and a plurality of words for each of the plurality of languages are stored so that Korean pronunciation characters represented by Korean characters correspond to each other;
Recognizing the user's voice and outputting a predetermined voice signal pattern for pronunciation of each of a plurality of words for each language stored in the voice signal database and voice Measuring a degree of similarity between signal patterns;
Selecting a first speech signal pattern having a maximum similarity among predetermined speech signal patterns stored in the speech signal database and predetermining a pronunciation of each of a plurality of words for each language, Selecting at least one second speech signal pattern excluding the first speech signal pattern while exceeding a first degree of similarity;
Extracting a first word corresponding to the first speech signal pattern from the speech signal database and extracting at least one second word corresponding to the at least one second speech signal pattern, ;
Extracting a first Korean pronunciation character corresponding to the first word from the Korean pronunciation database and extracting at least one second Korean pronunciation character corresponding to the at least one second word, ; And
Outputting the extracted first Korean pronunciation letter to the selected first area of the display in a Korean pronunciation corresponding to the user's voice while displaying the at least one second Korean pronunciation letter for the selected second area of the display And outputting a Korean candidate group pronunciation corresponding to the user voice
Lt; / RTI >
The voice signal database
Wherein a plurality of Korean words and a predetermined voice signal pattern for pronunciation of each of the plurality of Korean words are stored in association with each other for the Korean words,
In the Korean pronunciation database,
Wherein the plurality of Korean words and the pronunciation of each of the plurality of Korean words are stored in association with each other,
The step of measuring the degree of similarity
Wherein the speech recognition unit recognizes the user speech when the user's voice is input from the user through the microphone and outputs a predetermined speech signal pattern for pronunciation of each of the plurality of Korean words stored in the speech signal database, Measuring a degree of similarity between speech signal patterns;
Whether or not a speech signal pattern exceeding the second similarity degree determined in the first degree of similarity among predetermined speech signal patterns for pronunciation of each of the plurality of Korean words stored in the speech signal database exists step; And
It is determined that there is no speech signal pattern in which the degree of similarity measured in the first order among the predetermined speech signal patterns for pronunciation of each of the plurality of Korean words stored in the speech signal database exceeds the predetermined second similarity degree A second step of measuring a degree of similarity between a predetermined speech signal pattern for pronunciation of each of a plurality of words for each language stored in the speech signal database and a speech signal pattern for the user speech,
Lt; / RTI >
The step of selecting the at least one second voice signal pattern
Wherein when the second measurement of the degree of similarity is completed, the second degree of similarity measured by the second degree among the predetermined speech signal patterns stored in the speech signal database, 1 < / RTI > voice signal pattern, the second degree of similarity measuring the second degree of similarity exceeds the predetermined first degree of similarity, and selecting the at least one second voice signal pattern from which the first voice signal pattern is excluded Korean pronunciation method for voice.
delete delete 6. The method of claim 5,
If it is determined that the first-order similarity degree of the predetermined one of the speech signal patterns for the pronunciation of each of the plurality of Korean words exceeds the predetermined second similarity degree, Selecting a third speech signal pattern having a maximum degree of similarity measured in the first order among predetermined speech signal patterns for pronunciation of each of the plurality of Korean words;
Extracting a first Korean word stored corresponding to the third voice signal pattern among the plurality of Korean words stored in the voice signal database when the third voice signal pattern is selected;
Extracting a third Korean pronunciation character corresponding to the first Korean word from the Korean pronunciation database; And
Outputting the extracted third Korean pronunciation letter through the display in a Korean pronunciation corresponding to the user's voice
Further comprising the steps of:
A computer-readable recording medium recording a program for performing the method according to any one of claims 5 to 8. A computer program stored in a storage medium for executing the method of any one of claims 5 to 8 through a combination with a computer.
KR1020150097541A 2015-07-09 2015-07-09 Korean pronunciation display device and method for any of the input speech KR101681944B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150097541A KR101681944B1 (en) 2015-07-09 2015-07-09 Korean pronunciation display device and method for any of the input speech

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150097541A KR101681944B1 (en) 2015-07-09 2015-07-09 Korean pronunciation display device and method for any of the input speech

Publications (1)

Publication Number Publication Date
KR101681944B1 true KR101681944B1 (en) 2016-12-02

Family

ID=57571675

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150097541A KR101681944B1 (en) 2015-07-09 2015-07-09 Korean pronunciation display device and method for any of the input speech

Country Status (1)

Country Link
KR (1) KR101681944B1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200030354A (en) * 2018-09-12 2020-03-20 주식회사 한글과컴퓨터 Voice recognition processing device for performing a correction process of the voice recognition result based on the user-defined words and operating method thereof
KR102375575B1 (en) * 2020-12-23 2022-03-17 주식회사 한글과컴퓨터 Electronic apparatus that can perform user authentication using voice characteristics and operating method thereof
KR20220160991A (en) 2021-05-28 2022-12-06 한국전자통신연구원 Method and apparatus for recognizing speech for code switching sentence based on korean-english

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010106696A (en) * 2000-05-23 2001-12-07 김경징 Voice recognition/synthesis systems based on standard pronunciation analysis methodology and methods therefor
KR20150005027A (en) * 2013-07-04 2015-01-14 삼성전자주식회사 device for recognizing voice and method for recognizing voice
KR20150018357A (en) * 2013-08-08 2015-02-23 신부용 Apparatus and method for voice recognition of multilingual vocabulary

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010106696A (en) * 2000-05-23 2001-12-07 김경징 Voice recognition/synthesis systems based on standard pronunciation analysis methodology and methods therefor
KR20150005027A (en) * 2013-07-04 2015-01-14 삼성전자주식회사 device for recognizing voice and method for recognizing voice
KR20150018357A (en) * 2013-08-08 2015-02-23 신부용 Apparatus and method for voice recognition of multilingual vocabulary

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200030354A (en) * 2018-09-12 2020-03-20 주식회사 한글과컴퓨터 Voice recognition processing device for performing a correction process of the voice recognition result based on the user-defined words and operating method thereof
KR102144345B1 (en) * 2018-09-12 2020-08-13 주식회사 한글과컴퓨터 Voice recognition processing device for performing a correction process of the voice recognition result based on the user-defined words and operating method thereof
KR102375575B1 (en) * 2020-12-23 2022-03-17 주식회사 한글과컴퓨터 Electronic apparatus that can perform user authentication using voice characteristics and operating method thereof
KR20220160991A (en) 2021-05-28 2022-12-06 한국전자통신연구원 Method and apparatus for recognizing speech for code switching sentence based on korean-english

Similar Documents

Publication Publication Date Title
US11145292B2 (en) Method and device for updating language model and performing speech recognition based on language model
EP3039531B1 (en) Display apparatus and controlling method thereof
US9947324B2 (en) Speaker identification method and speaker identification device
CN107086040B (en) Voice recognition capability test method and device
US9466289B2 (en) Keyword detection with international phonetic alphabet by foreground model and background model
US8731926B2 (en) Spoken term detection apparatus, method, program, and storage medium
JP5480760B2 (en) Terminal device, voice recognition method and voice recognition program
US8751230B2 (en) Method and device for generating vocabulary entry from acoustic data
EP2685452A1 (en) Method of recognizing speech and electronic device thereof
KR100679042B1 (en) Method and apparatus for speech recognition, and navigation system using for the same
US9177545B2 (en) Recognition dictionary creating device, voice recognition device, and voice synthesizer
US9437187B2 (en) Voice search device, voice search method, and non-transitory recording medium
KR101681944B1 (en) Korean pronunciation display device and method for any of the input speech
CN113113024A (en) Voice recognition method and device, electronic equipment and storage medium
KR101483947B1 (en) Apparatus for discriminative training acoustic model considering error of phonemes in keyword and computer recordable medium storing the method thereof
KR101905827B1 (en) Apparatus and method for recognizing continuous speech
KR101424496B1 (en) Apparatus for learning Acoustic Model and computer recordable medium storing the method thereof
KR100554442B1 (en) Mobile Communication Terminal with Voice Recognition function, Phoneme Modeling Method and Voice Recognition Method for the same
KR102199445B1 (en) Method and apparatus for discriminative training acoustic model based on class, and speech recognition apparatus using the same
KR102299269B1 (en) Method and apparatus for building voice database by aligning voice and script
US11990136B2 (en) Speech recognition device, search device, speech recognition method, search method, and program
KR102662571B1 (en) Electronic apparatus, controlling method and computer-readable medium
JP4282354B2 (en) Voice recognition device
KR101066472B1 (en) Apparatus and method speech recognition based initial sound
KR102392992B1 (en) User interfacing device and method for setting wake-up word activating speech recognition

Legal Events

Date Code Title Description
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20191127

Year of fee payment: 4