WO2015163684A1 - 적어도 하나의 의미론적 유닛의 집합을 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 - Google Patents
적어도 하나의 의미론적 유닛의 집합을 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 Download PDFInfo
- Publication number
- WO2015163684A1 WO2015163684A1 PCT/KR2015/004010 KR2015004010W WO2015163684A1 WO 2015163684 A1 WO2015163684 A1 WO 2015163684A1 KR 2015004010 W KR2015004010 W KR 2015004010W WO 2015163684 A1 WO2015163684 A1 WO 2015163684A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- semantic unit
- improvement
- semantic
- unit set
- captured
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0635—Training updating or merging of old and new templates; Mean values; Weighting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/086—Recognition of spelled words
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/06—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being correlation coefficients
Definitions
- the present invention relates to a method, an apparatus and a computer readable recording medium for improving the set of at least one semantic unit (hereinafter also referred to simply as "meaning unit").
- the semantic unit set may be output as a digital voice corresponding to a specific language or expressed as text of a specific language.
- the set of semantic units may be digital speech that is the result of analysis of analog speech, which is one of broad speech recognition, and in the latter case, the set of semantic units may be speech recognition text, which is the result of narrow speech recognition.
- the semantic unit set obtained by the above attempt has some limitations in terms of quality. For example, myriads of different speech habits, indistinct pronunciations of some people, use of unspoken vocabulary or dialects, and ambient noise can create practical difficulties in deriving a set of semantic units depending on speech recognition technology. Was.
- the present inventors have developed a new technique for improving the set of at least one semantic unit using an improvement voice (i.e., a voice uttered separately for improvement) so that a higher quality semantic unit set is obtained as a result.
- an improvement voice i.e., a voice uttered separately for improvement
- the semantic unit set to be improved by the improvement voice is not necessarily limited to that obtained by the speech recognition technology.
- the set of semantic units to be improved may be originally input by the user as text (ie, may not be obtained by speech recognition technology).
- the semantic unit set to be improved is obtained by the speech recognition technology, and in order to improve this, it can be assumed even when an improvement text is used instead of the improvement voice.
- the present inventors also present new techniques that can be used in many of the above cases.
- the present invention aims to solve all of the above-mentioned problems of the prior art.
- Another object of the present invention is to improve the set of at least one semantic unit by using voice or text.
- the set comprising at least one semantic unit may be a digital voice or text as a result of recognizing an analog voice of a person through a predetermined electronic device (not shown) or a predetermined electronic device (not shown).
- TTS Text To Speech
- According to the present invention can be improved by the voice for improvement. As an aspect of such improvement, correction of a speech recognition result, correction of a typo in the input and displayed text, etc. are mentioned.
- a set including at least one semantic unit is a digital voice or text as a result of recognizing an analog voice of a person through a predetermined electronic device (not shown), according to the present invention described below. It may be improved by the text for improvement. As an aspect of such improvement, correction of a speech recognition result is mentioned.
- a method for improving a set comprising at least one semantic unit wherein the set comprising the at least one semantic unit is a captured semantic unit collection, the method according to the user's speech Receiving a user voice, specifying a set of improvement semantic units based on the improvement voice, and based on an association with the set of improvement semantic units, an object of actual improvement within the captured semantic unit set is determined. Specifying a set of semantic units to be matched as a matched semantic unit set, and replacing the matched semantic unit set in the captured semantic unit set with the improvement semantic unit set.
- a computer readable recording medium for recording another method for implementing the present invention, another apparatus, and a computer program for executing the method.
- the set of at least one semantic unit can be improved by using voice or text.
- the result of the text input can be easily and accurately improved.
- FIG. 1 is a view schematically showing the appearance of a semantic unit improving device according to an embodiment of the present invention.
- FIG. 2 is a block diagram conceptually illustrating an internal configuration of a semantic unit improving apparatus according to an embodiment of the present invention.
- FIG. 3 is an exemplary flowchart of a matching method according to an embodiment of the present invention.
- FIG. 4 is an exemplary flowchart of a semantic unit improvement method according to an embodiment of the present invention.
- FIG. 5 is a flowchart illustrating a digital voice comparison method according to an embodiment of the present invention.
- FIG. 6 is a flowchart illustrating a text comparison method according to an embodiment of the present invention.
- FIG. 1 is a view schematically showing the appearance of a semantic unit improving device according to an embodiment of the present invention.
- the semantic unit improving apparatus 100 may include a display unit 110 (eg, a display panel) that displays visual information on a set of various semantic units, and the like, and a semantic unit improving apparatus.
- the user can press or touch the button unit 120 (for example, the home button of the smart phone) that can be performed to perform a predetermined function, the meaning generated by the unit improvement device 100 Sound output unit 130 (for example, a speaker) capable of outputting sound, a sound sensor (in many cases, a microphone disposed near the bottom of the button unit 120) (not shown), and other known electric and electronic devices Components (not shown) and the like.
- the semantic unit improvement device 100 is illustrated in FIG.
- the semantic unit improvement device 100 is not limited thereto, but a desktop computer, a notebook computer, a workstation, a PDA, a web pad, a mobile phone (smart) Non-phone), various smart wearable devices (e.g., smart watches, smart bands, smart glasses, smart rings, etc.), which have a memory means and are equipped with a microprocessor, which is a digital device with computing power It can be adopted as the semantic unit improvement device 100 according to the invention.
- the display 110 may further function as a known touch panel to receive a text input of a user.
- the text input of the user may be performed by a keyboard (not shown) or a keypad (not shown) on the touch panel provided in software in conjunction with the touch panel.
- the semantic unit improvement device 100 may include a separate hardware keyboard / keypad (not shown) to receive a text input of the user.
- FIG. 2 is a block diagram conceptually illustrating an internal configuration of a semantic unit improving apparatus according to an embodiment of the present invention.
- the semantic unit improving apparatus 100 may include a voice sensing unit 210, a speech processing unit 220, a semantic unit improving unit 230, a database 250, and The controller 260 may be included.
- the voice detector 210, the voice processor 220, the semantic unit improver 230, the database 250, and the controller 260 perform a predetermined operation.
- a program module for managing or communicating with other hardware components or software components may be included in the semantic unit improvement device 100 in the form of an operating system, an application program module or other program modules, and may be physically stored in various known storage devices.
- program modules may be stored in a remote storage device (not shown) or even an external computing device (not shown) that can communicate with the semantic unit improvement device 100. Therefore, at least some of the functions of the semantic unit improving device 100 may be executed by an external computing device or the like according to the free choice of those skilled in the art.
- program modules include, but are not limited to, routines, subroutines, programs, objects, components, data structures, etc. that perform particular tasks or execute particular abstract data types, described below, in accordance with the present invention.
- the voice sensing unit 210 corresponds to a larger set of semantic units including a set of voices uttered by a user or the like, that is, a set of at least one semantic unit to be improved.
- the user may perform a function of detecting a voice for improvement that may be further uttered by the user.
- the voice sensing unit 210 may include the acoustic sensor as described above as a part thereof or at least communicate with the acoustic sensor.
- Examples of such an acoustic sensor may include a noise sensor, a vibration sensor, an ultrasonic sensor, and the like, which can sense a voice signal having a small amplitude as well as a general acoustic sensor such as a microphone.
- the acoustic sensor may include a voice detector 210, a chassis, a main board (not shown), a printed circuit board (PCB) (not shown), and an enclosure (not shown) of the semantic unit improvement device 100. And at least one portion thereof.
- the voice detector 210 may transmit the analog signal of the detected voice to the voice processor 220 as described later.
- the voice processing unit 220 may perform a function of converting an analog voice signal received from the voice sensing unit 210 into a digital signal.
- the voice processor 220 may include a known analog-to-digital converter. Accordingly, the speech processing unit 220 performs at least one of sampling, quantization, and encoding processes, thereby corresponding to a larger set of semantic units including a set of semantic units to be improved. It is possible to convert an audio signal or an audio signal for improvement from an analog signal to a digital signal.
- the voice processing unit 220 may amplify a voice signal, remove noise from the voice signal, selectively receive only a voice signal of a specific frequency band, or change a waveform of the voice signal as needed.
- the speech processing unit 220 may include a known amplifier, noise filter, bandpass / band reject filter, Kalman filter, EMA filter, Savitzky-Golay filter, and the like. have.
- the speech processing unit 220 may perform a process of converting a speech signal in the time domain into a frequency domain or vice versa.
- the voice processor 220 may transmit the digital voice signal that is a result of the process to the semantic unit improver 230 as described later.
- the semantic unit improving unit 230 includes a larger semantic unit set including a specific semantic unit set to be improved according to the digital voice signal received from the voice processing unit 220. Can be captured. This may be physically the digital voice itself or the result of conversion to text. For the latter case or for other speech recognition needs as described below, the semantic unit improving unit 230 may include or be linked to a known speech recognition module.
- the semantic unit set as described above will be referred to as "captured semantic unit set".
- the semantic unit improving unit 230 may also specify an improvement voice based on the digital voice signal received from the voice processing unit 220.
- the semantic unit set corresponding to the above-described improvement voice will be referred to as an "improvement semantic unit set”. This may also be physically the digital voice itself or the result of conversion to text.
- the semantic unit set captured by the semantic unit improving unit 230 need not necessarily originate from the digital audio signal. That is, for example, regardless of the occurrence of analog or digital voice, the semantic unit set corresponding to the text acquired by the user's key input, optical character reading, etc. may also be the captured semantic unit set as described above. have.
- the predetermined improvement text may constitute a set of improvement semantic units.
- the text input by the user via the keyboard may be a set of improvement semantic units.
- the number of cases related to the correspondence between the semantic unit set and the semantic unit set for improvement according to the embodiments of the present invention may be as shown in Table 1 below.
- the semantic unit improvement unit 230 compares the semantic unit set captured and the improvement semantic unit set, and sets the semantic unit that is the target of actual improvement, which has a high correlation with the semantic unit set for improvement from the semantic unit set captured. Can be extracted. Such extraction may also be called "matching" between the semantic unit set to be improved and the semantic unit set for improvement.
- the comparison performed for such a match may be a comparison between digital voices, a comparison between texts, and a comparison between digital voices and texts, provided that in this case, one of the digital voices and the text is different from the other. May need to be pre-converted to the same format).
- the matched set of improvement semantic units is referred to as " matched semantic unit set " for convenience. This may exist in one captured semantic unit set, but may exist in plural.
- the semantic unit improving unit 230 may further utilize information by a user's input (ie, an input other than the speech of the improvement voice or the input of the improvement text) for matching. For example, if a plurality of relatively high semantic unit sets are extracted, a matched semantic unit set may be determined by allowing a user to manually select at least some of them.
- the semantic unit improving unit 230 may improve the captured semantic unit set by the improvement semantic unit set. That is, the matched semantic unit set may be replaced with the improvement semantic unit set. This replacement may be to delete the matched set of semantic units that existed in the semantic set of captured semantic units, and instead insert the set of improvement semantic units in place. The result may be that the semantic unit set captured retains its physical form but its quality is improved. This result can be physically digital voice or text.
- the database 250 may store information about the semantic unit captured, the semantic unit for improvement, and the matching.
- the database 250 is illustrated as being included in the semantic unit improvement apparatus 100 in FIG. 2, according to the needs of those skilled in the art of implementing the present invention, the database 250 may be connected to the semantic unit improvement apparatus 100. It may be configured separately.
- the database 250 in the present invention is a concept that includes a computer-readable recording medium, and may be a broad database including not only a negotiated database but also a file system based on a file system. If the set can be retrieved to extract the data, it can be the database 250 in the present invention.
- the controller 260 provides a function of controlling the flow of data between the voice detector 210, the voice processor 220, the semantic unit improver 230, and the database 250. Can be done. That is, the controller 260 according to the present invention controls the data flow between the components of the semantic unit improving apparatus 100, thereby providing a voice sensing unit 210, a speech processing unit 220, a semantic unit improving unit 230, and the like. Each database 250 may control to perform a unique function.
- 3 is an exemplary flowchart of a matching method according to an embodiment of the present invention.
- the semantic unit improving unit 230 may perform step S1 of specifying a set of improvement semantic units.
- the semantic unit improving unit 230 receives the voice of the digital signal received before or after (or immediately before or after) the user's predetermined instruction among the digital voice signals received from the voice processing unit 220. It may be specified as an improvement voice, that is, as a set of improvement semantic units (the above instruction may be a speech of a user's pre-promised word or input of a pre-promised key).
- the digital voice corresponding to "error” or the text converted by speech recognition is displayed. It can be specified as a set of semantic units for improvement. This specification is the time interval between the speech portion corresponding to "error” (ie the speech portion for improvement) and the speech portion before it (ie the speech portion corresponding to "backspace” or the speech portion corresponding to "without backspace”). It may be based on the fact that it is above a predetermined threshold. On the other hand, in this example, a digital voice corresponding to "I can correct typing at all without backspace” or a set of semantic units in which the text converted by voice recognition may be captured.
- the semantic unit improvement unit 230 may determine that the user has displayed, for example, text (ie, a set of captured semantic units) on the screen such as "I can correct typing at all without backspace". Even when the "error" is uttered before or after (or immediately before or after) a predetermined instruction with the intention of making an improvement, the set of improvement semantic units can be specified based on the corresponding digital voice. have.
- text ie, a set of captured semantic units
- the set of improvement semantic units can be specified based on the corresponding digital voice. have.
- the semantic unit improving unit 230 for example, after the user utters a voice corresponding to "I can correct typing at all without backspace", sees it displayed on the screen, and then Even if a text for improvement such as "error” is input by the keyboard before or after (or immediately before or after) with the intention of making an improvement related to it, a set of improvement semantic units is specified based on this. can do.
- the semantic unit improving unit 230 may perform the step S2 of specifying the semantic unit set to be the actual improvement in the captured semantic unit set based on the specified improvement semantic unit set. Can be.
- a larger set of semantic units may be captured that contain a specific set of semantic units that are subject to substantial improvement.
- This captured semantic unit set may be a semantic unit set of “I can correct typing at all without backspace”, as illustrated above.
- the semantic unit improving unit 230 may determine the time interval or space between units in the semantic unit set captured and / or the length of the semantic unit set for improvement (for example, the duration of the corresponding digital voice signal, the length of the text, the word).
- Number of words, number of words, number of syllables, number of letters, number of words, etc. for example, “I can”, “can correct”, “correct typing”, After dividing and expanding, such as “typing at”, “at all”, “all without”, “without backspace”, etc., you can see each part compared to the set of semantic units for improvement. Of course, you can divide it with “I”, “can”, “correct”, “typing”, “at”, “all”, “without”, “backspace”, or “cor” rather than "correct”.
- each part may be compared with the semantic unit set for improvement together with one or more of them. While above and below, a description is made of dividing or extending the semantic unit set captured for comparison with the improvement semantic unit set, but as long as a part of the semantic unit set captured can be compared with the semantic unit set for improvement Any means other than separation (division) into parts of the set of semantic units or expansion of parts (ie, arrangement of parts to be overlapped) may be adopted.
- This comparison may be a comparison of the characteristics of the digital voice in the time domain or the frequency domain when the comparison is between digital voices.
- voice features may be feature points in the wave of the digital voice signal. That is, as more common feature points are found between two digital voice signals within the same reproduction time interval, the two digital voices may be regarded as having higher correlation with each other.
- the characteristics (characteristics) of the digital voice under consideration may freely include one or more of the following.
- LPCC linear prediction-based Cepstral Coefficients
- PLP perceptual linear prediction
- the above comparison may be a comparison between texts.
- the text may be compared with respect to at least one of the word, word, syllable, letter and word.
- One or more known text comparison algorithms may be employed for this comparison. For example, two texts having a high sequential similarity (for example, a similarity between a note value and a spelling) for each syllable may be defined as text having a high correlation with each other.
- the semantic unit improving unit 230 may determine, as a matched semantic unit set, a portion indicating a high correlation with the semantic unit for improvement in the semantic unit set captured.
- the digital voice or text corresponding to the "at all" part may be determined as a set of matched semantic units.
- the semantic unit improving unit 230 eventually performs a comparison between digital voices or a comparison between texts. This is because even when a comparison is performed between digital voice and text, the format of the two sets of semantic units is unified to one of the digital voice and the text before the full comparison.
- the semantic unit improving unit 230 may include or at least interwork with a known speech recognition module and / or a known TTS module.
- FIG. 5 is a flowchart illustrating a digital voice comparison method according to an embodiment of the present invention.
- the semantic unit improving unit 230 may measure the length of the digital voice corresponding to the set of improvement semantic units.
- the unit of this length can usually be seconds.
- the semantic unit improvement unit 230 may divide and expand the semantic unit set captured into various parts according to the length or the length in which the predetermined length is added to or subtracted from the length. For example, if the captured semantic unit set is a digital voice having a reproduction time of 10 seconds and the improvement semantic unit set is a digital voice having a reproduction time of 1 second, the captured semantic unit set has a corresponding reproduction time interval of 0 to 1 second, 0.1 to 1.1 seconds,... , 8.9 seconds to 9.9 seconds, and 9 seconds to 10 seconds. According to the performance of the semantic unit improving unit 230 or the semantic unit improving apparatus 100, the number of the above parts may be appropriately adjusted.
- the semantic unit improving unit 230 may compare each part of the semantic unit set captured with the improvement semantic unit set.
- the nature of the comparison may be a comparison of the characteristics of the digital speech signal.
- Such a comparison may preferably include some association score calculation. For example, within a corresponding time interval, whenever the same or nearly similar feature points are found between two digital voice signals, the association score may be cumulatively increased. The association score determined accordingly may be given for that portion of the semantic unit set captured.
- the correspondence or degree of correspondence of the various other characteristics as described above may be the basis of the correlation score calculation.
- This step 503 may be performed repeatedly as necessary.
- the semantic unit improving unit 230 is about all the parts of the semantic unit set captured or a part of which a predetermined association score has already been given.
- Correlation score calculation by digital voice comparison may be performed repeatedly two or more times.
- the captured semantic unit set portion given the highest association score (cumulative score or average score) after iterative association score calculation may be determined as the matched semantic unit set.
- FIG. 6 is a flowchart illustrating a text comparison method according to an embodiment of the present invention.
- the semantic unit improving unit 230 may measure the length of text corresponding to the set of improvement semantic units.
- the length of the text may be generally expressed by the number of words, words, syllables, letters, or words.
- the text corresponding to the set of refinement semantic units may be "error", which has a length of two syllables.
- the semantic unit improvement unit 230 may divide and expand the semantic unit set captured into various parts according to the length or the length in which the predetermined length is added to or subtracted from the length. For example, if the captured semantic unit set is text such as "I can correct typing at all without backspace", then the divided and expanded portion of the captured semantic unit set is "I", “can", “correct”, May include variously “I can”, “can correct”, “I can correct”, “cor”, “rect”, and the like (as for the other parts of the semantic unit set captured above).
- the parts which are most preferably divided and then expanded are "I can”, “can cor”, “correct”, “rect ty”, “typing”, “ping at”, “at all”, “all with”, It could be two syllables like "without", “out back”, “backspace”, etc.
- the number of the above parts may be appropriately adjusted.
- the semantic unit improving unit 230 may compare each part of the semantic unit set captured with the improvement semantic unit set.
- the comparison may be a sequential comparison of at least one of the words, words, syllables, letters, and lexicons between the texts.
- Such a comparison may preferably include some association score calculation. For example, at locations corresponding to each other, whenever the same or nearly similar syllables are found between two texts, the association score may be cumulatively increased. The association score determined accordingly may be given for that portion of the semantic unit set captured.
- what is determined to be almost similar between the two texts may be similar syllables, but may be similar syllables.
- the comparison may be an overall comparison based on a semantic association between texts.
- Such associations can be found depending on whether two words, each corresponding to two texts, belong to the same category or have substantially similar meanings. (References for such categories or meanings of such words are well-known linguistic libraries.) Can be made). For example, if the semantic unit set captured is text of "I can do it this Saturday", and the text of the refinement semantic unit set is "may” or "Friday", the semantic unit set "may” is captured. Can be identified as having a semantic association (i.e., an English verb) with a part of the set of semantic units (although the phonetic or spelling is different), and the set of improvement semantic units "Friday" is captured.
- the comparison may be a comparison based on key position association between texts.
- This comparison sequentially compares the spelling of a piece of text belonging to the semantic unit set captured with the spelling of the text of the set of improvement semantic units, but not only when the same spellings are found. Even if it is determined that the images have been adjacent to each other, the comparison score may be given to the portion. For example, if a QWERTY keyboard is used, "wyw", which may be a piece of text within a set of semantic units captured, has a high association score with respect to the text "eye" of the set of enhancement semantic units whose pitch or spelling is completely different. It can be determined to have.
- This step 603 may be performed repeatedly as necessary.
- the semantic unit improving unit 230 is about all the parts of the semantic unit set captured or a part of which a predetermined association score has already been given.
- Association score calculation by text comparison can be performed repeatedly two or more times.
- the captured semantic unit set portion given the highest association score (cumulative score or average score) after iterative association score calculation may be determined as the matched semantic unit set.
- association score calculation methods as described above may be adopted as necessary.
- the value obtained by multiplying the correlation score and the corresponding weight according to one method may be summed with the value multiplied by the correlation score and the corresponding weight according to another method.
- the association score derived accordingly may be a complex association score.
- one part or several parts of the captured semantic unit set having a high complex association score may be a matched semantic unit set.
- the weight multiplied for the association score according to one method may be determined differently according to the environment in which the semantic unit improvement device 100 is located or the intention of the user. For example, when the user repeatedly utters the improvement voice to generate the improvement semantic unit set, a higher weight may be given to the correlation score by the digital voice comparison. Alternatively, when a user writes text corresponding to a set of semantic units captured by a small touch panel that is easy to be misspelled, a higher weight is applied to the correlation score considering the key adjacency on the keyboard among the correlation scores by text comparison. Can be given.
- 4 is an exemplary flowchart of a semantic unit improvement method according to an embodiment of the present invention.
- the semantic unit improving unit 230 may perform a step (step T1) of replacing a matched semantic unit set with an improvement semantic unit set.
- the result of the replacement may be that the captured semantic unit set includes the improvement semantic unit set instead of the matched semantic unit set.
- the result of this replacement may be an improved speech recognition result or an improved text.
- the improved speech recognition result or text may be "I can correct typing error without backspace". This may be the result that exactly matches the user's original intent.
- the semantic unit improving unit 230 may perform a step (step T2) of giving the user a digital voice corresponding to the captured semantic unit set to be substituted after the improvement or displaying a text corresponding thereto.
- step T2 the semantic unit improving unit 230 may perform a step (step T2) of giving the user a digital voice corresponding to the captured semantic unit set to be substituted after the improvement or displaying a text corresponding thereto.
- what is heard or displayed to the user may be digital voice or text corresponding to a set of semantic units of "I can correct typing error without backspace".
- the improvement voice may not guarantee sufficient improvement (in the case of the improvement text, there is almost no such problem). This may be due to a problem inherent in the user (e.g. pronunciation inaccuracies, dialect usage, etc.) that the quality of the improvement voice is not high enough in the first place, which is mistaken for a different set of semantic units, or an environmental problem (e.g. For example, due to the environment in which noise is involved, the low specification of the semantic unit improvement device 100, etc., the quality of the improvement voice is not very low, but in that particular process, it may be mistaken as being a different set of semantic units. Can be.
- a problem inherent in the user e.g. pronunciation inaccuracies, dialect usage, etc.
- an environmental problem e.g. For example, due to the environment in which noise is involved, the low specification of the semantic unit improvement device 100, etc., the quality of the improvement voice is not very low, but in that particular process, it may be mistaken as being a different set of semantic units. Can be.
- the semantic unit improvement unit 230 can further refine the semantic unit set corresponding to the improvement voice based on this. Below we will look at several examples of additional information that make this possible.
- the user may further utter "e”, "r”, and “r” in addition to "error” corresponding to the improvement voice in the above example.
- the semantic unit improving unit 230 is preset for improvement (that is, for improvement corresponding to the set of improvement semantic units having a front part which sequentially matches a corresponding alphabet letter after a predetermined number of alphabet letters are uttered continuously). If the voice is uttered, the letters of the alphabet are all considered partial spelling of the set of improvement semantic units) or other machine learning techniques, so that "e", "r” and “r” are actually sets of improvement semantic units. It can be seen that this corresponds to partial spelling to further refine. Obviously this could be to ensure the precise specification of the set of semantic units for improvement.
- the user may further utter “echo”, “romeo” and “romeo” in addition to "error” corresponding to the improvement voice in the above example.
- the semantic unit improving unit 230 corresponds to a preset set of improvement semantic units having a front part which is preset (i.e., a predetermined number of military phonetic alphabets are sequentially uttered and then sequentially matches the letters of the alphabet). If the improvement voice is uttered, the letters of the alphabet are all considered partial spelling of the set of improvement semantics) or other machine learning techniques, so that "echo”, "romeo” and “romeo” actually mean improvement. You can see that this corresponds to partial spelling to further refine the set of units.
- the partial spelling technique as described above in the Korean language, the full-speaking speech of the vowel vowel letters (eg, "development” "for” and “yi” sequential sequential speech for the set of improvement semantic units called In Japanese, so as not to be mistaken for a set of meaning units for improvement (e.g., " ⁇ ⁇ " ( ⁇ ⁇ ⁇ ) It can also be carried out by " ⁇ ⁇ " ( ⁇ ⁇ ⁇ ⁇ ).
- the semantic unit improving unit 230 may set a preset (that is, a setting in which the word corresponding to the voice after the speech is a hint word when the "of" is uttered in the portion where the improvement voice is spoken) or other machine learning.
- “erroneous” is actually a hint word for more precisely specifying the set of improvement semantic units (ie, words that have spelling that is at least in part identical or similar to the words of the set of correct improvement semantic units). You can see that this corresponds to. Obviously this could be to ensure the precise specification of the set of semantic units for improvement.
- "of" which may be regarded as a reserved word, may be replaced with another word that is easy for the user to understand and low in recognition rate, such as "like”.
- the semantic unit improving unit 230 may correct the set of improvement semantic units to include the alphabet of the hint word, that is, "Zoe”.
- the semantic unit improvement unit 230 may Accordingly, the set of improvement semantic units can be corrected to include the letter "of” of the hint word “chair”, that is, to be "inside.”
- the synonym with the reserved word “like” may be used to further utter words similar in meaning to the correct set of improvement semantic units (e.g., "error” to be a set of improvement semantic units).
- the reserved word “like” to allow additional "mistake” to be uttered, or to allow additional parent words to be uttered (for example, "Kia” to be a set of semantic units for improvement).
- the relational word “car company” may be further uttered, or the association word may be further uttered (e.g., with the reserved word “like” to make the "dog house” a set of semantic units for improvement.
- the association word “database” may be additionally uttered with the reserved word “for” to enable additional association of "assholes” or “queries” to be a set of improved semantic units. Can be locked).
- the semantic unit improvement unit 230 interprets the result and sets the improvement semantic unit corresponding to the improvement voice. It can be more precisely specified.
- Embodiments according to the present invention described above can be implemented in the form of program instructions that can be executed by various computer components and recorded in a computer-readable recording medium.
- the computer-readable recording medium may include program instructions, data files, data structures, etc. alone or in combination.
- Program instructions recorded on the computer-readable recording medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the computer software arts.
- Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tape, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. medium) and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like.
- Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.
- the hardware device may be modified with one or more software modules to perform the processing according to the present invention, and vice versa.
Abstract
Description
Claims (13)
- 적어도 하나의 의미 유닛을 포함하는 집합 - 상기 적어도 하나의 의미 유닛을 포함하는 상기 집합은 포착된 의미 유닛 집함임 - 을 개선하기 위한 방법으로서,사용자의 발화에 따라 개선용 음성을 수신하는 단계,상기 개선용 음성에 기초하여 개선용 의미 유닛 집합을 특정하는 단계,상기 개선용 의미 유닛 집합과의 연관성에 기초하여, 상기 포착된 의미 유닛 집합 내에서 실제적인 개선의 대상이 되는 의미 유닛 집합을 매칭된 의미 유닛 집합으로서 특정하는 단계, 및상기 포착된 의미 유닛 집합 내의 상기 매칭된 의미 유닛 집합을 상기 개선용 의미 유닛 집합으로 교체하는 단계를 포함하는 방법.
- 제1항에 있어서,상기 개선용 의미 유닛 집합 특정 단계는, 상기 개선용 의미 유닛 집합의 길이를 측정하는 단계를 포함하는 방법.
- 제2항에 있어서,상기 매칭된 의미 유닛 집합 특정 단계는, 상기 포착된 의미 유닛 집합을 유닛 간의 간격이나 상기 개선용 의미 유닛 집합의 상기 길이에 기초하여 다수의 부분으로 나누고 확장하는 단계를 포함하는 방법.
- 제3항에 있어서,상기 매칭된 의미 유닛 집합 특정 단계는, 상기 포착된 의미 유닛 집합의 상기 다수의 부분의 각각과 상기 개선용 의미 유닛 집합을 비교하는 단계를 더 포함하는 방법.
- 제4항에 있어서,상기 비교 단계는, 상기 포착된 의미 유닛 집합의 상기 다수의 부분의 각각에 해당하는 디지털 음성 신호의 특징과 상기 개선용 의미 유닛 집합에 해당하는 디지털 음성 신호의 특징을 비교하는 단계를 포함하는 방법.
- 제4항에 있어서,상기 비교 단계는, 상기 포착된 의미 유닛 집합의 상기 다수의 부분의 각각에 해당하는 텍스트와 상기 개선용 의미 유닛 집합에 해당하는 텍스트를 비교하는 단계를 포함하는 방법.
- 제6항에 있어서,상기 텍스트 비교 단계는, 상기 두 개의 텍스트를 음가나 스펠링에 관하여 순차적으로 비교하는 단계를 포함하는 방법.
- 제6항에 있어서,상기 텍스트 비교 단계는, 상기 두 개의 텍스트가 동일한 카테고리에 속하는지 또는 유사한 의미를 나타내는지를 비교하는 단계를 포함하는 방법.
- 제6항에 있어서,상기 텍스트 비교 단계는, 상기 두 개의 텍스트를 키 위치 연관 관계에 기초하여 비교하는 단계를 포함하는 방법.
- 적어도 하나의 의미 유닛을 포함하는 집합 - 상기 적어도 하나의 의미 유닛을 포함하는 상기 집합은 사용자의 발화에 따른 음성으로부터 포착된 의미 유닛 집함임 - 을 개선하기 위한 방법으로서,사용자의 개선용 텍스트의 입력을 수신하는 단계,상기 개선용 텍스트에 기초하여 개선용 의미 유닛 집합을 특정하는 단계,상기 개선용 의미 유닛 집합과의 연관성에 기초하여, 상기 포착된 의미 유닛 집합 내에서 실제적인 개선의 대상이 되는 의미 유닛 집합을 매칭된 의미 유닛 집합으로서 특정하는 단계, 및상기 포착된 의미 유닛 집합 내의 상기 매칭된 의미 유닛 집합을 상기 개선용 의미 유닛 집합으로 교체하는 단계를 포함하는 방법.
- 제1항 및 제10항 중 어느 한 항에 따른 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 컴퓨터 판독 가능한 기록 매체.
- 적어도 하나의 의미 유닛을 포함하는 집합 - 상기 적어도 하나의 의미 유닛을 포함하는 상기 집합은 포착된 의미 유닛 집함임 - 을 개선하기 위한 장치로서,사용자의 발화에 따라 개선용 음성을 수신하는 음성 감지부, 및상기 개선용 음성에 기초하여 개선용 의미 유닛 집합을 특정하고, 상기 개선용 의미 유닛 집합과의 연관성에 기초하여, 상기 포착된 의미 유닛 집합 내에서 실제적인 개선의 대상이 되는 의미 유닛 집합을 매칭된 의미 유닛 집합으로서 특정하며, 상기 포착된 의미 유닛 집합 내의 상기 매칭된 의미 유닛 집합을 상기 개선용 의미 유닛 집합으로 교체하는 의미 유닛 개선부를 포함하는 장치.
- 적어도 하나의 의미 유닛을 포함하는 집합 - 상기 적어도 하나의 의미 유닛을 포함하는 상기 집합은 사용자의 발화에 따른 음성으로부터 포착된 의미 유닛 집함임 - 을 개선하기 위한 장치로서,사용자의 개선용 텍스트의 입력을 수신하는 수단, 및상기 개선용 텍스트에 기초하여 개선용 의미 유닛 집합을 특정하고, 상기 개선용 의미 유닛 집합과의 연관성에 기초하여, 상기 포착된 의미 유닛 집합 내에서 실제적인 개선의 대상이 되는 의미 유닛 집합을 매칭된 의미 유닛 집합으로서 특정하며, 상기 포착된 의미 유닛 집합 내의 상기 매칭된 의미 유닛 집합을 상기 개선용 의미 유닛 집합으로 교체하는 의미 유닛 개선부를 포함하는 장치.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911020246.4A CN110675866B (zh) | 2014-04-22 | 2015-04-22 | 用于改进至少一个语义单元集合的方法、设备及计算机可读记录介质 |
JP2016515299A JP2016521383A (ja) | 2014-04-22 | 2015-04-22 | 少なくとも一つの意味論的単位の集合を改善するための方法、装置およびコンピュータ読み取り可能な記録媒体 |
CN201580000567.1A CN105210147B (zh) | 2014-04-22 | 2015-04-22 | 用于改进至少一个语义单元集合的方法、设备及计算机可读记录介质 |
US14/779,037 US10395645B2 (en) | 2014-04-22 | 2015-04-22 | Method, apparatus, and computer-readable recording medium for improving at least one semantic unit set |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2014-0048315 | 2014-04-22 | ||
KR1020140048315A KR101651909B1 (ko) | 2014-04-22 | 2014-04-22 | 음성 인식 텍스트 수정 방법 및 이 방법을 구현한 장치 |
KR1020140077056 | 2014-06-24 | ||
KR10-2014-0077056 | 2014-06-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015163684A1 true WO2015163684A1 (ko) | 2015-10-29 |
Family
ID=54332775
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2015/004010 WO2015163684A1 (ko) | 2014-04-22 | 2015-04-22 | 적어도 하나의 의미론적 유닛의 집합을 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 |
Country Status (4)
Country | Link |
---|---|
US (1) | US10395645B2 (ko) |
JP (1) | JP2016521383A (ko) |
CN (2) | CN110675866B (ko) |
WO (1) | WO2015163684A1 (ko) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101704501B1 (ko) * | 2015-10-30 | 2017-02-09 | 주식회사 큐키 | 적어도 하나의 의미론적 유닛의 집합을 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 |
KR101830210B1 (ko) * | 2016-04-28 | 2018-02-21 | 네이버 주식회사 | 적어도 하나의 의미론적 유닛의 집합을 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 |
US20210280178A1 (en) * | 2016-07-27 | 2021-09-09 | Samsung Electronics Co., Ltd. | Electronic device and voice recognition method thereof |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102217292B1 (ko) * | 2015-02-26 | 2021-02-18 | 네이버 주식회사 | 적어도 하나의 의미론적 유닛의 집합을 음성을 이용하여 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 |
US10503467B2 (en) * | 2017-07-13 | 2019-12-10 | International Business Machines Corporation | User interface sound emanation activity classification |
CN108962228B (zh) * | 2018-07-16 | 2022-03-15 | 北京百度网讯科技有限公司 | 模型训练方法和装置 |
CN110827799B (zh) * | 2019-11-21 | 2022-06-10 | 百度在线网络技术(北京)有限公司 | 用于处理语音信号的方法、装置、设备和介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000105597A (ja) * | 1998-09-29 | 2000-04-11 | Atr Interpreting Telecommunications Res Lab | 音声認識誤り訂正装置 |
KR20120110751A (ko) * | 2011-03-30 | 2012-10-10 | 포항공과대학교 산학협력단 | 음성 처리 장치 및 방법 |
KR20130008663A (ko) * | 2011-06-28 | 2013-01-23 | 엘지전자 주식회사 | 사용자 인터페이스 방법 및 장치 |
KR101381101B1 (ko) * | 2013-11-13 | 2014-04-02 | 주식회사 큐키 | 문자열 사이의 연관성 판단을 통한 오타 수정 방법 |
Family Cites Families (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3663758A (en) * | 1970-03-24 | 1972-05-16 | Teaching Complements Inc | Speech pattern recognition system |
JPH01237597A (ja) * | 1988-03-17 | 1989-09-22 | Fujitsu Ltd | 音声認識訂正装置 |
JP2000010586A (ja) * | 1998-06-22 | 2000-01-14 | Nec Corp | 音声認識応答装置及び認識結果確認方法 |
US6302698B1 (en) * | 1999-02-16 | 2001-10-16 | Discourse Technologies, Inc. | Method and apparatus for on-line teaching and learning |
US7310600B1 (en) * | 1999-10-28 | 2007-12-18 | Canon Kabushiki Kaisha | Language recognition using a similarity measure |
JP3689670B2 (ja) * | 1999-10-28 | 2005-08-31 | キヤノン株式会社 | パターン整合方法及び装置 |
US6868383B1 (en) * | 2001-07-12 | 2005-03-15 | At&T Corp. | Systems and methods for extracting meaning from multimodal inputs using finite-state devices |
CN1235188C (zh) * | 2001-09-17 | 2006-01-04 | 皇家飞利浦电子股份有限公司 | 通过比较所识别的文本中的语音学序列与手动输入的校正词的语音学转换来校正通过语音识别而识别的文本 |
JP3762327B2 (ja) * | 2002-04-24 | 2006-04-05 | 株式会社東芝 | 音声認識方法および音声認識装置および音声認識プログラム |
US8793127B2 (en) * | 2002-10-31 | 2014-07-29 | Promptu Systems Corporation | Method and apparatus for automatically determining speaker characteristics for speech-directed advertising or other enhancement of speech-controlled devices or services |
TWI226600B (en) * | 2003-03-12 | 2005-01-11 | Leadtek Research Inc | Nasal detection method and device thereof |
US20060229878A1 (en) * | 2003-05-27 | 2006-10-12 | Eric Scheirer | Waveform recognition method and apparatus |
US20050071170A1 (en) | 2003-09-30 | 2005-03-31 | Comerford Liam D. | Dissection of utterances into commands and voice data |
US20060004570A1 (en) | 2004-06-30 | 2006-01-05 | Microsoft Corporation | Transcribing speech data with dialog context and/or recognition alternative information |
JP4301102B2 (ja) * | 2004-07-22 | 2009-07-22 | ソニー株式会社 | 音声処理装置および音声処理方法、プログラム、並びに記録媒体 |
US20060057545A1 (en) * | 2004-09-14 | 2006-03-16 | Sensory, Incorporated | Pronunciation training method and apparatus |
JP4784120B2 (ja) * | 2005-03-23 | 2011-10-05 | 日本電気株式会社 | 音声書き起こし支援装置及びその方法ならびにプログラム |
US20060292531A1 (en) * | 2005-06-22 | 2006-12-28 | Gibson Kenneth H | Method for developing cognitive skills |
US20070016421A1 (en) * | 2005-07-12 | 2007-01-18 | Nokia Corporation | Correcting a pronunciation of a synthetically generated speech object |
JP4734155B2 (ja) * | 2006-03-24 | 2011-07-27 | 株式会社東芝 | 音声認識装置、音声認識方法および音声認識プログラム |
WO2008021512A2 (en) | 2006-08-17 | 2008-02-21 | Neustar, Inc. | System and method for handling jargon in communication systems |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US7904298B2 (en) * | 2006-11-17 | 2011-03-08 | Rao Ashwin P | Predictive speech-to-text input |
US20110060587A1 (en) * | 2007-03-07 | 2011-03-10 | Phillips Michael S | Command and control utilizing ancillary information in a mobile voice-to-speech application |
WO2009040790A2 (en) * | 2007-09-24 | 2009-04-02 | Robert Iakobashvili | Method and system for spell checking |
US8332212B2 (en) * | 2008-06-18 | 2012-12-11 | Cogi, Inc. | Method and system for efficient pacing of speech for transcription |
WO2009158581A2 (en) * | 2008-06-27 | 2009-12-30 | Adpassage, Inc. | System and method for spoken topic or criterion recognition in digital media and contextual advertising |
US8782556B2 (en) * | 2010-02-12 | 2014-07-15 | Microsoft Corporation | User-centric soft keyboard predictive technologies |
US10522133B2 (en) | 2011-05-23 | 2019-12-31 | Nuance Communications, Inc. | Methods and apparatus for correcting recognition errors |
US8645825B1 (en) | 2011-08-31 | 2014-02-04 | Google Inc. | Providing autocomplete suggestions |
US8515751B2 (en) * | 2011-09-28 | 2013-08-20 | Google Inc. | Selective feedback for text recognition systems |
US9715489B2 (en) | 2011-11-10 | 2017-07-25 | Blackberry Limited | Displaying a prediction candidate after a typing mistake |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
KR20130135410A (ko) * | 2012-05-31 | 2013-12-11 | 삼성전자주식회사 | 음성 인식 기능을 제공하는 방법 및 그 전자 장치 |
US8606577B1 (en) | 2012-06-25 | 2013-12-10 | Google Inc. | Visual confirmation of voice recognized text input |
US8909526B2 (en) * | 2012-07-09 | 2014-12-09 | Nuance Communications, Inc. | Detecting potential significant errors in speech recognition results |
US9292621B1 (en) | 2012-09-12 | 2016-03-22 | Amazon Technologies, Inc. | Managing autocorrect actions |
WO2014042878A1 (en) * | 2012-09-12 | 2014-03-20 | Lingraphicare America Incorporated | Method, system, and apparatus for treating a communication disorder |
CN105027197B (zh) | 2013-03-15 | 2018-12-14 | 苹果公司 | 训练至少部分语音命令系统 |
US9489372B2 (en) | 2013-03-15 | 2016-11-08 | Apple Inc. | Web-based spell checker |
JP5893588B2 (ja) * | 2013-07-09 | 2016-03-23 | 京セラ株式会社 | 携帯端末、編集誘導プログラムおよび編集誘導方法 |
US9653073B2 (en) * | 2013-11-26 | 2017-05-16 | Lenovo (Singapore) Pte. Ltd. | Voice input correction |
CN103645876B (zh) * | 2013-12-06 | 2017-01-18 | 百度在线网络技术(北京)有限公司 | 语音输入方法和装置 |
-
2015
- 2015-04-22 CN CN201911020246.4A patent/CN110675866B/zh active Active
- 2015-04-22 JP JP2016515299A patent/JP2016521383A/ja active Pending
- 2015-04-22 WO PCT/KR2015/004010 patent/WO2015163684A1/ko active Application Filing
- 2015-04-22 US US14/779,037 patent/US10395645B2/en active Active
- 2015-04-22 CN CN201580000567.1A patent/CN105210147B/zh active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000105597A (ja) * | 1998-09-29 | 2000-04-11 | Atr Interpreting Telecommunications Res Lab | 音声認識誤り訂正装置 |
KR20120110751A (ko) * | 2011-03-30 | 2012-10-10 | 포항공과대학교 산학협력단 | 음성 처리 장치 및 방법 |
KR20130008663A (ko) * | 2011-06-28 | 2013-01-23 | 엘지전자 주식회사 | 사용자 인터페이스 방법 및 장치 |
KR101381101B1 (ko) * | 2013-11-13 | 2014-04-02 | 주식회사 큐키 | 문자열 사이의 연관성 판단을 통한 오타 수정 방법 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101704501B1 (ko) * | 2015-10-30 | 2017-02-09 | 주식회사 큐키 | 적어도 하나의 의미론적 유닛의 집합을 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 |
KR101830210B1 (ko) * | 2016-04-28 | 2018-02-21 | 네이버 주식회사 | 적어도 하나의 의미론적 유닛의 집합을 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 |
US20210280178A1 (en) * | 2016-07-27 | 2021-09-09 | Samsung Electronics Co., Ltd. | Electronic device and voice recognition method thereof |
Also Published As
Publication number | Publication date |
---|---|
CN105210147B (zh) | 2020-02-07 |
JP2016521383A (ja) | 2016-07-21 |
US10395645B2 (en) | 2019-08-27 |
CN110675866B (zh) | 2023-09-29 |
CN110675866A (zh) | 2020-01-10 |
US20170032778A1 (en) | 2017-02-02 |
CN105210147A (zh) | 2015-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015163684A1 (ko) | 적어도 하나의 의미론적 유닛의 집합을 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 | |
US5787230A (en) | System and method of intelligent Mandarin speech input for Chinese computers | |
JP4791984B2 (ja) | 入力された音声を処理する装置、方法およびプログラム | |
WO2020145439A1 (ko) | 감정 정보 기반의 음성 합성 방법 및 장치 | |
CN109686383B (zh) | 一种语音分析方法、装置及存储介质 | |
TW201337911A (zh) | 電子裝置以及語音識別方法 | |
WO2019208860A1 (ko) | 음성 인식 기술을 이용한 다자간 대화 기록/출력 방법 및 이를 위한 장치 | |
JP2018045001A (ja) | 音声認識システム、情報処理装置、プログラム、音声認識方法 | |
Yarra et al. | Indic TIMIT and Indic English lexicon: A speech database of Indian speakers using TIMIT stimuli and a lexicon from their mispronunciations | |
WO2016137071A1 (ko) | 적어도 하나의 의미론적 유닛의 집합을 음성을 이용하여 개선하기 위한 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 | |
CN112346696A (zh) | 虚拟助理的语音比较 | |
EP3241123B1 (en) | Voice recognition-based dialing | |
JP2015087544A (ja) | 音声認識装置及び音声認識プログラム | |
KR100868709B1 (ko) | 불리한 환경에서 동양 문자를 위한 하이브리드키패드/음성 인식 테크닉 | |
CN110890095A (zh) | 语音检测方法、推荐方法、装置、存储介质和电子设备 | |
JP6849977B2 (ja) | テキスト表示用同期情報生成装置および方法並びに音声認識装置および方法 | |
CN113096667A (zh) | 一种错别字识别检测方法和系统 | |
WO2020096078A1 (ko) | 음성인식 서비스를 제공하기 위한 방법 및 장치 | |
EP3742301A1 (en) | Information processing device and information processing method | |
WO2019208859A1 (ko) | 발음 사전 생성 방법 및 이를 위한 장치 | |
Dodiya et al. | Speech Recognition System for Medical Domain | |
KR102476497B1 (ko) | 언어 대응 화상 출력 장치, 방법 및 시스템 | |
JP3259734B2 (ja) | 音声認識装置 | |
JP2016191740A (ja) | 音声処理装置、音声処理方法およびプログラム | |
JPH11338862A (ja) | 電子辞書検索装置、電子辞書検索方法およびその方法を記録した記録媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 14779037 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 2016515299 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15782945 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17/02/2017) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15782945 Country of ref document: EP Kind code of ref document: A1 |