WO2023100236A1 - 音声認識装置、およびコンピュータ読み取り可能な記憶媒体 - Google Patents
音声認識装置、およびコンピュータ読み取り可能な記憶媒体 Download PDFInfo
- Publication number
- WO2023100236A1 WO2023100236A1 PCT/JP2021/043834 JP2021043834W WO2023100236A1 WO 2023100236 A1 WO2023100236 A1 WO 2023100236A1 JP 2021043834 W JP2021043834 W JP 2021043834W WO 2023100236 A1 WO2023100236 A1 WO 2023100236A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- command
- speech
- voice
- recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/01—Assessment or evaluation of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Definitions
- the present disclosure relates to speech recognition devices and computer-readable storage media.
- a speech recognition device that recognizes an uttered voice and converts it into text, and also calculates the reliability of the textualized information.
- a confirmation process is executed to request confirmation of the textualized information according to the calculated reliability (for example, Patent Document 1). That is, when the reliability is low, the speech recognition apparatus asks the user to confirm whether or not the information converted into text is correct, and when the reliability is high, the execution of the confirmation process is omitted.
- the reliability for example, Patent Document 1
- the speech recognition apparatus asks the user to confirm whether or not the information converted into text is correct, and when the reliability is high, the execution of the confirmation process is omitted.
- the purpose of the present disclosure is to provide a voice recognition device with improved convenience.
- a speech recognition device includes: a speech reception unit that receives speech information indicating one of a plurality of commands; a speech recognition of the one command based on the speech information received by the speech reception unit; and a recognition result of the one command.
- a speech recognition unit that calculates the reliability of a speech recognition unit; a condition storage unit that stores a plurality of conditions used for determining whether or not to execute a recognition result confirmation process in association with a plurality of commands; a judgment unit for judging whether or not to execute the confirmation step based on one of the stored conditions and the reliability calculated by the speech recognition unit; and the judgment unit judges not to execute the confirmation step.
- an output unit for outputting the recognition result without executing the confirmation step when the recognition process is performed.
- a computer-readable storage medium receives voice information indicating one of a plurality of commands, performs voice recognition of the one command based on the received voice information, and outputs the recognition result of the one command. Calculating a reliability, one of a plurality of conditions stored in association with a plurality of commands and used for determining whether or not to execute a recognition result confirmation step, and the calculated Determining whether or not to execute the confirmation process based on the reliability, and outputting the recognition result without executing the confirmation process if it is determined not to execute the confirmation process. store the command to
- FIG. 1 is a block diagram showing an example of functions of a speech recognition device
- FIG. 4 is a block diagram showing an example of functions of a speech recognition unit
- FIG. 4 is a diagram showing an example of commands and conditions stored in a condition storage unit
- It is a figure which shows an example of the confirmation screen displayed in a confirmation process.
- 4 is a flow chart showing an example of processing executed in a preparation stage
- 4 is a flow chart showing an example of processing executed in an operation stage
- FIG. 3 is a block diagram showing an example of functions of a speech recognition device having an updating unit;
- a speech recognition device is a device that performs speech recognition.
- Speech recognition is the process of converting uttered speech into text.
- the concept of speech recognition may also include converting spoken speech into computer understandable information.
- a speech recognition device is implemented, for example, in a numerical controller that controls industrial machinery.
- the speech recognition device may be implemented in a server, a PC (Personal Computer), or a portable tablet that is wired or wirelessly connected to the numerical controller.
- Industrial machinery includes machine tools, injection molding machines, wire electric discharge machines, and industrial robots.
- Machine tools are, for example, lathes, machining centers, drilling centers and multi-task machines.
- An embodiment in which a speech recognition device is implemented in a numerical controller that controls a machine tool will be described below.
- FIG. 1 is a block diagram showing an example of the hardware configuration of a machine tool equipped with a numerical controller.
- the machine tool 1 includes a numerical controller 2, an input/output device 3, a servo amplifier 4, a servo motor 5, a spindle amplifier 6, a spindle motor 7, an auxiliary device 8, and a microphone 9.
- the numerical controller 2 is a device that controls the machine tool 1 as a whole.
- the numerical controller 2 includes a hardware processor 201 , a bus 202 , a ROM (Read Only Memory) 203 , a RAM (Random Access Memory) 204 and a nonvolatile memory 205 .
- the hardware processor 201 is a processor that controls the entire numerical controller 2 according to the system program.
- a hardware processor 201 reads a system program stored in a ROM 203 via a bus 202 and performs various processes based on the system program.
- the hardware processor 201 controls the servomotor 5 and the spindle motor 7 based on the machining program.
- the hardware processor 201 is, for example, a CPU (Central Processing Unit) or an electronic circuit.
- the hardware processor 201 analyzes the machining program and outputs control commands to the servo motor 5 and the spindle motor 7 for each control cycle.
- a bus 202 is a communication path that connects each piece of hardware in the numerical controller 2 to each other. Each piece of hardware within the numerical controller 2 exchanges data via the bus 202 .
- the ROM 203 is a storage device that stores system programs and the like for controlling the numerical controller 2 as a whole.
- the ROM 203 may store a speech recognition program.
- a ROM 203 is a computer-readable storage medium.
- the RAM 204 is a storage device that temporarily stores various data.
- the RAM 204 functions as a work area for the hardware processor 201 to process various data.
- the nonvolatile memory 205 is a storage device that retains data even when the machine tool 1 is powered off and power is not supplied to the numerical controller 2 .
- the nonvolatile memory 205 stores, for example, machining programs and various parameters.
- Non-volatile memory 205 is a computer-readable storage medium.
- the non-volatile memory 205 is, for example, a memory backed up by a battery or an SSD (Solid State Drive).
- the numerical controller 2 further includes a first interface 206, an axis control circuit 207, a spindle control circuit 208, a PLC (Programmable Logic Controller) 209, an I/O unit 210, and a second interface 211. Prepare.
- a first interface 206 is an interface that connects the bus 202 and the input/output device 3 .
- the first interface 206 sends various data processed by the hardware processor 201 to the input/output device 3, for example.
- the input/output device 3 is a device that receives various data via the first interface 206 and displays various data. Also, the input/output device 3 receives input of various data and sends the various data to the hardware processor 201 via the first interface 206, for example.
- the input/output device 3 is, for example, a touch panel.
- the input/output device 3 is, for example, a capacitive touch panel.
- the touch panel is not limited to the capacitive type, and may be a touch panel of another type.
- the input/output device 3 is installed on a control panel (not shown) in which the numerical control device 2 is stored.
- the axis control circuit 207 is a circuit that controls the servo motor 5 .
- the axis control circuit 207 receives a control command from the hardware processor 201 and outputs a command for driving the servo motor 5 to the servo amplifier 4 .
- the axis control circuit 207 sends a torque command for controlling the torque of the servo motor 5 to the servo amplifier 4, for example.
- the servo amplifier 4 receives a command from the axis control circuit 207 and supplies current to the servo motor 5 .
- the servo motor 5 is driven by being supplied with current from the servo amplifier 4 .
- the servomotor 5 is connected to, for example, a ball screw that drives the tool post.
- a structure of the machine tool 1 such as a tool post moves in each control axis direction.
- the servomotor 5 incorporates an encoder (not shown) that detects the position of the control shaft and the feed speed. Position feedback information and speed feedback information indicating the position of the control axis detected by the encoder and the feed speed of the control axis, respectively, are fed back to the axis control circuit 207 .
- the axis control circuit 207 performs feedback control of the control axis.
- a spindle control circuit 208 is a circuit for controlling the spindle motor 7 .
- a spindle control circuit 208 receives a control command from the hardware processor 201 and sends a command for driving the spindle motor 7 to the spindle amplifier 6 .
- the spindle control circuit 208 sends, for example, a spindle speed command for controlling the rotational speed of the spindle motor 7 to the spindle amplifier 6 .
- the spindle amplifier 6 receives a command from the spindle control circuit 208 and supplies current to the spindle motor 7 .
- the spindle motor 7 is driven by being supplied with current from the spindle amplifier 6 .
- a spindle motor 7 is connected to the main shaft and rotates the main shaft.
- the PLC 209 is a device that executes the ladder program and controls the auxiliary equipment 8. PLC 209 sends commands to auxiliary equipment 8 via I/O unit 210 .
- the I/O unit 210 is an interface that connects the PLC 209 and the auxiliary device 8.
- the I/O unit 210 sends commands received from the PLC 209 to the auxiliary equipment 8 .
- the auxiliary device 8 is a device that is installed in the machine tool 1 and performs an auxiliary operation in the machine tool 1.
- the auxiliary equipment 8 operates based on commands received from the I/O unit 210 .
- the auxiliary equipment 8 may be equipment installed around the machine tool 1 .
- the auxiliary device 8 is, for example, a tool changer, a cutting fluid injection device, or an opening/closing door drive.
- a second interface 211 is an interface that connects the bus 202 and the microphone 9 .
- the second interface 211 sends audio information output from the microphone 9 to the hardware processor 201, for example.
- the microphone 9 is an acoustic device that acquires voice and converts the voice into voice information.
- audio information is an electrical signal.
- Microphone 9 sends audio information to hardware processor 201 via second interface 211 .
- FIG. 2 is a block diagram showing an example of functions of the speech recognition device 20 implemented in the numerical controller 2.
- the speech recognition device 20 includes a speech reception unit 21 , a speech recognition unit 22 , a condition storage unit 23 , a determination unit 24 , a confirmation execution unit 25 and an output unit 26 .
- the condition storage unit 23 is implemented by storing various data in the RAM 204 or the nonvolatile memory 205 .
- the voice reception unit 21 receives voice information of the voice uttered by the user.
- the voice uttered by the user includes, for example, commands for the numerical controller 2 .
- the user for example, utters a voice indicating one of the commands.
- the voice receiving unit 21 receives voice information indicating one of the commands.
- the audio reception unit 21 receives input of audio information from the microphone 9, for example.
- Speech information is, for example, an analog signal indicative of speech uttered by a speaker.
- the audio information may be a digital signal converted from an analog signal representing audio.
- the speech recognition unit 22 performs speech recognition of one command based on the speech information received by the speech reception unit 21, and calculates the reliability of the recognition result of the one command. That is, the voice recognition unit 22 recognizes what kind of command the voice information is.
- the details of the functions of the speech recognition unit 22 will be described.
- FIG. 3 is a block diagram showing an example of the functions of the speech recognition unit 22.
- the speech recognition unit 22 includes an acoustic model storage unit 221 , a dictionary storage unit 222 , a recognition processing unit 223 and a grammar storage unit 224 .
- the acoustic model storage unit 221 stores acoustic models for determining phonemes included in speech information.
- the acoustic model is used to extract features from the waveform of the voice uttered by the speaker and discriminate phonemes.
- the feature quantity is, for example, voice strength and frequency characteristics.
- An acoustic model is generated, for example, by performing machine learning using speech information of speech uttered by a speaker as training data.
- the acoustic model storage unit 221 may store a plurality of acoustic models corresponding to each language.
- the dictionary storage unit 222 stores dictionaries.
- the dictionary includes, for example, commands used when performing various operations or various settings on the numerical controller 2 .
- the dictionary may include technical terms used when various operations or various settings are performed on the numerical controller 2 .
- the recognition processing unit 223 uses an acoustic model to obtain the sequence of phonemes indicated by the speech information. For example, when a voice corresponding to "external interface" is uttered in Japanese, the recognition processing unit 223 obtains a sequence of Japanese phonemes "gaibuiNtafe:su" corresponding to "external interface". In addition, the recognition processing unit 223 uses the dictionary stored in the dictionary storage unit 222 to determine a character string matching the arrangement of phonemes and the arrangement of words. The recognition processing unit 223 determines, for example, that the phoneme sequence “gaibuiNtafe:su” matches the character string corresponding to the Japanese “external interface”.
- the grammar storage unit 224 stores a grammar model that defines rules for constructing sentences.
- a grammar model indicates the appearance probability of words in speech information. In other words, the grammar model describes the probability that one word will appear after another.
- a grammar model is used to evaluate whether a string of characters or a sequence of words is appropriate as a language.
- a grammar model is also referred to as a language model.
- the recognition processing unit 223 uses a dictionary and a grammar model to recognize speech information so that the speech information becomes a string of characters and words appropriate for the language. That is, the recognition processing unit 223 obtains candidates for suitable character strings and word sequences from the voice information. In other words, the recognition processing unit 223 performs voice recognition to recognize commands.
- the recognition processing unit 223 calculates the reliability of the obtained candidate.
- Reliability is a scale that indicates how reliable the obtained character strings and word sequences are. The reliability is obtained in a range of 0.0 or more and 1.0 or less. If the reliability is a small value, it means that many other candidates similar to the character string and word are obtained. On the other hand, when the reliability is a high value, it means that there are no or few other candidates similar to the character string and word. For example, the N-best method is used as a reliability calculation method.
- the condition storage unit 23 stores a plurality of conditions used for determining whether or not to execute the recognition result confirmation step, in association with a plurality of commands.
- the confirmation step is a step in which the user approves or denies the recognition result depending on whether the recognition result of the voice information is correct.
- a condition is, for example, a threshold. When the condition is a threshold value, the condition storage unit 23 stores a plurality of threshold values corresponding to each of the commands.
- FIG. 4 is a diagram showing an example of commands and conditions stored in the condition storage unit 23.
- the condition storage unit 23 stores transition commands, setting commands, drive commands, and approval commands, for example.
- a transition command is a command that transitions the display screen.
- Transition commands include home screen commands and network screen commands.
- the home screen command is a command for transitioning the display screen to the home screen.
- a network screen command is a command for transitioning a display screen to a network screen.
- a setting command is a command for setting a mode.
- Setting commands include automatic mode commands and manual mode commands.
- the automatic mode command is a command for setting the operation mode of the numerical controller 2 to the automatic mode.
- the manual mode command is a command for setting the operation mode of the numerical controller 2 to the manual mode.
- a drive command is a command for driving at least one of the main axis and each control axis.
- Drive commands include a start command and a stop command.
- the start command is a command to start driving at least one of the main axis and each control axis.
- the stop command is a command to stop driving at least one of the main axis and each control axis.
- the approval command is a command for approving or denying the confirmation items on the confirmation screen.
- the confirmation screen is a screen that displays confirmation items on the display screen. Accept commands include yes and no commands.
- the Yes command is a command for approving the confirmation items.
- a NO command is a command for denying the confirmation item.
- the condition storage unit 23 stores a plurality of conditions respectively corresponding to these commands.
- the condition stored in the condition storage unit 23 is, for example, a threshold to be compared with the reliability of the recognition result calculated by the speech recognition unit 22 .
- the condition storage unit 23 stores, for example, 0.6 as the condition corresponding to the transition command. Also, the condition storage unit 23 stores 0.7 as a condition corresponding to the setting command. Also, the condition storage unit 23 stores 0.8 as the condition corresponding to the drive command. Also, the condition storage unit 23 stores 0.9 as the condition corresponding to the approval command.
- the determination unit 24 determines whether or not to execute the confirmation process based on one of the conditions stored in the condition storage unit 23 and the reliability calculated by the voice recognition unit 22 . For example, when the voice information is recognized as a transition command, the determination unit 24 compares the reliability calculated by the voice recognition unit 22 with the condition "0.6" stored in association with the transition command. to determine whether or not to execute the confirmation step.
- the determination unit 24 determines not to perform the confirmation process when the reliability calculated by the speech recognition unit 22 is 0.6 or more. Further, the determination unit 24 determines to execute the confirmation process when the reliability calculated by the voice recognition unit 22 is less than 0.6.
- the determination unit 24 determines not to execute the confirmation process. Further, when the command recognized by the voice recognition unit 22 is a setting command and the calculated reliability is less than 0.7, the determination unit 24 determines to execute the confirmation process.
- the determination unit 24 determines not to execute the confirmation process. Further, when the command recognized by the voice recognition unit 22 is a drive command and the calculated reliability is less than 0.8, the determination unit 24 determines to execute the confirmation process.
- the determination unit 24 determines not to execute the confirmation process. Further, when the command recognized by the voice recognition unit 22 is an approval command and the calculated reliability is less than 0.9, the determination unit 24 determines to execute the confirmation process.
- the confirmation execution unit 25 executes the confirmation process when the determination unit 24 determines to execute the confirmation process.
- FIG. 5 is a diagram showing an example of a confirmation screen displayed in the confirmation process.
- FIG. 5 shows an example of a confirmation screen when the voice recognition unit 22 recognizes that the voice information indicates the "home screen".
- the user enters the approval information when the confirmation screen is displayed.
- Approval information is information indicating approval or disapproval of the recognition result. For example, the user speaks a voice "yes" to indicate approval or a voice "no" to indicate denial. These voices are entered as authorization information.
- the voice receiving unit 21 receives approval information indicating approval or disapproval of the recognition result of voice information.
- the speech recognition unit 22 performs speech recognition of the approval information received by the speech reception unit 21 and calculates the reliability of the recognition result of the approval information. That is, the speech recognition unit 22 recognizes the approval information as "yes” or "no" and calculates the reliability of this recognition result.
- the approval information may be the same as the approval command stored in the condition storage unit 23 .
- the determination unit 24 determines whether the voice recognized before the confirmation step is based on the conditions stored in association with the approval command stored in the condition storage unit 23 and the reliability of the approval information calculated by the voice recognition unit 22. Determines whether or not to output the information recognition result. In other words, the determination unit 24 determines whether or not to output the recognition result of the command uttered by the user according to the recognition result of the approval information and whether or not the reliability of the approval information satisfies the conditions.
- the determination unit 24 determines whether the command uttered by the user It is determined to output the recognition result. In this case, the user has confirmed that the recognition result of the command recognized by the voice recognition unit 22 is correct.
- the determination unit 24 It is determined not to output the recognition result of the voice information obtained. In this case, the user has confirmed that the recognition result of the command recognized by the voice recognition unit 22 is erroneous.
- the determination unit 24 determines the recognition result as Decide not to output. In these cases, it means that it is not reliably clear whether the recognition result of the authorization information is correct or incorrect.
- the output unit 26 outputs the recognition result when the determination unit 24 determines to output the recognition result of the command.
- the output unit 26 outputs the recognition result to a control unit (not shown) of the numerical control device 2, for example. Thereby, the control unit can execute the command indicated by the recognition result. Also, the output unit 26 may display the command indicated by the recognition result on the display screen of the input/output device 3 .
- the voice recognition unit 22 may again accept the voice information indicating the command. As a result, even if the speech recognition device 20 once fails in speech recognition, the speech recognition device 20 can perform speech recognition of the command again.
- FIG. 6 is a flowchart showing an example of processing executed in the preparation stage of the speech recognition device 20.
- FIG. In the preparation stage, first, grammar models, dictionaries, and acoustic models generated according to a plurality of commands are installed in speech recognition device 20 (step SA1). That is, the acoustic model storage unit 221 stores acoustic models. Also, the dictionary storage unit 222 stores a dictionary. Also, the grammar storage unit 224 stores grammar models. The commands are designed according to, for example, the machine tool 1, the factory where the machine tool 1 is installed, and the workers.
- step SA2 a plurality of conditions used for determining whether or not to execute the confirmation process are implemented in the speech recognition device 20.
- the condition storage unit 23 stores a plurality of conditions used for determining whether or not to execute the confirmation process in association with a plurality of commands. This completes the preparation stage processing.
- FIG. 7 is a flowchart showing an example of processing executed in the operation stage of the speech recognition device 20.
- the voice reception unit 21 receives voice information indicating one of a plurality of commands (step SB1).
- the voice recognition unit 22 performs voice recognition of one command and calculates the reliability of the recognition result of the one command (step SB2).
- the determination unit 24 determines whether or not to execute the confirmation step (step SB3).
- the output unit 26 outputs the recognition result (step SB4), and the process ends.
- the confirmation execution unit 25 executes the confirmation step.
- the confirmation execution unit 25 displays the recognition result on the display screen (step SB5).
- voice reception unit 21 receives the approval information (step SB6).
- step SB7 If the approval information indicates "yes” and the reliability of the approval information satisfies the conditions (Yes in step SB7), the output unit 26 outputs the recognition result (step SB4), and the process ends.
- voice reception unit 21 Accept voice information.
- the speech recognition apparatus 20 includes the speech reception unit 21 that receives speech information indicating one of a plurality of commands, and speech recognition of the one command based on the speech information received by the speech reception unit 21. and a speech recognition unit 22 for calculating the reliability of the recognition result of one command, and a plurality of conditions used for determining whether or not to execute the recognition result confirmation step are stored in association with each of the plurality of commands. and one of the plurality of conditions stored in the condition storage unit 23 and the reliability calculated by the speech recognition unit 22, it is determined whether or not to execute the confirmation step.
- a determination unit 24 and an output unit 26 that outputs a recognition result without performing the confirmation process when the determination unit 24 determines not to perform the confirmation process.
- the speech recognition device 20 can improve user convenience. Specifically, the confirmation process is reduced according to the reliability of the command recognition result. In other words, the number of times the user operates the speech recognition device 20 is reduced.
- the speech recognition device 20 may further include an updating unit that updates the conditions stored in the condition storage unit 23.
- FIG. 8 is a block diagram showing an example of the functions of the speech recognition device 20 provided with an updating unit.
- functions different from the functions of the speech recognition device 20 explained using FIG. 2 will be explained, and explanations of the same functions as those of the speech recognition device 20 of FIG. 2 will be omitted.
- the update unit 27 updates one condition corresponding to one command based on the approval information received by the voice reception unit 21 in the confirmation process.
- the approval information indicating "yes” in the confirmation step is If input, the recognition result of the setting command by the speech recognition unit 22 is correct. In other words, even if the value of the condition stored in correspondence with the setting command is changed to "0.6", which is less than the calculated reliability, no problem will occur. Therefore, when the voice receiving unit 21 receives approval information indicating approval in the confirmation step, the updating unit 27 can change the numerical value indicated by the condition stored corresponding to the setting command to be smaller. By updating the conditions by the update unit 27, the confirmation process can be omitted next time the speech recognition unit 22 recognizes the setting command.
- the update unit 27 reduces the value of the condition by, for example, a predetermined numerical value in one update.
- the updating unit 27 may further update the one condition based on the acceptance history of the approval information.
- the acceptance history of approval information is a past record of acceptance of approval information indicating approval or disapproval in the confirmation process by the voice receiving unit 21 .
- the update unit 27 updates the condition stored corresponding to the drive command.
- the updating unit 27 can update so as to decrease the value indicated by the condition stored corresponding to the drive command.
- the update unit 27 may further update one condition based on the system information.
- the system information is, for example, time information that the speech recognition device 20 has as system information. For example, during the daytime, there is a high possibility that a managerial person such as a factory manager is on duty. In other words, there is a high possibility that a person who can take responsibility for changing the conditions stored in the condition storage unit 23 is on duty. Therefore, for example, between 9:00 am and 5:00 pm, the updating unit 27 can change the conditions stored in the condition storage unit 23 .
- the voice receiving unit 21 receives approval information indicating approval or disapproval as voice information in the confirmation process.
- the approval information may be received by operating the display screen of the input/output device 3 .
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Numerical Control (AREA)
- User Interface Of Digital Computer (AREA)
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023564294A JP7820405B2 (ja) | 2021-11-30 | 2021-11-30 | 音声認識装置、およびコンピュータ読み取り可能な記憶媒体 |
| CN202180104465.XA CN118302807A (zh) | 2021-11-30 | 2021-11-30 | 语音识别装置以及计算机可读取的存储介质 |
| PCT/JP2021/043834 WO2023100236A1 (ja) | 2021-11-30 | 2021-11-30 | 音声認識装置、およびコンピュータ読み取り可能な記憶媒体 |
| DE112021008175.6T DE112021008175T5 (de) | 2021-11-30 | 2021-11-30 | Spracherkennungsvorrichtung und computerlesbares speichermedium |
| US18/709,812 US20250014576A1 (en) | 2021-11-30 | 2021-11-30 | Speech recognition device and computer-readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2021/043834 WO2023100236A1 (ja) | 2021-11-30 | 2021-11-30 | 音声認識装置、およびコンピュータ読み取り可能な記憶媒体 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2023100236A1 true WO2023100236A1 (ja) | 2023-06-08 |
| WO2023100236A9 WO2023100236A9 (ja) | 2024-03-14 |
Family
ID=86611709
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/043834 Ceased WO2023100236A1 (ja) | 2021-11-30 | 2021-11-30 | 音声認識装置、およびコンピュータ読み取り可能な記憶媒体 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20250014576A1 (https=) |
| JP (1) | JP7820405B2 (https=) |
| CN (1) | CN118302807A (https=) |
| DE (1) | DE112021008175T5 (https=) |
| WO (1) | WO2023100236A1 (https=) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2005181386A (ja) * | 2003-12-16 | 2005-07-07 | Mitsubishi Electric Corp | 音声対話処理装置及び音声対話処理方法並びにプログラム |
| JP2007041319A (ja) * | 2005-08-03 | 2007-02-15 | Matsushita Electric Ind Co Ltd | 音声認識装置および音声認識方法 |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5566272A (en) * | 1993-10-27 | 1996-10-15 | Lucent Technologies Inc. | Automatic speech recognition (ASR) processing using confidence measures |
| US5717826A (en) * | 1995-08-11 | 1998-02-10 | Lucent Technologies Inc. | Utterance verification using word based minimum verification error training for recognizing a keyboard string |
| DE19842405A1 (de) * | 1998-09-16 | 2000-03-23 | Philips Corp Intellectual Pty | Spracherkennungsverfahren mit Konfidenzmaßbewertung |
| US20060074664A1 (en) * | 2000-01-10 | 2006-04-06 | Lam Kwok L | System and method for utterance verification of chinese long and short keywords |
| JP5725028B2 (ja) * | 2010-08-10 | 2015-05-27 | 日本電気株式会社 | 音声区間判定装置、音声区間判定方法および音声区間判定プログラム |
-
2021
- 2021-11-30 DE DE112021008175.6T patent/DE112021008175T5/de active Pending
- 2021-11-30 US US18/709,812 patent/US20250014576A1/en active Pending
- 2021-11-30 CN CN202180104465.XA patent/CN118302807A/zh active Pending
- 2021-11-30 WO PCT/JP2021/043834 patent/WO2023100236A1/ja not_active Ceased
- 2021-11-30 JP JP2023564294A patent/JP7820405B2/ja active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2005181386A (ja) * | 2003-12-16 | 2005-07-07 | Mitsubishi Electric Corp | 音声対話処理装置及び音声対話処理方法並びにプログラム |
| JP2007041319A (ja) * | 2005-08-03 | 2007-02-15 | Matsushita Electric Ind Co Ltd | 音声認識装置および音声認識方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| DE112021008175T5 (de) | 2024-08-08 |
| JP7820405B2 (ja) | 2026-02-25 |
| US20250014576A1 (en) | 2025-01-09 |
| CN118302807A (zh) | 2024-07-05 |
| WO2023100236A9 (ja) | 2024-03-14 |
| JPWO2023100236A1 (https=) | 2023-06-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP4657736B2 (ja) | ユーザ訂正を用いた自動音声認識学習のためのシステムおよび方法 | |
| EP2309489B1 (en) | Methods and systems for considering information about an expected response when performing speech recognition | |
| JP5093963B2 (ja) | 置換コマンドを有する音声認識方法 | |
| US10068566B2 (en) | Method and system for considering information about an expected response when performing speech recognition | |
| US6934682B2 (en) | Processing speech recognition errors in an embedded speech recognition system | |
| CN111844085B (zh) | 机器人示教装置 | |
| EP1376537B1 (en) | Apparatus, method, and computer-readable recording medium for recognition of keywords from spontaneous speech | |
| US20080177542A1 (en) | Voice Recognition Program | |
| JP7750950B2 (ja) | 音声認識装置 | |
| JP7820405B2 (ja) | 音声認識装置、およびコンピュータ読み取り可能な記憶媒体 | |
| US20250124917A1 (en) | Voice recognition device and computer-readable recording medium | |
| JP2017161581A (ja) | 音声認識装置、音声認識プログラム | |
| KR20210074649A (ko) | 음향정보와 텍스트정보를 이용하여 자연어 문장에서 응대 여부를 판단하는 음성인식 방법 | |
| JP7791215B2 (ja) | 文法作成支援装置、及びコンピュータが読み取り可能な記憶媒体 | |
| EP0138166A1 (en) | Pattern matching apparatus | |
| WO2023139769A1 (ja) | 文法調整装置、及びコンピュータが読み取り可能な記憶媒体 | |
| WO2025191655A1 (ja) | 言語切替装置、及びコンピュータが読み取り可能な記憶媒体 | |
| JPH064264A (ja) | 音声入出力システム | |
| JP7542826B2 (ja) | 音声認識プログラム及び音声認識装置 | |
| TWI735168B (zh) | 語音控制機器人 | |
| JP2018537734A (ja) | ファクトリーオートメーションシステムおよびリモートサーバ | |
| WO2023139771A1 (ja) | 情報生成装置、およびコンピュータ読み取り可能な記憶媒体 | |
| JPH09212190A (ja) | 音声認識装置及び文認識装置 | |
| WO2025196891A1 (ja) | 音声処理装置、及びコンピュータ読み取り可能な記録媒体 | |
| JP2000267691A (ja) | 音声認識システムにおける認識辞書選択方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21966322 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023564294 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 112021008175 Country of ref document: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18709812 Country of ref document: US |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202180104465.X Country of ref document: CN |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21966322 Country of ref document: EP Kind code of ref document: A1 |