MXPA00000098A

MXPA00000098A - Digital cellular phone with voice recognition function and method for controlling the same

Info

Publication number: MXPA00000098A
Application number: MXPA/A/2000/000098A
Authority: MX
Inventors: Seo Yong Chin; Jang Ki Shin; Joung Kyou Park
Original assignee: Samsung Electronics Co Ltd
Priority date: 1997-07-04
Filing date: 2000-01-03
Publication date: 2000-09-08

Abstract

A digital cellular phone with a voice recognition function recognizes a voice signal using components included therein. A vocoder compresses a voice signal input from a microphone to output packet data. A nonvolatile memory stores the packet data and feature data corresponding thereto. A voice recognition device extracts the feature data from the packet data output from the vocoder, and compares the feature data with feature data registered in the nonvolatile memory to detect the registered feature data similar to the input feature data and a difference value therebetween to determine whether an input voice signal is recognized successfully depending on the difference value.

Description

DIGITAL CELLULAR TELEPHONE WITH VOICE RECOGNITION FUNCTION AND METHOD FOR THE CONTROL OF THE SAME BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a digital cellular phone, and in particular, to a digital cellular telephone having speech recognition capability and a method for controlling the same. 2. Description of the Related Art In general, a speech recognition apparatus that extracts characteristics as a frequency characteristic of a speech input signal to recognize the input speech. The speech recognition apparatus requires significant power processing to process a greater number of voice signals. The amount of processing power needed could overload a normal digital cellular phone. Thus, the conventional speech recognition apparatus is unsuitable for a conventional digital cellular phone. A known voice recognition method to solve the problem of phone overload REF: 32388 digital cell phone uses a hands-free device with speech recognition function. Hands-free equipment includes a digital signal processor (DSP) and non-volatile memory (for example, flash memory or EEPROM (Electrically Erasable and Programmable Memory Only for reading)). The DSP in hands-free equipment processes compresses the voice signal or the original voice signal to recognize the input voice, and provides the recognized voice signal to the cell phone. In this way, the hands-free equipment recognizes the voice for a telephone number preferred by the user, and the buttons of the cellular telephone dial the telephone number according to the recognized voice signal provided from the hands-free equipment. Figure 1 shows a block diagram of a conventional speech recognition apparatus which can be installed on the hands-free equipment. As illustrated, an analog signal input from a microphone 30 is converted into a PCM digital signal (Pulse Code Modulation) by an analog-digital signal converter 20 (A / D), and provided to a processor 10 which performs the speech recognition function. The processor 10 can be realized by a microcircuit (chip) 80186 or a chip of DSP.

This conventional speech recognition apparatus has drawbacks including: (1) it requires significant processing, being unfit for the digital cell phone; (2) the processing requirement of the voice recognition apparatus, presents a severe processing load on the cell phone and can obstruct the operation of the cell phone; (3) the speech recognition apparatus requires a separate memory for the speech recognition function. Therefore, hands-free equipment requires an independent non-volatile memory such as an EEPROM; (4) the speech recognition apparatus requires a separate processor such as a DSP to perform the speech recognition function; and (5) if the voice recognition apparatus is installed in the hands-free equipment, speech recognition can only be carried out through the hands-free equipment. Thus, when it is separated from the hands-free equipment, the cell phone can not recognize the voice.

BRIEF DESCRIPTION OF THE INVENTION It is therefore an object of the present invention to provide a digital cellular telephone with a voice recognition function, capable of recognizing a voice signal using a hardware included therein, and a method of controlling the same. . To achieve the above objective, the present invention provides a cellular phone with a voice recognition function having a vocoder or voice coder for compressing a voice signal received from a microphone to an output data packet. In the cell phone, a non-volatile memory stores the data of the packet and the corresponding characteristic data thereof. A voice recognition device extracts the characteristic data from the broadcast data package - from the vocoder, and compares the characteristic data with the characteristic data recorded in the non-volatile memory - to detect the registered characteristic data similar to the input characteristic data and a difference value between them. A microprocessor stores the data packet and the characteristic data in the non-volatile memory in the voice recording mode, and receives an index for the similar characteristic data and a difference value of the speech recognition device in the speech recognition mode. voice to determine if 'an incoming speech signal is recognized successfully.

BRIEF DESCRIPTION OF THE DRAWINGS The above objectives and others, the features and advantages of the present invention will become clearer from the following detailed description when taken together with the accompanying drawings in which: Figure 1 is a block diagram of a conventional speech recognition apparatus; Figure 2 is a block diagram of a digital cellular phone with a speech recognition function according to a mode of the present invention; Figure 3 is a diagram illustrating a memory map of a first memory (60) of Figure 2; and Figure 4 is a flow diagram for recording and recognizing a speech signal according to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED MODALITIES A preferred embodiment of the present invention will be described herein in detail below with reference to the accompanying drawings. For a comprehensive understanding of the present invention, it will be described illustratively, confined to the specific embodiment.

It should be noted that the present invention can be carried out by any person skilled in the art with the description. In the following description, well-known functions or constructions which may confuse the present invention in unnecessary details are not described in detail. For example, Figure 2 illustrates a digital cellular phone with a speech recognition function according to one embodiment of the present invention. An RF (Radio Frequency) circuit and a DTMF (Dual Frequency Multiple Frequency) circuit could be included in Figure 2 however they are not shown because they do not relate to the essence of the present invention. With reference to Figure 2, an analog speech signal from a microphone 30 is converted to a digital PCM signal by an A / D converter 20. A vocoder 45 compresses the output signal PCM of the A / D converter 20 and emitted to the PKT data package. In a CDMA cellular telephone, the vocoder 45 can be realized by a QCELP encoder of dKbps (Qualcomm Code Excited Linear Prediction), a QCELP encoder of 13-Kbps, or an EVRC encoder of 8Kbps (Enhanced Variable Coded Rate). In a GSM cell phone (Global System for Mobile Communications), the vocoder 45 can be realized by an RPE-LTP (Regular Pulse Excitation with Long Term Prediction) encoder. The PKT data packet issued from the vocoder 45 is applied to a microprocessor 50 which controls the global operations of the cellular phone. A first memory 60 is a non-volatile memory (e.g., a flash memory or EEPROM) stores the data and software programs including a control program and the initial service data. A second memory 65 is a RAM (Random Access Memory) for storing the data temporarily including the data packet for a voice signal to be recorded or recognized and several of the data generated during the operation of the cellular phone. A voice recognition device 85 extracts the characteristic data of the input speech signals and the emissions of the characteristic data, preferably at a transfer rate of several tens to several hundred bytes per second. The characteristic data includes the frequency characteristic and the intensity of the input speech signal. The speech recognition device 85 may be carried out by the hardware or the set of programs. In the case of the speech recognition device 85 is performed by the set of programs, the program of the set of programs for carrying out the speech recognition device 85 can be stored in the first memory 60. The microprocessor 50 releases the pack of PKT data - issued from the vocoder 45 to the recognition device 85. The speech recognition device generates and outputs the characteristic data to the microprocessor 50. The microprocessor 50 extracts the reference characteristic data previously recorded or stored in the first memory 60 and compares them with the characteristic data of the voice recognition device 85. From the comparison, the microprocessor decides and dials the telephone number corresponding to the chosen reference characteristic data. Preferably, the decision of the comparison is based on a difference value between the two characteristic data. Additionally, the microprocessor 50 stores the transmitted data packet from the vocoder 45 in a specific storage area of the first memory 60, and reads it from the first memory 60 when it informs the user that voice recognition was completed. For convenience, the reading of the data packet is called as VP voice reproduction data. The vocoder 45 converts the voice reproduction data VP to a PCM signal and applies it to a digital to analog signal converter (D / A) 75, which converts the PCM input signal to a similar signal and outputs the analog signal converted to a horn 80. In place of the VP voice playback data, a voice message informing the termination of the voice recognition may also be stored in the first memory 60. A hands-free equipment connector 500 connects the hands-free equipment to the cell phone to transfer a signal received from a microphone of the hands-free equipment to the vocoder 45 by means of the A / D converter 20. Additionally, when connected to the hands-free equipment, the hands-free equipment connector 500 cuts a signal path between a cell phone microphone and the vocoder 45. Figure 3 shows a map of the memory of the first memory 60 according to one embodiment of the present invention. the first memory 60 is divided into a first storage area SAI for the control program, a second storage area SA2 for the characteristic data, a third storage area SA3 for the reproduced voice data, a fourth storage area SA4 for the telephone number, and a fifth storage area SA5 for the voice message. A reference character ADD denotes an address of the signal received from the microprocessor 50.

Figure 4 is a flow diagram for recording and recognizing a speech signal according to an embodiment of the present invention. To dial a phone number by voice, the cell phone user will press a voice tag key. In the detection of important data by voice tagging, the microprocessor 50 will enter a speech recognition mode in step 4a. After pressing the voice tag key, the user will press a voice registration key to register a name not registered in the first memory 60 or press a voice recognition key. To voice dial a telephone number for a registered name whoever he wants to call. Then, the microprocessor 50 determines in step 4b which of these keys the user has pressed. If the user has pressed the voice registration key, the microprocessor 50 verifies in step 4c whether the valid data packet for the user's voice is received from the vocoder 45. If the valid data packet is received, the microprocessor 50 provides the input data packet to the speech recognition device 85 in step 4d, and stores the data packet in the third storage area SA3 of the first memory 60 as the. reproduced voice data VP in step 4e.

After this, the microprocessor 50 verifies in step 4f whether the characteristic data for the input voice is received from the speech recognition device 85. If the characteristic data is received, the microprocessor 50 stores the input characteristic data in the second one. storage area SA2 of the first memory 60. It was noted that the sequence of steps 4e and 4f can be reversed or these two steps can be performed in parallel. If the user has pressed the voice recognition key in step 4b, the microprocessor 50 verifies in step 4h whether the valid data packet for the user's voice is received from the vocoder 45. If the valid data packet is received, the microprocessor 50 provides the data of the input packet to the speech recognition device 85 in step 4i. After this, the microprocessor 50 verifies in step 4j whether the characteristic data for the input voice is an input of the speech recognition device 85. In the reception of the characteristic data, the microprocessor 50 temporarily stores them in the second memory. 65. Additionally, in step 4j, the microprocessor 50 checks whether an index for the similar characteristic data and a difference value are received from the speech recognition device 85. Here, the index for the similar characteristic data refers to an index for the characteristic data "- recorded in the first memory 60 which is similar to the characteristic data for the current input voice, and the difference value refers to a difference value between the registered characteristic data and the characteristic data of the speech recognition device 85. In the reception of the index and the difference value, the microprocessor 50 verified a in step 4 if the difference value is smaller than a threshold value or an allowable error interval. If the difference value is smaller than the threshold value, the microprocessor 50 outputs the voice reproduction data to the horn 80 according to the index in step 41, judging that the input voice is correctly recognized. However, if the difference value is' equal to or greater than the threshold value, the microprocessor 50 reads - from the fifth storage area SA5 of the first memory 60 a message of 'voice that informs the voice. input is not registered in the cell phone and provides reading of the voice message to vocoder 45, in step 4m. Then, the reading of the voice message of the first memory 60 is processed by the vocoder 45, converted to a similar signal by the D / A converter 75, and emitted to the horn 80. In addition, during the voice registration process, the corresponding telephone number is also registered in the fourth storage area SA4 of the first memory 60, so that the microprocessor 50 can read and can dial the telephone number registered by means of the DTMF circuit (not shown) when the user receives the registered voice. Preferably, the speech recognition device 85 can extract two or more characteristic data sets for the same voice and store them in the second storage area SA2 of the first memory 60, to improve the reliability of the speech recognition function. As described above, the cellular phone of the invention uses the vocoder data packet so that it can, with a simple operation, recognize the voice. Additionally, the cell phone uses the structure in the vocoder and memory for speech recognition. Advantageously, the cell phone has integrated speech recognition capabilities which can be structured solidly. The external hands-free equipment can be selectively distributed within it. While this invention has been described in relation to what is considered at this time to be the most practical and preferred embodiment, it will be understood that the invention is not limited to the described modality, however, on the contrary, it is thought that it covers several modifications within the spirit and scope of the appended claims.

It is noted that in relation to this date, the best method known to the applicant to carry out the aforementioned invention, is the conventional one for the manufacture of the objects or products to which it refers. Having described the invention as above, the content of the following is claimed as priority:

Claims

1. A digital cellular phone having a vocoder or voice coder for compressing a voice signal received through a microphone, characterized in that it comprises: a first means for receiving a data packet as a vocoder input and extracting the characteristic data from the packet of data; and a second means for recording the characteristic data extracted from a memory, comparing the registered characteristic data with the characteristic data for an input speech signal, and recognizing the "input signal" if the registered characteristic data is similar to the data characteristics of the input speech signal.

2. The digital cellular phone according to claim 1, characterized in that it additionally comprises: a non-volatile memory for storing the data packet and the characteristic data corresponding to the data packet therein; and a user interface unit for selecting a voice recording mode or a speech recognition mode.

3. The digital cellular phone according to claim 2, characterized in that the first means is a speech recognition device for comparing the extracted characteristic data with the characteristic data recorded in the non-volatile memory to detect the registered characteristic data similar to the characteristic data extracted and a difference value between the extracted characteristic data and the registered characteristic data.

4. The digital cellular telephone according to claim 3, characterized in that the second means is a microprocessor for storage, in the non-volatile memory, the data packet and the characteristic data for the data packet in the voice recording mode, and receiving an index for similar characteristic data and a difference value of the speech recognition device in speech recognition mode to determine whether an incoming speech signal is recognized successfully.

5. The digital cellular phone according to claim 2, characterized in that it additionally comprises a hands-free equipment connector for transferring a voice signal received from a microphone of a hands-free equipment to the vocoder, where when connecting to the hands-free equipment , the hands-free equipment connector cuts a signal path between a cell phone microphone and the vocoder.

6. The digital cellular phone according to claim 5, characterized in that it additionally comprises: an analog-to-digital signal converter for converting the voice signal emitted from the microphone of the hands-free equipment and the microphone of the cellular telephone to a digital signal and emitting the digital signal converted to the vocoder; and a digital to analog signal converter for converting a digital signal emitted from the vocoder to a similar signal and emitting the analog signal converted to a speaker.

7. The digital cellular phone according to claim 4, characterized in that the non-volatile memory stores the telephone number data corresponding to the data packet for the incoming speech signal.

8. The digital cellular phone according to claim 7, characterized in that the microprocessor controls the dialing of a telephone number corresponding to the data of the telephone number, if the incoming speech signal is recognized successfully.

9. The digital cellular phone according to claim 4, characterized in that the microprocessor reads the reproduced voice data from the non-volatile memory according to the index for the similar characteristic data and provides the reproduced voice data read from the vocoder to the reproduction of the input voice signal through a speaker.

10. The digital cellular phone according to claim 4, characterized in that the microprocessor reads a voice message that reports the success or failure of the voice recognition of the non-volatile memory and provides the message of the voice read to the vocoder for the emission of the voice message through a speaker.

11. A method of speech recognition in a digital cellular telephone having a memory and a vocoder, characterized in that it comprises the steps of: extracting the characteristic data from the data packet of emitted from the vocoder; record the characteristic data extracted in the memory; comparing the extracted characteristic data with the characteristic data previously recorded in the memory, and determining - then that the input speech signal is recognized successfully if the registered characteristic data is similar to the characteristic data for the input speech signal.

12. A method for controlling a cellular telephone with a voice recognition function, characterized in that it comprises the steps of: changing an operational mode of the cellular telephone to a voice recognition mode; verify if a user presses a voice registration key or a voice recognition key; if the user presses the voice registration key, providing the data packet for an input speech signal from a vocoder to a speech recognition device, storing the reproduced voice data in a non-volatile memory, and recording the data characteristics for the data packet received from the speech recognition device in the non-volatile memory; and if- the user presses the voice recognition key, providing the data packet for the speech signal input to the speech recognition device, extracting the characteristic data for the data packet from the speech recognition device, reading the registered characteristic data similar to the characteristic data for the data packet and a difference value between them from the non-volatile memory, and determining whether or not the incoming speech signal that is recognized successfully depends on the value Of diference.

13. The method of compliance as claimed in claim 12, characterized in that addition-ally comprises the steps of: registering a telephone number corresponding to the voice signal entering the non-volatile memory in the registration mode; and dialing the telephone number registered in the non-volatile memory in recognition mode, if the incoming speech signal is recognized successfully.

14. The method according to claim 12, characterized in that it additionally comprises the step - of changing the operational mode to an inactive mode if the speech recognition. failure.

15. A cell phone that has a vocoder to compress a received voice signal - from a microphone to a data packet. of output and a microprocessor for controlling a cell phone operation, characterized in that it comprises: a non-volatile memory for storing the data packet and the characteristic data corresponding thereto; an interface unit of a user with which a user chooses a voice recording mode and a voice recognition mode; a speech recognition device for extracting the characteristic data from the data packet issued from the vocoder in the voice recording mode or the voice recognition mode, and comparing the characteristic data with the characteristic data recorded in the non-volatile memory for detect the registered characteristic data similar to the extracted characteristic data and a difference value between them in the speech recognition mode; and a microprocessor for storing the data packet and the characteristic data for the data packet in the non-volatile memory in the voice recording mode, and receiving an index for the similar characteristic data and a difference value of the Voice recognition in speech recognition mode to determine if an incoming speech signal is recognized successfully.

16. The cellular telephone according to claim 15, characterized in that the microprocessor determines that the input speech signal is successfully recognized in the speech recognition mode if the difference value detected in the speech recognition device is less than a critical value.

17. A cell phone having a vocoder for compressing a voice signal received from a microphone to the output data packet, characterized in that it comprises: a non-volatile memory for storing the data packet and the characteristic data corresponding thereto; an interface unit of a user with which a user chooses a voice recording mode and a voice recognition mode; a speech recognition device for extracting characteristic data from the broadcast data packet from the vocoder in the voice recording mode or voice recognition mode, and comparing the characteristic data with the characteristic data recorded in the. non-volatile memory for detecting registered characteristic data, similar to the characteristic data input in voice recognition mode; and a microprocessor for storing the data packet and the characteristic data in the non-volatile memory in the voice recording mode, and determining whether an input speech sign is recognized successfully or not depending on whether or not the characteristic data similar ones detected in the speech recognition device are within an error range.