CN1247347A

CN1247347A - Speech input memorandum

Info

Publication number: CN1247347A
Application number: CN 98117693
Authority: CN
Inventors: 刘迎建; 马梁
Original assignee: ZHONGZI HANGWANG SCIENCE AND TECHNOLOGY Co BEIJING
Current assignee: Hanwang Science and Technology Co., Ltd., Beijing
Priority date: 1998-09-09
Filing date: 1998-09-09
Publication date: 2000-03-15
Anticipated expiration: 2018-09-09
Also published as: CN1101577C

Abstract

A speech input memorandum, belonging to a personal digital assitant (PDA), is a PDA based on speech, and features that the name and telephone number can be recorded by speech, and on inquiring, when a name is talked to it, a telephone number is directly displayed and the name and address/orgnization can be said out. It can also store 30-minute digitalized sound segmentally and then reproduce it at predefined time. The "journey" and "event record" functions can be performed in speech mode.

Description

Speech input memorandum

The invention belongs to " personal digital assistant " (Personal Digital Assistant), both usually said electronic notebook is called for short " PDA ".

At present, the PDA product can be divided into two big classes:

With the low-grade PDA headed by homemade " Wenquxing " and Hong Kong " electricity is translated logical ", be characterized in keyboard input, adopt encoding schemes such as phonetic or five, convert the GB Chinese character to, store or inquire about.Be characterized in simple in structure, memory consumption is few, and CPU requires low.Thereby price is very low.But because keyboard is little, operation inconvenience.Add that the viewing area is little, the candidate of phonetic is many, and the phonetic input is had any problem.And many people can five or spelling input method, can't use.

Also have with the Taiwan instant translator, translate logically well, the high-grade PDA of the artificial representative of Hong Kong name adopts handwriting input, a lot of specific functions is arranged, as infrared interface, digital recording, beeper reception etc.Because carry out handwriting input, these products all are furnished with bigger LCD, pressure type touch-screen, the RAM that 128K is above, and 1 to 4,000,000 ROM.Also very high to the requirement of CPU simultaneously, processing power all meets or exceeds the level of 286-16.Therefore their price is all very high.The input method of handwriting recognition is more humane, but CUP processing power and storage space is required very high, and the system requirements of for example intelligent pen are: Pentium is more than 75,8,000,000 internal memories.Because the restriction of PDA processing power, the process of writing are still not too convenient, require carefully and neatly to write, and the discrimination of its handwriting input, adaptive faculty and speed, all also far short of what is expected than the identification software of PC version.A lot of people feel and are inconvenient.

The objective of the invention is to designed PDA and have following characteristics:

1. when " name " in the business card function imported, read name, the feature with sound identification module analysis and extraction voice stores in the cog region of storer.Simultaneously with compression module with compress speech, store in the recording zone of storer.

2. when retrieval, extract the sound characteristic of the name of reading with sound identification module, and with storer in the feature comparison, find corresponding clauses and subclauses.

3. input digit and English adopt the little vocabulary speech recognition of specific people, compare with the eigenwert of extracting in advance with the eigenwert of sound identification module with the voice of institute's reading word, letter, identify corresponding digital or letter.

4. when consulting content, with the broadcast of contracing of the voice decompress(ion) of name, address in the storer recording zone.But not common demonstration Chinese character.

5. when consulting content, telephone number is except that character display, and also available massage voice reading comes out.Purpose of the present invention realizes in the following manner:

Shown in figure one, microphone is connected with amplifilter; Amplifilter is connected with high-precision a/d converter; A/D converter is connected with speech recognition device and compress speech element simultaneously; Speech recognition device links to each other with main control microprocessor by data bus and control signal wire.Speech compressor links to each other with main control microprocessor by data bus and control signal wire.Microprocessor is connected with storer.

The effect of this circuit is with the simulating signal of phonetic entry, is converted to digital code stream, and the characteristic extracting module of speech recognition device is extracted phonetic feature from speech data, and passes to microprocessor by data bus.Speech compressor is also passed to microprocessor by data bus after speech data is compressed.After processor obtains speech data and phonetic feature, they are deposited in respectively in the zones of different of storer use for identification and playback.

Microprocessor also is connected with the decompression element, and the decompression element is connected with D/A converter, and D/A converter is connected with amplifier, and amplifier is connected with loudspeaker.

The effect of this circuit is with the reduction of the speech data after the compression and plays.

Microprocessor also links to each other with the identification module of speech recognition device by data bus and control signal wire.Speech recognition device directly links to each other with storer again.

The effect of this circuit is sound identification module, the eigenwert in the speech recognition device usefulness storer and this value comparison that speech recognition device passed to phonetic feature by microprocessor.And the result passed to microprocessor.

Outward appearance of the present invention is as shown in Figure 2: this notepad can be held in the palm, one-handed performance.And microphone is housed, acknowledgement key is set, cancel key, knob, display screen and loudspeaker.

The present invention compares with existing PDA, has the following advantages:

Voice memo does not originally need speech conversion is become literal, and is low to system requirements, do not miss the knowledge problem.Ten numerals adopt the speech recognition input, and because of vocabulary is little, discrimination is very high, and speed is fast.Adopt the method for specific people's study, do not have the restriction of accent, language.Therefore phonetic entry is the input method of present most convenient.All inputs and inquiry, reading process all realize that by voice all operations all has voice suggestion, and blind person or the people who has defective vision also can easily use, and technology maturation, the low-grade PDA of close prices.

Description of drawings:

Fig. 1 is the speech input memorandum system diagram

Fig. 2 is the speech input memorandum outside drawing:

1-circuit output plughole; The 2-acknowledgement key; The 3-cancel key; The 4-loudspeaker; 5-

Display screen; The 6-knob; The 7-microphone.

Fig. 3 is the speech retrieval process flow diagram

Fig. 4 is the phonetic entry process flow diagram

Fig. 5 is the phonetic entry process flow diagram of numeral

Fig. 6 is the voice output process flow diagram

Embodiment:

The present invention can use 2 joint 3A batteries, can use more than 30 days.Product appearance as shown in Figure 2, shell is streamlined, elegant in appearance, small and exquisite and easy to operate.Positive demonstration adopts 16 ^*80 lattice type LCDs, and 16 more eye-catching icons are arranged.

Knob and affirmation, cancel key are housed respectively in the left and right sides of product, and the upper end is circuit delivery outlet and microphone, and the lower end is provided with loudspeaker.

Adjusting knob upwards transfers increment to, is downwards decrement, can select digital 0-9 easily, alphabetical A-Z and each menu option.

Affirmation/cancel key: the outward appearance long strip type, respectively there is a switch at inner two ends up and down.This key is confirmed by top, is cancellation by the bottom.

The recorded content of this notepad:

Can write down 200 business cards or 30 minutes voice.After business card increased, record length can shorten.

Each business card has maximum 2 seconds name recording and personal information recording in maximum 5 seconds.Maximum 5 telephone numbers: phone 1, phone 2, fax, home telephone, BP.

Digital recording can be divided into multistage, amounts to maximum 30 minutes, all can set reproduction time for every section, then begins automatically to play, as appointment/stroke reminding.

Circuit output and microphone interface:

Use this interface, the user can connect earphone and the microphone of oneself.Also can computing machine be transcribed in recording and carry out speech recognition by circuit output.Function and operation:

The course of work of this notepad following (as shown in Figure 1):

Phonetic entry: the analog voice signal of microphone input, entering amplifilter becomes analog electrical signal, is converted to digital code stream by A/D converter again, passes to speech recognition device and speech compressor simultaneously.The characteristic extracting module of speech recognition device is extracted phonetic feature from speech data, and passes to microprocessor by data bus.Speech compressor is also passed to microprocessor by data bus after speech data is compressed.After processor obtains speech data and phonetic feature, they are deposited in respectively in the zones of different of storer.

During retrieval, sound identification module, the eigenwert in the speech recognition device usefulness storer and this value comparison of speech recognition device passed to phonetic feature by processor.And the result passed to microprocessor.

Voice output: as shown in Figure 6, processor takes out the compressed voice data from storer, pass to the voice decompressor, data stream after decompressor will decompress is exported to D/A converter, be converted to simulating signal, after amplifying by amplifilter, export by loudspeaker.

1. primary option menu:

At first enter master menu after the start, turning knob, cursor is in " retrieval business card ", " browse business card ", " consulting recording ", " input business card ", " recording ", " default ", switch between options such as " internal memory situation ", " time set ", " voice training ", " recreation ", dot matrix LCD goes up the icon that shows each function.Cursor is along with selecting button to move, and after stopping 1 second, the term pronunciation goes out this function title.Enter function by acknowledgement key, return previous menu by " cancellation ".

2. business card input

Enter " business card input " function, cursor moves between " phone input " in " name input ", " personal information ", " input of name phonetic ".

1) the name input process as shown in Figure 3.

Earlier cursor is moved on on " name ", one second attonity, then voice suggestion " please be read name ", pins " affirmation ", microprocesser initialization A/D converter, speech compressor and speech recognition device.Begin recording then, acknowledgement key unclamps the back to be finished, and surpasses 2 seconds, overtime warning.Repeat to record and then cover last content.The voice signal that microphone is gathered through amplification, filtering, carries out the A/D conversion with A/D converter to voice signal.Speech recognition device extracts phonetic feature, and deposits characteristic in storer by control chip.Simultaneously, speech compressor deposits data compression in storer by control chip.

2) personal information input: earlier cursor is moved on on " personal information ", one second attonity, then voice suggestion " personal information " is pinned " affirmation " and is begun recording, and finish the back of loosing one's grip, and surpasses 5 seconds, overtime warning.Repeat to record and to cover last content.In the Recording Process,, and extract phonetic feature without recognizer only with compression storage after the voice digitization.

3) telephone number input:

Available knob or phonetic entry, the phonetic entry process as shown in Figure 5.

With selecting button that cursor is moved on to: any among phone 1, phone 2, home telephone, fax, the BP enters by " affirmation ".Rotating the selection button can be at phonetic entry, digital 0-9, and the space is returned to delete between end and selected.During display digit 0-9, import and enter next numeral by " affirmation ", choosing " is returned and is deleted " and then deletes a numeral, pins back and deletes then full scale clearance in 2 seconds.Wherein, phonetic entry, return to delete and be icon, when choosing phonetic entry, pin " affirmation ", read a numeral, unclamp back identification, microprocessor initialization A/D of elder generation and recognition component, the voice signal that microphone is gathered through amplification, filtering, carries out the A/D conversion with A/D converter to voice signal.Identification module extracts phonetic feature, and with storer in the phonetic feature of 0-9 compare, and the result is passed to microprocessor.The microprocessor control display screen shows corresponding digital 0-9, enters the next one then automatically.Then store and get back to the upper level option by " cancellation " key.

4) by " cancellation " key, whether prompting stores, and the user returns master menu after selecting.

3. business card retrieval:

Browse and the speech retrieval dual mode.

1) speech retrieval:

Retrieving as shown in Figure 4.

Select the business card retrieval under the master menu, begin recording with LCD and voice suggestion user.The user pins " affirmation " key and reads name simultaneously and unclamp then.Microprocessor control A/D, speech recognition device are started working.The voice signal that microphone is gathered through amplification, filtering, carries out the A/D conversion with A/D converter to voice signal.Identification module extracts phonetic feature, and with storer in the phonetic feature of the name stored compare, and the result is passed to microprocessor.Microprocessor shows corresponding recorded information according to the result.Call over name, address, phone simultaneously.Stop and entering editing mode by " affirmation ".When the close candidate of a plurality of pronunciations was arranged, if rotary knob then can switch between each candidate's business card, display showed sequence number and first telephone number.Stopped 1 second, and then called over name, personal information and phone.Stop and entering editing mode by " affirmation "." cancellation " returns master menu.

After entering, cursor is moved between name, remarks, phone and " deletion business card " icon, when moving on to name, address, stop to read after one second with knob.

Enter respective selection by " affirmation ", make amendment, " cancellation " returns upper level.

4. digital recording

1) recording

Select sound-recording function, pin acknowledgement key, begin recording.

Whether the back inquiry of recording regularly broadcasts, when selecting "Yes" then to continue the requirement setting

Between.Then can broadcast.

2) retrieval

Can only check in proper order.Show record length when checking, length and playback duration are set.

Began after stopping 1 second to play.By after " affirmations " enter, can revise regularly broadcast

Time or deletion.

5. system's setting

Identification parameter, volume and voice suggestion can be set opens or closes.

Each operation all has Chinese character or figure and voice suggestion, and wherein voice suggestion can be closed.

6. the study of 0-9 numeral

Each numeral reads twice, to voice gather and digitizing after, extract feature and be kept in the storer with recognizer.Behind the learning success with regard to available phonetic entry numeral.

Claims

1. one kind by microphone, the speech input memorandum that amplifilter, A/D converter, D/A converter, speech recognition device, speech compressor, voice decompressor, data bus, control signal wire, main control microprocessor, amplifier, loudspeaker, storer, button, knob, display screen constitute, it is characterized in that: microphone is connected with amplifilter; Amplifilter is connected with high-precision a/d converter; A/D converter is connected with speech recognition device and compress speech element simultaneously; Speech recognition device links to each other with main control microprocessor by data bus and control signal wire, and speech compressor links to each other with main control microprocessor by data bus and control signal wire; Microprocessor is connected with storer; Microprocessor also is connected with voice decompression element, and voice decompression element is connected with D/A converter, and D/A converter is connected with amplifier, and amplifier is connected with loudspeaker; Microprocessor also links to each other with the identification module of speech recognition device by data bus and control signal wire, and speech recognition device directly links to each other with storer again.

2. according to the described speech input memorandum of claim 1, it is characterized in that the shell of notepad is equipped with microphone, acknowledgement key is set, cancel key, knob, display screen and loudspeaker.