CN104778865A

CN104778865A - Method for conducting spoken language correction through speech recognition technology and language learning machine

Info

Publication number: CN104778865A
Application number: CN201410023709.3A
Authority: CN
Inventors: 王萍丽
Original assignee: Individual
Current assignee: Individual
Priority date: 2014-01-14
Filing date: 2014-01-14
Publication date: 2015-07-15

Abstract

The invention relates to a method for conducting spoken language correction through the speech recognition technology. Spoken language pronunciation of a user is identified through the speech recognition technology so as to correct spoken language pronunciation. The invention further relates to a language learning machine; an audio input and output module is connected with a speech recognition module and a central processing unit, and the speech recognition module, a storage unit and a display module are each connected with the central processing unit; a speech recognition database is stored in the storage unit, and the speech recognition module calls the speech recognition database to analyze and process speech signals input through the audio input and output module; and processed data are transmitted to the central processing unit, and the data transmitted through the speech recognition module is transmitted to the display module through the central processing unit so as to be output through the display module in an image mode. Image and speed comparison is carried out on pronunciations of the user and original standard pronunciations through the speech recognition technology, the pronunciation accuracy is accurately judged, correction is carried out in time, the spoken language pronunciation level is rapidly improved, and the listening level of the user can be rapidly improved.

Description

A kind of speech recognition technology of applying carries out the method for spoken rectification and a kind of language learner

Technical field

The invention belongs to aided education field, be specifically related to a kind of speech recognition technology of applying and carry out the method for spoken rectification and a kind of language learner.

Background technology

Current, the domestic language environment lacking foreign language learning, exercise, foreign language listening and spoken language are the difficult points of foreign language teaching, most School English Teaching remains and lays particular emphasis on foreign language knowwhy, the foreign language of Students ' Learning remains " Dumb English " mostly, knowwhy level is relatively high, and listening and speaking ability is very poor, does not also have a kind of method and the instrument that carry out foreign language listening-speaking study preferably at present.

Summary of the invention

The object of this invention is to provide can improve foreign language learning hearing fast, a kind of speech recognition technology of applying of oracy carries out spoken method of correcting, the step of the method is:

Step one: user carries out spoken language pronunciation according to source information facing to microphone;

Step 2: by the spoken language pronunciation of speech recognition technology identification user;

Step 3: the spoken language pronunciation information after identification is shown by textual form;

Step 4: user by the text after relative discern and source information multilevel iudge spoken language pronunciation whether standard, accurately, if recognition result and source information inconsistent, then correct spoken language pronunciation, repeat above process, if unanimously, then carry out the next one and circulate.

In order to reach listening and speaking effect better, said method is also included in user and carries out before step one carries out spoken language pronunciation, and carry out Received Pronunciation by speech play, user carries out with reading.

Above-mentioned source information comprises received text or Received Pronunciation etc.

The present invention also comprises and can improve foreign language learning hearing fast, a kind of language learner of oracy, this equipment comprises housing, power supply, also comprise microphone, audio frequency input/output module, sound identification module, CPU (central processing unit), display module and storage unit etc., audio frequency input/output module is connected with sound identification module and CPU (central processing unit), sound identification module, storage unit, display module, be connected with CPU (central processing unit) respectively, cell stores has speech recognition database, sound identification module calls speech recognition database and is analyzed by the voice signal that audio frequency input/output module inputs, process, the data processed are transferred to CPU (central processing unit), the data transmitted by sound identification module are transferred to display module and are exported with image format by display module by CPU (central processing unit).

Display module comprises LCDs, and LCDs carries out image output.

This equipment also comprises loudspeaker, and loudspeaker is connected with audio frequency input/output module, is exported in the form of sound by the signal that the transmission of audio frequency input/output module comes.

Microphone and audio frequency input/output module or loudspeaker and audio frequency input/output module pass through wireless signal transmission.

Also comprise data input/output module, data input/output module is connected with CPU (central processing unit), data input/output module comprises card reader, USB interface or bluetooth etc., be connected with external unit by data input/output module, by data input/output module by the storage information transmission of storage unit to external unit, or by the storage information transmission of external unit to language learner, show by the display module of language learner or play by loudspeaker.

This equipment also comprises mixed-media network modules mixed-media, mixed-media network modules mixed-media comprises wifi, GPRS, 3G, 4G radio communication unit or wire communication unit etc., mixed-media network modules mixed-media carries out network connection by wireless or wire communication, utilizes Internet resources to carry out signal transacting, data upload and download.

The CPU (central processing unit) of language learner is accepted instruction and the language message of cell stores is transferred to display module and is shown by display module and be transferred to audio frequency input/output module and be converted to form of sound by loudspeaker and export, user sees display information and hears the voice that language learner is play, undertaken with reading by microphone, microphone by transmitting voice signal to audio frequency input/output module, the simulating signal of microphone is converted to voice digital signal by audio frequency input/output module, and the voice digital signal after process is transferred to sound identification module and CPU (central processing unit), sound identification module is by the speech recognition database of storage unit or carry out voice recognition processing by network data base, by the Signal transmissions of identification to CPU (central processing unit), the voice digital signal inputted by audio frequency input/output module is transferred to storer storage or the voice digital signal inputted by audio frequency input/output module is transferred to audio frequency input/output module and is converted to form of sound output by loudspeaker by CPU (central processing unit), the Signal transmissions inputted by sound identification module stores to storer or is exported with image format by display module to display module by the Signal transmissions inputted by sound identification module by CPU (central processing unit), user to be compared with original content by the identification content of display and passes through to compare with reading sound and original sound the pronunciation whether standard judging oneself, by continuous rectification, reach the raising of spoken language pronunciation.

The pronunciation of user and primary standard are pronounced to carry out the contrast of image and sound by utilizing speech recognition technology by this equipment, accurately can judge the accuracy of pronouncing, timely correction, reaches and improves spoken language pronunciation level fast, and can improve the hearing level of user fast.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of embodiment 1;

Fig. 2 is the process flow diagram of embodiment 2;

Fig. 3 is the general frame figure of embodiment 3;

Fig. 4 is the general frame figure of embodiment 4;

Fig. 5 is the microphone in embodiment 4;

Fig. 6 is the loudspeaker in embodiment 4;

Fig. 7 is the general frame figure of embodiment 5.

Wherein: 1-housing, 2-power supply, 3-microphone, 4-loudspeaker, 5-audio frequency input/output module, 6-sound identification module, 7-CPU (central processing unit), 8-display module, 8.1-liquid crystal display, 9-storage unit, 10-data input/output module, 11-mixed-media network modules mixed-media.

Embodiment

Embodiment 1

As shown in Figure 1, a kind of speech recognition technology of applying carries out spoken method of correcting, and the step of the method is:

Embodiment 2

As shown in Figure 2, a kind of speech recognition technology of applying carries out spoken method of correcting, and the step of the method is:

Step one: carry out Received Pronunciation by speech play.

Step 2: user carries out spoken language pronunciation according to source information facing to microphone;

Step 3: by the spoken language pronunciation of speech recognition technology identification user;

Step 4: the spoken language pronunciation information after identification is shown by textual form;

Step 5: user by the text after relative discern and source information multilevel iudge spoken language pronunciation whether standard, accurately, if recognition result and source information inconsistent, then correct spoken language pronunciation, repeat above process, if unanimously, then carry out the next one and circulate.

Embodiment 3

As shown in Figure 3, language learner comprises housing 1, power supply 2, microphone 3, loudspeaker 4, audio frequency input/output module 5, sound identification module 6, CPU (central processing unit) 7, display module 8 and storage unit 9, audio frequency input/output module 5 is connected with sound identification module 6 and CPU (central processing unit) 7, sound identification module 6, storage unit 9, display module 8 is connected with CPU (central processing unit) 7 respectively, storage unit 9 stores speech recognition database, sound identification module 6 calls speech recognition database and is analyzed by the voice signal that audio frequency input/output module 5 inputs, process, the data processed are transferred to CPU (central processing unit) 7, the data transmitted by sound identification module 6 are transferred to display module 8 and are exported with image format by display module 8 by CPU (central processing unit) 7.

Display module 8 comprises LCDs 8.1, and LCDs 8.1 carries out image output.

Loudspeaker 4 is connected with audio frequency input/output module 5, audio frequency input/output module 5 is transmitted the signal come and exports in the form of sound.

Microphone 3 passes through wire transmission signal with audio frequency input/output module 5 and loudspeaker 4 with audio frequency input/output module 5.

Embodiment 4

As shown in Figures 4 to 6, language learner comprises housing 1, power supply 2, microphone 3, loudspeaker 4, audio frequency input/output module 5, sound identification module 6, CPU (central processing unit) 7, display module 8 and storage unit 9, audio frequency input/output module 5 is connected with sound identification module 6 and CPU (central processing unit) 7, sound identification module 6, storage unit 9, display module 8 is connected with CPU (central processing unit) 7 respectively, storage unit 9 stores speech recognition database, sound identification module 6 calls speech recognition database and is analyzed by the voice signal that audio frequency input/output module 5 inputs, process, the data processed are transferred to CPU (central processing unit) 7, the data transmitted by sound identification module 6 are transferred to display module 8 and are exported with image format by display module 8 by CPU (central processing unit) 7.

Microphone 3 passes through wireless signal transmission with audio frequency input/output module 5 and loudspeaker 4 with audio frequency input/output module 5.

This equipment also comprises data input/output module 10, data input/output module 10 is connected with CPU (central processing unit) 7, data input/output module 10 comprises card reader, USB interface, bluetooth etc., be connected with external unit by data input/output module, by data input/output module by the storage information transmission of storage unit to external unit, or by the storage information transmission of external unit to language learner, show by the display module of language learner or play by loudspeaker.

Embodiment 5

As shown in Figure 7, this equipment also comprises mixed-media network modules mixed-media 11, mixed-media network modules mixed-media 11 comprises wifi, GPRS, 3G, 4G radio communication unit or wire communication unit, and mixed-media network modules mixed-media 11 carries out network connection by wireless or wire communication, utilizes Internet resources to carry out signal transacting, data upload and download.

All the other are with embodiment 4.

Claims

1. apply speech recognition technology and carry out a spoken method of correcting, the step of the method is:

2. a kind of speech recognition technology of applying according to claim 1 carries out spoken method of correcting, and its step also comprises: carry out before step one carries out spoken language pronunciation user, carry out Received Pronunciation by speech play, user carries out with reading.

3. a foreign language learning machine, comprise housing, power supply, it is characterized in that: also comprise microphone, audio frequency input/output module, sound identification module, CPU (central processing unit), display module, storage unit, audio frequency input/output module is connected with sound identification module and CPU (central processing unit), sound identification module, storage unit, display module, be connected with CPU (central processing unit) respectively, cell stores has speech recognition database, sound identification module calls speech recognition database and is analyzed by the voice signal that audio frequency input/output module inputs, process, the data processed are transferred to CPU (central processing unit), the data transmitted by sound identification module are transferred to display module and are exported with image format by display module by CPU (central processing unit).

4. a kind of foreign language learning machine according to claim 3, is characterized in that: display module comprises LCDs, and LCDs carries out image output.

5. a kind of foreign language learning machine according to claim 3, is characterized in that: also comprise loudspeaker, and loudspeaker is connected with audio frequency input/output module, is exported in the form of sound by the signal that the transmission of audio frequency input/output module comes.

6. a kind of foreign language learning machine according to claim 3, is characterized in that: microphone and audio frequency input/output module pass through wireless signal transmission.

7. a kind of foreign language learning machine according to claim 6, is characterized in that: loudspeaker and audio frequency input/output module pass through wireless signal transmission.

8. a kind of foreign language learning machine according to claim 3, it is characterized in that: also comprise data input/output module, data input/output module is connected with CPU (central processing unit).

9. a kind of foreign language learning machine according to claim 3, is characterized in that: also comprise mixed-media network modules mixed-media, carries out network connection by mixed-media network modules mixed-media.

10. a kind of foreign language learning machine according to claim 3, is characterized in that: sound identification module accessible site audio frequency input/output module.