CN111429884A - Speech recognition rate analysis system - Google Patents

Speech recognition rate analysis system Download PDF

Info

Publication number
CN111429884A
CN111429884A CN202010244371.XA CN202010244371A CN111429884A CN 111429884 A CN111429884 A CN 111429884A CN 202010244371 A CN202010244371 A CN 202010244371A CN 111429884 A CN111429884 A CN 111429884A
Authority
CN
China
Prior art keywords
module
microphone array
recognition rate
test
analysis system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010244371.XA
Other languages
Chinese (zh)
Other versions
CN111429884B (en
Inventor
潘浩贤
蔡伟雄
严冬
冼佳莉
陈南洲
陈晓燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan University
Original Assignee
Foshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan University filed Critical Foshan University
Priority to CN202010244371.XA priority Critical patent/CN111429884B/en
Publication of CN111429884A publication Critical patent/CN111429884A/en
Application granted granted Critical
Publication of CN111429884B publication Critical patent/CN111429884B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/01Assessment or evaluation of speech recognition systems

Abstract

The invention discloses a speech recognition rate analysis system, comprising: the first microphone array is used for acquiring test audio; the display module is used for displaying the function options of the system and providing click selection for a user; the wireless transmitting module and the wireless receiving module are used for receiving the feedback information of the voice module of the test object in a mutually matched manner; the test audio delivery module comprises a first loudspeaker and a second microphone array, wherein the first loudspeaker is used for playing the test audio, and the second microphone array is used for collecting a test audio signal played by the first loudspeaker; the distance measurement module is used for measuring the distance between the processing module and the tested voice module; the invention can make the voice analysis system more intelligent and provide a good hardware environment for the accurate test of the voice recognition rate.

Description

Speech recognition rate analysis system
Technical Field
The invention relates to the technical field of voice recognition, in particular to a voice recognition rate analysis system.
Background
Speech recognition, which is a very popular technology in the present, has been used reasonably by many industries.
The reference value of the voice recognition rate of the existing voice product is low because the analysis of environmental parameters and functional parameters in a recognition state is lacked in the test process;
the existing voice recognition rate test methods mainly comprise two methods: software simulation test and manual test. The former inputs audio signals to a voice module through software, and a test recognition result is obtained on a computer. The latter arranges a large number of testers to repeatedly carry out testing, recording, uploading data and statistical analysis on site, and the testing method consumes a large amount of human resources and has complicated actual operation steps and low efficiency;
at present, a few of voice recognition equipment are tested by building a hardware system, but the tested parameters lack values for judging, analyzing and measuring the performance of a voice module, and the hardware architecture is complex;
in a few existing hardware test systems, most test objects are complete voice recognition products, so that the test is difficult to be compatible with different devices, and the universality is low.
Therefore, a speech recognition rate analysis system is urgently needed in the current market, the speech recognition rate analysis system can be built through a simpler hardware structure, and an excellent hardware environment can be provided for the accurate test of the speech recognition rate.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a voice recognition rate analysis system, which can be built through a simpler hardware structure and can provide a good hardware environment for completing accurate test of the voice recognition rate.
The solution of the invention for solving the technical problem is as follows: a speech recognition rate analysis system comprising:
a first microphone array for conducting acquisition of test audio;
the display module is used for displaying the function options of the system and providing click selection for a user;
the wireless transmitting module and the wireless receiving module are used for receiving feedback information of the voice module of the test object in a matched manner;
the test audio delivery module comprises a first loudspeaker and a second microphone array, wherein the first loudspeaker is used for playing the test audio, and the second microphone array is used for collecting a test audio signal played by the first loudspeaker;
the distance measurement module is used for measuring the distance between the processing module and the tested voice module;
the processing module comprises:
an ambient noise measurement unit for measuring a degree of ambient noise by a sound pressure level;
the multimedia encoder is used for filtering the test audio collected by the first microphone array, performing A/D conversion on the filtered test audio and storing the converted test audio into an F L ASH cache in the form of a WAV file;
and the multimedia decoder is used for performing D/A conversion on the WAV file when the WAV file is called by the serial port, performing power amplification and then broadcasting the WAV file by the first loudspeaker.
Further, the display module embedding sets up in the box body center, the lower part of box body encircles and is provided with first microphone array, the left and right both sides at the rear end lower part center of box body are provided with range finding module and wireless receiving module, the rear end upper portion center of box body is provided with first speaker.
Further, the wireless transmitting module and the wireless receiving module respectively comprise an infrared transmitting module and an infrared receiving module.
Further, the range finding module is specifically a laser range finding module, including laser emitter and SPAD infrared receiver.
Further, the display module comprises a TFT L CD display screen.
Further, the processing module comprises a micro control module of STM32F10X series and peripheral circuits thereof.
Further, the second microphone array is arranged at a tested voice module, and a second loudspeaker is further arranged at the tested voice module.
The invention has the beneficial effects that: the invention provides a voice recognition rate analysis system, which can directly interact with a test object after a voice module (test object) is externally connected with a wireless transmitting module, namely, artificial sound production and result recording are replaced, so that the voice analysis system is more intelligent, and a good hardware environment can be provided for accurate test of the voice recognition rate.
Drawings
In order to more clearly illustrate the technical solution in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is clear that the described figures are only some embodiments of the invention, not all embodiments, and that a person skilled in the art can also derive other designs and figures from them without inventive effort.
FIG. 1 is a schematic diagram of a front side of a display hardware portion of a speech recognition rate analysis system according to the present invention;
FIG. 2 is a schematic diagram of the rear side of the hardware portion of the display of a speech recognition rate analysis system of the present invention;
FIG. 3 is a system diagram of a speech recognition rate analysis system of the present invention;
FIG. 4 is a functional flow diagram of a speech recognition rate analysis system of the present invention;
FIG. 5 is a schematic diagram illustrating the generation and playing principle of the test audio of the speech recognition rate analysis system according to the present invention.
Detailed Description
The conception, the specific structure and the technical effects of the present invention will be clearly and completely described below in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the features and the effects of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and those skilled in the art can obtain other embodiments without inventive effort based on the embodiments of the present invention, and all embodiments are within the protection scope of the present invention. In addition, all the connection relations mentioned herein do not mean that the components are directly connected, but mean that a better connection structure can be formed by adding or reducing connection accessories according to the specific implementation situation. All technical characteristics in the invention can be interactively combined on the premise of not conflicting with each other.
Embodiment 1, referring to fig. 1, fig. 2, fig. 3, fig. 4, and fig. 5, a speech recognition rate analysis system includes:
a first microphone array 110, the first microphone array 110 for conducting an acquisition of test audio;
the display module 120, the display module 120 is used for displaying the function options of the system and providing the user to click and select;
the test system comprises a wireless transmitting module 131 and a wireless receiving module 132, wherein the wireless transmitting module 131 and the wireless receiving module 132 are used for receiving feedback information of the voice module 200 of the test object in a mutual matching manner;
the wireless transmitting module 131 is placed at the tested voice module 200, and the pin connection between the wireless receiving module 132 and the processing module 160 is controlled by the processing module 160;
a test audio delivery module, wherein the test audio delivery module includes a first speaker 141 and a second microphone array 142, the first speaker 141 is used for playing the test audio, and the second microphone array 142 is used for collecting a test audio signal played by the first speaker 141;
a ranging module 150, wherein the ranging module 150 is used for measuring the distance between the processing module 160 and the tested voice module 200;
the processing module 160 includes:
an ambient noise measurement unit for measuring a degree of ambient noise by a sound pressure level;
a multimedia encoder 161, wherein the multimedia encoder 161 is configured to filter the test audio collected by the first microphone array 110, perform an a/D conversion on the filtered test audio, and store the converted test audio in a WAV file form in an F L ASH buffer;
and the multimedia decoder 162 is used for performing D/A conversion on the WAV file when the WAV file is called by a serial port, performing power amplification and then playing the WAV file by the first loudspeaker 141.
Specifically, the first microphone array 110 and the second microphone array 142 are used for recording high fidelity audio for testing in the field, i.e. for detecting the voice recognized by the voice module 200, and the audio is played through the first speaker 141. Before testing, the spoken command voice is stored in the memory, and the effect of manual testing is achieved by playing during testing, so that the manual operation is replaced to carry out objective, scientific and efficient testing.
When the display module 120 is started, the function options of 5 virtual buttons, which are respectively "audio recording", "audio playing", "distance testing", "noise testing", and "automatic testing",
the automatic test is an automatic repeated test and a test result is reserved, and other tests are single tests;
entering the option of 'automatic test', starting the test work after clicking 'test' by setting and selecting the test audio, the test times, the audio playing time interval and the placing position,
firstly, detecting an environmental sound signal through a first microphone array 110, and obtaining a sound pressure level through system processing; then, the distance measurement module 150 is operated to measure the distance, and when the sensor on the distance measurement module 150 receives the laser light scattered back, the distance is obtained through data processing; then, the first time of playing of the test audio is started through the first speaker 141, after the playing is finished, the micro control module starts an internal timer to start timing, and the infrared receiving module detects and waits for a signal from the infrared transmitting module at any time. When the second microphone array 142 of the voice module 200 receives the command voice from the test system, the infrared signal is output through the infrared transmitting module of the test port. And the micro control system finishes timing when receiving the infrared signal, processes and records the received signal, completes the first test and displays the current test result through the display screen. And before the set times are reached, the test is circulated.
Turning on an audio recording function through operation of a touch display screen, detecting and receiving sound signals immediately by a first microphone array, enabling the collected sound signals to enter a multimedia encoder 161 through a filter circuit, integrating an analog-to-digital converter (ADC) with adjustable sampling frequency in the encoder to complete analog-to-digital conversion, outputting a generated WAV (uncompressed audio format) file to the encoder and storing the WAV file in a memory, such as F L ASH (flash memory), when the WAV file needs to be extracted for voice test, sending an instruction by an upper computer, extracting voice from F L ASH, sending the voice data to a multimedia decoder 162 at high speed through an SPI (serial peripheral interface) protocol, decoding the voice data through a high-performance DAC, and playing test voice through a power amplifier circuit and a first loudspeaker 141;
the microphone array and multimedia encoder 161 can be used for detection of ambient sound pressure in addition to being used as a recorder and player. After the sound signal of the environment passes through the first microphone array, the WAV file is generated by the ADC of the multimedia encoder 161, and the conversion of the sound signal into a voltage signal is completed. The sound pressure can be obtained by utilizing the voltage signal and the sensitivity parameter conversion of the microphone array, and finally, the environmental decibel size is obtained through a sound pressure level formula.
In a preferred embodiment of the present invention, the display module 120 is embedded in the center of a case, the first microphone array 110 is disposed around the lower portion of the case, the distance measuring module 150 and the wireless receiving module 132 are disposed on the left and right sides of the center of the lower portion of the rear end of the case, and the first speaker 141 is disposed in the center of the upper portion of the rear end of the case.
In addition, a second microphone array 210 and a second speaker 220 are disposed at the tested voice module, wherein the second microphone array 210 is used for detecting and receiving the sound signal of the voice module, and the second speaker 220 is used for enhancing the sound emitted by the voice module.
In a preferred embodiment of the present invention, the wireless transmitting module 131 and the wireless receiving module 132 respectively include an infrared transmitting module and an infrared receiving module.
IN the present embodiment, DATA of the infrared receiving module is led out "REMOTE _ IN" to be connected to PB9 of the STM32 of the processing module 160. IN the infrared communication protocol used IN the present system, the DATA bit is normally set, i.e. DATA remains connected to the 3.3V high level, so that REMOTE _ IN is pulled up to the high level, and when the DATA needs to be pulled down, DATA is cleared, and then REMOTE _ IN is pulled down to the low level.
As a preferred embodiment of the present invention, the distance measuring module 150 is specifically a laser distance measuring module 150, and includes a laser transmitter and a SPAD infrared receiver.
In this embodiment, the core chip of the laser ranging module 150 is V L5310X, and a voltage stabilizing chip XC6206P282MR is adopted,
the specific recognition principle of the speech recognition rate is that the data frame format fed back by the speech module 200 includes a start code, a user code, a data code and a data code complement, and the data code carries core information. By utilizing the characteristics that the voice module 200 receives different voices and feeds back different data frames, in the device, the receiving end compares the decoded signals with the sent voices so as to judge whether the voice module 200 correctly identifies. For example, if the system plays a voice test file "000", which is recognized by the voice module 200 and feeds back a corresponding and unique data frame "000" to the system, the recognition is correct; if the device plays the voice test file '000' and receives the feedback data frame '001', the recognition is wrong.
In this embodiment, the distance measuring module 150 is used for measuring the distance between the micro control module and the measured voice module 200, the device uses laser distance measurement, the processing module 160 can directly give instructions to the distance measuring module 150, the distance can be measured after the measuring mode is selected, the measured value can be displayed, specifically, the pulse type distance measuring technology is adopted, and 940nm red light-free scintillation laser is radiated. When the upper computer sends a distance measurement starting instruction and simultaneously opens the internal timer to start timing, the laser emitter radiates photons to the target. As shown in fig. five, the photons are scattered after striking the object to be measured, and the sensor immediately sends an interrupt request to the upper computer after receiving the returned photons, and the timing is finished. The distance of the back-and-forth flight is calculated by the product of the back-and-forth flight time and the light speed of the measured photon, and half of the value is the actual distance.
The display module 120 comprises a TFT L CD display screen, the system uses a 4.3-inch TFT L CD with a touch screen, also called a true color LCD, the resolution of the module is 800 × 480, the color depth of 24 bits can be displayed in 65536, and the NT35510 is used for driving chip control, and the pins of the module are connected with the FSMC of the processing module 160STM 32.
As a preferred embodiment of the present invention, the processing module 160 includes a micro control module of model STM32F10X series and its peripheral circuits.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that the present invention is not limited to the details of the embodiments shown and described, but is capable of numerous equivalents and substitutions without departing from the spirit of the invention as set forth in the claims appended hereto.

Claims (7)

1. A speech recognition rate analysis system, comprising:
a first microphone array for conducting acquisition of test audio;
the display module is used for displaying the function options of the system and providing click selection for a user;
the wireless transmitting module and the wireless receiving module are used for receiving feedback information of the voice module of the test object in a matched manner;
the test audio delivery module comprises a first loudspeaker and a second microphone array, wherein the first loudspeaker is used for playing the test audio, and the second microphone array is used for collecting a test audio signal played by the first loudspeaker;
the distance measurement module is used for measuring the distance between the processing module and the tested voice module;
the processing module comprises:
an ambient noise measurement unit for measuring a degree of ambient noise by a sound pressure level;
the multimedia encoder is used for filtering the test audio collected by the first microphone array, performing A/D conversion on the filtered test audio and storing the converted test audio into an F L ASH cache in the form of a WAV file;
and the multimedia decoder is used for performing D/A conversion on the WAV file when the WAV file is called by the serial port, performing power amplification and then broadcasting the WAV file by the first loudspeaker.
2. A speech recognition rate analysis system according to claim 1, wherein: the display module embedding sets up in the box body center, the lower part of box body encircles and is provided with first microphone array, the left and right both sides at the rear end lower part center of box body are provided with ranging module and wireless receiving module, the rear end upper portion center of box body is provided with first speaker.
3. A speech recognition rate analysis system according to claim 2, wherein: the wireless transmitting module and the wireless receiving module respectively comprise an infrared transmitting module and an infrared receiving module.
4. A speech recognition rate analysis system according to claim 2, wherein: the ranging module is specifically a laser ranging module and comprises a laser transmitter and an SPAD infrared receiver.
5. The system of claim 1, wherein the display module comprises a TFT L CD display.
6. The speech recognition rate analysis system of claim 5, wherein: the processing module comprises a micro control module of STM32F10X series and a peripheral circuit thereof.
7. A speech recognition rate analysis system according to claim 1, wherein: the second microphone array is arranged at a tested voice module, and a second loudspeaker is further arranged at the tested voice module.
CN202010244371.XA 2020-03-31 2020-03-31 Speech recognition rate analysis system Active CN111429884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010244371.XA CN111429884B (en) 2020-03-31 2020-03-31 Speech recognition rate analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010244371.XA CN111429884B (en) 2020-03-31 2020-03-31 Speech recognition rate analysis system

Publications (2)

Publication Number Publication Date
CN111429884A true CN111429884A (en) 2020-07-17
CN111429884B CN111429884B (en) 2023-03-28

Family

ID=71551956

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010244371.XA Active CN111429884B (en) 2020-03-31 2020-03-31 Speech recognition rate analysis system

Country Status (1)

Country Link
CN (1) CN111429884B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107221319A (en) * 2017-05-16 2017-09-29 厦门盈趣科技股份有限公司 A kind of speech recognition test system and method
CN206906990U (en) * 2017-06-19 2018-01-19 深圳市相位科技有限公司 A kind of blue tooth voice keyboard based on microphone array
CN109285543A (en) * 2018-09-07 2019-01-29 惠州市德赛西威汽车电子股份有限公司 A kind of vehicle-mounted multimedia navigating instrument voice automatization test system
CN208834732U (en) * 2018-08-27 2019-05-07 安徽筋斗云机器人科技股份有限公司 Speech recognition system and its marketing machine
CN109817209A (en) * 2019-01-16 2019-05-28 深圳市友杰智新科技有限公司 A kind of intelligent speech interactive system based on two-microphone array
CN110430492A (en) * 2019-08-07 2019-11-08 王家春 A kind of wireless microphone system and its implementation of the control of intelligent sound interactive voice

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107221319A (en) * 2017-05-16 2017-09-29 厦门盈趣科技股份有限公司 A kind of speech recognition test system and method
CN206906990U (en) * 2017-06-19 2018-01-19 深圳市相位科技有限公司 A kind of blue tooth voice keyboard based on microphone array
CN208834732U (en) * 2018-08-27 2019-05-07 安徽筋斗云机器人科技股份有限公司 Speech recognition system and its marketing machine
CN109285543A (en) * 2018-09-07 2019-01-29 惠州市德赛西威汽车电子股份有限公司 A kind of vehicle-mounted multimedia navigating instrument voice automatization test system
CN109817209A (en) * 2019-01-16 2019-05-28 深圳市友杰智新科技有限公司 A kind of intelligent speech interactive system based on two-microphone array
CN110430492A (en) * 2019-08-07 2019-11-08 王家春 A kind of wireless microphone system and its implementation of the control of intelligent sound interactive voice

Also Published As

Publication number Publication date
CN111429884B (en) 2023-03-28

Similar Documents

Publication Publication Date Title
CN106548772A (en) Speech recognition test system and method
CN108806720B (en) Microphone, data processor, monitoring system and monitoring method
CN107221319A (en) A kind of speech recognition test system and method
CN111724782B (en) Response time testing system, method and equipment of vehicle-mounted voice interaction system
CN109346075A (en) Identify user speech with the method and system of controlling electronic devices by human body vibration
CN101437191A (en) Calibration method for audio MACSYM
CN112151029A (en) Voice awakening and recognition automatic test method, storage medium and test terminal
CN109547910A (en) Electronic equipment acoustic assembly performance test methods, device, equipment and storage medium
CN104287700A (en) Pulse wave detection system and method using audio port of smart phone
JP2010506206A (en) Method for measuring a person's stress state according to voice and apparatus for carrying out this method
CN108806666A (en) Without the speech recognition test device of interface, system and method
CN111429884B (en) Speech recognition rate analysis system
CN104796692B (en) The echo cancellor method of testing and its system of a kind of tv audio harvester
CN100538560C (en) Electronic beat sound-calibrator and control method thereof
CN104598016A (en) Method and device for controlling audio playing and motion sensing controller
CN105748099A (en) Android-based intelligent electronic auscultation system
CN108235215A (en) A kind of line control earphone test device and its test circuit
CN111968675A (en) Stringed instrument note comparison system based on hand recognition and use method thereof
CN112995882B (en) Intelligent equipment audio open loop test method
CN109979487A (en) Voice signal detection method and device
CN113409809B (en) Voice noise reduction method, device and equipment
CN114999457A (en) Voice system testing method and device, storage medium and electronic equipment
CN112908322A (en) Voice control method and device for toy vehicle
CN110787424A (en) Anti-cheating pull-up intelligent test system and method
CN114822589B (en) Indoor acoustic parameter determination method, model construction method, device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant