CN113782002B - Speech recognition testing method and system based on reverberation simulation - Google Patents

Speech recognition testing method and system based on reverberation simulation Download PDF

Info

Publication number
CN113782002B
CN113782002B CN202111022162.1A CN202111022162A CN113782002B CN 113782002 B CN113782002 B CN 113782002B CN 202111022162 A CN202111022162 A CN 202111022162A CN 113782002 B CN113782002 B CN 113782002B
Authority
CN
China
Prior art keywords
reverberation
test
audio
sound source
test audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111022162.1A
Other languages
Chinese (zh)
Other versions
CN113782002A (en
Inventor
邹凯文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shencong Semiconductor Zhuhai Co ltd
Shencong Semiconductor Jiangsu Co ltd
Original Assignee
Shencong Semiconductor Zhuhai Co ltd
Shencong Semiconductor Jiangsu Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shencong Semiconductor Zhuhai Co ltd, Shencong Semiconductor Jiangsu Co ltd filed Critical Shencong Semiconductor Zhuhai Co ltd
Priority to CN202111022162.1A priority Critical patent/CN113782002B/en
Publication of CN113782002A publication Critical patent/CN113782002A/en
Application granted granted Critical
Publication of CN113782002B publication Critical patent/CN113782002B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/01Assessment or evaluation of speech recognition systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/307Frequency adjustment, e.g. tone control
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T90/00Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation

Abstract

The invention provides a voice recognition test method and a voice recognition test system based on reverberation simulation, wherein the method comprises the following steps: firstly, setting a first test scene, wherein a reverberation parameter acquisition device of the first test scene is arranged in a first closed boundary and surrounds a preset position to be tested, the reverberation parameter acquisition device acquires a reverberation parameter of a first test audio sent by a first sound source, then, setting a second test scene, generating a simulated reverberation test audio by an audio generator in the second test scene, sending the simulated reverberation test audio by the second sound source, receiving the simulated reverberation test audio by the device to be tested, and finally, outputting an identification result by the device to be tested, and judging the identification result by a processor; the system comprises a first test scene, a second test scene, equipment to be tested and a processor. The method for simulating the real reverberation is adopted to replace the traditional method for testing in the real environment, is not limited by the field any more, and is more convenient and quicker to operate.

Description

Speech recognition testing method and system based on reverberation simulation
Technical Field
The invention relates to the technical field of reverberation simulation, in particular to a voice recognition test method and system based on reverberation simulation.
Background
With the rapid development of artificial intelligence, the language is not just a communication mode between people, but also becomes an important means for people to machine communication, and an artificial intelligence voice recognition technology is used as a man-machine communication interface, so that the artificial intelligence voice recognition technology becomes a key technology for people to machine communication, provides various possibilities for our life, facilitates the innovation of various fields in our life, and has opportunities for applying the artificial intelligence voice recognition technology in almost all fields including the fields of industry, household appliances, communication, automobile electronics, medical treatment, home services, consumer electronics and the like. Speech recognition technology has great market potential.
Artificial intelligence speech recognition technology is applied to an artificial intelligence speech device, and there are many application scenarios of the artificial intelligence speech device, such as bedrooms, restaurants, meeting rooms, balconies, kitchens, bathrooms, concert halls, meeting rooms, and the like, and the spatial characteristics under the different scenarios are generally different, so that the reverberation situation of each scenario is also different.
Because artificial intelligence voice equipment can be applied in different reverberation scenes, therefore the artificial intelligence voice equipment can all carry out the speech recognition test under the different reverberation scene condition before leaving the factory, the artificial intelligence voice equipment before carrying out the speech performance test, need arrange various environment and simulate different reverberation situations, a large amount of manpower and materials have been wasted, arrange various real environment and still can receive the restriction in place, therefore some complicated scenes are arranged very inconvenient, and after testing one scene, need to shift test equipment to another scene and test, consequently can't test a plurality of scenes simultaneously, the inefficiency.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a voice recognition testing method and system based on reverberation simulation.
In a first aspect, the present invention provides a method for testing speech recognition based on reverberation simulation, comprising the steps of:
step S1, a first test scene is set, wherein the first test scene comprises at least one first sound source, a plurality of first closed boundaries and a plurality of reverberation parameter acquisition devices, and the first sound source comprises a plurality of first closed boundaries and a plurality of reverberation parameter acquisition devices, wherein the first sound source comprises a plurality of first closed boundaries:
the first sound source is located within the first closed boundary;
the reverberation parameter acquisition equipment is arranged in the first closed boundary and is arranged around a preset position to be detected in a three-dimensional space;
step S2, through a first test audio sent by the first sound source, the first test audio is reflected by the first closed boundary to form a reverberant sound, and each reverberation parameter acquisition device executes reverberation acquisition according to the reverberant sound received in the acquisition direction and generates a corresponding reverberation parameter;
step S3, generating simulated reverberation test audio according to the reverberation parameter and a second test audio, wherein the second test audio comprises a test instruction corpus representing a preset test instruction;
step S4, setting a second test scene, wherein the second test scene comprises a second closed boundary and a plurality of second sound sources, and the equipment to be tested is arranged in the second test scene, and the second test scene comprises the following components:
the second closed boundary is used for realizing sound insulation between an internal closed environment and an external open environment and eliminating the reverberation possibly generated by the internal closed environment;
the plurality of second sound sources and the equipment to be measured are positioned in the second closed boundary, and the relative position relationship between the equipment to be measured and each second sound source is consistent with the relative position relationship between the position to be measured and each reverberation parameter acquisition equipment;
step S5, through the simulated reverberation test audio sent by the second sound source, the equipment to be tested carries out voice recognition according to the received simulated reverberation test audio and generates a corresponding voice recognition result;
and S6, judging whether the voice recognition result is consistent with the preset test instruction or not, and recording the judgment result.
Optionally, the simulated reverberation test audio sent by the second sound source is generated according to the reverberation parameter generated by the reverberation parameter acquisition device with consistent relative position relationship and the second test audio.
Optionally, in the step S2, the reverberation collection includes:
step S21, sequentially extracting one first test audio from a first test audio set to be sent, wherein the first test audio set comprises a plurality of first test audio with different frequencies, and each first test audio has the same first duration;
step S22, the reverberation parameter acquisition equipment continuously acquires the audio signals received in the acquisition direction, and acquires second duration time and frequency change conditions of the audio signals;
and (3) repeatedly executing the steps S21 to S22 until each reverberation parameter acquisition device cannot acquire the audio signals, and playing all the first test audio in the first test audio set by the first sound source.
Optionally, the reverberation parameter includes a reverberation duration and a frequency decay curve corresponding to a frequency of each of the first test audio;
the reverberation duration includes a difference between the second duration and the first duration at a corresponding frequency;
the frequency decay curve includes the frequency variation over the reverberation duration at a corresponding frequency.
Optionally, in the step S3, the generating the simulated reverberation test audio includes:
step S31, extracting a characteristic segment of the second test audio, and acquiring the average frequency of the characteristic segment;
step S32, selecting the corresponding reverberation parameter according to the average frequency, and generating reverberation superposition audio based on the selected reverberation parameter;
and step S33, superposing the reverberation superposition audio and the second test audio to generate the simulated reverberation test audio.
Optionally, the second test audio further includes an environmental noise corpus representing the preset test instruction, where the environmental noise corpus is used to provide a real environmental simulation for the speech recognition test.
Optionally, the simulated reverberation test audio includes test instruction reverberation audio and ambient noise reverberation audio;
the test instruction reverberation audio is generated according to the test instruction corpus and the reverberation parameter;
the ambient noise reverberant audio is generated according to the ambient noise corpus and the reverberation parameter.
Optionally, at least a portion of the plurality of second sound sources emit the test instruction reverberant audio;
at least a portion of the plurality of second sound sources play the ambient noise reverberant audio.
Optionally, the second sound source emitting the test instruction reverberant audio is on the same horizontal plane as the device under test.
In a second aspect, the present invention further provides a voice recognition test system based on reverberant sound simulation, which is applied to the voice recognition test method, and includes:
a first test scenario for providing a reverberation parameter acquisition environment including at least one first sound source, a first closed boundary, and a plurality of reverberation acquisition devices, wherein:
the first sound source is positioned in the first closed boundary and is used for emitting first test audio, and the first test audio is reflected by the first closed boundary to form reverberant sound;
the reverberation parameter acquisition devices are arranged in the first closed boundary, are arranged around the position to be detected in the three-dimensional space, and are used for executing reverberation acquisition according to the reverberation received in the acquisition direction and generating corresponding reverberation parameters;
the audio generator is used for generating simulated reverberation test audio according to the reverberation parameter and second test audio, and the second test audio comprises a test instruction corpus representing a preset test instruction;
the second test scene is used for providing a voice recognition test environment for the equipment to be tested, the second test scene comprises a second closed boundary and a plurality of second sound sources, and the equipment to be tested is arranged in the second test scene, wherein:
the second closed boundary is used for realizing sound insulation between an internal closed environment and an external open environment and eliminating the reverberation possibly generated by the internal closed environment;
the plurality of second sound sources and the equipment to be tested are all arranged in the second closed boundary, and the relative position relationship between the equipment to be tested and each second sound source is consistent with the relative position relationship between the position to be tested and each reverberation parameter acquisition equipment, so as to play the simulated reverberation test audio;
the equipment to be tested is used for carrying out voice recognition on the received simulated reverberation test audio and generating a corresponding voice recognition result;
and the processor is used for judging whether the voice recognition result is consistent with the preset test instruction or not and recording the judgment result.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the voice recognition testing method and system based on the reverberation simulation, the method for simulating the real reverberation is adopted to replace a traditional method for testing in a real environment, the method and system are not limited by sites any more, and the operation is more convenient and quicker.
2. According to the voice recognition testing method and system based on the reverberation simulation, simulation of a plurality of different scenes can be achieved in the same second closed boundary, so that simulation tests under different reverberation conditions are achieved, the application range of the voice recognition testing method and system is improved, after testing of one scene is completed, the voice recognition testing method and system do not need to be transferred to another testing environment to conduct testing of the other scene, and overall testing efficiency is improved greatly.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
FIG. 1 is a flow chart of a speech recognition testing method based on reverberation simulation according to an embodiment of the present invention;
FIG. 2 is a block diagram of a speech recognition test system based on reverberation simulation according to an embodiment of the present invention;
FIG. 3 is a graph of reverberation parameters at 1000Hz of a speech recognition test method based on reverberation simulation according to an embodiment of the present invention;
FIG. 4 is a graph of reverberation parameters at 1100Hz of a voice recognition testing method based on reverberation simulation according to an embodiment of the present invention;
fig. 5 is a reverberation parameter acquisition chart of a voice recognition testing method based on reverberation simulation according to an embodiment of the present invention;
in the figure:
1-an input interface;
2-a screener;
3-reverberation parameter acquisition equipment;
a 4-audio generator;
5-an audio playing device;
6-memory;
7-a processor;
8-a device to be tested;
9-a second closed boundary.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.
Before the embodiments of the present invention are explained, the reverberation will be briefly explained, and when the sound wave propagates indoors, the sound wave is reflected by the wall, the ceiling, the floor and other obstacles, and each reflection needs to be absorbed by the obstacles. Therefore, after the sound source stops sounding, the sound waves are reflected and absorbed for a plurality of times in the room and finally disappear, and the sound source is perceived to be mixed with a plurality of sound waves for a period of time after the sound source stops sounding (the sound continuation phenomenon still exists after the sound source stops sounding in the room). This phenomenon is called reverberation and this period is called reverberation time.
Examples
Fig. 1 is a flowchart of a voice recognition testing method based on reverberation simulation according to an embodiment of the present invention, and fig. 2 is a block diagram of a voice recognition testing system based on reverberation simulation according to an embodiment of the present invention; FIG. 3 is a graph of reverberation parameters at 1000Hz of a speech recognition test method based on reverberation simulation according to an embodiment of the present invention; FIG. 4 is a graph of reverberation parameters at 1100Hz of a voice recognition testing method based on reverberation simulation according to an embodiment of the present invention; fig. 5 is a reverberation parameter acquisition chart of a voice recognition testing method based on reverberation simulation according to an embodiment of the present invention; referring to fig. 1, 3, 4 and 5, the method in this embodiment includes:
step S1, a first test scene is set, wherein the first test scene comprises at least one first sound source, a plurality of first closed boundaries and a plurality of reverberation parameter acquisition devices, and the first sound source comprises a plurality of first closed boundaries and a plurality of reverberation parameter acquisition devices, wherein the first sound source comprises a plurality of first closed boundaries:
the first sound source is located within the first closed boundary;
the reverberation parameter acquisition equipment is arranged in the first closed boundary and is arranged around a preset position to be detected in the three-dimensional space.
In this embodiment, the first closed side in the step S1 may be a living room, a bedroom, or a conference room, which is not particularly limited in this application, where the first sound source is generally located at a common sound generating point in the first closed boundary, and the common sound generating point may be a position where a common person is located, a position where an electronic device is located, or the like, where when the reverberation parameter is collected by the reverberation parameter collecting device, one or several positions to be measured need to be preset, and through the preset position to be measured, the coordinate position of the reverberation parameter collecting device relative to the preset position to be measured can be clearly known, so that the setting of the position between the device to be measured and each second sound source in the subsequent step S4 is facilitated.
And S2, through a first test audio sent by a first sound source, the first test audio forms a reverberant sound through reflection of a first closed boundary, and each reverberant parameter acquisition device performs reverberant acquisition according to the reverberant sound received in the acquisition direction and generates corresponding reverberant parameters.
In this embodiment, the reverberation parameter can include a pulse file, in particular, the pulse file can be a kind of snapshot reflecting how the physical space or the audio system responds to the input signal and generates some output in combination with the input signal. In the above embodiment, the pulse file may correspond to a reverberation characteristic curve representing a reverberation parameter.
In this embodiment, the first test audio in step S2 may be a sound emitted by a person or a sound emitted by an electronic device, and the collection direction of the reverberation parameter collection device may be that 6 positions, including up, down, left, right, front, back, and the left of the preset position to be tested, are collected separately, or may be increased to 12 positions or more, and the collection of the reverberation parameters of the multiple positions makes the following simulated real environment more approximate to the real environment.
And S3, generating a simulated reverberation test audio according to the reverberation parameter and a second test audio, wherein the second test audio comprises a test instruction corpus representing a preset test instruction.
In this embodiment, the preset test instruction generally includes a test instruction corpus and an environmental noise corpus, and when the voice performance test is performed, the test instruction corpus with the most basic test instruction corpus is needed to complete the test of the voice performance.
Step S4, setting a second test scene, wherein the second test scene comprises a second closed boundary and a plurality of second sound sources, and the equipment to be tested is arranged in the second test scene, and the second test scene comprises:
the second closed boundary is used for realizing sound insulation between the inner closed environment and the outer open environment and eliminating reverberation possibly generated by the inner closed environment;
the plurality of second sound sources and the equipment to be measured are all positioned in the second closed boundary, the relative position relation between the equipment to be measured and each second sound source is consistent with the relative position relation between the preset position to be measured and each reverberation parameter acquisition equipment, and the setting mode of the position relation is used for realizing the restoration of the test scene of the first closed boundary.
In this embodiment, the device to be tested may be a mobile phone or an intelligent voice recognition robot in a mall, which is not specifically limited in this application, and the relative positional relationship is consistent and may be understood as: in the first closed boundary, a space rectangular coordinate system is established by taking a preset position to be detected as a coordinate origin, and then the space coordinate position of the reverberation parameter acquisition device can be expressed as (X, Y, Z); in the second closed boundary, the position of the device to be detected is taken as the origin of coordinates, and a space rectangular coordinate system identical to the space rectangular coordinate system is established, so that the position of the second sound source corresponding to the reverberation parameter acquisition device can be expressed as (X, Y, Z), wherein X, Y, Z respectively represent coordinates on an X axis, a Y axis and a Z axis of the space rectangular coordinate system.
And S5, performing voice recognition on the equipment to be detected according to the received simulated reverberation test audio by using the simulated reverberation test audio sent by the second sound source and generating a corresponding voice recognition result.
In this embodiment, the speech recognition result may include specific text information.
And S6, judging whether the voice recognition result is consistent with a preset test instruction or not, and recording the judgment result.
In this embodiment, comparing specific text information of a voice recognition result with text information of a preset test instruction, if the specific text information is consistent with the text information of the preset test instruction, marking the specific text information as normal audio and recording correct recognition in a test log; if the voice frequency is inconsistent, the voice frequency is marked as abnormal voice frequency, and a string word or unrecognized word is recorded in the test log.
In this embodiment, the simulation of the same second closed boundary on different test scenes may be implemented, which may be understood that in the process of simulating multiple different real scenes, in the same second closed boundary referred to in this application, the restoration of different real scenes may be implemented by adjusting different reverberation parameters corresponding to the different real scenes. Specifically, for a specific second sound source, the simulated reverberation test audio emitted by the specific second sound source can be generated by combining different pulse files and the second test audio, and the different pulse files respectively correspond to different real scenes, so that simulation of different test scenes in the same second closed boundary is realized.
In an alternative embodiment, the simulated reverberation test audio emitted by the second sound source is generated according to the reverberation parameter generated by the reverberation parameter acquisition device and the second test audio consistent in relative position relationship.
In this embodiment, the generation of the simulated reverberation test audio is implemented by using the reverberation parameters having the consistent relative positional relationship described in step S4.
In an alternative embodiment, in step S2, the reverberation collection includes:
step S21, sequentially extracting a first test audio from a first test audio set to acquire the first test audio, wherein the first test audio set comprises a plurality of first test audio with different frequencies, and each first test audio has the same first duration;
step S22, the reverberation parameter acquisition equipment continuously acquires the audio signals received in the acquisition direction, and acquires the second duration time and the frequency change condition of the audio signals;
and (3) repeatedly executing the steps S21 to S22 until the condition that each reverberation parameter acquisition device cannot acquire the audio signals is reached, and playing all the first test audio in the first test audio set by the first sound source.
In this embodiment, the first test audio set uses 100Hz as the playing precision, that is, 100Hz is the playing precision, that is, the frequency difference between the audio in the first test audio set is a natural multiple of 100Hz, the first test audio set includes sounds with frequencies from 100Hz to 20KHz, the first duration of each frequency sound may be 4s, and when the first duration of each frequency sound is collected, a blank time is reserved after the reverberation of one frequency is completely eliminated, the blank time is used for recording the reverberation parameter in the environment, after the recording is completed, the collection of the other frequency is performed, in this embodiment, the collection is performed on the audio with frequencies of 1000Hz and 1100Hz, fig. 3 shows a reverberation characteristic curve at the frequency of 1000Hz, and fig. 4 shows a reverberation characteristic curve at the frequency of 1100Hz, where the reverberation characteristic curve is composed of the reverberation duration and the frequency decay curve. Specifically, in the reverberation characteristic curves shown in fig. 3 and 4, the horizontal coordinate of the coordinate system in which it is located represents time in seconds, and the vertical coordinate represents frequency in hertz. In the reverberation profile shown in fig. 3 and 4, the first duration of the first test audio at both the 1000Hz frequency and the 1100Hz frequency is 4s. It will be appreciated that the frequency curve shown at the upper part in fig. 3 is the frequency curve of the first test audio at a frequency of 1000Hz, the frequency curve shown at the lower part is the frequency curve collected by the reverberation parameter collection device, and the frequency curve located in the rectangular frame 300 is the frequency decay curve, i.e. the reverberation characteristic curve, within the reverberation duration. Likewise, in fig. 4, the frequency curve shown in the upper part is the frequency curve of the first test audio at 1100Hz, the frequency curve shown in the lower part is the frequency curve collected by the reverberation parameter collection device, and the frequency curve located in the rectangular frame 400 is the frequency decay curve within the reverberation duration.
In an alternative embodiment, the reverberation parameter includes a reverberation duration and a frequency decay curve corresponding to the frequency of each first test audio;
the reverberation duration includes a difference between the second duration and the first duration at the corresponding frequency;
the frequency decay curve includes the frequency change over the duration of the reverberation at the corresponding frequency.
In the present embodiment, the calculation formula of the reverberation duration is T 0 =T 1 -T 2
Wherein T is 0 For reverberation duration, T 1 For a second duration, T 2 For the first duration, a calculation formula for the reverberation duration can be derived without any doubt based on the definition of the reverberation time described above.
In an alternative embodiment, in step S3, the generation of the simulated reverberation test audio includes:
step S31, extracting a characteristic segment of the second test audio, and acquiring the average frequency of the characteristic segment;
step S32, selecting corresponding reverberation parameters according to the average frequency, and generating reverberation superposition audio based on the selected reverberation parameters;
and step S33, superposing the reverberation superposition audio and the second test audio to generate the simulated reverberation test audio.
In this embodiment, the characteristic segment of the second test audio in step S31 is a frequency segment of the corresponding time taken in the section excluding the peak area and the valley area of the frequency in the second test audio. In step S32, the specific implementation of generating the reverberant superimposed audio based on the selected reverberant parameter may be understood that the reverberant parameter and the second test audio are convolved to generate the reverberant superimposed audio, and those skilled in the art may perform the convolution calculation on the reverberant parameter and the second test audio by using a conventional calculation method in the art, which is not limited herein.
In an alternative embodiment, the second test audio further includes an environmental noise corpus representing a preset test instruction, where the environmental noise corpus is used to provide a simulation of a real noise environment for the speech recognition test, and by adding the environmental noise corpus, the speech performance of the device to be tested can be further better tested.
In this embodiment, the environmental noise corpus may be a sound emitted by a person or a sound emitted by other electronic devices, that is, the environmental noise corpus basically adopts a sound that may occur in a real environment.
In an alternative embodiment, the simulated reverberation test audio includes test instruction reverberation audio and ambient noise reverberation audio;
generating test instruction reverberation audio according to the test instruction corpus and the reverberation parameter;
the ambient noise reverberant audio is generated from the ambient noise corpus and the reverberation parameters.
In an alternative embodiment, at least a portion of the plurality of second sound sources emit test instruction reverberant audio;
at least a portion of the plurality of second sound sources plays the ambient noise reverberant audio.
In an alternative embodiment, the second sound source that emits the test instruction reverberant audio is on the same horizontal plane as the device under test. Because in real life, the sound source position and the equipment to be tested are generally on the same horizontal plane, the device is also arranged in the simulation test process, the device is closer to the real environment, and the obtained result has more practical significance.
Referring to fig. 2, the present embodiment further provides a voice recognition test system based on reverberant sound simulation, which is applied to the above voice recognition test method, and includes:
a first test scenario for providing a reverberation parameter acquisition environment comprising at least one first sound source, a first closed boundary and a plurality of reverberation acquisition devices 3, wherein:
the first sound source is positioned in the first closed boundary and used for emitting first test audio, and the first test audio is reflected by the first closed boundary to form reverberant sound;
a plurality of reverberation parameter acquisition devices 3, which are placed in the first closed boundary and around the position to be measured in the three-dimensional space, and are used for performing reverberation acquisition according to the reverberation received in the acquisition direction and generating corresponding reverberation parameters;
the audio generator 4 is used for generating simulated reverberation test audio according to the reverberation parameter and second test audio, wherein the second test audio comprises a test instruction corpus representing a preset test instruction;
a second test scenario, configured to provide a speech recognition test environment for the device under test 8, where the second test scenario includes a second closed boundary 9 and a plurality of second sound sources, and the device under test 8 is in the second test scenario, where:
the second closed boundary is used for realizing sound insulation between the inner closed environment and the outer open environment and eliminating reverberation possibly generated by the inner closed environment;
the plurality of second sound sources and the equipment to be measured 8 are all arranged in the second closed boundary 9, and the relative position relationship between the equipment to be measured 8 and each second sound source is consistent with the relative position relationship between the position to be measured 8 and each reverberation parameter acquisition equipment 3, so as to play the simulated reverberation test audio.
In this embodiment, both the first sound source and the second sound source may use hi-fi.
In this embodiment, the second closed boundary 9 adopts a box body, and the box body can isolate external sound and can fully absorb the sound, thereby avoiding the generation of reverberation. The box body comprises a shell, the shell comprises six panels, the six panels are made of cold-rolled steel plates with the thickness of 2-3mm, the shell is made by stamping, welding, pickling and spraying, one of the six panels can be opened and closed, and 3-6 layers of composite sound insulation panels made of damping, sound insulation and sound absorption materials are fixed on the inner wall of the shell, so that the complete absorption of sound is realized.
The device under test 5 is used for performing voice recognition on the received simulated reverberation test audio and generating a corresponding voice recognition result.
And the processor 7 judges whether the voice recognition result is consistent with a preset test instruction or not and records the judgment result.
In this embodiment, in a specific implementation process, the system may further include an input interface 1 as shown in fig. 2, provided on the filter 2, for importing specified entries;
the filter 2 is used for filtering the imported appointed vocabulary entries from the corpus of the test library to obtain appointed vocabulary entries as the corpus of the test instruction;
and the memory 6 is used for automatically storing the test log file and the abnormal audio file which are output by the processor.
By the above embodiment, the following effects can be achieved:
1. the real environment reverberation audio generation method, the voice performance test method and the system provided by the invention adopt a method for simulating the real reverberation to replace the traditional method for testing in the real environment, are not limited by sites any more, and are more convenient and quicker to operate.
2. According to the real environment reverberation audio generation method, the voice performance test method and the system, simulation of a plurality of different scenes can be achieved in the same second closed boundary, so that simulation tests under different reverberation conditions are achieved, the application range of the real environment reverberation audio generation method is improved, the real environment reverberation audio generation method does not need to be transferred to another test environment to test another scene after the test of one scene is completed, and the overall test efficiency is greatly improved.
The foregoing describes specific embodiments of the present invention. It is to be understood that the invention is not limited to the particular embodiments described above, and that various changes or modifications may be made by those skilled in the art within the scope of the appended claims without affecting the spirit of the invention. The embodiments of the present application and features in the embodiments may be combined with each other arbitrarily without conflict.

Claims (10)

1. A voice recognition testing method based on reverberation simulation is characterized by comprising the following steps:
step S1, a first test scene is set, wherein the first test scene comprises at least one first sound source, a plurality of first closed boundaries and a plurality of reverberation parameter acquisition devices, and the first sound source comprises a plurality of first closed boundaries and a plurality of reverberation parameter acquisition devices, wherein the first sound source comprises a plurality of first closed boundaries:
the first sound source is located within the first closed boundary;
the reverberation parameter acquisition equipment is arranged in the first closed boundary and is arranged around a preset position to be detected in a three-dimensional space;
step S2, through a first test audio sent by the first sound source, the first test audio is reflected by the first closed boundary to form a reverberant sound, and each reverberation parameter acquisition device executes reverberation acquisition according to the reverberant sound received in the acquisition direction and generates a corresponding reverberation parameter;
step S3, generating simulated reverberation test audio according to the reverberation parameter and a second test audio, wherein the second test audio comprises a test instruction corpus representing a preset test instruction;
step S4, setting a second test scene, wherein the second test scene comprises a second closed boundary and a plurality of second sound sources, and the equipment to be tested is arranged in the second test scene, and the second test scene comprises the following components:
the second closed boundary is used for realizing sound insulation between an internal closed environment and an external open environment and eliminating the reverberation possibly generated by the internal closed environment;
the plurality of second sound sources and the equipment to be measured are positioned in the second closed boundary, and the relative position relationship between the equipment to be measured and each second sound source is consistent with the relative position relationship between the position to be measured and each reverberation parameter acquisition equipment;
step S5, through the simulated reverberation test audio sent by the second sound source, the equipment to be tested carries out voice recognition according to the received simulated reverberation test audio and generates a corresponding voice recognition result;
and S6, judging whether the voice recognition result is consistent with the preset test instruction or not, and recording the judgment result.
2. The method according to claim 1, wherein the simulated reverberation test audio emitted from the second sound source is generated from the reverberation parameter generated by the reverberation parameter collection device having the same relative positional relationship and the second test audio.
3. The method according to claim 1, wherein in the step S2, the reverberation collection includes:
step S21, sequentially extracting one first test audio from a first test audio set to be sent, wherein the first test audio set comprises a plurality of first test audio with different frequencies, and each first test audio has the same first duration;
step S22, the reverberation parameter acquisition equipment continuously acquires the audio signals received in the acquisition direction, and acquires second duration time and frequency change conditions of the audio signals;
and (3) repeatedly executing the steps S21 to S22 until each reverberation parameter acquisition device cannot acquire the audio signals, and playing all the first test audio in the first test audio set by the first sound source.
4. The method of claim 3, wherein the reverberation parameter includes a reverberation duration and a frequency decay curve corresponding to a frequency of each of the first test audio;
the reverberation duration includes a difference between the second duration and the first duration at a corresponding frequency;
the frequency decay curve includes the frequency variation over the reverberation duration at a corresponding frequency.
5. The method according to claim 4, wherein in the step S3, the generating of the pseudo reverberation test audio includes:
step S31, extracting a characteristic segment of the second test audio, and acquiring the average frequency of the characteristic segment;
step S32, selecting the corresponding reverberation parameter according to the average frequency, and generating reverberation superposition audio based on the selected reverberation parameter;
and step S33, superposing the reverberation superposition audio and the second test audio to generate the simulated reverberation test audio.
6. The method according to any one of claims 1 to 5, wherein the second test audio further includes an ambient noise corpus representing the preset test instructions, the ambient noise corpus being used to provide a real environment simulation for the speech recognition test.
7. The speech recognition testing method of claim 6, wherein the simulated reverberation test audio comprises test instruction reverberation audio and ambient noise reverberation audio;
the test instruction reverberation audio is generated according to the test instruction corpus and the reverberation parameter;
the ambient noise reverberant audio is generated according to the ambient noise corpus and the reverberation parameter.
8. The method of claim 7, wherein at least a portion of the plurality of second sound sources emit the test instruction reverberant audio;
at least a portion of the plurality of second sound sources play the ambient noise reverberant audio.
9. The method of claim 8, wherein the second sound source that emits the test instruction reverberant audio is on the same horizontal plane as the device under test.
10. A speech recognition testing system based on reverberant sound simulation, applied to the speech recognition testing method of any one of claims 1 to 9, comprising:
a first test scenario for providing a reverberation parameter acquisition environment including at least one first sound source, a first closed boundary, and a plurality of reverberation acquisition devices, wherein:
the first sound source is positioned in the first closed boundary and is used for emitting first test audio, and the first test audio is reflected by the first closed boundary to form reverberant sound;
the reverberation parameter acquisition devices are arranged in the first closed boundary, are arranged around the position to be detected in the three-dimensional space, and are used for executing reverberation acquisition according to the reverberation received in the acquisition direction and generating corresponding reverberation parameters;
the audio generator is used for generating simulated reverberation test audio according to the reverberation parameter and second test audio, and the second test audio comprises a test instruction corpus representing a preset test instruction;
the second test scene is used for providing a voice recognition test environment for the equipment to be tested, the second test scene comprises a second closed boundary and a plurality of second sound sources, and the equipment to be tested is arranged in the second test scene, wherein:
the second closed boundary is used for realizing sound insulation between an internal closed environment and an external open environment and eliminating the reverberation possibly generated by the internal closed environment;
the plurality of second sound sources and the equipment to be tested are all arranged in the second closed boundary, and the relative position relationship between the equipment to be tested and each second sound source is consistent with the relative position relationship between the position to be tested and each reverberation parameter acquisition equipment, so as to play the simulated reverberation test audio;
the equipment to be tested is used for carrying out voice recognition on the received simulated reverberation test audio and generating a corresponding voice recognition result;
and the processor is used for judging whether the voice recognition result is consistent with the preset test instruction or not and recording the judgment result.
CN202111022162.1A 2021-09-01 2021-09-01 Speech recognition testing method and system based on reverberation simulation Active CN113782002B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111022162.1A CN113782002B (en) 2021-09-01 2021-09-01 Speech recognition testing method and system based on reverberation simulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111022162.1A CN113782002B (en) 2021-09-01 2021-09-01 Speech recognition testing method and system based on reverberation simulation

Publications (2)

Publication Number Publication Date
CN113782002A CN113782002A (en) 2021-12-10
CN113782002B true CN113782002B (en) 2023-07-04

Family

ID=78840676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111022162.1A Active CN113782002B (en) 2021-09-01 2021-09-01 Speech recognition testing method and system based on reverberation simulation

Country Status (1)

Country Link
CN (1) CN113782002B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115862665B (en) * 2023-02-27 2023-06-16 广州市迪声音响有限公司 Visual curve interface system of echo reverberation effect parameters

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107039050A (en) * 2016-02-04 2017-08-11 阿里巴巴集团控股有限公司 Treat the automatic test approach and device of tested speech identifying system
CN108242234A (en) * 2018-01-10 2018-07-03 腾讯科技(深圳)有限公司 Speech recognition modeling generation method and its equipment, storage medium, electronic equipment
JP2021001949A (en) * 2019-06-20 2021-01-07 学校法人立命館 Prediction system for voice recognition performance, structuring method for learning model, and prediction method for voice recognition performance
CN213028551U (en) * 2020-09-23 2021-04-20 上海深聪半导体有限责任公司 A noise elimination sound-proof box for acoustics test

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6261043B2 (en) * 2013-08-30 2018-01-17 本田技研工業株式会社 Audio processing apparatus, audio processing method, and audio processing program
US11823658B2 (en) * 2015-02-20 2023-11-21 Sri International Trial-based calibration for audio-based identification, recognition, and detection system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107039050A (en) * 2016-02-04 2017-08-11 阿里巴巴集团控股有限公司 Treat the automatic test approach and device of tested speech identifying system
CN108242234A (en) * 2018-01-10 2018-07-03 腾讯科技(深圳)有限公司 Speech recognition modeling generation method and its equipment, storage medium, electronic equipment
JP2021001949A (en) * 2019-06-20 2021-01-07 学校法人立命館 Prediction system for voice recognition performance, structuring method for learning model, and prediction method for voice recognition performance
CN213028551U (en) * 2020-09-23 2021-04-20 上海深聪半导体有限责任公司 A noise elimination sound-proof box for acoustics test

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Improving robustness against reverberation for automatic speech recognition;Vikramjit Mitra;《2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)》;全文 *
关于语音识别在空调上的应用与改善;桑亚超;《日用电器》;全文 *

Also Published As

Publication number Publication date
CN113782002A (en) 2021-12-10

Similar Documents

Publication Publication Date Title
Wendt et al. A computationally-efficient and perceptually-plausible algorithm for binaural room impulse response simulation
Christensen et al. The CHiME corpus: a resource and a challenge for computational hearing in multisource environments
CN103430574B (en) For determining apparatus and method, audio process and the method for the treatment of signal for the tolerance of reverberation perception level
US11190898B2 (en) Rendering scene-aware audio using neural network-based acoustic analysis
Ratnarajah et al. IR-GAN: Room impulse response generator for far-field speech recognition
Raykar et al. Speaker localization using excitation source information in speech
JP2006500818A (en) Sound reproduction system, program, and data carrier
Jetzt Critical distance measurement of rooms from the sound energy spectral response
CN113782002B (en) Speech recognition testing method and system based on reverberation simulation
Pörschmann et al. Binauralization of omnidirectional room impulse responses-algorithm and technical evaluation
Bertin et al. A French corpus for distant-microphone speech processing in real homes
Nguyen et al. Multilevel B-splines-based learning approach for sound source localization
CN110072177A (en) Space division information acquisition methods, device and storage medium
Kirsch et al. Computationally-efficient simulation of late reverberation for inhomogeneous boundary conditions and coupled rooms
CN113470685B (en) Training method and device for voice enhancement model and voice enhancement method and device
CN109168118A (en) Reverberation detection method, device and electronic equipment
Georgiou et al. Design and simulation of a benchmark room for room acoustic auralizations
Chang et al. Applying deep learning and building information modeling to indoor positioning based on sound
Lopez-Ballester et al. Ai-iot platform for blind estimation of room acoustic parameters based on deep neural networks
Sarabia et al. Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning
JP4229435B2 (en) Sound field simulation apparatus, sound field simulation method, computer program, program recording medium
Wang et al. Blind estimation of speech transmission index and room acoustic parameters by using extended model of room impulse response derived from speech signals
Dobre et al. TIC-TAC based live acoustic watermarking with improved forgery detection performances
CN112489667A (en) Audio signal processing method and device
Delabie et al. An acoustic simulation framework to support indoor positioning and data driven signal processing assessments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Unit G4-202-059, Artificial Intelligence Industrial Park, No. 88 Jinjihu Avenue, Suzhou Industrial Park, Suzhou Area, China (Jiangsu) Pilot Free Trade Zone, Suzhou City, Jiangsu Province, 215124

Applicant after: Shencong Semiconductor (Jiangsu) Co.,Ltd.

Applicant after: Shencong semiconductor (Zhuhai) Co.,Ltd.

Address before: 200232 room 3712, 3 / F, 2879 Longteng Avenue, Xuhui District, Shanghai

Applicant before: Shanghai shencong Semiconductor Co.,Ltd.

Applicant before: Shencong semiconductor (Zhuhai) Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant