CN117956376A - Audio judgment method and device, electronic equipment and storage medium - Google Patents

Audio judgment method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117956376A
CN117956376A CN202211336717.4A CN202211336717A CN117956376A CN 117956376 A CN117956376 A CN 117956376A CN 202211336717 A CN202211336717 A CN 202211336717A CN 117956376 A CN117956376 A CN 117956376A
Authority
CN
China
Prior art keywords
sound
arrival angle
audio
audio signal
sound arrival
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211336717.4A
Other languages
Chinese (zh)
Inventor
陈明良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kaidelian Software Technology Co ltd
Original Assignee
Guangzhou Kaidelian Software Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Kaidelian Software Technology Co ltd filed Critical Guangzhou Kaidelian Software Technology Co ltd
Priority to CN202211336717.4A priority Critical patent/CN117956376A/en
Publication of CN117956376A publication Critical patent/CN117956376A/en
Pending legal-status Critical Current

Links

Landscapes

  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

The embodiment of the application provides an audio judging method, an audio judging device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring an audio signal acquired in a classroom; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, and if the sound energy data of the audio signal exceeds a preset sound energy threshold value, obtaining a preset interference sound arrival angle; according to the interference sound arrival angle, eliminating the sound arrival angle associated with the interference sound arrival angle in the sound arrival angle information, and obtaining the eliminated sound arrival angle information; if the number of sound arrival angles of the excluded sound arrival angle information exceeds a preset first sound source number threshold, determining that the audio signal is the multiple sound reading, the application realizes accurate judgment of the audio of the multiple sound reading, and reduces the cost.

Description

Audio judgment method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an audio determining method, an audio determining apparatus, an electronic device, and a storage medium.
Background
In a remote teaching scene, noise reduction processing is required after the microphone collects audio, a noise reduction method (such as RNNoise) based on deep learning is generally applied, however, the audio characteristics of the audio signal in the multiple-sound reading state are very close to the characteristics of noise, and the audio signal in the multiple-sound reading state is often suppressed as noise, so that whether the audio signal is in the multiple-sound reading state at present needs to be judged.
At present, the judgment of the finish read-aloud state can be realized by a deep learning-based method, however, the model trained by the method and the required calculated amount are large, the model is difficult to operate on low-performance embedded equipment, and the cost of audio confirmation of the finish read-aloud is increased.
Disclosure of Invention
Based on the above, the application provides an audio judging method, an audio judging device, electronic equipment and a storage medium, which can realize accurate judgment of aligned sound reading and reduce cost by calculating the angle of sound corresponding to an audio signal and the sound energy corresponding to the audio signal.
As a first aspect of an embodiment of the present application, there is provided an audio judging method including the steps of:
Acquiring an audio signal acquired in a classroom; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, wherein the sound arrival angle information comprises a plurality of sound arrival angles;
if the sound energy data of the audio signal exceeds a preset sound energy threshold value, acquiring a preset interference sound arrival angle; removing the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining removed sound arrival angle information;
If the sound arrival angle number of the excluded sound arrival angle information exceeds a preset first sound source number threshold, determining that the audio signal is the multiple sound reading, wherein the preset first sound source number threshold is used for indicating the number of sound sources corresponding to the multiple sound reading.
As a second aspect of an embodiment of the present application, there is provided an audio judging apparatus including:
the data acquisition module is used for acquiring audio signals acquired in a classroom; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, wherein the sound arrival angle information comprises a plurality of sound arrival angles;
The sound arrival angle processing module is used for acquiring a preset interference sound arrival angle if the sound energy data of the audio signal exceeds a preset sound energy threshold value; removing the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining removed sound arrival angle information;
The target audio signal determining module is configured to determine that the audio signal is a multiple sound reading if the number of sound arrival angles of the excluded sound arrival angle information exceeds a preset first sound source number threshold, where the preset first sound source number threshold is used to indicate a number of sound sources corresponding to the multiple sound reading.
As a third aspect of the embodiments of the present application, there is provided an electronic apparatus, including: a processor, a memory and a computer program stored on the memory and executable on the processor, which when executed by the processor performs the steps of the audio decision method as described in the first aspect.
As a fourth aspect of embodiments of the present application, there is provided a storage medium storing a computer program which, when executed by a processor, implements the steps of the audio judgment method as described in the first aspect.
The embodiment of the application acquires the audio signals acquired in the class; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, wherein the sound arrival angle information comprises a plurality of sound arrival angles; if the sound energy data of the audio signal exceeds a preset sound energy threshold value, acquiring a preset interference sound arrival angle; removing the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining removed sound arrival angle information; if the sound arrival angle number of the excluded sound arrival angle information exceeds a preset first sound source number threshold, determining that the audio signal is the multiple sound reading, wherein the preset first sound source number threshold is used for indicating the number of sound sources corresponding to the multiple sound reading. By calculating the sound reaching angle corresponding to the audio signal and the sound energy corresponding to the audio signal, the accurate judgment of the audio of the aligned sound reading is realized, and the cost is reduced.
For a better understanding and implementation, the present application is described in detail below with reference to the drawings.
Drawings
Fig. 1 is a flow chart of an audio judging method according to an embodiment of the application;
Fig. 2 is a schematic flow chart of S1 in an audio determining method according to an embodiment of the present application;
fig. 3 is a schematic flow chart of S2 in the audio determining method according to an embodiment of the present application;
Fig. 4 is a schematic flow chart of S2 in the audio determining method according to an embodiment of the present application;
Fig. 5 is a schematic structural diagram of an audio determining apparatus according to an embodiment of the present application;
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. Where the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated.
It should be understood that the embodiments described in the examples described below do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this disclosure, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, in the description of the present application, unless otherwise indicated, "a plurality" means two or more. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items, e.g., a and/or B, may represent: a exists alone, a and B exist together, and B exists alone; the character "/" generally indicates that the context-dependent object is an "or" relationship.
It should be appreciated that, although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms, and these terms are merely used to distinguish between similar objects and do not necessarily describe a particular order or sequence or imply relative importance. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances. The word "if"/"if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination", depending on the context.
The application scene of the audio judging method of the embodiment of the application comprises recording and broadcasting equipment, audio acquisition equipment and a loudspeaker; the audio acquisition device is used for carrying out data transmission with the recording and broadcasting device, the audio acquisition device is used for acquiring audio signals sent by a loudspeaker and audio signals sent by a teacher or a student, the acquired audio signals are sent to the recording and broadcasting device, the recording and broadcasting device is used for receiving the audio signals, and noise reduction and storage are carried out on the audio signals.
The audio judging method can be executed by recording and broadcasting equipment, the recording and broadcasting equipment can realize the audio judging method in a mode of software and/or hardware, and the recording and broadcasting equipment can be formed by two or more physical entities or one physical entity. The hardware pointed by the recording and playing device essentially refers to a computer device, for example, the recording and playing device can be an intelligent device such as a computer, a mobile phone, a tablet or an intelligent interaction tablet.
The recording and playing equipment is provided with at least one type of operating system, wherein the operating system comprises, but is not limited to, an android system, a Linux system and a Windows system. In an embodiment, the recording device may install at least one application program based on the operating system, and in an embodiment, the exemplary description is given with respect to a conference reservation application program. The application may be an application hosted by the operating system or may be an application downloaded from a third party device or server. The user can implement the audio judgment method of the present application based on the conference reservation application.
Referring to fig. 1, fig. 1 is a flowchart of an audio determining method according to an embodiment of the application, the method includes the following steps:
S1: acquiring an audio signal acquired in a classroom; according to the audio signal, sound arrival angle information of the audio signal and sound energy data of the audio signal are obtained, wherein the sound arrival angle information comprises a plurality of sound arrival angles.
The recording and broadcasting equipment can acquire the audio signals sent by the audio acquisition equipment through data transmission with the audio acquisition equipment, and can also extract corresponding audio signals in a preset database, wherein the database comprises a plurality of audio signals at different moments acquired in advance in a class. In an alternative embodiment, the user can establish a connection between the recording and playing device and the server by constructing a database, so that the data transmission between the recording and playing device and the database of the server is realized, the recording and playing device can call the database, and corresponding audio signals in the database are extracted.
The recording and broadcasting device collects audio signals in a class through the audio collection device to obtain audio signals collected in the class, and obtains sound arrival angle information of the audio signals according to the audio signals, wherein the sound arrival angle information comprises a plurality of sound arrival angles (DOA, direction of arrival), and each sound arrival angle corresponds to one sound source.
The recording and playing device obtains sound energy data of the audio signal, and specifically, the recording and playing device can obtain sound energy corresponding to the audio signal by detecting amplitude generated by the audio acquisition device for the audio signal.
S2: if the sound energy data of the audio signal exceeds a preset sound energy threshold value, acquiring a preset interference sound arrival angle; and eliminating the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining the eliminated sound arrival angle information.
Because the sound energy corresponding to the audio signal collected in the classroom is much larger than the energy of normal speaking in the same sound reading state, and an interference sound source possibly exists in the audio signal, for example, a loudspeaker fixedly arranged in the classroom is used for speaking through the loudspeaker, the sound energy corresponding to the audio signal sent by the loudspeaker is also much larger than the energy of normal speaking, therefore, the recording and playing device is preset with a sound energy threshold value, the preset sound energy threshold value is used for judging whether the sound energy of the audio signal collected by the audio collecting device at the moment is the audio signal in the same sound reading state or not, or the audio signal sent by the teacher through the loudspeaker, the recording and playing device can perform initial judgment of the audio signal according to the sound energy of the audio signal collected by the audio collecting device at the moment, and judges whether the sound energy of the audio signal collected by the audio collecting device is the audio signal in the same sound reading state or not or the audio signal sent by the teacher through the loudspeaker, specifically, the preset sound energy threshold value can be set according to the actual situation and is not limited.
In this embodiment, the recording and playing device compares the sound energy corresponding to the audio signal with a preset sound energy threshold, and if the sound energy corresponding to the audio signal does not exceed the sound energy threshold, it is determined that the audio signal neither includes the audio signal in the sound-like reading state, nor includes the audio signal sent by the teacher through the speaker, so that the noise reduction processing can be directly performed on the audio signal.
If the sound energy data of the audio signal exceeds the preset sound energy threshold, the sound energy of the audio signal collected by the audio collection device may only include the audio signal in the sound-like reading state, may only include the audio signal sent by the teacher through the speaker, or may include both the audio signal in the sound-like reading state and the audio signal sent by the teacher through the speaker, and if the sound energy is only adopted as a standard for judgment, the misjudgment is easy.
In this embodiment, in order to accurately determine whether the audio signal includes an audio signal corresponding to a multiple sound reading sound source, in this embodiment, the recording and playing device acquires a preset interference sound arrival angle; according to the interference sound arrival angle, eliminating the sound arrival angle associated with the interference sound arrival angle in the sound arrival angle information, obtaining the sound arrival angle information after elimination, eliminating the interference sound source, and reducing the influence of the interference sound source on the judgment of the audio signal of the aligned sound reading.
S3: if the number of sound arrival angles of the excluded sound arrival angle information exceeds a preset first sound source number threshold, determining that the audio signal is a sound reading.
The first sound source number threshold is the sound source number corresponding to the multiple sound reading state, and because the audio signals in the multiple sound reading state generally comprise the audio signals sent by a plurality of student sound sources, whether the audio signals comprise the audio signals in the multiple sound reading state can be confirmed by comparing the sound arrival angle number of the sound arrival angle information after being counted and the preset first sound source number threshold.
In this embodiment, the recording and playing device obtains the sound arrival angle number of the excluded sound arrival angle information according to the excluded sound arrival angle information, and determines that the audio signal is a multiple sound reading if the sound arrival angle number of the excluded sound arrival angle information exceeds a preset first sound source number threshold, where the preset first sound source number threshold is used to indicate the number of sound sources corresponding to the multiple sound reading.
Specifically, if the number of sound arrival angles of the excluded sound arrival angle information does not exceed the preset first sound source number threshold, it is confirmed that the audio signal is not in the multiple sound reading state, that is, the audio signal in the multiple sound reading state is not included, and noise reduction processing can be directly performed on the audio signal.
If the number of sound arrival angles of the excluded sound arrival angle information exceeds a preset first sound source number threshold, confirming that the audio signal is the sound-only reading, namely the audio signal in the sound-only reading state is included, taking the audio signal as a target audio signal, and not performing noise reduction processing.
The embodiment of the application acquires the audio signals acquired in the class; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, wherein the sound arrival angle information comprises a plurality of sound arrival angles; if the sound energy data of the audio signal exceeds a preset sound energy threshold value, acquiring a preset interference sound arrival angle; removing the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining the removed sound arrival angle information; if the number of sound arrival angles of the excluded sound arrival angle information exceeds a preset first sound source number threshold, determining that the audio signal is the multiple sound reading, wherein the preset first sound source number threshold is used for indicating the number of sound sources corresponding to the multiple sound reading. By calculating the sound reaching angle corresponding to the audio signal and the sound energy corresponding to the audio signal, the accurate judgment of the audio of the aligned sound reading is realized, and the cost is reduced.
In an alternative embodiment, the recording and playing device may obtain a plurality of sound arrival angles by adopting a method based on a sound arrival time difference, referring to fig. 2, fig. 2 is a schematic flow chart of step S1 in the audio judging method provided by an embodiment of the present application, including steps S11 to S14, specifically as follows:
S11: and acquiring the audio signals in the class through at least two audio acquisition devices to obtain the audio signals corresponding to the audio acquisition devices.
In this embodiment, the recording and playing device collects audio signals in a classroom through at least two audio collection devices, and obtains audio signals corresponding to the audio collection devices.
Specifically, the user is respectively provided with an audio acquisition device, such as a microphone, at two preset positions, the audio acquisition device performs data transmission with the recording and playing device, and the two audio acquisition devices respectively acquire corresponding audio signals at the respective preset positions and send the corresponding audio signals to the recording and playing device, wherein the audio signals are specifically as follows:
x1(t)=s(t-τ1)+n1(t)
x2(t)=s(t-τ2)+n2(t)
Wherein x 1 (t) is a first audio signal acquired by a first audio acquisition device, s (t- τ 1) is a sound source signal included in the first audio signal, and n 1 (t) is an additive noise signal included in the first audio signal; x 2 (t) is a second audio signal acquired by a second audio acquisition device, s (t- τ 2) is a sound source signal included in the second audio signal, and n 2 (t) is an additive noise signal included in the second audio signal; t is a time parameter, τ 1 and τ 2 are respectively delay parameters of the first audio acquisition device and the second audio acquisition device, and are used for reflecting delay time of an audio signal sent by a sound source to reach the first audio acquisition device and the second audio acquisition device respectively.
S12: and carrying out Fourier computation on the audio signals corresponding to the audio acquisition equipment to obtain frequency spectrum signals of the audio signals corresponding to the audio acquisition equipment, and obtaining the mutual frequency spectrum coefficients corresponding to the audio signals according to the frequency spectrum signals of the audio signals corresponding to the audio acquisition equipment and a preset mutual frequency spectrum computation algorithm.
The inter-spectrum calculation algorithm is as follows:
in the method, in the process of the invention, As the inter-spectral coefficients, X 1 (ω) is the spectral signal of the first audio signal, and X 2 (ω) is the spectral signal of the second audio signal.
In this embodiment, the recording and playing device performs fourier computation on the audio signal corresponding to the audio acquisition device, so as to obtain a spectrum signal of the audio signal corresponding to the audio acquisition device, and obtains a mutual spectrum coefficient corresponding to the audio signal according to the spectrum signal of the audio signal corresponding to the audio acquisition device and a preset mutual spectrum computation algorithm.
S13: and calculating the cross-correlation coefficient corresponding to the audio signal according to the cross-spectrum coefficient corresponding to the audio signal, and solving a set of the cross-correlation coefficient corresponding to the audio signal to obtain a time delay parameter set.
The delay parameter is used for indicating the time difference between the audio signals sent by the same sound source and at least two audio acquisition devices respectively.
The recording and broadcasting equipment calculates the cross-correlation coefficient corresponding to the audio signal according to the cross-spectrum coefficient corresponding to the audio signal, and calculates a set of the cross-correlation coefficient corresponding to the audio signal to obtain a time delay parameter set.
Specifically, the recording and playing device calculates a cross-correlation coefficient corresponding to the audio signal according to the cross-spectrum coefficient corresponding to the audio signal.
In an alternative embodiment, the recording and playing device obtains the cross-correlation coefficient corresponding to the audio signal according to the cross-spectrum coefficient corresponding to the audio signal and a preset frequency domain weighting function, as follows:
in the method, in the process of the invention, For cross-correlation coefficients, ψ 12 is the frequency domain weighting function,/>Is a mutual spectrum coefficient;
The frequency domain weighting functions may be cross-correlation functions, smooth coherence variation functions, ROTH processing functions, maximum likelihood weighting functions, and PHAT weighting functions, among others.
In another optional embodiment, the recording and playing device obtains the cross-correlation coefficient corresponding to the audio signal according to the cross-spectrum coefficient corresponding to the audio signal and the entropy value of the absolute value of the cross-spectrum coefficient corresponding to the audio signal, as follows:
The recording and broadcasting equipment calculates a set of cross-correlation coefficients corresponding to the audio signals according to a preset set calculation function, obtains a plurality of time delay parameters corresponding to the maximum value of the cross-correlation coefficients, combines the time delay parameters to obtain a time delay parameter set, wherein each time delay parameter is represented as time delay between the audio signals sent by the corresponding sound source and the audio acquisition equipment respectively and comprises a plurality of time delay parameters, and the set calculation function is as follows:
s14: and obtaining sound arrival angles corresponding to a plurality of time delay parameters according to the distance data between the audio acquisition devices, the time delay parameter set and a preset sound arrival angle calculation algorithm.
The sound arrival angle calculation algorithm is as follows:
where L is distance data, θ is sound arrival angle, v is sound wave propagation speed parameter, Is a time delay parameter.
In this embodiment, the recording and playing device obtains sound arrival angles corresponding to a plurality of delay parameters according to distance data between the audio acquisition devices, the delay parameter set and a preset sound arrival angle calculation algorithm.
Referring to fig. 3, fig. 3 is a schematic flow chart of step S2 in the audio determining method according to an embodiment of the present application, and further includes steps S21 to S22, which specifically include the following steps:
s21: the method comprises the steps of collecting audio signals sent by an interference sound source in a classroom for multiple times, obtaining audio signals collected each time, and obtaining a plurality of sound arrival angles corresponding to the audio signals collected each time.
The audio signal sent by the interference sound source comprises an audio signal and an environment interference signal which are externally emitted by the loudspeaker, and because a teacher frequently speaks through the loudspeaker in a class, the sound energy corresponding to the audio signal sent by the loudspeaker is also much larger than the energy of normal speaking, and the environment surrounding possibly has the environment interference signal corresponding to the noise source.
In order to improve the judging efficiency and accuracy of the recording and broadcasting equipment on the audio signals, the recording and broadcasting equipment can acquire the audio signals sent by the interference sound sources on the class for multiple times according to the audio acquisition equipment so as to acquire the arrival angles of the interference sounds.
In order to quickly and accurately obtain a preset interference sound arrival angle, in this embodiment, the recording and broadcasting device collects audio signals sent by an interference sound source on a class for multiple times according to preset sampling times, obtains each collected audio signal, and obtains a plurality of sound arrival angles corresponding to each collected audio signal.
S22: if the number of the sound arrival angles corresponding to the audio signals collected each time exceeds a preset second sound source number threshold, the same sound arrival angle is obtained according to the number of the sound arrival angles corresponding to the audio signals collected each time and is used as an interference sound arrival angle.
The preset second sound source number threshold is used for indicating the number of sound sources corresponding to the interference sound source, if the number of the sound arrival angles corresponding to the audio signals collected each time exceeds the preset second sound source number threshold, the audio signals comprise the audio signals corresponding to the interference sound source, and because the positions of the interference sound sources, namely the loudspeaker sound sources, are fixed, the recording and playing equipment analyzes the sound arrival angles corresponding to the audio signals collected each time based on the sound arrival angles to obtain the same sound arrival angles as the interference sound arrival angles. Through a mode of multiple detection, the arrival angle of the interference sound is obtained rapidly and accurately.
Specifically, referring to fig. 4, fig. 4 is a schematic flow chart of step S2 in the audio determining method according to an embodiment of the present application, including steps S23 to S24, specifically as follows:
S23: and constructing an interference sound arrival angle interval according to the interference sound arrival angle and a preset interference sound arrival angle threshold.
In order to exclude the influence of the disturbing sound source as much as possible, in this embodiment, the recording and playing device obtains the minimum disturbing sound arrival angle and the maximum disturbing sound arrival angle according to the disturbing sound arrival angle and the preset disturbing sound arrival angle, and constructs a disturbing sound arrival angle interval.
Specifically, the recording and playing device obtains an interference sound arrival angle threshold value input in advance by a user, subtracts the interference sound arrival angle from the interference sound arrival angle threshold value to obtain a subtraction result, adds the interference sound arrival angle and the interference sound arrival angle threshold value as a minimum interference sound arrival angle to obtain an addition result, and constructs an interference sound arrival angle interval as a maximum interference sound arrival angle. The interference sound arrival angle threshold may be set according to actual situations, and is not limited to this.
S24: judging whether each sound arrival angle in the sound arrival angle information is in an interference sound arrival angle interval, marking the sound arrival angle in the interference sound arrival angle interval as an interference sound arrival angle, eliminating the interference sound arrival angle in the sound arrival angle information, and obtaining the eliminated sound arrival angle information.
In this embodiment, the recording and playing device determines whether each sound arrival angle in the sound arrival angle information is within an interference sound arrival angle interval, marks the sound arrival angle within the interference sound arrival angle interval as an interference sound arrival angle, excludes the interference sound arrival angle in the sound arrival angle information, and obtains the excluded sound arrival angle information.
The voice arrival angles corresponding to the audio signals are calculated, a plurality of voice arrival angles in the voice arrival angle information of the audio signals collected from the class are marked, accurate distinction between the interference voice arrival angles and the normal voice arrival angles is achieved, and cost is reduced.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an audio determining apparatus according to an embodiment of the present application, where the apparatus may implement all or a part of an audio determining method by software, hardware or a combination of both, and the audio determining apparatus 5 includes:
A data acquisition module 51 for acquiring audio signals acquired in a classroom; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, wherein the sound arrival angle information comprises a plurality of sound arrival angles;
The sound arrival angle processing module 52 is configured to obtain a preset interference sound arrival angle if the sound energy data of the audio signal exceeds a preset sound energy threshold; removing the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining the removed sound arrival angle information;
The target audio signal determining module 53 is configured to determine that the audio signal is a multiple sound reading if the number of sound arrival angles of the excluded sound arrival angle information exceeds a preset first threshold of sound sources, where the preset first threshold of sound sources is used to indicate the number of sound sources corresponding to the multiple sound reading.
In an alternative embodiment, the sound source marking module 51 includes:
The first audio signal acquisition module is used for acquiring audio signals in a classroom through at least two audio acquisition devices to obtain audio signals corresponding to the audio acquisition devices;
the mutual spectrum coefficient calculation module is used for carrying out Fourier calculation on the audio signals corresponding to the audio acquisition equipment to obtain spectrum signals of the audio signals corresponding to the audio acquisition equipment, and obtaining the mutual spectrum coefficients corresponding to the audio signals according to the spectrum signals of the audio signals corresponding to the audio acquisition equipment and a preset mutual spectrum calculation algorithm;
The time delay parameter calculation module calculates a cross-correlation coefficient corresponding to the audio signal according to a cross-spectrum coefficient corresponding to the audio signal, and calculates a set of cross-correlation coefficients corresponding to the audio signal to obtain a time delay parameter set, wherein the time delay parameter set comprises a plurality of time delay parameters, and the time delay parameters are used for indicating time differences between the audio signals sent by the same audio source and at least two audio acquisition devices respectively;
The sound arrival angle calculation module is used for obtaining sound arrival angles corresponding to a plurality of time delay parameters according to distance data among the audio acquisition devices, the time delay parameter set and a preset sound arrival angle calculation algorithm.
In an alternative embodiment, the sound arrival angle processing module 52 includes:
The second audio signal acquisition module is used for acquiring audio signals sent by the interference sound source on the classroom for multiple times to obtain audio signals acquired each time and a plurality of sound arrival angles corresponding to the audio signals acquired each time;
the interference sound arrival angle calculation module is used for obtaining the same sound arrival angle as the interference sound arrival angle according to the plurality of sound arrival angles corresponding to the audio signals collected each time if the number of the plurality of sound arrival angles corresponding to the audio signals collected each time exceeds a preset second sound source number threshold value.
In the embodiment, an audio signal acquired in a classroom is acquired through a data acquisition module; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, wherein the sound arrival angle information comprises a plurality of sound arrival angles; acquiring a preset interference sound arrival angle if the sound energy data of the audio signal exceeds a preset sound energy threshold value through a sound arrival angle processing module; removing the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining the removed sound arrival angle information; and determining that the audio signal is the multiple sound reading if the number of sound arrival angles of the excluded sound arrival angle information exceeds a preset first sound source number threshold value through a target audio signal determining module, wherein the preset first sound source number threshold value is used for indicating the number of sound sources corresponding to the multiple sound reading. By calculating the sound reaching angle corresponding to the audio signal and the sound energy corresponding to the audio signal, the accurate judgment of the audio of the aligned sound reading is realized, and the cost is reduced.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the application. The embodiment of the application also provides electronic equipment, which comprises: a processor 61, a memory 62, and a computer program 63 stored on the memory 62 and executable on the processor; the electronic device may store a plurality of instructions adapted to be loaded by the processor and to execute the method steps of the embodiments shown in fig. 1 to 4, and the specific execution process may refer to the specific description of the embodiments shown in fig. 1 to 4, which is not repeated herein.
Wherein the processor may include one or more processing cores. The processor 61 connects various parts within the electronic device using various interfaces and lines, performs various functions of the audio judging apparatus 5 and processes data by executing or executing instructions, programs, code sets or instruction sets stored in the memory 62 and calling data in the memory 62, and alternatively, the processor 61 may be implemented in at least one hardware form of Digital Signal Processing (DSP), field-Programmable gate array (GATE ARRAY, FPGA), programmable logic array (Programble Logic Array, PLA). The processor 61 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the touch display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 61 and may be implemented by a single chip.
The Memory 62 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 62 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 62 may be used to store instructions, programs, code sets, or instruction sets. The memory 62 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as touch instructions, etc.), instructions for implementing the various method embodiments described above, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory 62 may alternatively be at least one memory device located remotely from the aforementioned processor 61.
The embodiment of the present application further provides a storage medium, where the storage medium may store a plurality of instructions, where the instructions are suitable for being loaded by a processor and executed by the processor to perform the steps of the method shown in fig. 1 to fig. 4, and the specific execution process may refer to the specific description of the embodiment shown in fig. 1 to fig. 4, which is not repeated herein.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The present application is not limited to the above-described embodiments, but, if various modifications or variations of the present application are not departing from the spirit and scope of the present application, the present application is intended to include such modifications and variations as fall within the scope of the claims and the equivalents thereof.

Claims (10)

1. An audio judging method is characterized by comprising the following steps:
Acquiring an audio signal acquired in a classroom; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, wherein the sound arrival angle information comprises a plurality of sound arrival angles;
if the sound energy data of the audio signal exceeds a preset sound energy threshold value, acquiring a preset interference sound arrival angle; removing the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining removed sound arrival angle information;
If the sound arrival angle number of the excluded sound arrival angle information exceeds a preset first sound source number threshold, determining that the audio signal is the multiple sound reading, wherein the preset first sound source number threshold is used for indicating the number of sound sources corresponding to the multiple sound reading.
2. The audio judging method according to claim 1, wherein the step of acquiring a preset disturbing sound arrival angle includes the steps of:
The method comprises the steps of collecting audio signals sent by an interference sound source on a classroom for multiple times, obtaining audio signals collected each time, and obtaining a plurality of sound arrival angles corresponding to the audio signals collected each time;
If the number of the sound arrival angles corresponding to the audio signals collected each time exceeds a preset second sound source number threshold, the same sound arrival angle is obtained according to the number of the sound arrival angles corresponding to the audio signals collected each time and used as the interference sound arrival angle, wherein the preset second sound source number threshold is used for indicating the number of sound sources corresponding to the interference sound sources.
3. The audio judgment method according to claim 2, wherein: the audio signals sent by the interference sound source comprise audio signals and environment interference signals which are externally emitted by a loudspeaker.
4. The audio judging method according to claim 1, wherein the step of excluding the sound arrival angle associated with the disturbing sound arrival angle from the sound arrival angle information to obtain the excluded sound arrival angle information, includes the steps of:
constructing an interference sound arrival angle interval according to the interference sound arrival angle and a preset interference sound arrival angle threshold;
judging whether each sound arrival angle in the sound arrival angle information is in the interference sound arrival angle interval, marking the sound arrival angle in the interference sound arrival angle interval as an interference sound arrival angle, eliminating the interference sound arrival angle in the sound arrival angle information, and obtaining the eliminated sound arrival angle information.
5. The audio judgment method according to any one of claims 1 to 4, wherein the obtaining an audio signal collected in a class, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal from the audio signal, comprises the steps of:
Acquiring audio signals in a classroom through at least two audio acquisition devices to obtain audio signals corresponding to the audio acquisition devices;
Performing Fourier computation on the audio signals corresponding to the audio acquisition equipment to obtain frequency spectrum signals of the audio signals corresponding to the audio acquisition equipment, and obtaining mutual frequency spectrum coefficients corresponding to the audio signals according to the frequency spectrum signals of the audio signals corresponding to the audio acquisition equipment and a preset mutual frequency spectrum computation algorithm;
Calculating a cross-correlation coefficient corresponding to the audio signal according to the cross-spectrum coefficient corresponding to the audio signal, and solving a set of the cross-correlation coefficient corresponding to the audio signal to obtain a time delay parameter set, wherein the time delay parameter set comprises a plurality of time delay parameters, and the time delay parameters are used for indicating time differences between the audio signals sent by the same audio source and the at least two audio acquisition devices respectively;
And obtaining the sound arrival angles corresponding to the delay parameters according to the distance data and the delay parameter set between the audio acquisition devices.
6. The audio judging method according to claim 5, wherein the calculating the cross-correlation coefficient corresponding to the audio signal based on the cross-spectral coefficient corresponding to the audio signal comprises the steps of:
And obtaining the cross-correlation coefficient corresponding to the audio signal according to the cross-spectrum coefficient corresponding to the audio signal and a preset frequency domain weighting function.
7. The audio judging method according to claim 5, wherein the calculating the cross-correlation coefficient corresponding to the audio signal based on the cross-spectral coefficient corresponding to the audio signal comprises the steps of:
And obtaining the cross-correlation coefficient corresponding to the audio signal according to the cross-spectrum coefficient corresponding to the audio signal and the entropy value of the absolute value of the cross-spectrum coefficient corresponding to the audio signal.
8. An audio judgment device, comprising:
the data acquisition module is used for acquiring audio signals acquired in a classroom; according to the audio signal, obtaining sound arrival angle information of the audio signal and sound energy data of the audio signal, wherein the sound arrival angle information comprises a plurality of sound arrival angles;
The sound arrival angle processing module is used for acquiring a preset interference sound arrival angle if the sound energy data of the audio signal exceeds a preset sound energy threshold value; removing the interference sound arrival angle in the sound arrival angle information according to the preset interference sound arrival angle, and obtaining removed sound arrival angle information;
The target audio signal determining module is configured to determine that the audio signal is a multiple sound reading if the number of sound arrival angles of the excluded sound arrival angle information exceeds a preset first sound source number threshold, where the preset first sound source number threshold is used to indicate a number of sound sources corresponding to the multiple sound reading.
9. An electronic device, comprising: a processor, a memory, and a computer program stored on the memory and executable on the processor; the computer program, when executed by the processor, implements the steps of the audio judgment method according to any one of claims 1 to 7.
10. A storage medium, characterized by: the storage medium stores a computer program which, when executed by a processor, implements the steps of the audio judgment method according to any one of claims 1 to 7.
CN202211336717.4A 2022-10-28 2022-10-28 Audio judgment method and device, electronic equipment and storage medium Pending CN117956376A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211336717.4A CN117956376A (en) 2022-10-28 2022-10-28 Audio judgment method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211336717.4A CN117956376A (en) 2022-10-28 2022-10-28 Audio judgment method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117956376A true CN117956376A (en) 2024-04-30

Family

ID=90791202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211336717.4A Pending CN117956376A (en) 2022-10-28 2022-10-28 Audio judgment method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117956376A (en)

Similar Documents

Publication Publication Date Title
CN108899044B (en) Voice signal processing method and device
US10063965B2 (en) Sound source estimation using neural networks
CN110164469B (en) Method and device for separating multi-person voice
CN108899037B (en) Animal voiceprint feature extraction method and device and electronic equipment
CN110503971A (en) Time-frequency mask neural network based estimation and Wave beam forming for speech processes
EP3526979B1 (en) Method and apparatus for output signal equalization between microphones
US20160187453A1 (en) Method and device for a mobile terminal to locate a sound source
US11941968B2 (en) Systems and methods for identifying an acoustic source based on observed sound
CN110554357B (en) Sound source positioning method and device
CN112017681B (en) Method and system for enhancing directional voice
CN109979469B (en) Signal processing method, apparatus and storage medium
CN109658935B (en) Method and system for generating multi-channel noisy speech
CN105611014A (en) Method and device for mobile terminal call voice noise reduction
CN110290280B (en) Terminal state identification method and device and storage medium
CN109644192A (en) Audio transmission with the compensation of speech detection cycle duration
CN111930336A (en) Volume adjusting method and device of audio device and storage medium
CN111060874A (en) Sound source positioning method and device, storage medium and terminal equipment
CN110491409B (en) Method and device for separating mixed voice signal, storage medium and electronic device
CN110931028A (en) Voice processing method and device and electronic equipment
US20190250240A1 (en) Correlation function generation device, correlation function generation method, correlation function generation program, and wave source direction estimation device
CN110169082A (en) Combining audio signals output
WO2020250797A1 (en) Information processing device, information processing method, and program
CN117956376A (en) Audio judgment method and device, electronic equipment and storage medium
CN110660399A (en) Training method and device for voiceprint recognition, terminal and computer storage medium
JP6314475B2 (en) Audio signal processing apparatus and program

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination