CN114120758A - Vocal music training auxiliary system based on intelligent wearable equipment - Google Patents

Vocal music training auxiliary system based on intelligent wearable equipment Download PDF

Info

Publication number
CN114120758A
CN114120758A CN202111196512.6A CN202111196512A CN114120758A CN 114120758 A CN114120758 A CN 114120758A CN 202111196512 A CN202111196512 A CN 202111196512A CN 114120758 A CN114120758 A CN 114120758A
Authority
CN
China
Prior art keywords
vocal
vibration
sound
signals
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111196512.6A
Other languages
Chinese (zh)
Inventor
邹永攀
罗诚哲
陈思添
伍楷舜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN202111196512.6A priority Critical patent/CN114120758A/en
Publication of CN114120758A publication Critical patent/CN114120758A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B15/00Teaching music

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Telephone Function (AREA)

Abstract

The invention discloses a vocal music training auxiliary system based on intelligent wearable equipment. The system comprises a wearable vocal cord vibration receptor and intelligent equipment, wherein the wearable vocal cord vibration receptor comprises an audio signal collector, a vibration signal collector, an audio processing module, a control unit and a communication unit, the audio signal collector is used for collecting sound signals generated by sound production of a user, the vibration signal collector is used for collecting skin vibration signals caused by vibration at the position of a vocal cord, and the audio processing module is used for processing the collected sound signals and the skin vibration signals and transmitting the processed sound signals and the skin vibration signals to the intelligent equipment through the communication unit under the control of the control unit; and the intelligent equipment respectively processes the received sound signals and the skin vibration signals into time-frequency graphs, inputs the time-frequency graphs into a pre-training machine learning classifier, and obtains the vocal cord closed state classification result. The method has the advantages of low cost, convenient use, high recognition accuracy and capability of recognizing the vocal cord closed state in real time when a user exercises vocal music.

Description

Vocal music training auxiliary system based on intelligent wearable equipment
Technical Field
The invention relates to the technical field of wearability, in particular to a vocal music training auxiliary system based on intelligent wearable equipment.
Background
The vocal cord closed state greatly influences the vocal effect of vocal music sound production and the vocal cord health, and for vocal music sound production, the sound production skill is established on the basis of normal vocal cord closure in the sound-beautifying teaching; for vocal cords health, prolonged too tight closure or untight closure of the vocal cords to produce loud or loud sounds can result in damaged vocal cords.
In the prior art, the monitoring of the closed state of the vocal cords mainly includes the following technical solutions: 1) the principle of glottal graph-based vocal cord closure detection is to capture bioelectrical signals from the skin surface close to the vocal cords, which requires specialized medical detection equipment and is less universal and non-mobile. 2) The method can effectively resist external noise interference based on the vocal cord closed state detection of the vibration signal, but skin discomfort can be caused because the accelerometer needs to be tightly attached to the skin close to the vocal cords, and the identification accuracy is low because of the low resolution of the accelerometer and the interference of body movement. 3) The method is a non-invasive detection method, and is easily interfered by external environment noise.
In addition, the current vocal cord closed state monitoring method is limited by the accuracy of signal feature extraction, only can identify three states of over-closure, normal closure and untight closure, a user can not make an adjustment clearly according to an identification result, and the accuracy of a monitoring result is greatly reduced under the influence of user change, sound tone change and external environment noise interference, so that the effect of real-time detection can not be achieved.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a vocal music training auxiliary system based on intelligent wearable equipment.
According to a first aspect of the invention, a vocal music training auxiliary system based on an intelligent wearable device is provided. The system comprises a wearable vocal cord vibroreceptor and a smart device, wherein:
the wearable vocal cord vibration receptor comprises an audio signal collector, a vibration signal collector, an audio processing module, a control unit and a communication unit, wherein the audio signal collector is used for collecting sound signals generated by the sounding of a user, the vibration signal collector is used for collecting skin vibration signals caused by the vibration of the vocal cord position, and the audio processing module is used for processing the collected sound signals and the skin vibration signals and transmitting the processed sound signals and the skin vibration signals to the intelligent equipment through the communication unit under the control of the control unit;
the intelligent device respectively processes the received sound signals and the skin vibration signals into time-frequency graphs, inputs the time-frequency graphs into a pre-trained machine learning classifier, and obtains vocal cord closed state classification results.
According to a second aspect of the invention, a vocal music training assisting method based on an intelligent wearable device is provided. The method comprises the following steps:
collecting a sound signal generated by the sound production of a user by using an audio signal collector;
collecting skin vibration signals caused by vibration at the vocal cords of a user by using a vibration signal collector and carrying out amplification and filtering treatment;
and processing the sound signals and the skin vibration signals into a time-frequency diagram, inputting the time-frequency diagram into a pre-trained machine learning classifier, and outputting a vocal cord closed state classification result.
Compared with the prior art, the vocal music training auxiliary system based on the intelligent wearable device has the advantages that the hardware cost is low, the carrying and the use are convenient, the closed state of the vocal cords can be recognized in real time when a user conducts vocal music training, the recognition accuracy is high, and the user can adjust the vocal mode in time or conduct targeted vocal cord vocal training.
Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a block diagram of a vocal training assistance system based on an intelligent wearable device according to an embodiment of the present invention;
fig. 2 is an overall process schematic diagram of a vocal music training assisting method based on an intelligent wearable device according to an embodiment of the invention;
FIG. 3 is a flow diagram of detecting a closed state of a vocal cord according to one embodiment of the present invention;
fig. 4 is a flow chart of processing a sound signal and a skin vibration signal according to one embodiment of the present invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
The invention provides a vocal music training auxiliary system based on intelligent wearable equipment, which is used for detecting a vocal cord closed state when a user produces a voice in real time based on a skin vibration signal and a voice sensing signal so as to guide vocal music learning of the user.
The system provided generally comprises: wearable vocal cords vibration receptor and intelligent terminal, wherein wearable vocal cords vibration receptor is used for collecting near the vocal cords skin vibration signal and the ear canal sound signal. For example, the wearable vocal cord vibration receptor comprises an audio signal collector, a vibration signal collector, an audio processing module, a control unit and a communication unit, wherein the audio signal collector is used for collecting a sound signal generated by a user; the vibration signal collector is used for collecting skin vibration signals caused by skin vibration at the vocal cord position; the audio processing module is used for processing the collected sound signals and skin vibration signals and transmitting the processed sound signals and skin vibration signals to the intelligent equipment through the communication unit; the control unit is used for coordinating information interaction processes among other modules or units, such as controlling the start and the end of sound signal and skin vibration signal acquisition, controlling data transmission between the wearable vocal cord vibration receptor and the intelligent terminal and the like. And the intelligent equipment processes the received sound signals and skin vibration signals into a time-frequency diagram, inputs the time-frequency diagram into a pre-trained machine learning classifier and obtains the vocal cord closed state classification result.
In one embodiment, the audio signal collector is an in-ear microphone for collecting sound signals in the ear canal, and the vibration signal collector is a piezoceramic wafer. The in-ear microphone is used for collecting the sound signals, so that the noise in the environment can be shielded to a certain extent, and more accurate sound signals can be obtained. The piezoelectric ceramic piece can more sensitively capture the weak vibration of vocal cords, so that a skin vibration signal caused by the sound production of a user can be accurately acquired.
In one embodiment, the wearable vocal cords vibroreceptors are disposed within a hardware housing that can be mounted on the napestrap with the piezoceramic wafer affixed proximate to the user's vocal cords. This is advantageous for protecting vocal cord vibration receptors and for portability.
In one embodiment, the smart device is a smart terminal or a wearable device, and the like, and the smart device is a plurality of types of electronic devices, such as a smart phone, a tablet electronic device, a desktop computer, or a vehicle-mounted device.
Referring to fig. 1, a wearing example of the smart vocal cord vibration receptor is shown, wherein the wearable vocal cord vibration receptor comprises an in-ear microphone 1 for collecting ear canal sounds, a piezoceramic sheet 2 for collecting skin vibrations near the vocal cord, an adjustable neck strap 3, an audio processing module, a control unit, a communication unit, etc. are not labeled for clarity. It should be understood that these modules or units may be disposed within the hardware housing. The hardware housing may be fixed to the napestrap and the piezoelectric ceramic plate may be attached to the user's vocal cords. The side of the hardware shell can be further provided with a type-c charging port 4 for charging the wearable vocal cord vibration sensor and an on-off key.
The following description will be given taking a smartphone in which a terminal adopts an android Operating System (OS) as an example, and referring to fig. 3 and shown in fig. 1 and 2, the detection process of the closed state of the vocal cords includes the following steps:
and S1, starting the intelligent vocal cord vibration receptor, wherein the auditory canal microphone collects the sound of the auditory canal of the user and the original vibration signal of the skin near the vocal cord of the piezoelectric ceramic piece.
For example, adopt WIFI communication between wearable vocal cord vibration receptor and the cell-phone, the user need ensure that cell-phone and hardware equipment are under same WIFI environment after opening intelligent vocal cord vibration receptor to open cell-phone APP.
And S2, filtering and amplifying the original vibration signal through a band-pass filtering and amplifying circuit of the intelligent vocal cord vibration receptor.
For example, piezoelectric ceramic plates are preferred as sensors for collecting skin vibration signals. This is because the piezoelectric ceramic plate is a special material that is very sensitive to vibration signals, and can generate voltage changes with different amplitudes according to the vibration amplitude, because the vibration of the vocal cords is relatively weak, the amplitude of the voltage change caused by the vibration is relatively small, and simultaneously, because the piezoelectric ceramic piece is provided with a layer of metal material, the piezoelectric ceramic piece has an antenna effect and is easy to introduce various noises, therefore, a band-pass filtering and amplifying circuit is designed for the piezoelectric ceramic piece, which is beneficial to effectively filtering (comprising commercial power 50Hz clutter and various noise harmonic components) and moderately amplifying effective signals, so that the collected vocal cord skin vibration signal is relatively pure, the microcontroller STM32 is used as a control unit to pass through a 12bit ADC (analog-to-digital converter), the method comprises the steps that original vibration data are collected at a sampling rate of 10KHz, and a user needs to wear a sensor to one side of a vocal cord when collecting the original vibration data so as to ensure that the piezoelectric ceramic piece is tightly attached to the vocal cord.
And S3, transmitting the collected ear canal sound signals and the amplified and filtered vibration signals to the mobile terminal in real time through the communication unit.
And S4, the mobile terminal processes the received sound data and vibration signal data to obtain the judgment result of the vocal cord closed state and vocal music exercise suggestion, and the judgment result and the vocal music exercise suggestion are displayed in the APP.
In step S4, the intelligent terminal analyzes and processes the received data to recognize the closed state of the vocal cords in real time, for example, the recognition result is divided into five levels including: normal closure, excessive closure, slight closure, moderate closure and severe closure, and finally the recognition result is fed back to the user in a visual form.
Referring to fig. 4, step S4 specifically includes the following steps:
s41, framing the original ear canal sound data and the skin vibration signal data, and dividing the data into a plurality of windows for processing.
The amplified and filtered skin vibration signal and the ear canal sound signal collected by the microphone are received first, and the signal is divided into windows to process the data of each window.
S42, detecting the voice data by using voice time detection algorithm (VAD), and extracting the corresponding data frame when the user utters.
For example, window data is read in chronologically and the start time point and the end time point of a user utterance on the data stream are detected with a speech event detection algorithm. When the starting time point is detected, the data stream after the time point is transmitted to the next step until the ending time point is detected, the data transmission is stopped, then the starting time point is detected continuously, and the process is repeated.
And S43, converting the data frame into a time-frequency image through short-time Fourier transform (SFFT).
Specifically, incoming data streams are buffered, the buffered data are taken out after a certain time period is accumulated, and sound data and skin vibration signal data are converted into a time-frequency diagram through a short-time Fourier transform algorithm.
And S44, analyzing the time-frequency image by a machine learning and deep learning model to identify the closed state of the vocal cords.
The classifier for detecting the vocal cords closed state is established by utilizing data analysis methods such as machine learning or deep learning, and due to the fact that different vocal cords closed degrees cause inconsistency of sound and skin vibration, identification of different vocal cords closed states can be reflected in extracted auditory meatus sound and skin vibration signals near the vocal cords, namely the vocal cords are excessively closed, the vocal cords are normally closed, the vocal cords are slightly closed and not tight, the cords are moderately closed and not tight, and the cords are severely closed and not tight.
In one embodiment, the classifier is trained using a supervised deep learning approach, i.e., based on labeled data sets. Specifically, the data set contains samples and labels, such as collecting ear canal sound signals and skin vibration signals of professional students performing vocal music in different vocal cord closed states, and related sounding scenes including but not limited to bubble sounds, long vowels, short vowels, strong tones, weak tones, skip tones, and sounds covering various vocal zones of men and women, wherein the data of each single tone is used as one sample of the data set. For labeling of the label, an electronic glottis instrument is adopted to analyze the closed contact state of the vocal cords during current sound production while sound is collected, and the opinion of professional vocal teacher is integrated to obtain real closed state information of the vocal cords, namely, the vocal cords are excessively closed, the vocal cords are normally closed, the vocal cords are slightly closed and not tight, the vocal cords are moderately closed and not tight, the vocal cords are severely closed and not tight, and the like, and the label is used as a label of a current sound production sample.
Further, in order to facilitate learning of correspondence between the skin vibration signal, the in-ear-canal sound signal, and the vocal cords closed state, the data set is preprocessed, including: for the sound signal and the skin vibration signal of the sample, cutting into signal segments with fixed length by using a sliding window; and carrying out short-time Fourier transform processing on the fixed-length segments to obtain a time-frequency graph, wherein the fixed-length time-frequency graph can be directly used for training a classifier. The time-frequency graph is used as the input of the classifier, so that the deep time-domain feature and the deep frequency-domain feature in the skin vibration signal and the sound signal can be extracted, the processing efficiency of the classifier is improved, and the method is favorable for being applied to a mobile terminal with relatively limited computing capacity and storage capacity.
In one embodiment, the deep learning classifier may employ a naive bayes classifier, SVC (support vector machine classifier), or other neural network models, among others. The invention is not limited in this regard.
It should be understood that the pre-training process of the classifier can also be performed in an off-line manner in the server or the cloud, and the trained classifier is embedded into the mobile terminal, so that real-time detection of the closed state of the vocal cords can be realized.
And S45, feeding back the result to the user to generate a relevant report.
For example, the obtained vocal cord closed state result is displayed on the APP interface of the mobile terminal in real time and stored until the user finishes the exercise, the user can adjust the vocal mode of the vocal music exercise in time according to feedback in the exercise process, and after the exercise is finished, the system can summarize the whole vocal cord closed state of the exercise, and further guide the user to carry out corresponding vocal music sounding exercise.
It is to be noted that those skilled in the art can appropriately change or modify the above-described embodiments without departing from the spirit and scope of the present invention, for example, a microphone in an ear headphone is used to collect a sound signal, and the like. As another example, the control unit may be implemented using an existing microcontroller or a microcontroller specifically designed.
In summary, the present invention provides a vocal cord closed state detection system based on the combination of the ear canal sound signal and the skin vibration signal. The ear canal sound signal and the skin vibration signal can resist external noise to a great extent, and the signals are guaranteed to be mainly originated from vocal cord vibration. The invention refines the classified vocal cord closure condition for the first time, and can be divided into five levels or more of normal closure, excessive closure, slight closure, medium closure and serious closure, so that a user can clearly understand the current vocal cord closure state of the user, thereby determining the direction and degree of adjusting the tightness of the vocal cords. When the user uses this system to carry out the vocal music exercise, can judge the real-time vocal cords closed condition of user through the characteristic of sound signal and vibration signal, and then in time correct user's wrong vocal mode to after the exercise, this system can analyze this exercise holistic vocal cords closed state, and instruct the user to carry out corresponding vocal music sound production exercise. In addition, the invention expands the functions of the earphone and the intelligent terminal, and reduces the hardware cost compared with the completely redesigned special intelligent vocal cord vibration receptor.
The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + +, Python, or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims (10)

1. The utility model provides a vocal music training auxiliary system based on intelligence wearing equipment, includes wearable vocal cord vibration receptor and smart machine, wherein:
the wearable vocal cord vibration receptor comprises an audio signal collector, a vibration signal collector, an audio processing module, a control unit and a communication unit, wherein the audio signal collector is used for collecting sound signals generated by the sounding of a user, the vibration signal collector is used for collecting skin vibration signals caused by the vibration of the vocal cord position, and the audio processing module is used for processing the collected sound signals and the skin vibration signals and transmitting the processed sound signals and the skin vibration signals to the intelligent equipment through the communication unit under the control of the control unit;
and the intelligent equipment respectively processes the received sound signals and the skin vibration signals into time-frequency graphs, inputs the time-frequency graphs into a pre-trained machine learning classifier, and obtains the vocal cord closed state classification result.
2. The system of claim 1, wherein the audio signal collector is an in-ear microphone for collecting sound signals within the ear canal, and the vibration signal collector is a piezoceramic wafer.
3. The system of claim 2, wherein the wearable vocal cord vibration receptor is disposed within a hardware housing that can be mounted on the napestrap with the piezoceramic wafer affixed at the user's vocal cord location.
4. The system according to claim 1, wherein the intelligent terminal obtains a time-frequency diagram corresponding to the sound signal and the skin vibration signal according to the following steps:
receiving the amplified and filtered skin vibration signal and sound signal, and dividing the signals into a plurality of windows to process the data of each window;
reading window data according to a time sequence, detecting a starting time point and an ending time point of a user sound on a data stream, caching the data stream after the time point when the starting time point is detected until the ending time point is detected, stopping data transmission, then continuing to detect the starting time point, and repeating the process;
when the data flow transmitted by the buffer memory is accumulated to a set time length, the buffered data is taken out, and the sound signal data and the skin vibration signal data are converted into a time-frequency diagram through short-time Fourier transform, wherein the time-frequency diagram reflects time domain characteristics and frequency domain characteristics in the sound production process of a user;
and inputting the obtained time-frequency diagram into a pre-trained machine learning classifier, and outputting vocal cord closed state classification results, wherein the results comprise excessive closure of vocal cords, normal closure of vocal cords, slight closure of vocal cords, moderate closure of vocal cords and severe closure of vocal cords.
5. The system of claim 3, wherein the side of the hardware housing is provided with a type-c charging port and a power-on key.
6. The system of claim 1, wherein the smart device comprises a smart terminal or a wearable device.
7. The system of claim 1, wherein the data set for pre-training the machine learning classifier is constructed by:
collecting sound signals in auditory canals and skin vibration signals at vocal cords positions of professional students in vocal performance under different vocal cord closed states, wherein sounding scenes comprise bubble sounds, long vowels, short vowels, gradually strong sounds, gradually weak sounds and jumping sounds, the sounds cover all the sound zones of men and women, and data of each single sound is used as a sample of a data set;
in the process of collecting sound, the electronic glottis instrument is adopted to analyze the closed contact state of the vocal cords during sounding, and the opinion of professional vocal music teachers is integrated to obtain real closed degree information of the vocal cords to be used as a label of a sample.
8. The system of claim 1, wherein the smart device further displays the obtained vocal cord closed state classification result in real time on the APP interface, and stores the vocal cord closed state of the vocal music exercise process to guide the user to perform the corresponding vocal music sound production exercise.
9. A vocal music training auxiliary method based on intelligent wearable equipment comprises the following steps:
collecting a sound signal generated by the sound production of a user by using an audio signal collector;
collecting skin vibration signals caused by vibration at the vocal cords of a user by using a vibration signal collector and carrying out amplification and filtering treatment;
and processing the sound signals and the skin vibration signals into a time-frequency diagram, inputting the time-frequency diagram into a pre-trained machine learning classifier, and outputting a vocal cord closed state classification result.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method as claimed in claim 9.
CN202111196512.6A 2021-10-14 2021-10-14 Vocal music training auxiliary system based on intelligent wearable equipment Pending CN114120758A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111196512.6A CN114120758A (en) 2021-10-14 2021-10-14 Vocal music training auxiliary system based on intelligent wearable equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111196512.6A CN114120758A (en) 2021-10-14 2021-10-14 Vocal music training auxiliary system based on intelligent wearable equipment

Publications (1)

Publication Number Publication Date
CN114120758A true CN114120758A (en) 2022-03-01

Family

ID=80375845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111196512.6A Pending CN114120758A (en) 2021-10-14 2021-10-14 Vocal music training auxiliary system based on intelligent wearable equipment

Country Status (1)

Country Link
CN (1) CN114120758A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007068847A (en) * 2005-09-08 2007-03-22 Advanced Telecommunication Research Institute International Glottal closure region detecting apparatus and method
KR101144948B1 (en) * 2011-02-25 2012-05-11 (주)피지오랩 A tune correction device using a egg
CN103169505A (en) * 2013-03-19 2013-06-26 何宗彦 Method and device for Doppler ultrasound pickup analysis processing
CN103584859A (en) * 2012-08-13 2014-02-19 泰亿格电子(上海)有限公司 Electroglottography device
US20140093093A1 (en) * 2012-09-28 2014-04-03 Apple Inc. System and method of detecting a user's voice activity using an accelerometer
US20150305920A1 (en) * 2014-04-29 2015-10-29 Meditab Software Inc. Methods and system to reduce stuttering using vibration detection
CN108682427A (en) * 2018-05-23 2018-10-19 北京航空航天大学 A kind of portable electric glottis graphic language sound harvester for far field human-computer interaction
CN110557696A (en) * 2019-08-13 2019-12-10 力当高(上海)智能科技有限公司 Be used for interactive earphone of neck wearing pronunciation
KR102225288B1 (en) * 2019-09-09 2021-03-10 공효원 Method for providing bigdata based vocalization guidance service using comparative analysis of v0cal cord vibration pattern

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007068847A (en) * 2005-09-08 2007-03-22 Advanced Telecommunication Research Institute International Glottal closure region detecting apparatus and method
KR101144948B1 (en) * 2011-02-25 2012-05-11 (주)피지오랩 A tune correction device using a egg
CN103584859A (en) * 2012-08-13 2014-02-19 泰亿格电子(上海)有限公司 Electroglottography device
US20140093093A1 (en) * 2012-09-28 2014-04-03 Apple Inc. System and method of detecting a user's voice activity using an accelerometer
CN103169505A (en) * 2013-03-19 2013-06-26 何宗彦 Method and device for Doppler ultrasound pickup analysis processing
WO2014146552A1 (en) * 2013-03-19 2014-09-25 北京银河之舟环保科技有限公司 Doppler ultrasonic pickup analyzing and processing method and device
US20150305920A1 (en) * 2014-04-29 2015-10-29 Meditab Software Inc. Methods and system to reduce stuttering using vibration detection
CN108682427A (en) * 2018-05-23 2018-10-19 北京航空航天大学 A kind of portable electric glottis graphic language sound harvester for far field human-computer interaction
CN110557696A (en) * 2019-08-13 2019-12-10 力当高(上海)智能科技有限公司 Be used for interactive earphone of neck wearing pronunciation
KR102225288B1 (en) * 2019-09-09 2021-03-10 공효원 Method for providing bigdata based vocalization guidance service using comparative analysis of v0cal cord vibration pattern

Similar Documents

Publication Publication Date Title
CN100583909C (en) Apparatus for multi-sensory speech enhancement on a mobile device
Lu et al. Speakersense: Energy efficient unobtrusive speaker identification on mobile phones
CN101887728B (en) Method for multi-sensory speech enhancement
US20220319538A1 (en) Voice interactive wakeup electronic device and method based on microphone signal, and medium
CN107799126A (en) Sound end detecting method and device based on Supervised machine learning
CN109346075A (en) Identify user speech with the method and system of controlling electronic devices by human body vibration
CN108346425A (en) A kind of method and apparatus of voice activity detection, the method and apparatus of speech recognition
US10339960B2 (en) Personal device for hearing degradation monitoring
US20220084543A1 (en) Cognitive Assistant for Real-Time Emotion Detection from Human Speech
US20220238118A1 (en) Apparatus for processing an audio signal for the generation of a multimedia file with speech transcription
US20220303688A1 (en) Activity Detection On Devices With Multi-Modal Sensing
JP2004199053A (en) Method for processing speech signal by using absolute loudness
CN109994129B (en) Speech processing system, method and device
Dupont et al. Combined use of close-talk and throat microphones for improved speech recognition under non-stationary background noise
Mishra et al. Optimization of stammering in speech recognition applications
Bouserhal et al. Classification of nonverbal human produced audio events: a pilot study
JP2023502697A (en) Methods and systems for monitoring and analyzing cough
CN114120758A (en) Vocal music training auxiliary system based on intelligent wearable equipment
CN116312561A (en) Method, system and device for voice print recognition, authentication, noise reduction and voice enhancement of personnel in power dispatching system
Kurcan Isolated word recognition from in-ear microphone data using hidden markov models (HMM)
Dov et al. Voice activity detection in presence of transients using the scattering transform
McLoughlin The use of low-frequency ultrasound for voice activity detection
Guan et al. FaceInput: a hand-free and secure text entry system through facial vibration
Singh et al. Automatic articulation error detection tool for Punjabi language with aid for hearing impaired people
CN111768800B (en) Voice signal processing method, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220301