WO2023206788A1 - 听力的保护方法、装置、终端设备及存储介质 - Google Patents

听力的保护方法、装置、终端设备及存储介质 Download PDF

Info

Publication number
WO2023206788A1
WO2023206788A1 PCT/CN2022/102134 CN2022102134W WO2023206788A1 WO 2023206788 A1 WO2023206788 A1 WO 2023206788A1 CN 2022102134 W CN2022102134 W CN 2022102134W WO 2023206788 A1 WO2023206788 A1 WO 2023206788A1
Authority
WO
WIPO (PCT)
Prior art keywords
age interval
volume control
age
control mode
interval
Prior art date
Application number
PCT/CN2022/102134
Other languages
English (en)
French (fr)
Inventor
杨洁
刘际滨
王奉宝
Original Assignee
歌尔股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 歌尔股份有限公司 filed Critical 歌尔股份有限公司
Publication of WO2023206788A1 publication Critical patent/WO2023206788A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1041Mechanical or electronic switches, or control elements
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/22Arrangements for obtaining desired frequency or directional characteristics for obtaining desired frequency characteristic only 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/01Aspects of volume control, not necessarily automatic, in sound systems

Definitions

  • the present application relates to the field of earphones, and in particular, to a hearing protection method, device, terminal equipment and computer-readable storage medium.
  • the current headphone equipment is mainly aimed at young people who like strong bass.
  • emphasis is placed on improving the low-frequency performance of the headset.
  • this causes the low-frequency part of the sound signal output by the headphone equipment to resonate with human organs. , thus causing the headphone device to cause greater damage to the user's hearing during use.
  • embodiments of the present application aim to set different volume control modes for users of different age ranges to achieve the effect of fully protecting the hearing of earphone device users. .
  • Embodiments of the present application provide a hearing protection method, which is applied to earphone devices.
  • the method includes the following steps:
  • first voice signal When the first voice signal is collected in real time, extract first voice feature data from the first voice signal, wherein the first voice signal is generated by the wearer of the headset device;
  • the first age interval includes an adult age interval
  • the step of determining the volume control mode corresponding to the first age interval data includes:
  • the volume control mode corresponding to the first age interval is determined to be the first volume control mode
  • the step of operating the volume control mode to protect the wearer's hearing includes:
  • Run the first volume control mode to output a sound signal according to a first frequency value and a first volume value; wherein the first frequency value is within a preset frequency standard interval, and the first volume value is within a preset frequency value. Within the volume standard range.
  • the first age interval also includes an age interval for minors
  • the step of determining the volume control mode corresponding to the first age interval data also includes:
  • the volume control mode corresponding to the first age interval is determined to be the second volume control mode
  • the step of operating the volume control mode to protect the wearer's hearing also includes:
  • the first age interval also includes an age interval for the elderly
  • the step of determining the volume control mode corresponding to the first age interval data also includes:
  • the volume control mode corresponding to the first age interval data is determined to be the third volume control mode
  • the step of operating the volume control mode to protect the wearer's hearing also includes:
  • the method also includes:
  • Obtain a preset voice database wherein the voice database is constructed based on the second voice feature data of each second voice signal and each second age interval collected in advance;
  • a training set is constructed in the speech database, and neural network training is performed through the training set to obtain the classification model.
  • the training set includes: each second voice feature data, and the second age interval corresponding to each second voice feature data.
  • the second age interval includes: adult age interval, minor age interval.
  • Human age interval and elderly age interval the step of performing neural network training through the training set to obtain the classification model includes:
  • Each second voice feature data is used as the input of the preset initial neural network model, and the age range of adults, the age range of minors, or the elderly corresponding to each second voice feature data is
  • the age interval is used as the output of the preset initial neural network model to be based on the second voice feature data and the corresponding adult age interval, the minor age interval or the elderly age interval.
  • the initial neural network model is trained into the classification model based on the mapping relationship between them.
  • step of performing neural network training through the training set to obtain the classification model also includes:
  • the initial neural network model is trained to obtain After the model to be confirmed, input the standard speech feature data in the verification set into the model to be confirmed;
  • this application also provides a hearing protection device, which includes:
  • Collection and extraction module used to extract first voice feature data from the first voice signal when the first voice signal is collected in real time, wherein the first voice signal is generated by the wearer of the headset device;
  • Classification determination module used to call a preset classification model to determine the first age range to which the wearer belongs based on the first voice feature data;
  • Mode operation module used to determine the volume control mode corresponding to the first age interval, and run the volume control mode to protect the wearer's hearing.
  • the present application also provides a terminal device, which includes: a memory, a processor, and a hearing protection program stored on the memory and executable on the processor.
  • a terminal device which includes: a memory, a processor, and a hearing protection program stored on the memory and executable on the processor.
  • the present application also provides a computer-readable storage medium.
  • a hearing protection program is stored on the computer-readable storage medium.
  • the hearing protection program is executed by a processor, the above-mentioned methods are implemented. Hearing protection methods.
  • the hearing protection method provided by the embodiment of the present application includes: when the first voice signal is collected in real time, extracting the first voice feature data from the first voice signal, wherein the first voice signal is the headphone device The wearer is generated, calling the preset classification model to determine the first age interval to which the wearer belongs based on the first voice feature data, determining the volume control mode corresponding to the first age interval, and running the volume control mode to protect the wearer's hearing.
  • the terminal device collects the voice signal emitted by the wearer in real time, and extracts the voice feature data corresponding to the voice signal. After that, the terminal device inputs the voice feature data into the terminal device. The classification model is set, and then the age interval corresponding to the voice signal is determined. Finally, the terminal device determines the preset volume control mode in the terminal device corresponding to the age interval based on the age interval data, and runs the volume control mode to complete Protection of the wearer's hearing.
  • this application obtains the user's voice signal and determines the user's age range based on the voice signal, thereby matching the corresponding volume control mode according to the age range, achieving Different volume control modes are set for users of different age ranges to fully protect the hearing effect of users of the headphone device.
  • Figure 1 is a schematic structural diagram of a terminal device of the hardware operating environment involved in the embodiment of the present application
  • Figure 2 is a schematic flow chart of an embodiment of the hearing protection method of the present application.
  • Figure 3 is a schematic diagram of the application flow involved in one embodiment of the hearing protection method of the present application.
  • Figure 4 is a schematic diagram of the classification model training process involved in one embodiment of the hearing protection method of the present application.
  • FIG. 5 is a schematic diagram of functional modules involved in an embodiment of the hearing protection method of the present application.
  • FIG. 1 is a schematic structural diagram of a terminal device of the hardware operating environment involved in the embodiment of the present application.
  • the terminal device involved in the embodiment of the present application may be a headset device.
  • the terminal device may also be a mobile terminal device such as a mobile phone, a tablet, a PC (Personal Computer, Personal Computer), or a fixed terminal device.
  • the terminal device may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to realize connection communication between these components.
  • the user interface 1003 may include a display screen (Display) and an input unit such as a keyboard (Keyboard).
  • the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface or a wireless interface (such as a wireless fidelity (WIreless-FIdelity, WI-FI) interface).
  • the memory 1005 can be a high-speed random access memory (Random Access Memory, RAM) memory or a stable non-volatile memory (Non-Volatile Memory, NVM), such as a disk memory.
  • RAM Random Access Memory
  • NVM Non-Volatile Memory
  • the memory 1005 may optionally be a storage device independent of the aforementioned processor 1001.
  • Figure 1 does not constitute a limitation on the terminal device, and may include more or fewer components than shown, or combine certain components, or arrange different components.
  • the network interface 1004 is mainly used for data communication with other devices; the user interface 1003 is mainly used for data interaction with the user; the processor 1001 and the memory 1005 in the terminal device of this application can be set in In the terminal device, the terminal device calls the hearing protection program stored in the memory 1005 through the processor 1001, and executes the hearing protection method provided by the embodiment of the present application.
  • FIG. 2 is a schematic flow chart of a first embodiment of a hearing protection method according to the present application.
  • the hearing protection method of the present application includes the following steps:
  • Step S10 When the first voice signal is collected in real time, extract the first voice feature data from the first voice signal, wherein the first voice signal is generated by the wearer of the headset device;
  • the first voice signal generated by the user of the terminal device is collected in real time through the built-in collection device, and the first voice signal in the first voice signal is extracted through the built-in classification device of the terminal device.
  • First voice feature data when the terminal device is running, the first voice signal generated by the user of the terminal device is collected in real time through the built-in collection device, and the first voice signal in the first voice signal is extracted through the built-in classification device of the terminal device.
  • the chip device configured in the headset device calls the micro microphone built in the headset device to collect the first voice generated in real time by the wearer of the headset device. signal, and extract the first voice feature data in the first voice signal through the classifier built in the headset device.
  • the above-mentioned speech feature data includes data such as power spectrum, Mel cepstral coefficients, gamma-pass filter coefficients, etc.
  • Step S20 Call a preset classification model to determine the first age range to which the wearer belongs based on the first voice feature data;
  • the terminal device calls the above classification device and inputs the obtained first voice feature data into the classification model in the classification device to determine the first age range to which the user of the terminal device belongs.
  • the headset device calls a built-in classifier, and inputs the above-mentioned first voice feature data obtained by the headset device into a preset classification model in the classifier to obtain the first voice data.
  • the first age interval corresponding to the characteristic data is determined as the first age interval to which the wearer of the headphone device belongs.
  • Step S30 Determine the volume control mode corresponding to the first age range, and run the volume control mode to protect the wearer's hearing.
  • the terminal device determines the volume control mode corresponding to the first age range among the volume control modes preset by the user in the terminal device according to the first age range to which the user of the terminal device belongs, and The frequency value and volume value in the sound signal output by the terminal device are controlled according to the volume control mode, thereby protecting the hearing of the user of the terminal device.
  • the headphone device determines the volume control mode corresponding to the first age range among the volume control modes stored by the user in the memory of the headphone device according to the first age range to which the wearer of the headphone device belongs.
  • the volume control mode then, the integrated system in the headphone device runs the volume control mode corresponding to the first age range, and controls the frequency value and volume value in the sound signal output by the headphone device according to the volume control mode, and then Protects the hearing of the wearer of this headphone device.
  • the preset volume control modes include a first volume control mode, a second volume control mode, and a third volume control mode. These various volume control modes are stored locally by the terminal device in advance. For the terminal device to choose when outputting sound signals. It should be understood that based on the design needs of practical applications, in different feasible implementations, the terminal device can of course also obtain various other volume control modes not listed in this embodiment locally or by downloading from the cloud.
  • the hearing protection method is not limited to the specific type of the volume control mode.
  • the first age interval includes an adult age interval
  • the step of "determining the volume control mode corresponding to the first age interval" in the above step S30 specifically includes:
  • Step S301 If it is determined that the first age interval is the adult age interval, determine that the volume control mode corresponding to the first age interval is the first volume control mode;
  • the terminal device receives the age interval configuration instruction triggered by the user in advance, and the terminal device configures the age interval according to the preset content encapsulated in the instruction that the first age interval includes the adult age interval. the above age range.
  • the terminal device determines that the volume control mode corresponding to the adult age range is The first volume control mode among the volume control modes stored by the user in the terminal device.
  • the step of "operating the volume control mode to protect the wearer's hearing" in the above step S30 specifically includes:
  • Step S302 Run the first volume control mode to output a sound signal according to a first frequency value and a first volume value; wherein the first frequency value is within a preset frequency standard interval, and the first volume value is within Within the preset volume standard range;
  • the terminal device receives the volume control mode configuration instruction triggered by the user in advance, and when the terminal device outputs a sound signal according to the package in the instruction, the first frequency value and the third frequency value in the sound signal are An instruction content with a volume value within a preset frequency standard interval and volume standard interval is used to configure the above-mentioned first volume control mode.
  • the integrated system in the terminal device controls the terminal device after determining that the volume control mode corresponding to the adult age range is the first volume control mode among the volume control modes stored by the user in the terminal device.
  • the sound signal is output with the first frequency value and the first volume value in the above-mentioned first volume control mode.
  • the first age interval includes the age interval of minors
  • the step of "determining the volume control mode corresponding to the first age interval" in step S30 also includes:
  • Step S303 If it is determined that the first age interval is the age interval of minors, determine that the volume control mode corresponding to the first age interval is the second volume control mode;
  • the terminal device receives the age interval configuration instruction triggered by the user in advance, and the terminal device configures the age interval according to the preset content encapsulated in the instruction that the first age interval includes the age interval of minors.
  • the above age range The above age range.
  • the terminal device determines the volume control mode corresponding to the age range of the minors. It is the second volume control mode among the volume control modes stored by the user in the terminal device.
  • step S30 also includes:
  • Step S304 Run the second volume control mode to output a sound signal according to a second frequency value and a second volume value; wherein the second frequency value is higher than the lowest value of the frequency standard interval, and the second volume The value is lower than the highest value of the volume standard range;
  • the terminal device when the terminal device receives the volume control mode configuration instruction triggered by the user in advance and outputs a sound signal according to the terminal device encapsulated in the instruction, the second frequency value in the sound signal is higher than The lowest value of the frequency marking interval and the instruction content that the second volume value is lower than the highest value of the volume standard interval are used to configure the above-mentioned second volume control mode.
  • the integrated system in the terminal device controls the terminal after determining that the volume control mode corresponding to the age range of the minor is the second volume control mode among the volume control modes stored by the user in the terminal device.
  • the device outputs a sound signal using the second frequency value and the second volume value in the above-mentioned second volume control mode.
  • the first age interval includes the age interval of minors
  • the step of "determining the volume control mode corresponding to the first age interval" in step S30 also includes:
  • Step S305 If it is determined that the first age interval is the age interval for the elderly, determine that the volume control mode corresponding to the first age interval data is the third volume control mode;
  • the terminal device receives the age interval configuration instruction triggered by the user in advance, and the terminal device configures the age interval based on the preset content encapsulated in the instruction that the first age interval includes the age interval for the elderly. the above age range.
  • the terminal device determines that the volume control mode corresponding to the age interval for the elderly is: The third volume control mode among the volume control modes stored by the user in the terminal device.
  • step S30 also includes:
  • Step S306 Run the third volume control mode to output a sound signal according to a third frequency value and a third volume value; wherein the third frequency value is higher than the highest value of the human body's resonance frequency range, and the third volume value Within the volume standard range;
  • the terminal device when the terminal device receives the volume control mode configuration instruction triggered by the user in advance and outputs a sound signal according to the terminal device encapsulated in the instruction, the third frequency value in the sound signal is higher than The lowest value of the human body resonance frequency range and the instruction content of the third volume value in the above-mentioned volume standard range are used to configure the above-mentioned third volume control mode.
  • the integrated system in the terminal device controls the terminal device after determining that the volume control mode corresponding to the age range of the elderly person is the third volume control mode among the volume control modes stored by the user in the terminal device.
  • the sound signal is output using the third frequency value and the third volume value in the third volume control mode.
  • the hearing protection method of the present application also includes:
  • Step A Obtain a preset voice database, wherein the voice database is constructed based on the pre-collected second voice feature data of each second voice signal and each second age interval;
  • Step B Construct a training set in the speech database, and perform neural network training through the training set to obtain the classification model.
  • the terminal device obtains a speech database constructed by pre-collecting the second speech feature data of each second speech signal and each second age interval by the user. After that, the terminal device Construct a training set in the database, and use the training set to train the initial neural network model in the neural network device built into the terminal device to establish a second voice feature corresponding to the second voice feature that can be obtained through the second voice feature data. Classification model for age intervals.
  • the training set includes: each second voice feature data, and the second age interval corresponding to each second voice feature data, and the second age interval includes: The adult age interval, minor age interval and elderly age interval, in the above step B "training the neural network through the training set to obtain the classification model", specifically include:
  • Step B01 Use each second voice feature data as the input of the preset initial neural network model, and use the adult age interval, the minor age interval, or the age interval corresponding to each second voice feature data.
  • the age interval of the elderly is used as the output of the preset initial neural network model to determine the age interval of the elderly based on each of the second voice feature data and the corresponding age interval of adults, the age interval of minors or the age interval of the elderly.
  • the mapping relationship between age intervals is used to train the initial neural network model into the classification model;
  • the neural network device configured in the terminal device sets each of the above-mentioned second voice feature data as the input of the initial neural network model preset in the neural network device, and sets the corresponding second voice feature data to The above-mentioned adult age interval, minor age interval or elderly age interval is set as the output of the initial neural network model.
  • the terminal device uses the second voice feature data and the age corresponding to the second voice feature data.
  • the initial neural network is trained using a nonlinear mapping relationship between the age intervals of people, the age intervals of minors, or the age intervals of the elderly, thereby training the initial neural network model into the above classification model.
  • step B of "training a neural network through the training set to obtain the classification model” also includes:
  • Step B02 Construct a verification set in the speech database
  • Step B03 Based on the mapping relationship between each second speech feature data and the corresponding adult age interval, the minor age interval or the elderly age interval, the initial neural network model is After training to obtain the model to be confirmed, input the standard speech feature data in the verification set into the model to be confirmed;
  • Step B04 Detect whether the age interval output by the model to be confirmed is consistent with the standard age interval corresponding to the standard voice feature data in the verification set;
  • Step B05 If yes, use the model to be confirmed as the classification model
  • Step B06 If not, continue to perform neural network training on the model to be confirmed based on the training set to obtain the classification model.
  • the terminal device constructs a verification set in the above-mentioned speech database that contains standard speech feature data of each standard speech signal collected in advance by the user, and each standard age interval corresponding to each standard speech characteristic data, and obtains a verification set based on the above-mentioned each standard speech feature data.
  • the mapping relationship between the second voice feature data and the corresponding adult age interval, minor age interval or elderly age interval for each second voice feature data, and the model to be confirmed obtained by training the above initial neural network,
  • the terminal device inputs the standard voice feature data of the verification set into the model to be confirmed, and uses the integrated system in the terminal device to detect whether the age interval output by the model to be confirmed is corresponding to the standard voice feature data in the verification set. Standard age interval.
  • the system will use the model to be confirmed as the above-mentioned classification model and store it in the built-in memory of the terminal device.
  • the system continues to call the above-mentioned neural network device to continue to classify the model to be confirmed based on the above-mentioned training set. Neural network training is performed to obtain the above classification model.
  • the terminal device collects the first voice signal generated by the user of the terminal device through the built-in collection device, and extracts the first voice signal through the built-in classification device of the terminal device.
  • the first voice feature data and then the terminal device inputs the obtained first voice feature data into the above classification device to obtain the age interval corresponding to the first voice feature data, and determine the age interval for use by the terminal device
  • the terminal device determines the first age interval corresponding to the first age interval in each volume control mode preset by the user in the terminal device according to the first age interval to which the user of the terminal device belongs.
  • the volume control mode is used to control the frequency value and volume value in the sound signal output by the terminal device according to the volume control mode, thereby protecting the hearing of the user of the terminal device.
  • this application obtains the user's voice signal and determines the user's age range based on the voice signal, thereby matching the corresponding volume control mode according to the age range, achieving the goal of targeting different users.
  • Users in different age groups set different volume control modes to fully protect the hearing effect of users of the headphone device.
  • FIG. 5 is a functional module schematic diagram of an embodiment of the hearing protection device of this application. As shown in Figure 5, the hearing protection device of this application includes:
  • Collection and extraction module used to extract first voice feature data from the first voice signal when the first voice signal is collected in real time, wherein the first voice signal is generated by the wearer of the headset device;
  • Classification determination module used to call a preset classification model to determine the first age range to which the wearer belongs based on the first voice feature data;
  • Mode operation module used to determine the volume control mode corresponding to the first age interval, and run the volume control mode to protect the wearer's hearing.
  • classification determination module includes:
  • the first volume control module determining unit is configured to determine that the volume control mode corresponding to the first age interval is the first volume control mode if it is determined that the first age interval is the adult age interval;
  • the mode running module includes:
  • the first volume control module operating unit is configured to operate the first volume control mode to output a sound signal according to a first frequency value and a first volume value; wherein the first frequency value is within a preset frequency standard interval, The first volume value is within a preset volume standard interval.
  • classification determination module also includes:
  • the second volume control module determining unit is configured to determine that the volume control mode corresponding to the first age interval is the second volume control mode if it is determined that the first age interval is the minor age interval;
  • the mode operation module also includes:
  • the second volume control module operating unit is configured to operate the second volume control mode to output a sound signal according to a second frequency value and a second volume value; wherein the second frequency value is higher than the lowest of the frequency standard interval. value, the second volume value is lower than the highest value of the volume standard interval;
  • classification determination module also includes:
  • a third volume control module determination unit configured to determine that the volume control mode corresponding to the first age interval data is the third volume control mode if it is determined that the first age interval is the age interval for the elderly;
  • the mode operation module also includes:
  • the third volume control module operating unit is used to operate the third volume control mode to output a sound signal according to a third frequency value and a third volume value; wherein the third frequency value is higher than the highest value of the human body's resonance frequency range. , the third volume value is within the volume standard interval;
  • classification determination module also includes:
  • Speech database acquisition unit used to acquire a preset speech database, wherein the speech database is constructed based on the second speech feature data of each second speech signal and each second age interval collected in advance;
  • Training set construction unit used to construct a training set in the speech database, and perform neural network training through the training set to obtain the classification model.
  • the training set includes: each second voice feature data, and the second age interval corresponding to each second voice feature data.
  • the second age interval includes: adult age interval, minor age interval. The age range of people and the age range of the elderly;
  • Training set building units include:
  • Classification model training subunit used to use the second voice feature data as input to the preset initial neural network model, and use the adult age range, the minors corresponding to the second voice feature data
  • the human age interval or the elderly age interval is used as the output of the preset initial neural network model, based on each of the second voice feature data and the corresponding adult age interval and minor age interval. Or the mapping relationship between the age intervals of the elderly, and the initial neural network model is trained into the classification model;
  • classification determination module also includes:
  • Verification set construction unit constructs a verification set in the speech database
  • To-be-confirmed model building unit configured to construct a model based on the mapping relationship between each of the second voice feature data and the corresponding adult age interval, the minor age interval or the elderly age interval. After the initial neural network model is trained to obtain the model to be confirmed, the standard speech feature data in the verification set is input into the model to be confirmed;
  • Model verification unit to be confirmed used to detect whether the age interval output by the model to be confirmed is consistent with the standard age interval corresponding to the standard voice feature data in the verification set;
  • Classification model confirmation unit used to detect that the age interval output by the model to be confirmed is consistent with the standard age interval corresponding to the standard voice feature data in the verification set, then use the model to be confirmed as the classification model;
  • Classification model update unit used to, if it is detected that the age interval output by the model to be confirmed is inconsistent with the standard age interval corresponding to the standard voice feature data in the verification set, continue to perform the verification on the model to be confirmed based on the training set Neural networks are trained to obtain the classification model.
  • This application also provides a terminal device, which has a hearing protection program that can be run on a processor.
  • the terminal device executes the hearing protection program, it can achieve the hearing described in any of the above embodiments. steps of protection methods.
  • the present application also provides a computer-readable storage medium.
  • a hearing protection program is stored on the computer-readable storage medium.
  • the hearing protection program is executed by a processor, the hearing protection program as described in any of the above embodiments is implemented. Steps of conservation methods.
  • the methods of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. implementation.
  • the technical solution of the present application can be embodied in the form of a software product that is essentially or contributes to the existing technology.
  • the computer software product is stored in a storage medium (such as ROM/RAM) as mentioned above. , magnetic disk, optical disk), including several instructions to cause a terminal device (which can be a mobile phone, computer, server, or network device, etc.) to execute the methods described in various embodiments of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Otolaryngology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Telephone Function (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

本申请公开了一种听力的保护方法、装置、终端设备以及计算机可读存储介质,包括:首先,在实时采集到第一语音信号时,从第一语音信号中提取第一语音特征数据,其中,第一语音信号为耳机设备的佩戴者产生,之后,调用预设的分类模型基于第一语音特征数据确定佩戴者所属的第一年龄区间,最后,确定第一年龄区间对应的音量控制模式,并运行音量控制模式以保护佩戴者的听力。本申请通过获取用户的语音信号,并根据该语音信号确定该用户的年龄区间,从而根据该年龄区间匹配对应的音量控制模式,达到了针对不同年龄区间用户设置不同的音量控制模式,以充分保护该耳机设备使用者听力的效果。

Description

听力的保护方法、装置、终端设备及存储介质
本申请要求于2022年04月28日提交中国专利局、申请号202210462397.0、申请名称为“听力的保护方法、装置、终端设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及耳机领域,尤其涉及一种听力的保护方法、装置、终端设备以及计算机可读存储介质。
背景技术
随着消费电子行业的快速发展,耳机设备的市场占有率越来越高,并且,从未成年人到老年人,耳机设备的使用者覆盖范围也越来越广泛,其中,年轻人已成为了耳机设备的主要使用群体。
而目前的耳机设备主要针对年轻人群体喜欢强有力低音的特点,在定义耳机设备性能的时候会重点强调提升耳机的低频性能,却造成了耳机设备输出的声音信号中低频部分与人体器官产生共振,从而导致耳机设备的在使用过程中对使用者的听力造成较大的损害。
发明内容
本申请实施例通过提供一种听力的保护方法、装置、终端设备以及计算机可读存储介质,旨在针对不同年龄区间的用户设置不同的音量控制模式,以达成充分保护耳机设备使用者听力的效果。
本申请实施例提供一种听力的保护方法,所述听力的保护方法应用于耳机设备,所述方法包括以下步骤:
在实时采集到第一语音信号时,从所述第一语音信号中提取第一语音特征数据,其中,所述第一语音信号为所述耳机设备的佩戴者产生;
调用预设的分类模型基于所述第一语音特征数据确定所述佩戴者所属的第一年龄区间;
确定所述第一年龄区间对应的音量控制模式,并运行所述音量控制模式以保护所述佩戴者的听力。
进一步地,所述第一年龄区间包括成年人年龄区间,所述确定所述第一年龄区间数据对应的音量控制模式的步骤,包括:
若确定所述第一年龄区间为所述成年人年龄区间,则确定所述第一年龄区间对应的音量控制模式为第一音量控制模式;
所述运行所述音量控制模式以保护所述佩戴者的听力的步骤,包括:
运行所述第一音量控制模式以按照第一频率数值和第一音量数值输出声音信号;其中,所述第一频率数值在预设的频率标准区间内,所述第一音量数值在预设的音量标准区间内。
进一步地,所述第一年龄区间还包括未成年人年龄区间,所述确定所述第一年龄区间数据对应的音量控制模式的步骤,还包括:
若确定所述第一年龄区间为所述未成年人年龄区间,则确定所述第一年龄区间对应的音量控制模式为第二音量控制模式;
所述运行所述音量控制模式以保护所述佩戴者的听力的步骤,还包括:
运行所述第二音量控制模式以按照第二频率数值和第二音量数值输出声音信号;其中,所述第二频率数值高于所述频率标准区间的最低值,所述第二音量数值低于所述音量标准区间的最高值。
进一步地,所述第一年龄区间还包括老年人年龄区间,所述确定所述第一年龄区间数据对应的音量控制模式的步骤,还包括:
若确定所述第一年龄区间为所述老年人年龄区间,则确定所述第一年龄区间数据对应的音量控制模式为第三音量控制模式;
所述运行所述音量控制模式以保护所述佩戴者的听力的步骤,还包括:
运行所述第三音量控制模式以按照第三频率数值和第三音量数值输出声音信号;其中,所述第三频率数值高于人体共振频率区间的最高值,所述第三音量数值在所述音量标准区间内。
进一步地,所述方法还包括:
获取预设的语音数据库,其中,所述语音数据库根据预先采集的各第二语音信号的第二语音特征数据和各第二年龄区间构建得到;
在所述语音数据库中构建训练集,通过所述训练集进行神经网络训练以得到所述分类模型。
进一步地,所述训练集包括:各第二语音特征数据,和各所述第二语音特征数据各自对应的所述第二年龄区间,所述第二年龄区间包括:成年人年龄区间、未成年人年龄区间及老年人年龄区间,所述通过所述训练集进行神经网络训练以得到所述分类模型的步骤,包括:
将所述各第二语音特征数据作为预设初始神经网络模型的输入,并将所述各第二语音特征数据对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间作为所述预设初始神经网络模型的输出,以基于所述各第二语音特征数据和各自对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间之间的映射关系,将所初始神经网络模型训练成为所述分类模型。
进一步地,所述通过所述训练集进行神经网络训练以得到所述分类模型的步骤,还包括:
在所述语音数据库中构建验证集;
在基于所述各第二语音特征数据和各自对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间之间的映射关系,对所初始神经网络模型进行训练得到待确认模型后,将所述验证集中的标准语音特征数据输入所述待确认模型;
检测所述待确认模型输出的年龄区间与所述验证集中所述标准语音特征数据对应的标准年龄区间是否一致;
若是,则将所述待确认模型作为所述分类模型;
若否,则基于所述训练集继续对所述待确认模型进行神经网络训练以得到所述分类模型。
此外,为实现上述目的,本申请还提供一种听力的保护装置,所述装置包括:
采集提取模块:用于在实时采集到第一语音信号时,从所述第一语音信号中提取第一语音特征数据,其中,所述第一语音信号为所述耳机设备的佩戴者产生;
分类确定模块:用于调用预设的分类模型基于所述第一语音特征数据确定所述佩戴者所属的第一年龄区间;
模式运行模块:用于确定所述第一年龄区间对应的音量控制模式,并运行所述音量控制模式以保护所述佩戴者的听力。
此外,为实现上述目的,本申请还提供一种终端设备,所述终端设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的听力的保护程序,所述听力的保护程序被所述处理器执行时实现如上述中的听力的保护方法的步骤。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有听力的保护程序,所述听力的保护程序被处理器执行时实现如上所述的听力的保护方法的步骤。
本申请实施例提供的听力的保护方法包括:在实时采集到第一语音信号时,从所述第一语音信号中提取第一语音特征数据,其中,所述第一语音信号为所述耳机设备的佩戴者产生,调用预设的分类模型基于所述第一语音特征数据确定所述佩戴者所属的第一年龄区间,确定所述第一年龄区间对应的音量控制模式,并运行所述音量控制模式以保护所述佩戴者的听力。
本申请在用户使用耳机设备的过程中,由终端设备实时采集佩戴者发出的语音信号,并提取该语音信号对应的语音特征数据,之后,该终端设备将该语音特征数据输入该终端设备内预设的分类模型,进而确定该语音信号对应的年龄区间,最后,该终端设备根据该年龄区间数据确定该年龄区间对应的该终端设备中预设的音量控制模式,并运行该音量控制模式以完成对该佩戴者听力的保护。
如此,相比于现有耳机设备强调低频性能的方式,本申请通过获取用户的语音信号,并根据该语音信号确定该用户的年龄区间,从而根据该年龄区间匹配对应的音量控制模式, 达到了针对不同年龄区间的用户设置不同的音量控制模式,以充分保护该耳机设备使用者听力的效果。
附图说明
图1是本申请实施例方案涉及的硬件运行环境的终端设备的结构示意图;
图2为本申请听力的保护方法一实施例的流程示意图;
图3为本申请听力的保护方法一实施例中涉及的应用流程示意图;
图4为本申请听力的保护方法一实施例中涉及的分类模型训练流程示意图;
图5为本申请听力的保护方法一实施例中涉及的功能模块示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
请参照图1,图1为本申请实施例方案涉及的硬件运行环境的终端设备结构示意图。
本申请实施例所涉及的终端设备具体可以是耳机设备,当然,该终端设备具体还可以是手机、平板和PC(Personal Computer,个人计算机)等可移动终端设备或者固定式终端设备。
如图1所示,该终端设备可以包括:处理器1001,例如中央处理器(Central Processing Unit,CPU),通信总线1002、用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如无线保真(WIreless-FIdelity,WI-FI)接口)。存储器1005可以是高速的随机存取存储器(Random Access Memory,RAM)存储器,也可以是稳定的非易失性存储器(Non-Volatile Memory,NVM),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的结构并不构成对终端设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图1所示,作为一种存储介质的存储器1005中可以包括操作系统、数据存储模块、网络通信模块、用户接口模块以及听力的保护程序。
在图1所示的终端设备中,网络接口1004主要用于与其他设备进行数据通信;用户接口 1003主要用于与用户进行数据交互;本申请终端设备中的处理器1001、存储器1005可以设置在终端设备中,所述终端设备通过处理器1001调用存储器1005中存储的听力的保护程序,并执行本申请实施例提供的听力的保护方法。
基于上述的终端设备,提供本申请听力的保护的各个实施例。
请参照图2,图2为本申请一种听力的保护方法第一实施例的流程示意图。在本实施例中,本申请听力的保护方法,包括以下步骤:
步骤S10:在实时采集到第一语音信号时,从所述第一语音信号中提取第一语音特征数据,其中,所述第一语音信号为所述耳机设备的佩戴者产生;
在本实施例中,终端设备在运行的过程中通过内置的采集装置实时集采该终端设备使用者产生的第一语音信号,并通过该终端设备内置的分类装置提取该第一语音信号中的第一语音特征数据。
示例性地,例如,如图3所示,耳机设备在运行的过程中通过该耳机设备内配置的芯片装置调用该耳机设备内置的微型麦克风以采集该耳机设备的佩戴者实时产生的第一语音信号,并通过该耳机设备内置的分类器提取该第一语音信号中的第一语音特征数据。
需要说明的是,在本实施例中,上述语音特征数据包含功率谱,梅尔倒谱系数,伽马通滤波器系数等数据。
步骤S20:调用预设的分类模型基于所述第一语音特征数据确定所述佩戴者所属的第一年龄区间;
在本实施例中,终端设备调用上述分类装置,通过获取的将上述第一语音特征数据输入该分类装置中的分类模型,以确定为该终端设备使用者所属的第一年龄区间。
示例性地,例如,请参考图3,耳机设备调用内置的分类器,通过将该耳机设备获取的上述第一语音特征数据输入该分类器中预设的分类模型,以获取与该第一语音特征数据对应的第一年龄区间,并将该第一年龄区间确定为该耳机设备的佩戴者所属的第一年龄区间。
步骤S30:确定所述第一年龄区间对应的音量控制模式,并运行所述音量控制模式以保护所述佩戴者的听力。
在本实施例中,终端设备根据上述该终端设备使用者所属的第一年龄区间,在该终端设备内用户预设的各音量控制模式中确定与该第一年龄区间对应的音量控制模式,并按照该音量控制模式控制该终端设备输出的声音信号中的频率数值和音量数值,进而保护该终 端设备使用者的听力。
示例性地,例如,请参考图3,耳机设备根据上述该耳机设备的佩戴者所属的第一年龄区间,在该耳机设备存储器内用户存储的各音量控制模式中确定与该第一年龄区间对应的音量控制模式,之后,该耳机设备内集成系统运行与该第一年龄区间对应的该音量控制模式,并按照该音量控制模式控制该耳机设备输出的声音信号中的频率数值和音量数值,进而保护该耳机设备的佩戴者的听力。
需要说明的是,在本实施例中,预设的音量控制模式包括第一音量控制模式、第二音量控制模式及第三音量控制模式,该各种音量控制模式预先由终端设备存储在本地以供该终端设备在输出声音信号时进行选择。应当理解的是,基于实际应用的设计需要,在不同可行的实施方式中,终端设备当然还可以在本地或者从云端下载获取到本实施例中未列举到的其它各种音量控制模式,本申请听力的保护方法并不针对该音量控制模式的具体种类进行限定。
进一步地,在一种可行的实施例中,第一年龄区间包括成年人年龄区间,在上述步骤S30中的“确定所述第一年龄区间对应的音量控制模式”的步骤,具体包括:
步骤S301:若确定所述第一年龄区间为所述成年人年龄区间,则确定所述第一年龄区间对应的音量控制模式为第一音量控制模式;
需要说明的是,在本实施例中,终端设备预先接收用户触发的年龄区间配置指令,该终端设备根据该指令中封装的预设该第一年龄区间包含成年人年龄区间的内容,来配置该上述年龄区间。
在本实施例中,当终端设备内集成系统对比该终端设备使用者所属的第一年龄区间与上述成年人年龄区间相同时,该终端设备确定与该成年人年龄区间对应的音量控制模式为该用户在该终端设备中存储的各音量控制模式中的第一音量控制模式。
进一步地,在一种可行的实施例中,在上述步骤S30中的“运行所述音量控制模式以保护所述佩戴者的听力”的步骤,具体包括:
步骤S302:运行所述第一音量控制模式以按照第一频率数值和第一音量数值输出声音信号;其中,所述第一频率数值在预设的频率标准区间内,所述第一音量数值在预设的音量标准区间内;
需要说明的是,在本实施例中,终端设备预先接收用户触发的音量控制模式配置指令,并根据该指令内封装的该终端设备输出声音信号时,该声音信号中的第一频率数值和第一 音量数值在预设的频率标准区间和音量标准区间内的指令内容,来配置上述第一音量控制模式。
在本实施例中,终端设备内集成系统在确定该成年人年龄区间对应的音量控制模式为该用户在该终端设备中存储的各音量控制模式中的第一音量控制模式后,控制该终端设备以上述第一音量控制模式中的第一频率数值和该第一音量数值输出声音信号。
进一步地,在一种可行的实施例中,第一年龄区间包括未成年人年龄区间,在上述步骤S30中的“确定所述第一年龄区间对应的音量控制模式”的步骤,还包括:
步骤S303:若确定所述第一年龄区间为所述未成年人年龄区间,则确定所述第一年龄区间对应的音量控制模式为第二音量控制模式;
需要说明的是,在本实施例中,终端设备预先接收用户触发的年龄区间配置指令,该终端设备根据该指令中封装的预设该第一年龄区间包含未成年人年龄区间的内容,来配置该上述年龄区间。
在本实施例中,当终端设备内集成系统对比该终端设备使用者所属的第一年龄区间与上述未成年人年龄区间相同时,该终端设备确定与该未成年人年龄区间对应的音量控制模式为该用户在该终端设备中存储的各音量控制模式中的第二音量控制模式。
进一步地,在一种可行的实施例中,在上述步骤S30中的“运行所述音量控制模式以保护所述佩戴者的听力”的步骤,还包括:
步骤S304:运行所述第二音量控制模式以按照第二频率数值和第二音量数值输出声音信号;其中,所述第二频率数值高于所述频率标准区间的最低值,所述第二音量数值低于所述音量标准区间的最高值;
需要说明的是,在本实施例中,终端设备预先接收用户触发的音量控制模式配置指令,并根据该指令内封装的该终端设备输出声音信号时,该声音信号中的第二频率数值高于频率标注区间的最低值,第二音量数值低于音量标准区间的最高值的指令内容,来配置上述第二音量控制模式。
在本实施例中,终端设备内集成系统在确定该未成年人年龄区间对应的音量控制模式为该用户在该终端设备中存储的各音量控制模式中的第二音量控制模式后,控制该终端设备以上述第二音量控制模式中的第二频率数值和该第二音量数值输出声音信号。
进一步地,在一种可行的实施例中,第一年龄区间包括未成年人年龄区间,在上述步骤S30中的“确定所述第一年龄区间对应的音量控制模式”的步骤,还包括:
步骤S305:若确定所述第一年龄区间为所述老年人年龄区间,则确定所述第一年龄区间数据对应的音量控制模式为第三音量控制模式;
需要说明的是,在本实施例中,终端设备预先接收用户触发的年龄区间配置指令,该终端设备根据该指令中封装的预设该第一年龄区间包含老年人年龄区间的内容,来配置该上述年龄区间。
在本实施例中,当终端设备内集成系统对比该终端设备使用者所属的第一年龄区间与上述老年人年龄区间相同时,该终端设备确定与该老年人年龄区间对应的音量控制模式为该用户在该终端设备中存储的各音量控制模式中的第三音量控制模式。
进一步地,在一种可行的实施例中,在上述步骤S30中的“运行所述音量控制模式以保护所述佩戴者的听力”的步骤,还包括:
步骤S306:运行所述第三音量控制模式以按照第三频率数值和第三音量数值输出声音信号;其中,所述第三频率数值高于人体共振频率区间的最高值,所述第三音量数值在所述音量标准区间内;
需要说明的是,在本实施例中,终端设备预先接收用户触发的音量控制模式配置指令,并根据该指令内封装的该终端设备输出声音信号时,该声音信号中的第三频率数值高于人体共振频率区间的最低值,第三音量数值在上述音量标准区间的指令内容,来配置上述第三音量控制模式。
在本实施例中,终端设备内集成系统在确定该老年人年龄区间对应的音量控制模式为该用户在该终端设备中存储的各音量控制模式中的第三音量控制模式后,控制该终端设备以上述第三音量控制模式中的第三频率数值和该第三音量数值输出声音信号。
进一步地,在一种可行的实施例中,本申请听力的保护方法,还包括:
步骤A:获取预设的语音数据库,其中,所述语音数据库根据预先采集的各第二语音信号的第二语音特征数据和各第二年龄区间构建得到;
步骤B:在所述语音数据库中构建训练集,通过所述训练集进行神经网络训练以得到所述分类模型。
示例性地,例如,请参考图4,该终端设备获取由用户预先采集各第二语音信号的第二语音特征数据和各第二年龄区间构建得到的语音数据库,之后,该终端设备在该语音数据库中构建训练集,并通过该训练集对该终端设备内置的神经网络装置中的初始神经网络模型进行训练,以建立能够通过上述第二语音特征数据得到与该第二语音特征对应的第二年 龄区间的分类模型。
进一步地,在一种可行的实施例中,训练集包括:各第二语音特征数据,和各所述第二语音特征数据各自对应的所述第二年龄区间,所述第二年龄区间包括:成年人年龄区间、未成年人年龄区间及老年人年龄区间,在上述步骤B“通过所述训练集对神经网络训练以获得所述分类模型”中,具体包括:
步骤B01:将所述各第二语音特征数据作为预设初始神经网络模型的输入,并将所述各第二语音特征数据对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间作为所述预设初始神经网络模型的输出,以基于所述各第二语音特征数据和各自对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间之间的映射关系,将所初始神经网络模型训练成为所述分类模型;
示例性地,例如,终端设备中配置的神经网络装置将上述各第二语音特征数据设置为该神经网络装置中预设的初始神经网络模型的输入,并将该各第二语音特征数据对应的上述成年人年龄区间、未成年人年龄区间或老年人年龄区间设置为该初始神经网络模型的输出,之后,该终端设备基于该各第二语音特征数据和该各第二语音特征数据对应的年人年龄区间、未成年人年龄区间或老年人年龄区间之间的非线性映射关系对该初始神经网络进行训练,从而将该初始神经网络模型训练为上述分类模型。
进一步地,在一种可行的实施例中,上述步骤B“通过所述训练集对神经网络训练以获得所述分类模型”中,还包括:
步骤B02:在所述语音数据库中构建验证集;
步骤B03:在基于所述各第二语音特征数据和各自对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间之间的映射关系,对所初始神经网络模型进行训练得到待确认模型后,将所述验证集中的标准语音特征数据输入所述待确认模型;
步骤B04:检测所述待确认模型输出的年龄区间与所述验证集中所述标准语音特征数据对应的标准年龄区间是否一致;
步骤B05:若是,则将所述待确认模型作为所述分类模型;
步骤B06:若否,则基于所述训练集继续对所述待确认模型进行神经网络训练以得到所述分类模型。
示例性地,例如,终端设备在上述语音数据库中构建含有用户预先采集的各标准语音信号的标准语音特征数据,和各标准语音特征数据对应的各标准年龄区间的验证集,并获 取基于上述各第二语音特征数据和该各第二语音特征数据各自对应的成年人年龄区间、未成年人年龄区间或老年人年龄区间之间的映射关系,对上述初始神经网络进行训练得到的待确认模型,之后,该终端设备将该验证集的标准语音特征数据输入该待确认模型,并通过该终端设备内集成系统检测该待确认模型输出的年龄区间是否为该验证集内该标准语音特征数据对应的标准年龄区间,若该终端设备内集成系统检测该年龄区间为验证集内该标准语音特征数据对应的标准年龄区间,则该系统将该待确认模型作为上述分类模型并存储在该终端设备内置的分类装置中,若该终端设备内集成系统检测该年龄区间不为该验证集中该标准语音特征数据对应的标准年龄区间,则该系统继续调用上述神经网络装置基于上述训练集继续对该待确认模型进行神经网络训练以得到上述分类模型。
在本实施例中,首先,终端设备在运行的过程中通过内置的采集装置时采该终端设备使用者产生的第一语音信号,并通过该终端设备内置的分类装置提取该第一语音信号中的第一语音特征数据,之后,该终端设备将获取的上述第一语音特征数据输入上述分类装置以获取与该第一语音特征数据对应的年龄区间,并将该年龄区间确定为该终端设备使用者所属的第一年龄区间,最后,该终端设备根据上述该终端设备使用者所属的第一年龄区间,在该终端设备内用户预设的各音量控制模式中确定与该第一年龄区间对应的音量控制模式,并按照该音量控制模式控制该终端设备输出的声音信号中的频率数值和音量数值,进而保护该终端设备使用者的听力。
相比于现有耳机设备强调低频性能的方式,本申请通过获取用户的语音信号,并根据该语音信号确定该用户的年龄区间,从而根据该年龄区间匹配对应的音量控制模式,达到了针对不同年龄区间的用户设置不同的音量控制模式,以充分保护该耳机设备使用者听力的效果。
进一步地,本申请还提供一种听力的保护装置,请参照图5,图5为本申请听力的保护装置一实施例的功能模块示意图,如图5所示,本申请听力的保护装置包括:
采集提取模块:用于在实时采集到第一语音信号时,从所述第一语音信号中提取第一语音特征数据,其中,所述第一语音信号为所述耳机设备的佩戴者产生;
分类确定模块:用于调用预设的分类模型基于所述第一语音特征数据确定所述佩戴者所属的第一年龄区间;
模式运行模块:用于确定所述第一年龄区间对应的音量控制模式,并运行所述音量控制模式以保护所述佩戴者的听力。
进一步地,分类确定模块,包括:
第一音量控制模块确定单元:用于若确定所述第一年龄区间为所述成年人年龄区间,则确定所述第一年龄区间对应的音量控制模式为第一音量控制模式;
进一步地,模式运行模块,包括:
第一音量控制模块运行单元:用于运行所述第一音量控制模式以按照第一频率数值和第一音量数值输出声音信号;其中,所述第一频率数值在预设的频率标准区间内,所述第一音量数值在预设的音量标准区间内。
进一步地,分类确定模块,还包括:
第二音量控制模块确定单元:用于若确定所述第一年龄区间为所述未成年人年龄区间,则确定所述第一年龄区间对应的音量控制模式为第二音量控制模式;
进一步地,模式运行模块,还包括:
第二音量控制模块运行单元:用于运行所述第二音量控制模式以按照第二频率数值和第二音量数值输出声音信号;其中,所述第二频率数值高于所述频率标准区间的最低值,所述第二音量数值低于所述音量标准区间的最高值;
进一步地,分类确定模块,还包括:
第三音量控制模块确定单元:用于若确定所述第一年龄区间为所述老年人年龄区间,则确定所述第一年龄区间数据对应的音量控制模式为第三音量控制模式;
进一步地,模式运行模块,还包括:
第三音量控制模块运行单元:用于运行所述第三音量控制模式以按照第三频率数值和第三音量数值输出声音信号;其中,所述第三频率数值高于人体共振频率区间的最高值,所述第三音量数值在所述音量标准区间内;
进一步的,分类确定模块,还包括:
语音数据库获取单元:用于获取预设的语音数据库,其中,所述语音数据库根据预先采集的各第二语音信号的第二语音特征数据和各第二年龄区间构建得到;
训练集构建单元:用于在所述语音数据库中构建训练集,通过所述训练集进行神经网络训练以得到所述分类模型。
进一步地,所述训练集包括:各第二语音特征数据,和各所述第二语音特征数据各自对应的所述第二年龄区间,所述第二年龄区间包括:成年人年龄区间、未成年人年龄区间及老年人年龄区间;
训练集构建单元,包括:
分类模型训练子单元:用于将所述各第二语音特征数据作为预设初始神经网络模型的输入,并将所述各第二语音特征数据对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间作为所述预设初始神经网络模型的输出,以基于所述各第二语音特征数据和各自对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间之间的映射关系,将所初始神经网络模型训练成为所述分类模型;
进一步地,分类确定模块,还包括:
验证集构建单元:在所述语音数据库中构建验证集;
待确认模型构建单元:用于在基于所述各第二语音特征数据和各自对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间之间的映射关系,对所初始神经网络模型进行训练得到待确认模型后,将所述验证集中的标准语音特征数据输入所述待确认模型;
待确认模型验证单元:用于检测所述待确认模型输出的年龄区间与所述验证集中所述标准语音特征数据对应的标准年龄区间是否一致;
分类模型确认单元:用于若检测所述待确认模型输出的年龄区间与所述验证集中所述标准语音特征数据对应的标准年龄区间一致,则将所述待确认模型作为所述分类模型;
分类模型更新单元:用于若检测所述待确认模型输出的年龄区间与所述验证集中所述标准语音特征数据对应的标准年龄区间不一致,则基于所述训练集继续对所述待确认模型进行神经网络训练以得到所述分类模型。
本申请还提供一种终端设备,该终端设备上有可在处理器上运行的听力的保护程序,所述终端设备执行所述听力的保护程序时实现如以上任一项实施例所述的听力的保护方法的步骤。
本申请终端设备的具体实施例与上述听力的保护方法各实施例基本相同,在此不作赘述。
本申请还提供一种计算机可读存储介质,该计算机可读存储介质上存储有听力的保护程序,所述听力的保护程序被处理器执行时实现如以上任一项实施例所述的听力的保护方法的步骤。
本发计算机可读存储介质的具体实施例与听力的保护方法各实施例基本相同,在此不作赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (10)

  1. 一种听力的保护方法,其特征在于,所述听力的保护方法应用于耳机设备,所述方法包括以下步骤:
    在实时采集到第一语音信号时,从所述第一语音信号中提取第一语音特征数据,其中,所述第一语音信号为所述耳机设备的佩戴者产生;
    调用预设的分类模型基于所述第一语音特征数据确定所述佩戴者所属的第一年龄区间;
    确定所述第一年龄区间对应的音量控制模式,并运行所述音量控制模式以保护所述佩戴者的听力。
  2. 如权利要求1所述的听力的保护方法,其特征在于,所述第一年龄区间包括成年人年龄区间,所述确定所述第一年龄区间数据对应的音量控制模式的步骤,包括:
    若确定所述第一年龄区间为所述成年人年龄区间,则确定所述第一年龄区间对应的音量控制模式为第一音量控制模式;
    所述运行所述音量控制模式以保护所述佩戴者的听力的步骤,包括:
    运行所述第一音量控制模式以按照第一频率数值和第一音量数值输出声音信号;其中,所述第一频率数值在预设的频率标准区间内,所述第一音量数值在预设的音量标准区间内。
  3. 如权利要求2所述的听力的保护方法,其特征在于,所述第一年龄区间还包括未成年人年龄区间,所述确定所述第一年龄区间数据对应的音量控制模式的步骤,还包括:
    若确定所述第一年龄区间为所述未成年人年龄区间,则确定所述第一年龄区间对应的音量控制模式为第二音量控制模式;
    所述运行所述音量控制模式以保护所述佩戴者的听力的步骤,还包括:
    运行所述第二音量控制模式以按照第二频率数值和第二音量数值输出声音信号;其中,所述第二频率数值高于所述频率标准区间的最低值,所述第二音量数值低于所述音量标准区间的最高值。
  4. 如权利要求2所述的听力的保护方法,其特征在于,所述第一年龄区间还包括老年人年龄区间,所述确定所述第一年龄区间数据对应的音量控制模式的步骤,还包括:
    若确定所述第一年龄区间为所述老年人年龄区间,则确定所述第一年龄区间数据对应的音量控制模式为第三音量控制模式;
    所述运行所述音量控制模式以保护所述佩戴者的听力的步骤,还包括:
    运行所述第三音量控制模式以按照第三频率数值和第三音量数值输出声音信号;其中,所述第三频率数值高于人体共振频率区间的最高值,所述第三音量数值在所述音量标准区间内。
  5. 如权利要求1所述的听力的保护方法,其特征在于,所述方法还包括:
    获取预设的语音数据库,其中,所述语音数据库根据预先采集的各第二语音信号的第二语音特征数据和各第二年龄区间构建得到;
    在所述语音数据库中构建训练集,通过所述训练集进行神经网络训练以得到所述分类模型。
  6. 如权利要求5所述的听力的保护方法,其特征在于,所述训练集包括:各第二语音特征数据,和各所述第二语音特征数据各自对应的所述第二年龄区间,所述第二年龄区间包括:成年人年龄区间、未成年人年龄区间及老年人年龄区间,所述通过所述训练集进行神经网络训练以得到所述分类模型的步骤,包括:
    将所述各第二语音特征数据作为预设初始神经网络模型的输入,并将所述各第二语音特征数据对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间作为所述预设初始神经网络模型的输出,以基于所述各第二语音特征数据和各自对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间之间的映射关系,将所初始神经网络模型训练成为所述分类模型。
  7. 如权利要求6所述的听力的保护方法,其特征在于,所述通过所述训练集进行神经网络训练以得到所述分类模型的步骤,还包括:
    在所述语音数据库中构建验证集;
    在基于所述各第二语音特征数据和各自对应的所述成年人年龄区间、所述未成年人年龄区间或所述老年人年龄区间之间的映射关系,对所初始神经网络模型进行训练得到待确认模型后,将所述验证集中的标准语音特征数据输入所述待确认模型;
    检测所述待确认模型输出的年龄区间与所述验证集中所述标准语音特征数据对应的标准年龄区间是否一致;
    若是,则将所述待确认模型作为所述分类模型;
    若否,则基于所述训练集继续对所述待确认模型进行神经网络训练以得到所述分类模型。
  8. 一种听力的保护装置,其特征在于,所述装置包括:
    采集提取模块:用于在实时采集到第一语音信号时,从所述第一语音信号中提取第一语音特征数据,其中,所述第一语音信号为所述耳机设备的佩戴者产生;
    分类确定模块:用于调用预设的分类模型基于所述第一语音特征数据确定所述佩戴者所属的第一年龄区间;
    模式运行模块:用于确定所述第一年龄区间对应的音量控制模式,并运行所述音量控制模式以保护所述佩戴者的听力。
  9. 一种终端设备,其特征在于,所述终端设备包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至7中任一项所述的听力的保护方法的步骤。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述的听力的保护方法的步骤。
PCT/CN2022/102134 2022-04-28 2022-06-29 听力的保护方法、装置、终端设备及存储介质 WO2023206788A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210462397.0A CN114900767B (zh) 2022-04-28 2022-04-28 听力的保护方法、装置、终端设备及存储介质
CN202210462397.0 2022-04-28

Publications (1)

Publication Number Publication Date
WO2023206788A1 true WO2023206788A1 (zh) 2023-11-02

Family

ID=82720537

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/102134 WO2023206788A1 (zh) 2022-04-28 2022-06-29 听力的保护方法、装置、终端设备及存储介质

Country Status (2)

Country Link
CN (1) CN114900767B (zh)
WO (1) WO2023206788A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012093470A1 (ja) * 2011-01-04 2012-07-12 富士通株式会社 音声制御装置、音声制御方法及び音声制御プログラム
US20180032014A1 (en) * 2016-07-28 2018-02-01 Kyocera Document Solutions Inc. Electronic device and image-forming apparatus
CN107656461A (zh) * 2016-07-26 2018-02-02 青岛海尔洗衣机有限公司 一种基于用户年龄调节语音的方法及洗衣机
CN108924687A (zh) * 2018-07-05 2018-11-30 Oppo(重庆)智能科技有限公司 一种音量设置方法和设备、及计算机存储介质
CN111179915A (zh) * 2019-12-30 2020-05-19 苏州思必驰信息科技有限公司 基于语音的年龄识别方法及装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004205624A (ja) * 2002-12-24 2004-07-22 Megachips System Solutions Inc 音声処理システム
CN103177750A (zh) * 2011-12-20 2013-06-26 富泰华工业(深圳)有限公司 音频播放装置及其控制方法
US9943253B2 (en) * 2015-03-20 2018-04-17 Innovo IP, LLC System and method for improved audio perception
CN105282345B (zh) * 2015-11-23 2019-03-15 小米科技有限责任公司 通话音量的调节方法和装置
CN106535044A (zh) * 2016-11-24 2017-03-22 深圳市傲洲科技有限公司 智能音响的播放控制方法以及音乐播放控制系统
CN108235204A (zh) * 2016-12-12 2018-06-29 塞舌尔商元鼎音讯股份有限公司 取得听力数据的电子装置及听力数据取得的方法
CN114071293A (zh) * 2020-08-05 2022-02-18 广东小天才科技有限公司 音量模式设置、调节方法、监护终端、智能设备和耳机
CN114257191B (zh) * 2020-09-24 2024-05-17 达发科技股份有限公司 均衡器调整方法和电子装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012093470A1 (ja) * 2011-01-04 2012-07-12 富士通株式会社 音声制御装置、音声制御方法及び音声制御プログラム
CN107656461A (zh) * 2016-07-26 2018-02-02 青岛海尔洗衣机有限公司 一种基于用户年龄调节语音的方法及洗衣机
US20180032014A1 (en) * 2016-07-28 2018-02-01 Kyocera Document Solutions Inc. Electronic device and image-forming apparatus
CN108924687A (zh) * 2018-07-05 2018-11-30 Oppo(重庆)智能科技有限公司 一种音量设置方法和设备、及计算机存储介质
CN111179915A (zh) * 2019-12-30 2020-05-19 苏州思必驰信息科技有限公司 基于语音的年龄识别方法及装置

Also Published As

Publication number Publication date
CN114900767A (zh) 2022-08-12
CN114900767B (zh) 2023-06-13

Similar Documents

Publication Publication Date Title
US11450337B2 (en) Multi-person speech separation method and apparatus using a generative adversarial network model
CN113709616B (zh) 耳朵接近度检测
US10460095B2 (en) Earpiece with biometric identifiers
US11386905B2 (en) Information processing method and device, multimedia device and storage medium
CN108346433A (zh) 一种音频处理方法、装置、设备及可读存储介质
CN108511002B (zh) 危险事件声音信号识别方法、终端和计算机可读存储介质
US20180233125A1 (en) Wearable audio device
CN112087701B (zh) 用于风检测的麦克风的扬声器仿真
CN110364156A (zh) 语音交互方法、系统、终端及可读存储介质
US10531178B2 (en) Annoyance noise suppression
CN109656511A (zh) 一种音频播放方法、终端及计算机可读存储介质
CN110070863A (zh) 一种语音控制方法及装置
US11218796B2 (en) Annoyance noise suppression
CN108540660B (zh) 语音信号处理方法和装置、可读存储介质、终端
CN106992008A (zh) 处理方法及电子设备
CN110097875A (zh) 基于麦克风信号的语音交互唤醒电子设备、方法和介质
CN110223711A (zh) 基于麦克风信号的语音交互唤醒电子设备、方法和介质
CN110111776A (zh) 基于麦克风信号的语音交互唤醒电子设备、方法和介质
WO2018000764A1 (zh) 一种声道自动匹配的方法、装置以及耳机
CN111385688A (zh) 一种基于深度学习的主动降噪方法、装置及系统
CN109545221A (zh) 参数调整方法、移动终端及计算机可读存储介质
CN110111795B (zh) 一种语音处理方法及终端设备
CN110232909A (zh) 一种音频处理方法、装置、设备及可读存储介质
JP3233390U (ja) 通知装置及びウェアラブル装置
CN109686359A (zh) 语音输出方法、终端及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22939617

Country of ref document: EP

Kind code of ref document: A1