CN110825446B - Parameter configuration method and device, storage medium and electronic equipment - Google Patents

Parameter configuration method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN110825446B
CN110825446B CN201911032104.XA CN201911032104A CN110825446B CN 110825446 B CN110825446 B CN 110825446B CN 201911032104 A CN201911032104 A CN 201911032104A CN 110825446 B CN110825446 B CN 110825446B
Authority
CN
China
Prior art keywords
scene
preset
electronic equipment
wake
audio data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911032104.XA
Other languages
Chinese (zh)
Other versions
CN110825446A (en
Inventor
陈喆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201911032104.XA priority Critical patent/CN110825446B/en
Publication of CN110825446A publication Critical patent/CN110825446A/en
Application granted granted Critical
Publication of CN110825446B publication Critical patent/CN110825446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4418Suspend and resume; Hibernate and awake
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • G06F9/4451User profiles; Roaming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The embodiment of the application discloses a parameter configuration method, a device, a storage medium and electronic equipment, wherein trigger information generated by a preset application suitable for running in a preset scene is detected, when the trigger information is detected, the electronic equipment is initially judged to be in the preset scene, at the moment, audio data of the current scene of the electronic equipment are further collected, whether the current scene is the preset scene or not is identified according to the collected audio data, namely whether the current scene of the electronic equipment is the preset scene is checked, if so, configuration parameters of the preset scene are further obtained to configure the electronic equipment, and the electronic equipment can run in the preset scene better. Therefore, the scene where the electronic equipment is located is automatically identified, and the electronic equipment is automatically configured according to the identified scene, so that the user does not need to manually configure the electronic equipment, and the aim of improving usability of the electronic equipment can be achieved.

Description

Parameter configuration method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of audio recognition technologies, and in particular, to a parameter configuration method, a device, a storage medium, and an electronic apparatus.
Background
At present, people cannot leave electronic equipment such as smart phones, tablet computers and the like, and people can play entertainment, work and the like at any time and any place through various rich functions provided by the electronic equipment. In the related art, the configuration of the electronic device may be manually changed by the user, so that the electronic device is suitable for better serving the user in the scene where it is actually located. However, the ease of use of the electronic device is poor due to the need for manual configuration by the user.
Disclosure of Invention
The embodiment of the application provides a parameter configuration method, a device, a storage medium and electronic equipment, which can provide usability of the electronic equipment.
The embodiment of the application provides a parameter configuration method, which is applied to electronic equipment and comprises the following steps:
detecting trigger information generated by a preset application, wherein the preset application is suitable for running in a preset scene;
acquiring audio data of a scene where the electronic equipment is currently located according to the trigger information;
identifying whether the current scene is the preset scene according to the audio data;
and when the current scene is identified as the preset scene, acquiring configuration parameters corresponding to the preset scene to configure the electronic equipment.
The parameter configuration device provided by the embodiment of the application is applied to electronic equipment, and comprises:
the detection module is used for detecting trigger information generated by a preset application, and the preset application is suitable for running in a preset scene;
the acquisition module is used for acquiring audio data of a scene where the electronic equipment is currently located according to the trigger information;
the identification module is used for identifying whether the current scene is the preset scene or not according to the audio data;
and the configuration module is used for acquiring configuration parameters corresponding to the preset scene to configure the electronic equipment when the current scene is identified as the preset scene.
The storage medium provided by the embodiment of the application stores a computer program thereon, and when the computer program is loaded by a processor, the parameter configuration method provided by any embodiment of the application is executed.
The electronic device provided by the embodiment of the application comprises a processor and a memory, wherein the memory stores a computer program, and the processor is used for executing the parameter configuration method provided by any embodiment of the application by loading the computer program.
According to the method and the device, the scene where the electronic equipment is located is automatically identified, and the configuration is automatically carried out according to the identified scene, so that a user does not need to manually carry out configuration, and the aim of improving usability of the electronic equipment can be achieved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a parameter configuration method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of frequency components corresponding to a subway scene in an embodiment of the present application.
Fig. 3 is a schematic diagram of non-overlapping framing of audio data according to an embodiment of the present application.
FIG. 4 is a schematic diagram of loading a primary wake model and a secondary wake model in an embodiment of the present application.
Fig. 5 is another flow chart of a parameter configuration method according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a parameter configuration apparatus according to an embodiment of the present application.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Referring to the drawings, wherein like reference numbers refer to like elements throughout, the principles of the present application are illustrated in an appropriate computing environment. The following description is of illustrative embodiments of the application and should not be taken as limiting the application to other embodiments that are not described in detail herein.
The embodiment of the application relates to a parameter configuration method, a parameter configuration device, a storage medium and electronic equipment, wherein an execution subject of the parameter configuration method can be the parameter configuration device provided by the embodiment of the application or the electronic equipment integrated with the parameter configuration device, and the parameter configuration device can be realized in a hardware or software mode. The electronic device may be a device with a processing capability, such as a smart phone, a tablet computer, a palm computer, a notebook computer, or a desktop computer, configured with a processor.
Referring to fig. 1, fig. 1 is a flow chart of a parameter configuration method provided by an embodiment of the present application, and a specific flow of the parameter configuration method provided by the embodiment of the present application may be as follows:
s101, detecting trigger information generated by a preset application, wherein the preset application is suitable for running in a preset scene.
It should be noted that, in the embodiment of the present application, the usage scenario of the electronic device is categorized in advance, which includes but is not limited to a bus scenario, a subway scenario, a restaurant scenario, an office scenario, and the like. The preset application is configured to be adapted to run under a preset scenario, which may be any of the usage scenarios of the aforementioned categories.
For example, the preset application may be a bus taking application configured to be adapted to run in a bus scenario, e.g. the electronic device provides for a user to swipe a code for taking a bus by running the bus taking application.
The preset application may also be a subway riding application configured to be suitable for running in a subway scene, for example, the electronic device provides a user with a code to ride a subway by running the subway riding application, and the like.
The preset application may also be an ordering application configured to be adapted to run in a restaurant scenario, such as by the electronic device running the ordering application for online self-service ordering by the user.
The preset application may also be an office-like application configured to be adapted to run in an office scenario, e.g., the electronic device provides the user with an electronic office by running the office application.
It will be appreciated that when a preset application on the electronic device is running, it is illustrated that the electronic device may be in a corresponding preset scene at this time. In the embodiment of the application, specific information generated in the running process of the preset application is used as trigger information for triggering scene recognition.
For example, when the preset application is a bus taking application suitable for running in a bus scene, the trigger information may be description information describing the bus taking application and the code scanning for taking the bus;
for another example, when the preset application is a subway riding application suitable for running in a subway scene, the trigger information may be description information describing that the subway riding application scans a code and multiplies the subway;
for another example, when the preset application is an ordering application suitable for running in a restaurant scene, the trigger information may be description information describing online ordering of the ordering application;
for another example, when the preset application is an office application adapted to run in an office scenario, the trigger information may be description information describing that the office application performs an office operation.
S102, acquiring audio data of a scene where the electronic equipment is currently located according to the trigger information.
In the embodiment of the application, when the trigger information generated by the preset application is detected, the scene where the electronic equipment is currently located is identified by triggering, namely whether the scene where the electronic equipment is currently located is a preset scene or not is identified. Firstly, acquiring audio data of a scene where the electronic equipment is currently located according to trigger information.
It can be understood that the current scene of the electronic device is an unknown scene at this time, sound collection can be directly performed through a microphone set by the electronic device, and the collected audio data is used as the audio data of the current scene.
The microphone provided by the electronic device may be an internal microphone or an external microphone (may be a wired external microphone or a wireless external microphone).
If the microphone is an analog microphone, analog audio data is collected, and analog-to-digital conversion is needed to be performed on the analog audio data at this time, so that digitized audio data is obtained for subsequent processing. For example, after the analog audio data is collected by the microphone, the analog audio data may be sampled at a sampling frequency of 16KHz to obtain digitized audio data.
It will be appreciated by those of ordinary skill in the art that if the microphone included in the electronic device is a digital microphone, the digitized audio data will be directly collected without further analog-to-digital conversion.
S103, identifying whether the current scene is a preset scene or not according to the audio data.
After the audio data of the current scene of the electronic equipment is collected, whether the current scene is a preset scene or not can be identified according to the collected audio data and a preset scene identification strategy.
For example, whether the audio data includes the audio features of the preset scene or not may be determined that the current scene is the preset scene if the audio data includes the audio features of the preset scene;
for another example, the audio data can be directly compared with the sample audio data of the preset scene collected in advance, and when the collected audio data is consistent with the sample audio data in comparison, the current scene is judged to be the preset scene.
And S104, when the current scene is identified as the preset scene, acquiring configuration parameters corresponding to the preset scene to configure the electronic equipment.
It should be noted that, in the embodiment of the present application, corresponding configuration parameters are preset for configuring related functions of the electronic device according to a preset scenario, so that the related functions of the electronic device can provide an optimal service effect in the preset scenario.
Correspondingly, when the current scene is identified as the preset scene, acquiring configuration parameters corresponding to the preset scene, and configuring related functions of the electronic equipment according to the acquired configuration parameters, wherein the related functions comprise but are not limited to an audio/video output function, a call function, a voice interaction function and the like.
As can be seen from the above, according to the method and the device, the trigger information generated by the preset application suitable for running in the preset scene is detected, when the trigger information is detected, the electronic device is initially determined to be in the preset scene, at this time, the audio data of the current scene of the electronic device is further collected, whether the current scene of the electronic device is the preset scene or not is identified according to the collected audio data, that is, whether the current scene of the electronic device is the preset scene is checked, if yes, the configuration parameters of the preset scene are further obtained to configure the electronic device, so that the electronic device can better run in the preset scene. Therefore, the scene where the electronic equipment is located is automatically identified, and the electronic equipment is automatically configured according to the identified scene, so that the user does not need to manually configure the electronic equipment, and the aim of improving usability of the electronic equipment can be achieved.
In one embodiment, "identifying whether the currently located scene is a preset scene according to audio data" includes:
and identifying whether the audio data comprises frequency components corresponding to the preset scene or not, and if so, determining that the current scene is the preset scene.
It should be noted that for different preset scenarios, specific sound features exist.
For example, for a bus scene, there is a sound feature of a bus door opening and closing; for a subway scene, the sound characteristics of subway switch doors exist; for restaurant scenes, there are sound features of the serving ring tone; for office scenes, there are sound features that tap the keyboard, etc.
In the embodiment of the application, in order to more accurately identify the sound characteristics capable of representing different preset scenes, the identification is performed in the frequency dimension.
For example, taking a subway scene as an example, please refer to fig. 2, fig. 2 is a spectrogram of a subway door opening and closing, and according to the spectrogram, it can be seen that a sound of the subway door opening and closing is composed of a plurality of frequency components. Correspondingly, the corresponding relation between the subway scene and the corresponding frequency component can be pre-established.
Similarly, the corresponding relation between the bus scene and the corresponding frequency component, the corresponding relation between the restaurant scene and the corresponding frequency component and the corresponding relation between the office scene and the corresponding frequency component can be established.
Therefore, when the current scene is identified as the preset scene according to the audio data, the acquired audio data of the current scene of the electronic equipment can be analyzed in the frequency dimension, whether the audio data comprises frequency components corresponding to the preset scene is identified, and if the audio data comprises the frequency components corresponding to the preset scene, the current scene of the electronic equipment can be determined to be the preset scene.
In one embodiment, "identifying whether frequency components corresponding to a preset scene are included in audio data" includes:
and identifying whether the duration of the frequency component corresponding to the preset scene in the audio data reaches the preset duration, and if so, determining that the audio data comprises the frequency component.
In the embodiment of the application, when the audio data is identified to include the frequency component corresponding to the preset scene, whether the duration of the frequency component corresponding to the preset scene in the acquired audio data reaches the preset duration can be identified, and when the duration of the frequency component in the audio data reaches the preset duration, the audio data is determined to include the frequency component corresponding to the preset scene.
It should be noted that, in different preset scenarios, the corresponding frequency components are also different.
For example, taking a subway scene as an example, the subway scene corresponds to 7 different frequency components, the 7 different frequency components respectively correspond to corresponding preset time periods, the preset time periods corresponding to the different frequency components are different, after the triggering information generated by the subway riding application is detected, the audio data are analyzed in the frequency dimension after the audio data of the current scene of the electronic equipment are acquired, whether the duration time periods of the 7 frequency components corresponding to the subway acquisition in the audio data reach the respective preset time periods is identified, if the duration time periods reach the respective preset time periods, the audio data comprise the frequency components corresponding to the subway scene can be determined, and the current scene of the electronic equipment can be judged to be the subway scene.
In an embodiment, "identifying whether the duration of the frequency component corresponding to the preset scene in the audio data reaches the preset duration" includes:
(1) Carrying out non-overlapping framing treatment on the audio data to obtain a plurality of audio frames;
(2) And carrying out Fourier transform on the audio frames obtained by framing, identifying whether the amplitudes of the frequency components corresponding to the preset scene in the continuous audio frames within the preset duration reach the preset amplitudes according to the Fourier transform result, and if so, determining that the duration of the frequency components in the audio data reaches the preset duration.
In the embodiment of the application, when the duration of the frequency component corresponding to the preset scene in the audio data is identified to reach the preset duration, firstly, non-overlapping framing processing is carried out on the acquired audio data to obtain a plurality of audio frames.
For example, referring to fig. 3, non-overlapping framing is performed on the collected audio data x (N), where each frame has a length of N, and m audio frames are obtained by co-framing, and each audio frame may be represented as x m (n). The non-overlapping frame division can be understood colloquially that no overlapping part exists between two adjacent audio frames obtained by frame division.
After a plurality of audio frames are obtained through framing, further carrying out Fourier transform on each audio frame obtained through framing, identifying whether the amplitudes of the frequency components corresponding to the preset scene in the continuous audio frames within the preset duration reach the preset amplitude according to the Fourier transform result, and if so, determining that the duration of the frequency components in the audio data reaches the preset duration.
For example, for audio frame x m (n) performing fast Fourier transform to obtain corresponding Fourier transform result X m (n)=FFT[x m (n)]The frequency resolution of the Fourier transform is f s N, where f s N is the length of the audio frame, which is the sampling frequency of the audio data. For any frequency component f, it is found in the Fourier transform result X m The position in (n) is i=nf/f s Can be expressed as X m (i) A. The application relates to a method for producing a fibre-reinforced plastic composite Then, for the frequency component f 'corresponding to the preset scene, the amplitude of the frequency component f' is as follows in each audio frameWherein abs []The absolute value is calculated. For the frequency component f ', if the frequency component f' is within the preset time period t, the amplitude of each frame of the continuous audio frames is +.>And when the preset amplitude value alpha is reached, determining that the duration of the frequency component f' in the audio data reaches the preset duration t.
In an embodiment, the configuration parameters include noise reduction parameters, "configure the electronic device according to the configuration parameters," including:
and configuring the noise reduction mode of the electronic equipment according to the noise reduction parameters so that the noise reduction mode of the electronic equipment is matched with a preset scene.
In the embodiment of the application, the configuration parameters comprise noise reduction parameters for configuring the noise reduction function of the electronic equipment. Correspondingly, in the embodiment of the application, corresponding noise reduction parameters are respectively set for different preset scenes in advance.
Therefore, when the electronic device is configured according to the configuration parameters, the noise reduction mode of the electronic device can be configured according to the noise reduction parameters, so that the noise reduction mode of the electronic device is matched with a preset scene to be suitable for noise reduction under the preset scene.
Taking a subway scene as an example, when the current scene of the electronic equipment is identified as the subway scene, the noise reduction parameters of the corresponding subway scene are acquired, and the noise reduction mode of the electronic equipment is configured according to the acquired noise reduction parameters, so that the noise reduction mode of the electronic equipment is matched with the subway scene. Thus, when the noise reduction function of the electronic equipment is enabled, the optimal noise reduction effect in the subway scene can be obtained. For example, the noise reduction function can be enabled when the electronic device performs voice call, so that the electronic device can provide clearer voice call service for users.
In an embodiment, the configuration parameters include a wake-up parameter, "configure the electronic device according to the configuration parameters", further comprising:
and configuring the wake-up strategy of the electronic equipment according to the wake-up parameters, so that the wake-up strategy of the electronic equipment is matched with a preset scene.
In the embodiment of the application, the configuration parameters comprise wake-up parameters for configuring the voice interaction function of the electronic equipment. Correspondingly, in the embodiment of the application, corresponding wake-up parameters are respectively set for different preset scenes in advance. It should be noted that the premise of enabling the voice interaction function of the electronic device is to wake the electronic device, and the wake-up parameters may be used to configure a wake-up policy of the wake-up electronic device.
Therefore, when the electronic device is configured according to the configuration parameters, the wake-up policy of the electronic device can be configured according to the wake-up parameters, so that the wake-up policy of the electronic device is matched with the preset scene, and the electronic device is suitable for being woken up in the preset scene. For example, the electronic device provides a voice interaction function through the installed voice interaction application, and wakes up the electronic device, that is, wakes up the voice interaction application according to the electronic device, so that the electronic device can perform voice interaction with the user through the voice interaction application.
Taking a subway scene as an example, when the current scene of the electronic equipment is identified as the subway scene, wake-up parameters corresponding to the subway scene are obtained, and wake-up strategies of the electronic equipment are configured according to the obtained wake-up parameters, so that the wake-up strategies of the electronic equipment are matched with the subway scene. Therefore, when the electronic equipment is awakened, the electronic equipment can be more accurately awakened. For example, after completing configuration of the wake-up policy, external audio data is collected in real time to serve as audio data to be checked, the audio data to be checked is checked according to the wake-up policy, and when the check is passed, the electronic equipment is awakened.
In an embodiment, an electronic device includes a dedicated speech recognition chip and a processor, and the configuring of a wake policy of the electronic device according to wake parameters includes:
and controlling the special voice recognition chip to load a primary wake-up model corresponding to the preset scene, and controlling the processor to load a secondary wake-up model corresponding to the preset scene.
It should be noted that, in the embodiment of the present application, the electronic device further includes a processor and a dedicated voice recognition chip, and the power consumption of the dedicated voice recognition chip is smaller than the power consumption of the processor.
The processor is a processor suitable for general processing tasks, such as an ARM architecture processor.
The dedicated speech recognition chip is a dedicated chip designed for speech recognition, such as a digital signal processing chip designed for speech recognition, an application specific integrated circuit chip designed for speech recognition, etc., which has lower power consumption than a general-purpose processor and is suitable for processing of speech recognition tasks. The special voice recognition chip, the processor and the microphone are connected through a communication bus (such as an I2C bus) to realize data interaction.
In addition, a primary wake-up model set and a secondary wake-up model set are preset in the electronic equipment, wherein the primary wake-up model set comprises a plurality of primary wake-up models which are obtained through training in different preset scenes in advance, so that a special voice recognition chip is suitable for loading in different preset scenes, the collected audio data to be verified can be more flexibly and accurately subjected to primary verification, and the secondary wake-up model comprises a plurality of secondary wake-up models which are obtained through training in different preset scenes in advance, so that the processor is suitable for loading in different preset scenes, and the collected audio data to be verified can be subjected to secondary verification.
Correspondingly, when the wake-up strategy of the electronic device is configured according to the wake-up parameters, the special voice recognition chip is controlled to load a first-stage wake-up model corresponding to the preset scene (namely, a first-stage wake-up model suitable for carrying out wake-up verification under the preset scene) from the first-stage wake-up model set, and the processor is controlled to load a second-stage wake-up model corresponding to the preset scene (namely, a second-stage wake-up model suitable for carrying out wake-up verification under the preset scene) from the second-stage wake-up model set.
For example, referring to fig. 4, the primary wake-up model set includes four primary wake-up models, which are a primary wake-up model a adapted to perform audio verification in a bus scene, a primary wake-up model B adapted to perform audio verification in a subway scene, a primary wake-up model C adapted to perform audio verification in a restaurant scene, and a primary wake-up model D adapted to perform audio verification in an office scene. The secondary wake-up model set comprises four secondary wake-up models, namely a secondary wake-up model A suitable for carrying out audio verification in a bus scene, a secondary wake-up model B suitable for carrying out audio verification in a subway scene, a secondary wake-up model C suitable for carrying out audio verification in a restaurant scene and a secondary wake-up model D suitable for carrying out audio verification in an office scene. Assuming that the current scene is a subway scene, the acquired wake-up parameters indicate to load a primary wake-up model B and a secondary wake-up model B, and correspondingly, the electronic equipment loads the primary wake-up model B from the primary wake-up model set through a special voice recognition chip and loads the secondary wake-up model B from the secondary wake-up model set through a processor.
After the loading of the primary wake-up model and the secondary wake-up model is completed, the collected audio data to be verified can be verified through the primary wake-up model loaded by the special voice recognition chip, and after the collected audio data to be verified passes the primary verification, the collected audio data to be verified is verified through the secondary wake-up model loaded by the processor, and if the collected audio data to be verified passes the secondary verification, voice interaction application of the electronic equipment can be awakened to perform voice interaction with a user. It should be noted that, because the processing capacity of the special voice recognition chip is not as high as that of the processor, the size and accuracy of the secondary wake-up model suitable for the same scene will be greater than those of the primary wake-up model, so that the collected audio data to be checked is checked roughly by the primary wake-up model, and after the primary check is passed, the secondary check is performed, and the overall check accuracy is ensured by the secondary check.
The first-level verification of the collected audio data to be verified comprises verification of text features and/or voiceprint features, and the second-level verification of the collected audio data to be verified comprises verification of the text features and/or voiceprint features.
In popular words, the text feature of the audio data to be checked is checked, namely, whether the audio data to be checked comprises a preset wake-up word or not is checked, and if the audio data to be checked comprises the preset wake-up word, the checking is passed. For example, the collected audio data to be verified includes a preset wake-up word set by a preset user (for example, a owner of the electronic device, or other users authorized to use the electronic device by the owner), but the preset wake-up word is spoken by the user a, instead of the preset user, and the verification is passed at this time.
And checking the text features and the voiceprint features of the audio data to be checked, namely checking whether the audio data to be checked comprises a preset wake-up word spoken by a preset user, and if the collected audio data to be checked comprises the preset wake-up word spoken by the preset user, checking is passed. For example, the collected audio data to be checked comprises a preset wake-up word set by a preset user, and if the preset wake-up word is spoken by the preset user, the text characteristics and voiceprint characteristics of the audio data to be checked pass the check; for example, if the collected audio data to be checked includes a preset wake-up word uttered by a user other than the preset user, or if the audio data to be checked does not include any preset wake-up word uttered by the user, the verification of the text feature and the voiceprint feature of the audio data to be checked fails (or fails).
It should be noted that, the primary check and the secondary check are only used to refer to the sequence of the check, and are not used to limit the check content, in other words, in the embodiment of the present application, the primary wake-up model and the secondary wake-up model corresponding to the same scene may be the same or different. For example, the primary wake-up model is a gaussian mixture model-based voice wake-up model, and the secondary wake-up model is a neural network-based voice wake-up model.
The parameter configuration method of the present application will be further described below taking a preset application as a subway riding application and a preset scene as a subway scene as an example on the basis of the method described in the above embodiment. Referring to fig. 5, the parameter configuration method is applied to an electronic device, and the parameter configuration method may include:
201, the electronic device detects trigger information generated by the subway riding application.
It should be noted that, in the embodiment of the present application, when the subway riding application on the electronic device is running, it is illustrated that the electronic device may be in a subway scene at this time. For example, a user may scan a subway by using a subway riding application running through an electronic device, and correspondingly, in the embodiment of the present application, description information describing the scanning of the subway by using the subway riding application may be used as trigger information for triggering scene recognition.
202, the electronic equipment collects audio data of the scene where the electronic equipment is currently located according to the trigger information.
In the embodiment of the application, when the trigger information generated by the preset application is detected, the scene where the electronic equipment is currently located is identified by triggering, namely whether the scene where the electronic equipment is currently located is a subway scene or not is identified. The microphone is firstly arranged on the electronic equipment to collect sound, and the collected audio data are used as the audio data of the current scene.
The microphone provided by the electronic device may be an internal microphone or an external microphone (may be a wired external microphone or a wireless external microphone).
If the microphone is an analog microphone, analog audio data is collected, and analog-to-digital conversion is needed to be performed on the analog audio data at this time, so that digitized audio data is obtained for subsequent processing. For example, after the analog audio data is collected by the microphone, the analog audio data may be sampled at a sampling frequency of 16KHz to obtain digitized audio data.
It will be appreciated by those of ordinary skill in the art that if the microphone included in the electronic device is a digital microphone, the digitized audio data will be directly collected without further analog-to-digital conversion.
203, the electronic device performs non-overlapping framing processing on the acquired audio data to obtain a plurality of audio frames.
After the audio data of the scene where the electronic equipment is currently located is collected, whether the scene where the electronic equipment is currently located is a subway scene or not can be identified according to the collected audio data.
For example, taking a subway scene as an example, please refer to fig. 2, fig. 2 is a spectrogram of a subway door opening and closing, and according to the spectrogram, it can be seen that a sound of the subway door opening and closing is composed of a plurality of frequency components. Correspondingly, the corresponding relation between the subway scene and the corresponding frequency component can be pre-established.
When the current scene is identified as the subway scene, the electronic equipment can identify whether the audio data comprises frequency components corresponding to the subway scene or not, and if the audio data comprises the frequency components corresponding to the subway scene, the current scene of the electronic equipment can be determined as the subway scene.
When identifying whether the collected audio data comprises frequency components corresponding to the subway scene, firstly carrying out non-overlapping framing processing on the collected audio data to obtain a plurality of audio frames.
For example, referring to fig. 3, non-overlapping framing is performed on the collected audio data x (N), where each frame has a length of N, and m audio frames are obtained by co-framing, and each audio frame may be represented as x m (n). The non-overlapping frame division can be understood colloquially that no overlapping part exists between two adjacent audio frames obtained by frame division.
204, the electronic equipment performs fourier transform on the audio frames obtained by framing, and identifies whether the amplitudes of the frequency components corresponding to the subway scene in the continuous audio frames within the preset duration reach preset amplitudes according to the fourier transform result, if so, the current scene is judged to be the subway scene.
After a plurality of audio frames are obtained through framing, further carrying out Fourier transform on each audio frame obtained through framing, identifying whether the amplitudes of the frequency components corresponding to the subway scene in the continuous audio frames within the preset duration reach the preset amplitudes according to the Fourier transform result, and if so, determining that the collected audio data comprise the frequency components corresponding to the subway scene.
For example, for audio frame x m (n) performing fast Fourier transform to obtain corresponding Fourier transform result X m (n)=FFT[x m (n)]The frequency resolution of the Fourier transform is f s N, where f s N is the length of the audio frame, which is the sampling frequency of the audio data. For any frequency component f, it is found in the Fourier transform result X m The position in (n) is i=nf/f s Can be expressed as X m (i) A. The application relates to a method for producing a fibre-reinforced plastic composite Then, for the frequency component f 'corresponding to the subway scene, the amplitude of the frequency component f' is equal toWherein abs []The absolute value is calculated. For frequencyComponent f ', if the frequency component f' is within a preset time period t, the amplitude of each frame of the continuous audio frames is +.>When the preset amplitude value alpha is reached, determining that the audio data comprises a frequency component f' corresponding to the subway scene.
205, the electronic device obtains configuration parameters corresponding to the subway scene to perform configuration.
It should be noted that, in the embodiment of the present application, corresponding configuration parameters are preset for configuring related functions of the electronic device, so that the related functions of the electronic device can provide an optimal service effect in the subway scene.
Correspondingly, when the current scene is identified as the subway scene, the electronic equipment acquires configuration parameters corresponding to the subway scene, and configures related functions of the electronic equipment according to the acquired configuration parameters, wherein the related functions comprise but are not limited to an audio/video output function, a call function, a voice interaction function and the like.
In an embodiment, a parameter configuration apparatus is also provided. Referring to fig. 6, fig. 6 is a schematic structural diagram of a parameter configuration device according to an embodiment of the application. The parameter configuration device is applied to an electronic device, and the parameter configuration device includes a detection module 301, an acquisition module 302, an identification module 303, and a configuration module 304, as follows:
the detection module 301 is configured to detect trigger information generated by a preset application, where the preset application is adapted to run in a preset scene;
the acquisition module 302 is configured to acquire audio data of a scene where the electronic device is currently located according to the trigger information;
the identifying module 303 is configured to identify, according to the audio data, whether the current scene is a preset scene;
and the configuration module 304 is configured to obtain configuration parameters corresponding to a preset scene to configure the electronic device when the current scene is identified as the preset scene.
In one embodiment, when identifying whether the current scene is a preset scene according to the audio data, the identification module 303 is configured to:
and identifying whether the audio data comprises frequency components corresponding to the preset scene or not, and if so, determining that the current scene is the preset scene.
In one embodiment, in identifying whether the audio data includes a frequency component corresponding to a preset scene, the identifying module 303 is configured to:
and identifying whether the duration of the frequency component corresponding to the preset scene in the audio data reaches the preset duration, and if so, determining that the audio data comprises the frequency component.
In one embodiment, when identifying whether the duration of the frequency component corresponding to the preset scene in the audio data reaches the preset duration, the identification module 303 is configured to:
carrying out non-overlapping framing treatment on the audio data to obtain a plurality of audio frames;
and carrying out Fourier transform on the audio frames obtained by framing, identifying whether the amplitudes of the frequency components corresponding to the preset scene in the continuous audio frames within the preset duration reach the preset amplitudes according to the Fourier transform result, and if so, determining that the duration of the frequency components in the audio data reaches the preset duration.
In one embodiment, the configuration parameters include noise reduction parameters, and when the electronic device is configured according to the configuration parameters, the configuration module 304 is configured to:
and configuring the noise reduction mode of the electronic equipment according to the noise reduction parameters so that the noise reduction mode of the electronic equipment is matched with a preset scene.
In an embodiment, the configuration parameters include wake-up parameters, and the configuration module 304 is further configured to, when configuring the electronic device according to the configuration parameters:
and configuring the wake-up strategy of the electronic equipment according to the wake-up parameters, so that the wake-up strategy of the electronic equipment is matched with a preset scene.
In one embodiment, the electronic device includes a dedicated voice recognition chip and a processor, and when the wake policy of the electronic device is configured according to the wake parameters, the configuration module 304 is configured to:
and controlling the special voice recognition chip to load a primary wake-up model corresponding to the preset scene, and controlling the processor to load a secondary wake-up model corresponding to the preset scene.
It should be noted that, the parameter configuration device provided in the embodiment of the present application and the parameter configuration method in the foregoing embodiment belong to the same concept, and any method provided in the parameter configuration method embodiment may be run on the parameter configuration device, and the specific implementation process is detailed in the foregoing embodiment and will not be repeated herein.
In an embodiment, referring to fig. 7, an electronic device is further provided, and the electronic device includes a processor 401 and a memory 402.
The processor 401 in the embodiment of the present application is a general-purpose processor such as an ARM architecture processor.
The memory 402 has stored therein a computer program, which may be a high speed random access memory, or may be a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device, etc. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 with access to the computer program in the memory 402, implementing the following functions:
detecting trigger information generated by a preset application, wherein the preset application is suitable for running in a preset scene;
acquiring audio data of a scene where the electronic equipment is currently located according to the trigger information;
identifying whether the current scene is a preset scene or not according to the audio data;
when the current scene is identified as the preset scene, acquiring configuration parameters corresponding to the preset scene to configure the electronic equipment.
In one embodiment, when identifying whether the current scene is a preset scene according to the audio data, the processor 401 is configured to perform:
and identifying whether the audio data comprises frequency components corresponding to the preset scene or not, and if so, determining that the current scene is the preset scene.
In an embodiment, in identifying whether frequency components corresponding to a preset scene are included in the audio data, the processor 401 is configured to perform:
and identifying whether the duration of the frequency component corresponding to the preset scene in the audio data reaches the preset duration, and if so, determining that the audio data comprises the frequency component.
In one embodiment, when identifying whether the duration of the frequency component corresponding to the preset scene in the audio data reaches the preset duration, the processor 401 is configured to perform:
carrying out non-overlapping framing treatment on the audio data to obtain a plurality of audio frames;
and carrying out Fourier transform on the audio frames obtained by framing, identifying whether the amplitudes of the frequency components corresponding to the preset scene in the continuous audio frames within the preset duration reach the preset amplitudes according to the Fourier transform result, and if so, determining that the duration of the frequency components in the audio data reaches the preset duration.
In an embodiment, the configuration parameters include noise reduction parameters, and the processor 401 is configured to perform, when configuring the electronic device according to the configuration parameters:
and configuring the noise reduction mode of the electronic equipment according to the noise reduction parameters so that the noise reduction mode of the electronic equipment is matched with a preset scene.
In an embodiment, the configuration parameters include wake-up parameters, and the processor 401 is configured to perform, when configuring the electronic device according to the configuration parameters:
and configuring the wake-up strategy of the electronic equipment according to the wake-up parameters, so that the wake-up strategy of the electronic equipment is matched with a preset scene.
In an embodiment, the electronic device further comprises a dedicated speech recognition chip, and the processor 401 is configured to execute, when configuring the wake policy of the electronic device according to the wake parameters:
the special voice recognition chip loads a primary wake-up model corresponding to a preset scene;
and loading a secondary wake-up model corresponding to the preset scene.
It should be noted that, the electronic device provided in the embodiment of the present application and the parameter configuration method in the foregoing embodiment belong to the same concept, and any method provided in the parameter configuration method embodiment may be run on the electronic device, and a specific implementation process of the method is detailed in the feature extraction method embodiment and will not be described herein.
It should be noted that, for the parameter configuration method of the embodiment of the present application, it will be understood by those skilled in the art that all or part of the flow of implementing the parameter configuration method of the embodiment of the present application may be implemented by controlling related hardware by a computer program, where the computer program may be stored in a computer readable storage medium, such as a memory of an electronic device, and executed by a processor and/or a dedicated speech recognition chip in the electronic device, and the execution process may include the flow of the embodiment of the parameter configuration method. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.
The foregoing describes in detail a parameter configuration method, apparatus, storage medium and electronic device provided in the embodiments of the present application, and specific examples are applied to illustrate the principles and embodiments of the present application, where the foregoing examples are only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, the present description should not be construed as limiting the present application.

Claims (7)

1. A parameter configuration method applied to an electronic device, comprising:
detecting trigger information generated by a preset application, wherein the preset application is suitable for running in a preset scene;
acquiring audio data of a scene where the electronic equipment is currently located according to the trigger information;
carrying out non-overlapping framing treatment on the audio data to obtain a plurality of audio frames;
performing Fourier transform on the audio frame;
identifying whether the amplitudes of the frequency components corresponding to the preset scene in the continuous audio frames within the preset duration reach preset amplitudes according to the Fourier transform result, and identifying whether the current scene is the preset scene according to the identification result;
and when the current scene is identified as the preset scene, acquiring configuration parameters corresponding to the preset scene to configure the electronic equipment.
2. The method according to claim 1, wherein the configuration parameters include noise reduction parameters, and the configuring the electronic device according to the configuration parameters includes:
and configuring the noise reduction mode of the electronic equipment according to the noise reduction parameters so that the noise reduction mode of the electronic equipment is matched with the preset scene.
3. The method for configuring parameters according to claim 1, wherein the configuration parameters include a wake-up parameter, and the configuring the electronic device according to the configuration parameters further includes:
and configuring the wake-up strategy of the electronic equipment according to the wake-up parameters, so that the wake-up strategy of the electronic equipment is matched with the preset scene.
4. A method for configuring parameters according to claim 3, wherein the electronic device comprises a dedicated speech recognition chip and a processor, and wherein configuring the wake-up policy of the electronic device according to the wake-up parameters comprises:
and controlling the special voice recognition chip to load a primary wake-up model corresponding to the preset scene, and controlling the processor to load a secondary wake-up model corresponding to the preset scene.
5. A parameter configuration apparatus applied to an electronic device, comprising:
the detection module is used for detecting trigger information generated by a preset application, and the preset application is suitable for running in a preset scene;
the acquisition module is used for acquiring audio data of a scene where the electronic equipment is currently located according to the trigger information;
the identification module is used for carrying out non-overlapping framing treatment on the audio data to obtain a plurality of audio frames;
performing Fourier transform on the audio frame; identifying whether the amplitudes of the frequency components corresponding to the preset scene in the continuous audio frames within the preset duration reach preset amplitudes according to the Fourier transform result, and identifying whether the current scene is the preset scene according to the identification result;
and the configuration module is used for acquiring configuration parameters corresponding to the preset scene to configure the electronic equipment when the current scene is identified as the preset scene.
6. A storage medium having stored thereon a computer program, wherein the parameter configuration method according to any one of claims 1 to 4 is performed when the computer program is loaded by a processor.
7. An electronic device comprising a processor and a memory, the memory storing a computer program, characterized in that the processor is adapted to execute the parameter configuration method according to any of claims 1 to 4 by loading the computer program.
CN201911032104.XA 2019-10-28 2019-10-28 Parameter configuration method and device, storage medium and electronic equipment Active CN110825446B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911032104.XA CN110825446B (en) 2019-10-28 2019-10-28 Parameter configuration method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911032104.XA CN110825446B (en) 2019-10-28 2019-10-28 Parameter configuration method and device, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN110825446A CN110825446A (en) 2020-02-21
CN110825446B true CN110825446B (en) 2023-12-08

Family

ID=69551238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911032104.XA Active CN110825446B (en) 2019-10-28 2019-10-28 Parameter configuration method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN110825446B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111323783A (en) * 2020-02-27 2020-06-23 Oppo广东移动通信有限公司 Scene recognition method and device, storage medium and electronic equipment
CN111311889A (en) * 2020-02-27 2020-06-19 Oppo广东移动通信有限公司 Arrival reminding method and device, storage medium and electronic equipment
CN113395539B (en) * 2020-03-13 2023-07-07 北京字节跳动网络技术有限公司 Audio noise reduction method, device, computer readable medium and electronic equipment
CN111510814A (en) * 2020-04-29 2020-08-07 Oppo广东移动通信有限公司 Noise reduction mode control method and device, electronic equipment and storage medium
CN113873379B (en) * 2020-06-30 2023-05-02 华为技术有限公司 Mode control method and device and terminal equipment
CN112367429B (en) * 2020-11-06 2021-11-09 维沃移动通信有限公司 Parameter adjusting method and device, electronic equipment and readable storage medium
CN113132625B (en) * 2021-03-11 2023-05-12 宇龙计算机通信科技(深圳)有限公司 Scene image acquisition method, storage medium and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106572411A (en) * 2016-09-29 2017-04-19 乐视控股(北京)有限公司 Noise cancelling control method and relevant device
WO2017206916A1 (en) * 2016-05-31 2017-12-07 广东欧珀移动通信有限公司 Method for determining kernel running configuration in processor and related product
CN108764304A (en) * 2018-05-11 2018-11-06 Oppo广东移动通信有限公司 scene recognition method, device, storage medium and electronic equipment
CN108831505A (en) * 2018-05-30 2018-11-16 百度在线网络技术(北京)有限公司 The method and apparatus for the usage scenario applied for identification
CN109036428A (en) * 2018-10-31 2018-12-18 广东小天才科技有限公司 A kind of voice wake-up device, method and computer readable storage medium
CN109977731A (en) * 2017-12-27 2019-07-05 深圳市优必选科技有限公司 A kind of recognition methods of scene, identification equipment and terminal device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017206916A1 (en) * 2016-05-31 2017-12-07 广东欧珀移动通信有限公司 Method for determining kernel running configuration in processor and related product
CN106572411A (en) * 2016-09-29 2017-04-19 乐视控股(北京)有限公司 Noise cancelling control method and relevant device
CN109977731A (en) * 2017-12-27 2019-07-05 深圳市优必选科技有限公司 A kind of recognition methods of scene, identification equipment and terminal device
CN108764304A (en) * 2018-05-11 2018-11-06 Oppo广东移动通信有限公司 scene recognition method, device, storage medium and electronic equipment
CN108831505A (en) * 2018-05-30 2018-11-16 百度在线网络技术(北京)有限公司 The method and apparatus for the usage scenario applied for identification
CN109036428A (en) * 2018-10-31 2018-12-18 广东小天才科技有限公司 A kind of voice wake-up device, method and computer readable storage medium

Also Published As

Publication number Publication date
CN110825446A (en) 2020-02-21

Similar Documents

Publication Publication Date Title
CN110825446B (en) Parameter configuration method and device, storage medium and electronic equipment
US9779725B2 (en) Voice wakeup detecting device and method
US9775113B2 (en) Voice wakeup detecting device with digital microphone and associated method
CN111210021B (en) Audio signal processing method, model training method and related device
CN109087669B (en) Audio similarity detection method and device, storage medium and computer equipment
CN103811003B (en) A kind of audio recognition method and electronic equipment
JP2020515877A (en) Whispering voice conversion method, device, device and readable storage medium
CN110232933B (en) Audio detection method and device, storage medium and electronic equipment
CN109272991B (en) Voice interaction method, device, equipment and computer-readable storage medium
CN110223687B (en) Instruction execution method and device, storage medium and electronic equipment
CN110544468B (en) Application awakening method and device, storage medium and electronic equipment
CN113330511B (en) Voice recognition method, voice recognition device, storage medium and electronic equipment
US9633655B1 (en) Voice sensing and keyword analysis
CN112669822B (en) Audio processing method and device, electronic equipment and storage medium
US11626104B2 (en) User speech profile management
US11437022B2 (en) Performing speaker change detection and speaker recognition on a trigger phrase
CN113327620A (en) Voiceprint recognition method and device
CN108074581A (en) For the control system of human-computer interaction intelligent terminal
CN110580897B (en) Audio verification method and device, storage medium and electronic equipment
CN108600559B (en) Control method and device of mute mode, storage medium and electronic equipment
CN109377993A (en) Intelligent voice system and its voice awakening method and intelligent sound equipment
US20180091638A1 (en) Mobile device and method for determining its context
CN108989551B (en) Position prompting method and device, storage medium and electronic equipment
CN108922523B (en) Position prompting method and device, storage medium and electronic equipment
WO2021169711A1 (en) Instruction execution method and apparatus, storage medium, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant