CN111081275A - Terminal processing method and device based on sound analysis, storage medium and terminal - Google Patents

Terminal processing method and device based on sound analysis, storage medium and terminal Download PDF

Info

Publication number
CN111081275A
CN111081275A CN201911325074.1A CN201911325074A CN111081275A CN 111081275 A CN111081275 A CN 111081275A CN 201911325074 A CN201911325074 A CN 201911325074A CN 111081275 A CN111081275 A CN 111081275A
Authority
CN
China
Prior art keywords
sound
type
current scene
terminal
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911325074.1A
Other languages
Chinese (zh)
Other versions
CN111081275B (en
Inventor
李岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huizhou TCL Mobile Communication Co Ltd
Original Assignee
Huizhou TCL Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huizhou TCL Mobile Communication Co Ltd filed Critical Huizhou TCL Mobile Communication Co Ltd
Priority to CN201911325074.1A priority Critical patent/CN111081275B/en
Publication of CN111081275A publication Critical patent/CN111081275A/en
Application granted granted Critical
Publication of CN111081275B publication Critical patent/CN111081275B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The embodiment of the application discloses a terminal processing method and device based on sound analysis, a storage medium and a terminal. The method comprises the following steps: collecting sound information under a current scene; analyzing the sound information to obtain an analysis result; determining the type of the current scene based on the analysis result; and executing the operation instruction corresponding to the type of the current scene. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.

Description

Terminal processing method and device based on sound analysis, storage medium and terminal
Technical Field
The present application relates to the field of terminal technologies, and in particular, to a terminal processing method and apparatus based on voice analysis, a storage medium, and a terminal.
Background
The artificial intelligence technology is increasingly popularized to intelligent terminal equipment, and new intelligent experience is brought to consumers. The core of the artificial intelligence technology is to enable a machine to establish the intelligent decision-making capability through a large amount of data learning, so that the artificial intelligence technology is applied to the daily life of a user and brings convenience to the user.
Disclosure of Invention
The embodiment of the application provides a terminal processing method and device based on sound analysis, a storage medium and a terminal, and can improve the intelligence of the terminal.
In a first aspect, an embodiment of the present application provides a terminal processing method based on voice analysis, including:
collecting sound information under a current scene;
analyzing the sound information to obtain an analysis result;
determining the type of the current scene based on the analysis result;
and executing the operation instruction corresponding to the type of the current scene.
In some embodiments, the determining the type of the current scene based on the analysis result includes:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In some embodiments, the executing the operation instruction corresponding to the type of the current scene includes:
if the type of the current scene is a noisy scene, acquiring a current geographic position;
determining whether the geographic location is within a preset area;
if so, adjusting the current information reminding mode to vibration reminding.
In some embodiments, after determining that the geographic location is within the preset area, the method further comprises:
and when detecting that the terminal plays the first audio signal, carrying out noise reduction processing on the first audio signal.
In some embodiments, the executing the operation instruction corresponding to the type of the current scene includes:
if the type of the current scene is a dangerous scene, acquiring current environment image information;
determining the position coverage range of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage range;
and performing information prompt on the user according to the route guidance information.
In some embodiments, the executing the operation instruction corresponding to the type of the current scene includes:
if the type of the current scene is a quiet scene, when a second audio signal to be played is detected, adjusting a volume parameter value of the second audio signal to be lower than the current volume parameter value.
In some embodiments, the analyzing the sound information to obtain an analysis result includes:
carrying out feature extraction on the sound information to obtain sound features;
identifying the sound information according to the sound characteristics;
determining the sound type, the number of sound types, the number of sound sources and the sound characteristic parameter contained in the sound information based on the identification result;
and generating the analysis result at least according to the sound type, the sound type quantity, the sound source quantity and the sound characteristic parameter.
In a second aspect, an embodiment of the present application provides a terminal processing apparatus based on voice analysis, including:
the acquisition unit is used for acquiring sound information under the current scene;
the analysis unit is used for analyzing the sound information to obtain an analysis result;
a determining unit, configured to determine a type to which a current scene belongs based on the analysis result;
and the processing unit is used for executing the operation instruction corresponding to the type of the current scene.
In some embodiments, the determining unit is to:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In some embodiments, the processing unit is to:
if the type of the current scene is a noisy scene, acquiring a current geographic position;
determining whether the geographic location is within a preset area;
if so, adjusting the current information reminding mode to vibration reminding.
In some embodiments, the processing unit is further to:
after the geographic position is determined to be in the preset area, when a first audio signal played by the terminal is detected, carrying out noise reduction processing on the first audio signal
In some embodiments, the processing unit is to:
if the type of the current scene is a dangerous scene, acquiring current environment image information;
determining the position coverage range of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage range;
and performing information prompt on the user according to the route guidance information.
In some embodiments, the processing unit is to:
if the type of the current scene is a quiet scene, when a second audio signal to be played is detected, adjusting a volume parameter value of the second audio signal to be lower than the current volume parameter value.
In some embodiments, the analysis unit comprises:
the extraction subunit is used for carrying out feature extraction on the sound information to obtain sound features;
the identification subunit is used for identifying the sound information according to the sound characteristics;
a determining subunit configured to determine, based on the recognition result, a sound type, a number of sound types, a number of sound sources, and a sound characteristic parameter included in the sound information;
and the generating subunit is used for generating the analysis result at least according to the sound type, the sound type quantity, the sound source quantity and the sound characteristic parameter.
In a third aspect, the present application further provides a computer-readable storage medium, where a plurality of instructions are stored, where the instructions are adapted to be loaded by a processor to execute the above terminal processing method based on sound analysis.
In a fourth aspect, an embodiment of the present application further provides a terminal, including a processor and a memory, where the processor is electrically connected to the memory, the memory is used to store instructions and data, and the processor is used to execute the terminal processing method based on sound analysis.
In the implementation of the application, the analysis result is obtained by collecting the sound information in the current scene and analyzing the sound information. Then, the type to which the current scene belongs is determined based on the analysis result, and an operation instruction corresponding to the type to which the current scene belongs is executed. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a terminal processing method based on voice analysis according to an embodiment of the present application.
Fig. 2 is a scene schematic diagram of a terminal processing method based on sound analysis according to an embodiment of the present application.
Fig. 3 is a schematic structural diagram of a terminal processing device based on sound analysis according to an embodiment of the present application.
Fig. 4 is another schematic structural diagram of a terminal processing device based on sound analysis according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Fig. 6 is another schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a terminal processing method and device based on sound analysis, a storage medium and a terminal. The details will be described below separately.
In an embodiment, a terminal processing method based on voice analysis is provided, and is applied to terminal devices such as smart phones, tablet computers, and notebook computers. Referring to fig. 1, a specific flow of the terminal processing method based on the sound analysis may be as follows:
101. and collecting sound information under the current scene.
Sound is a sound wave generated by the vibration of an object, a wave phenomenon that propagates through a medium (air or solid, liquid) and can be perceived by the human or animal auditory organs. Where sound is generated by the vibration of an object, which sound propagates in the form of waves, the object being sounded is called the sound source. The number of times an object vibrates within one second is called the frequency, and is given in hertz (Hz). The human ear can hear sounds at 20 Hz-20000 Hz, the most sensitive is sounds between 1000Hz and 3000 Hz.
In the embodiment of the application, a microphone is arranged in the terminal device, and the terminal can specifically collect sound information in the external environment through the microphone.
In some embodiments, the microphone may also be an external device, which may establish a wireless link with a terminal device, and send the collected sound information to the terminal device through the wireless link, so that the terminal obtains the sound information in the current scene.
102. And analyzing the sound information to obtain an analysis result.
In this embodiment, an AI (Artificial Intelligence) technology needs to be utilized in advance to mark and learn common environmental sound data, so that the terminal device can recognize the current environmental sound.
Specifically, firstly, inputting common sound source data to an AI algorithm model for model training, marking common sound classification, and identifying different sound sources generating sound. After a large amount of learning, the terminal equipment has the capability of recognizing various different sounds.
In practical applications, the sound information may be analyzed and processed by a neural network method, a hidden markov model method, a VQ clustering method, a polynomial classifier method, or the like. For example, in some embodiments, the step "analyzing the sound information to obtain the analysis result" may include the following processes:
(11) carrying out feature extraction on the sound information to obtain sound features;
(12) recognizing the sound information according to the sound characteristics;
(13) determining the sound type, the number of sound types, the number of sound sources and the sound characteristic parameter contained in the sound information based on the identification result;
(14) and generating an analysis result at least according to the sound type, the sound type quantity, the sound source quantity and the sound characteristic parameter.
Specifically, the collected sound information may be input into a trained AI algorithm model to extract features such as frequency spectrum, velocity, timbre, pitch, volume, loudness, and the like from the collected sound information. Then, the voice is classified and identified according to the extracted features, so that all voices are identified from the current voice information, and an identification result is obtained. Further, the type of sound, the number of types of sound, the number of sound sources, and the sound characteristic parameter included in the sound information are determined from the recognition result, and an analysis result is generated based on the obtained information.
The sound category may include sounds generated by different sound sources, such as human speech, animal call, wind, automobile whistle, ringing, etc. The sound characteristic parameters may include parameter information such as timbre, volume, loudness, and the like.
In specific implementation, before feature extraction is performed on the sound information, denoising processing may be performed on the sound information in advance to remove some atypical (i.e., non-classifiable) sounds, so as to improve the accuracy of the subsequent sound analysis result.
103. And determining the type of the current scene based on the analysis result.
Specifically, in some embodiments, the step "determining the type of the current scene based on the analysis result" may include the following steps:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In this embodiment, a plurality of sample scenes need to be preset, and the types of sounds that can be included in different scenes are different. Therefore, the sound composition components in different sample scenes need to be prepared in advance. Then, the type of the scene to which the current scene belongs is determined from the constructed sample scenes based on the analysis result.
In this embodiment, several typical scenes may be constructed in advance, such as a noisy scene (containing various different types of sounds and a certain number of sound sources, such as sounds generated in the environments of downtown, dish market, and the like), a quiet scene (with lower sound decibel), and a dangerous scene (containing special sounds generated by special sound sources, such as explosion sounds).
104. And executing the operation instruction corresponding to the type of the current scene.
Specifically, with reference to fig. 1 and fig. 2, the operation instruction to be executed is different according to different scene types in the actual application.
In some embodiments, if the type to which the current scene belongs is a noisy scene, the step "execute an operation instruction corresponding to the type to which the current scene belongs" may include the following steps:
(21) acquiring a current geographic position;
(22) determining whether the geographic location is within a preset area;
(23) if so, adjusting the current information reminding mode to vibration reminding.
Specifically, a Positioning device may be disposed in the terminal device, and the current geographic location information may be obtained by using Positioning technologies such as GPS (Global Positioning System), wifi (Wireless Fidelity), and bluetooth Positioning. And determining whether the current scene is in a preset area with generally noisy sounds such as a downtown place, a restaurant and a dish market by combining the current geographic position and the current scene type. If the terminal is in the preset area, sound prompt information such as incoming call ringing, message reminding and the like of the terminal is difficult to perceive due to noisy sound of the current scene. Therefore, the information reminding mode of the current terminal can be changed into a vibration reminding mode, so that the user can more easily perceive the state of the terminal.
Of course, if the current information reminding mode is vibration reminding, the adjustment mode is not needed.
In some embodiments, after the geographic location is determined to be within the preset area, if it is detected that the terminal plays the first audio signal, noise reduction processing may be further performed on the first audio signal, so that the user can hear specific content of the audio more clearly.
The first audio signal may specifically be an audio signal generated when the terminal performs a voice call or a video call with another terminal, or an audio signal carried in a voice message sent by a terminal of the other party and received by the terminal.
In some embodiments, if the type to which the current scene belongs is a dangerous scene, the step "execute an operation instruction corresponding to the type to which the current scene belongs" may include the following steps:
(31) acquiring current environment image information;
(32) determining the position coverage range of the dangerous area based on the current geographic position and the environmental image information;
(33) generating route guidance information according to the position coverage range;
(34) and performing information prompt on the user according to the route guidance information.
Specifically, a camera can be arranged in the terminal, and the terminal can acquire the current external environment image information by opening the camera. In addition, the current geographical position is determined through the positioning function of the terminal, and the position coverage of the dangerous area is determined by combining the acquired external environment image, the current geographical position and the analysis result (mainly sound loudness, sound type and the like).
After the coverage area of the dangerous area is determined, route guidance information can be generated by combining road condition information, building information and the like under the current environment so as to prompt the user to be guided to transfer to a safe area, and the safety of the user is guaranteed.
In some embodiments, if the type of the current scene is a quiet scene, when the operation instruction corresponding to the type of the current scene is executed, specifically, if the second audio signal to be played is detected, the volume parameter value of the second audio signal is adjusted to be lower than the current volume parameter value, so as to reduce the volume of the current audio playing, avoid generating excessive sound in the quiet environment, and do not need a user to manually adjust the volume, which is simple and convenient.
As can be seen from the above, the terminal processing method based on sound analysis provided in this embodiment acquires sound information in a current scene, and analyzes the sound information to obtain an analysis result. Then, the type to which the current scene belongs is determined based on the analysis result, and an operation instruction corresponding to the type to which the current scene belongs is executed. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.
In another embodiment of the present application, a terminal processing apparatus based on voice analysis is further provided, where the terminal processing apparatus based on voice analysis may be integrated in a terminal in a form of software or hardware, and the terminal may specifically include a mobile phone, a tablet computer, a notebook computer, and the like. As shown in fig. 3, the terminal processing device 300 based on sound analysis may include: an acquisition unit 301, an analysis unit 302, a determination unit 303 and a processing unit 304, wherein:
the acquisition unit 301 is configured to acquire sound information in a current scene;
an analysis unit 302, configured to analyze the sound information to obtain an analysis result;
a determining unit 303, configured to determine a type to which the current scene belongs based on the analysis result;
and the processing unit 304 is used for executing the operation instruction corresponding to the type of the current scene.
In some embodiments, the determining unit 303 may be configured to:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In some embodiments, the processing unit 304 may be configured to:
if the type of the current scene is a noisy scene, acquiring a current geographic position;
determining whether the geographic location is within a preset area;
if so, adjusting the current information reminding mode to vibration reminding.
In some embodiments, the processing unit 304 may be further configured to:
after the geographic position is determined to be in the preset area, when a first audio signal played by the terminal is detected, noise reduction processing is carried out on the first audio signal.
In some embodiments, the processing unit 304 may be configured to:
if the type of the current scene is a dangerous scene, acquiring current environment image information;
determining the position coverage range of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage range;
and performing information prompt on the user according to the route guidance information.
In some embodiments, the processing unit 304 may be configured to:
if the type of the current scene is a quiet scene, when a second audio signal to be played is detected, adjusting a volume parameter value of the second audio signal to be lower than the current volume parameter value.
Referring to fig. 4, in some embodiments, the analysis unit 302 may include:
an extracting subunit 3021, configured to perform feature extraction on the sound information to obtain sound features;
an identifying subunit 3022, configured to identify the sound information according to the sound feature;
a determining subunit 3023 configured to determine the sound type, the number of sound types, the number of sound sources, and the sound characteristic parameter included in the sound information based on the recognition result;
a generating subunit 3024, configured to generate the analysis result according to at least the sound type, the number of sound types, the number of sound sources, and a sound characteristic parameter.
As can be seen from the above, the terminal processing device based on sound analysis provided in the embodiment of the present application acquires sound information in a current scene, and analyzes the sound information to obtain an analysis result. Then, the type to which the current scene belongs is determined based on the analysis result, and an operation instruction corresponding to the type to which the current scene belongs is executed. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.
In another embodiment of the present application, a terminal is further provided, where the terminal may be a terminal device such as a smart phone and a tablet computer. As shown in fig. 5, the terminal 400 includes a processor 401 and a memory 402. The processor 401 is electrically connected to the memory 402.
The processor 401 is a control center of the terminal 400, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of the terminal and processes data by running or loading an application stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the terminal.
In this embodiment, the processor 401 in the terminal 400 loads instructions corresponding to one or more application processes into the memory 402 according to the following steps, and the processor 401 runs the application stored in the memory 402, thereby implementing various functions:
collecting sound information under a current scene;
analyzing the sound information to obtain an analysis result;
determining the type of the current scene based on the analysis result;
and executing the operation instruction corresponding to the type of the current scene.
In some embodiments, in determining the type of the current scene based on the analysis result, processor 401 further performs the steps of:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In some embodiments, when executing the operation instruction corresponding to the type to which the current scene belongs, the processor 401 further performs the following steps:
if the type of the current scene is a noisy scene, acquiring a current geographic position;
determining whether the geographic location is within a preset area;
if so, adjusting the current information reminding mode to vibration reminding.
In some embodiments, after determining that the geographic location is within the preset area, processor 401 further performs the steps of:
and when detecting that the terminal plays the first audio signal, carrying out noise reduction processing on the first audio signal.
In some embodiments, if the type of the current scene is a dangerous scene, the processor 401 further performs the following steps:
acquiring current environment image information;
determining the position coverage range of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage range;
and performing information prompt on the user according to the route guidance information.
In some embodiments, if the type of the current scene is a quiet scene, when the second audio signal to be played is detected, the processor 401 further performs the following steps:
and adjusting the volume parameter value of the second audio signal to be lower than the current volume parameter value.
In some embodiments, when analyzing the sound information to obtain an analysis result, the processor 401 further performs the following steps:
carrying out feature extraction on the sound information to obtain sound features;
identifying the sound information according to the sound characteristics;
determining the sound type, the number of sound types, the number of sound sources and the sound characteristic parameter contained in the sound information based on the identification result;
and generating the analysis result at least according to the sound type, the sound type quantity, the sound source quantity and the sound characteristic parameter.
The memory 402 may be used to store applications and data. The memory 402 stores applications containing instructions executable in the processor. Applications may constitute various functional modules. The processor 401 executes various functional applications and terminal processing based on sound analysis by running applications stored in the memory 402.
In some embodiments, as shown in fig. 6, the terminal 400 further includes: display 403, control circuitry 404, radio frequency circuitry 405, microphone 406, sensor 408, and power supply 409. The processor 401 is electrically connected to the display 403, the control circuit 404, the rf circuit 405, the microphone 406, the camera 407, the sensor 408, and the power source 409.
The display screen 403 may be used to display information input by or provided to the user as well as various graphical user interfaces of the terminal, which may be constituted by images, text, icons, video, and any combination thereof.
The control circuit 404 is electrically connected to the display 403, and is configured to control the display 403 to display information.
The rf circuit 405 is used for transceiving rf signals to establish wireless communication with a terminal or other terminals through wireless communication, and to transceive signals with a server or other terminals.
A microphone (i.e., microphone) 406, and an energy conversion device for detecting a sound signal in the external environment and converting the sound signal into an electrical signal. Among them, microphones can be classified into two types, an electrodynamic microphone and a capacitance microphone, according to their transduction principle.
The camera 407 may be used to collect image information. The camera may be a single camera with one lens, or may have two or more lenses.
The sensor 408 is used to collect external environmental information. The sensors 408 may include ambient light sensors, acceleration sensors, light sensors, motion sensors, and other sensors.
The power supply 409 is used to power the various components of the terminal 400. In some embodiments, the power source 409 may be logically connected to the processor 401 through a power management system, so that functions of managing charging, discharging, and power consumption are implemented through the power management system.
Although not shown in fig. 6, the terminal 400 may further include a speaker, a bluetooth module, and the like, which will not be described in detail herein.
Therefore, the terminal provided by the embodiment of the application acquires the sound information in the current scene and analyzes the sound information to obtain the analysis result. Then, the type to which the current scene belongs is determined based on the analysis result, and an operation instruction corresponding to the type to which the current scene belongs is executed. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.
In some embodiments, a computer-readable storage medium is also provided, having stored therein a plurality of instructions adapted to be loaded by a processor to perform any of the above-described voice analysis based terminal processing methods.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The terminal processing method, device, storage medium and terminal based on voice analysis provided by the embodiments of the present application are described in detail above, and a specific example is applied in the present application to explain the principle and implementation of the present application, and the description of the above embodiments is only used to help understanding the method and core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A terminal processing method based on sound analysis is characterized by comprising the following steps:
collecting sound information under a current scene;
analyzing the sound information to obtain an analysis result;
determining the type of the current scene based on the analysis result;
and executing the operation instruction corresponding to the type of the current scene.
2. The method according to claim 1, wherein the determining the type of the current scene based on the analysis result comprises:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
3. The terminal processing method based on sound analysis according to claim 1, wherein the executing the operation instruction corresponding to the type of the current scene comprises:
if the type of the current scene is a noisy scene, acquiring a current geographic position;
determining whether the geographic location is within a preset area;
if so, adjusting the current information reminding mode to vibration reminding.
4. The terminal processing method based on sound analysis according to claim 3, further comprising, after determining that the geographical location is within a preset area:
and when detecting that the terminal plays the first audio signal, carrying out noise reduction processing on the first audio signal.
5. The terminal processing method based on sound analysis according to claim 1, wherein the executing the operation instruction corresponding to the type of the current scene comprises:
if the type of the current scene is a dangerous scene, acquiring current environment image information;
determining the position coverage range of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage range;
and performing information prompt on the user according to the route guidance information.
6. The terminal processing method based on sound analysis according to claim 1, wherein the executing the operation instruction corresponding to the type of the current scene comprises:
if the type of the current scene is a quiet scene, when a second audio signal to be played is detected, adjusting a volume parameter value of the second audio signal to be lower than the current volume parameter value.
7. The terminal processing method based on voice analysis according to any one of claims 1 to 6, wherein the analyzing the voice information to obtain an analysis result comprises:
carrying out feature extraction on the sound information to obtain sound features;
identifying the sound information according to the sound characteristics;
determining the sound type, the number of sound types, the number of sound sources and the sound characteristic parameter contained in the sound information based on the identification result;
and generating the analysis result at least according to the sound type, the sound type quantity, the sound source quantity and the sound characteristic parameter.
8. A terminal processing apparatus based on sound analysis, comprising:
the acquisition unit is used for acquiring sound information under the current scene;
the analysis unit is used for analyzing the sound information to obtain an analysis result;
a determining unit, configured to determine a type to which a current scene belongs based on the analysis result;
and the processing unit is used for executing the operation instruction corresponding to the type of the current scene.
9. A computer-readable storage medium, characterized in that a plurality of instructions are stored in the storage medium, said instructions being adapted to be loaded by a processor to perform the method for terminal processing based on sound analysis according to any of claims 1-7.
10. A terminal is characterized by comprising a processor and a memory, wherein the processor is electrically connected with the memory, and the memory is used for storing instructions and data; the processor is used for executing the terminal processing method based on the sound analysis of any one of claims 1-7.
CN201911325074.1A 2019-12-20 2019-12-20 Terminal processing method and device based on sound analysis, storage medium and terminal Active CN111081275B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911325074.1A CN111081275B (en) 2019-12-20 2019-12-20 Terminal processing method and device based on sound analysis, storage medium and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911325074.1A CN111081275B (en) 2019-12-20 2019-12-20 Terminal processing method and device based on sound analysis, storage medium and terminal

Publications (2)

Publication Number Publication Date
CN111081275A true CN111081275A (en) 2020-04-28
CN111081275B CN111081275B (en) 2023-05-26

Family

ID=70316248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911325074.1A Active CN111081275B (en) 2019-12-20 2019-12-20 Terminal processing method and device based on sound analysis, storage medium and terminal

Country Status (1)

Country Link
CN (1) CN111081275B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111601000A (en) * 2020-05-14 2020-08-28 支付宝(杭州)信息技术有限公司 Communication network fraud identification method and device and electronic equipment
CN112866480A (en) * 2021-01-05 2021-05-28 北京小米移动软件有限公司 Information processing method, information processing device, electronic equipment and storage medium
CN116320144A (en) * 2022-09-23 2023-06-23 荣耀终端有限公司 Audio playing method and electronic equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103945062A (en) * 2014-04-16 2014-07-23 华为技术有限公司 User terminal volume adjusting method, device and terminal
CN107277260A (en) * 2017-07-07 2017-10-20 珠海格力电器股份有限公司 A kind of contextual model method of adjustment, device and mobile terminal
CN108924348A (en) * 2018-06-26 2018-11-30 努比亚技术有限公司 Terminal scene mode conversion method, device and computer readable storage medium
CN109040473A (en) * 2018-10-23 2018-12-18 珠海格力电器股份有限公司 terminal volume adjusting method, system and mobile phone
CN109151719A (en) * 2018-09-28 2019-01-04 北京小米移动软件有限公司 Safety guide method, device and storage medium
CN109995799A (en) * 2017-12-29 2019-07-09 广东欧珀移动通信有限公司 Information-pushing method, device, terminal and storage medium
CN110019931A (en) * 2017-12-05 2019-07-16 腾讯科技(深圳)有限公司 Audio frequency classification method, device, smart machine and storage medium
US20190227767A1 (en) * 2016-09-27 2019-07-25 Huawei Technologies Co., Ltd. Volume Adjustment Method and Terminal
CN110473566A (en) * 2019-07-25 2019-11-19 深圳壹账通智能科技有限公司 Audio separation method, device, electronic equipment and computer readable storage medium
CN110580914A (en) * 2019-07-24 2019-12-17 安克创新科技股份有限公司 Audio processing method and equipment and device with storage function

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103945062A (en) * 2014-04-16 2014-07-23 华为技术有限公司 User terminal volume adjusting method, device and terminal
US20170034362A1 (en) * 2014-04-16 2017-02-02 Huawei Technologies Co., Ltd. Method and Apparatus for Adjusting Volume of User Terminal, and Terminal
US20190227767A1 (en) * 2016-09-27 2019-07-25 Huawei Technologies Co., Ltd. Volume Adjustment Method and Terminal
CN107277260A (en) * 2017-07-07 2017-10-20 珠海格力电器股份有限公司 A kind of contextual model method of adjustment, device and mobile terminal
CN110019931A (en) * 2017-12-05 2019-07-16 腾讯科技(深圳)有限公司 Audio frequency classification method, device, smart machine and storage medium
CN109995799A (en) * 2017-12-29 2019-07-09 广东欧珀移动通信有限公司 Information-pushing method, device, terminal and storage medium
CN108924348A (en) * 2018-06-26 2018-11-30 努比亚技术有限公司 Terminal scene mode conversion method, device and computer readable storage medium
CN109151719A (en) * 2018-09-28 2019-01-04 北京小米移动软件有限公司 Safety guide method, device and storage medium
CN109040473A (en) * 2018-10-23 2018-12-18 珠海格力电器股份有限公司 terminal volume adjusting method, system and mobile phone
CN110580914A (en) * 2019-07-24 2019-12-17 安克创新科技股份有限公司 Audio processing method and equipment and device with storage function
CN110473566A (en) * 2019-07-25 2019-11-19 深圳壹账通智能科技有限公司 Audio separation method, device, electronic equipment and computer readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111601000A (en) * 2020-05-14 2020-08-28 支付宝(杭州)信息技术有限公司 Communication network fraud identification method and device and electronic equipment
CN112866480A (en) * 2021-01-05 2021-05-28 北京小米移动软件有限公司 Information processing method, information processing device, electronic equipment and storage medium
CN116320144A (en) * 2022-09-23 2023-06-23 荣耀终端有限公司 Audio playing method and electronic equipment
CN116320144B (en) * 2022-09-23 2023-11-14 荣耀终端有限公司 Audio playing method, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN111081275B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN109166593B (en) Audio data processing method, device and storage medium
CN111081275B (en) Terminal processing method and device based on sound analysis, storage medium and terminal
JP2021516786A (en) Methods, devices, and computer programs to separate the voices of multiple people
CN103945062A (en) User terminal volume adjusting method, device and terminal
CN110995933A (en) Volume adjusting method and device of mobile terminal, mobile terminal and storage medium
CN110364156A (en) Voice interactive method, system, terminal and readable storage medium storing program for executing
EP4191579A1 (en) Electronic device and speech recognition method therefor, and medium
KR101965313B1 (en) System for Providing Noise Map Based on Big Data Using Sound Collection Device Looked Like Earphone
CN110931000B (en) Method and device for speech recognition
CN111696532A (en) Speech recognition method, speech recognition device, electronic device and storage medium
CN111105788B (en) Sensitive word score detection method and device, electronic equipment and storage medium
CN107863110A (en) Safety prompt function method, intelligent earphone and storage medium based on intelligent earphone
CN111739517A (en) Speech recognition method, speech recognition device, computer equipment and medium
CN110992963A (en) Network communication method, device, computer equipment and storage medium
CN114299933A (en) Speech recognition model training method, device, equipment, storage medium and product
CN110910876A (en) Article sound searching device and control method, and voice control setting method and system
CN111613213A (en) Method, device, equipment and storage medium for audio classification
CN108600559B (en) Control method and device of mute mode, storage medium and electronic equipment
CN110944056A (en) Interaction method, mobile terminal and readable storage medium
CN114333774A (en) Speech recognition method, speech recognition device, computer equipment and storage medium
CN112735388B (en) Network model training method, voice recognition processing method and related equipment
CN113220590A (en) Automatic testing method, device, equipment and medium for voice interaction application
CN114360546A (en) Electronic equipment and awakening method thereof
CN112614507A (en) Method and apparatus for detecting noise
CN111081102B (en) Dictation result detection method and learning equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant