CN111081275B - Terminal processing method and device based on sound analysis, storage medium and terminal - Google Patents

Terminal processing method and device based on sound analysis, storage medium and terminal Download PDF

Info

Publication number
CN111081275B
CN111081275B CN201911325074.1A CN201911325074A CN111081275B CN 111081275 B CN111081275 B CN 111081275B CN 201911325074 A CN201911325074 A CN 201911325074A CN 111081275 B CN111081275 B CN 111081275B
Authority
CN
China
Prior art keywords
sound
information
scene
current
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911325074.1A
Other languages
Chinese (zh)
Other versions
CN111081275A (en
Inventor
李岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huizhou TCL Mobile Communication Co Ltd
Original Assignee
Huizhou TCL Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huizhou TCL Mobile Communication Co Ltd filed Critical Huizhou TCL Mobile Communication Co Ltd
Priority to CN201911325074.1A priority Critical patent/CN111081275B/en
Publication of CN111081275A publication Critical patent/CN111081275A/en
Application granted granted Critical
Publication of CN111081275B publication Critical patent/CN111081275B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Environmental & Geological Engineering (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Telephone Function (AREA)
  • Emergency Alarm Devices (AREA)

Abstract

The embodiment of the application discloses a terminal processing method and device based on sound analysis, a storage medium and a terminal. The method comprises the following steps: collecting sound information in a current scene; analyzing the sound information to obtain an analysis result; determining the type of the current scene based on the analysis result; and executing an operation instruction corresponding to the type of the current scene. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.

Description

Terminal processing method and device based on sound analysis, storage medium and terminal
Technical Field
The present disclosure relates to the field of terminal technologies, and in particular, to a terminal processing method and apparatus based on sound analysis, a storage medium, and a terminal.
Background
Artificial intelligence technology is now becoming more and more popular with intelligent terminal devices, bringing new intelligent experiences to consumers. The core of the artificial intelligence technology is to enable a machine to establish the intelligent decision-making capability through a large amount of data learning, so that the artificial intelligence technology is applied to the daily life of a user and brings convenience to the user.
Disclosure of Invention
The embodiment of the application provides a terminal processing method and device based on sound analysis, a storage medium and a terminal, and the terminal intelligence can be improved.
In a first aspect, an embodiment of the present application provides a terminal processing method based on sound analysis, including:
collecting sound information in a current scene;
analyzing the sound information to obtain an analysis result;
determining the type of the current scene based on the analysis result;
and executing an operation instruction corresponding to the type of the current scene.
In some implementations, the determining the type of the current scene based on the analysis result includes:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In some implementations, the executing the operation instruction corresponding to the type of the current scene includes:
if the type of the current scene is a noisy scene, acquiring the current geographic position;
determining whether the geographic position is in a preset area;
if yes, the current information reminding mode is adjusted to be vibration reminding.
In some embodiments, after determining that the geographic location is within a preset area, further comprising:
and when the terminal is detected to play the first audio signal, carrying out noise reduction processing on the first audio signal.
In some implementations, the executing the operation instruction corresponding to the type of the current scene includes:
if the type of the current scene is a dangerous scene, acquiring current environment image information;
determining the position coverage of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage area;
and carrying out information prompt on the user according to the route guidance information.
In some implementations, the executing the operation instruction corresponding to the type of the current scene includes:
if the current scene belongs to a quiet scene, when a second audio signal to be played is detected, adjusting the volume parameter value of the second audio signal to be lower than the current volume parameter value.
In some embodiments, the analyzing the sound information to obtain an analysis result includes:
extracting the characteristics of the sound information to obtain sound characteristics;
identifying the sound information according to the sound characteristics;
determining sound types, sound sources and sound characteristic parameters contained in the sound information based on the identification result;
and generating the analysis result at least according to the sound type, the sound type number, the sound source number and the sound characteristic parameters.
In a second aspect, an embodiment of the present application provides a terminal processing device based on sound analysis, including:
the acquisition unit is used for acquiring sound information in the current scene;
the analysis unit is used for analyzing the sound information to obtain an analysis result;
a determining unit, configured to determine a type of the current scene based on the analysis result;
and the processing unit is used for executing the operation instruction corresponding to the type of the current scene.
In some embodiments, the determining unit is configured to:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In some embodiments, the processing unit is to:
if the type of the current scene is a noisy scene, acquiring the current geographic position;
determining whether the geographic position is in a preset area;
if yes, the current information reminding mode is adjusted to be vibration reminding.
In some embodiments, the processing unit is further to:
after the geographic position is determined to be in a preset area, when the terminal is detected to play a first audio signal, noise reduction processing is carried out on the first audio signal
In some embodiments, the processing unit is to:
if the type of the current scene is a dangerous scene, acquiring current environment image information;
determining the position coverage of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage area;
and carrying out information prompt on the user according to the route guidance information.
In some embodiments, the processing unit is to:
if the current scene belongs to a quiet scene, when a second audio signal to be played is detected, adjusting the volume parameter value of the second audio signal to be lower than the current volume parameter value.
In some embodiments, the analysis unit comprises:
the extraction subunit is used for extracting the characteristics of the sound information to obtain sound characteristics;
the identification subunit is used for identifying the voice information according to the voice characteristics;
a determination subunit configured to determine, based on the recognition result, a sound type, a number of sound types, a number of sound sources, and a sound characteristic parameter included in the sound information;
and the generation subunit is used for generating the analysis result at least according to the sound type, the number of the sound types, the number of sound sources and the sound characteristic parameters.
In a third aspect, embodiments of the present application further provide a computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor to perform the above-described terminal processing method based on sound analysis.
In a fourth aspect, an embodiment of the present application further provides a terminal, including a processor and a memory, where the processor is electrically connected to the memory, the memory is configured to store instructions and data, and the processor is configured to execute the above terminal processing method based on sound analysis.
In the implementation of the method, the analysis result is obtained by collecting the sound information in the current scene and analyzing the sound information. Then, the type of the current scene is determined based on the analysis result, and an operation instruction corresponding to the type of the current scene is executed. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a terminal processing method based on sound analysis according to an embodiment of the present application.
Fig. 2 is a schematic view of a terminal processing method based on sound analysis according to an embodiment of the present application.
Fig. 3 is a schematic structural diagram of a terminal processing device based on sound analysis according to an embodiment of the present application.
Fig. 4 is another schematic structural diagram of a terminal processing device based on sound analysis according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Fig. 6 is another schematic structural diagram of a terminal provided in an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The embodiment of the application provides a terminal processing method and device based on sound analysis, a storage medium and a terminal. Each of which will be described in detail below.
In an embodiment, a terminal processing method based on sound analysis is provided, which is applied to terminal devices such as smart phones, tablet computers, notebook computers and the like. Referring to fig. 1, the specific flow of the terminal processing method based on sound analysis may be as follows:
101. and collecting sound information in the current scene.
Sound is a sound wave generated by the vibration of an object, and is a wave phenomenon that propagates through a medium (air or solid, liquid) and can be perceived by human or animal auditory organs. In which sound is generated by vibration of an object, the sound propagates in the form of waves, and the object that is sounding is called a sound source. The number of times an object vibrates within one second is called the frequency, in hertz (Hz). The human ear can hear sound at 20 Hz-20000 Hz, the most sensitive being sound between 1000Hz and 3000 Hz.
In the embodiment of the application, a microphone is arranged in the terminal device, and the terminal can specifically collect sound information in the external environment through the microphone.
In some embodiments, the microphone may also be an external device, which may establish a wireless link with the terminal device, and send the collected sound information to the terminal device through the wireless link, so that the terminal obtains the sound information in the current scene.
102. And analyzing the sound information to obtain an analysis result.
In this embodiment, the AI (Artificial Intelligence ) technology is required to be utilized in advance to mark and learn the commonly used environmental sound data, so that the terminal device can identify the current environmental sound.
Specifically, firstly, common sound source data is input to an AI algorithm model for model training, common sound classifications are marked, and different sound sources for generating sound are identified. After a large amount of learning, the terminal equipment has the capability of recognizing various different sounds.
In practical application, the sound information can be analyzed and processed by adopting a neural network method, a hidden Markov model method, a VQ clustering method, a polynomial classifier method and the like. For example, in some embodiments, the step of analyzing the sound information to obtain an analysis result may include the following steps:
(11) Extracting the characteristics of the sound information to obtain sound characteristics;
(12) Identifying sound information according to sound characteristics;
(13) Determining sound types, sound sources, and sound characteristic parameters contained in the sound information based on the recognition result;
(14) And generating an analysis result according to at least the sound types, the number of the sound sources and the sound characteristic parameters.
Specifically, the collected sound information may be input into a trained AI algorithm model to extract features such as spectrum, speed, timbre, tone, volume, loudness, etc. from the collected sound information. And then, carrying out voice classification and recognition according to the extracted features so as to recognize all voices from the current voice information, and obtaining a recognition result. Further, the sound type, the number of sound types, the number of sound sources, and the sound characteristic parameters included in the sound information are determined based on the recognition result, and an analysis result is generated based on the obtained information.
The sound types may include sounds generated by different sound sources, such as human speaking sounds, animal calling sounds, wind sounds, car whistling sounds, bell sounds, and the like, among others. The sound characteristic parameters may include parametric information of timbre, volume, loudness, etc.
In specific implementation, before extracting features of the sound information, denoising processing may be performed on the sound information in advance to exclude some atypical (i.e. unclassified) sounds, so as to improve accuracy of subsequent sound analysis results.
103. And determining the type of the current scene based on the analysis result.
Specifically, in some embodiments, the step of determining the type of the current scene based on the analysis result may include the following procedures:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In this embodiment, a plurality of sample scenes are required to be preset, and the types of sounds that can be included in different scenes are different. Therefore, the sound composition in the different sample scenes is required in advance. Then, the type of the scene to which the current scene belongs is determined from the constructed sample scenes based on the analysis result.
In this embodiment, several typical scenes such as a noisy scene (containing a plurality of different kinds of sounds and a certain number of sound sources such as sounds generated in the environment of downtown, vegetable market, etc.), a quiet scene (sound decibels are low), and a dangerous scene (containing a special sound generated by a special sound source such as an explosion sound) may be constructed in advance.
104. And executing an operation instruction corresponding to the type of the current scene.
Specifically, with reference to fig. 1 and fig. 2, the operation instructions to be executed are different according to different scene types in practical applications.
In some embodiments, if the type of the current scene is a noisy scene, the step of executing the operation instruction corresponding to the type of the current scene may include the following steps:
(21) Acquiring a current geographic position;
(22) Determining whether the geographic position is in a preset area;
(23) If yes, the current information reminding mode is adjusted to be vibration reminding.
Specifically, a positioning device can be arranged in the terminal equipment, and the current geographic position information can be obtained by using positioning technologies such as GPS (Global Positioning System ) positioning, wifi (Wireless Fidelity, wireless fidelity) positioning, bluetooth positioning and the like. By combining the current geographic position and the current scene type, whether the current sound is in a preset area with general noisy sounds such as downtimes, restaurants, vegetable markets and the like is determined. If the terminal is in the preset area, the current scene sounds noisy, so that the sound prompt information such as the incoming call bell, the message prompt and the like of the terminal is difficult to perceive. Therefore, the information reminding mode of the current terminal can be changed into a vibration reminding mode, so that a user can more easily perceive the state of the terminal.
Of course, if the current information reminding mode is vibration reminding, no adjustment mode is needed.
In some embodiments, after determining that the geographic location is within the preset area, if it is detected that the terminal plays the first audio signal, the noise reduction processing may be further performed on the first audio signal, so that the user can hear the specific content of the audio more clearly.
The first audio signal may specifically be an audio signal generated when the terminal performs a voice call or a video call with other terminals, or may be an audio signal carried in a voice message that is received by the terminal and sent by the other terminal.
In some embodiments, if the type of the current scene is a dangerous scene, the step of executing the operation instruction corresponding to the type of the current scene may include the following steps:
(31) Acquiring current environment image information;
(32) Determining the position coverage of the dangerous area based on the current geographic position and the environmental image information;
(33) Generating route guidance information according to the position coverage;
(34) And carrying out information prompt on the user according to the route guidance information.
Specifically, a camera can be arranged in the terminal, and the terminal can acquire the current external environment image information by starting the camera. In addition, the current geographic position is determined through the positioning function of the terminal, and the position coverage of the dangerous area is determined by combining the acquired external environment image, the current geographic position and the analysis results (mainly, sound loudness, sound types and the like).
After the coverage area of the dangerous area is determined, route guiding information can be generated by combining road condition information, building information and the like in the current environment so as to prompt and guide a user to transfer to the safe area, and the safety of the user is ensured.
In some embodiments, if the type of the current scene is a quiet scene, when the operation instruction corresponding to the type of the current scene is executed, specifically, if the second audio signal to be played is detected, the volume parameter value of the second audio signal is adjusted to be lower than the current volume parameter value, so as to reduce the volume of the current audio to be played, avoid generating excessive sound in the quiet environment, and avoid the need for the user to manually adjust the volume, thereby being simple and convenient.
As can be seen from the above, the terminal processing method based on sound analysis provided in this embodiment obtains the analysis result by collecting the sound information in the current scene and analyzing the sound information. Then, the type of the current scene is determined based on the analysis result, and an operation instruction corresponding to the type of the current scene is executed. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.
In still another embodiment of the present application, a terminal processing device based on sound analysis is provided, where the terminal processing device based on sound analysis may be integrated in a terminal in a form of software or hardware, and the terminal may specifically include a mobile phone, a tablet computer, a notebook computer, and other devices. As shown in fig. 3, the terminal processing apparatus 300 based on sound analysis may include: an acquisition unit 301, an analysis unit 302, a determination unit 303 and a processing unit 304, wherein:
the acquisition unit 301 is configured to acquire sound information in a current scene;
an analysis unit 302, configured to analyze the sound information to obtain an analysis result;
a determining unit 303, configured to determine a type of the current scene based on the analysis result;
the processing unit 304 is configured to execute an operation instruction corresponding to the type of the current scene.
In some embodiments, the determining unit 303 may be configured to:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In some embodiments, the processing unit 304 may be configured to:
if the type of the current scene is a noisy scene, acquiring the current geographic position;
determining whether the geographic position is in a preset area;
if yes, the current information reminding mode is adjusted to be vibration reminding.
In some embodiments, the processing unit 304 may be further configured to:
after the geographic position is determined to be in a preset area, when the terminal is detected to play a first audio signal, noise reduction processing is carried out on the first audio signal.
In some embodiments, the processing unit 304 may be configured to:
if the type of the current scene is a dangerous scene, acquiring current environment image information;
determining the position coverage of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage area;
and carrying out information prompt on the user according to the route guidance information.
In some embodiments, the processing unit 304 may be configured to:
if the current scene belongs to a quiet scene, when a second audio signal to be played is detected, adjusting the volume parameter value of the second audio signal to be lower than the current volume parameter value.
Referring to fig. 4, in some embodiments, the analysis unit 302 may include:
an extracting subunit 3021, configured to perform feature extraction on the sound information to obtain a sound feature;
an identifying subunit 3022 configured to identify the sound information according to the sound feature;
a determination subunit 3023 for determining, based on the recognition result, the sound type, the number of sound sources, the sound characteristic parameters included in the sound information;
a generating subunit 3024 configured to generate the analysis result based on at least the sound type, the number of sound types, the number of sound sources, and the sound characteristic parameter.
As can be seen from the above, the terminal processing device based on sound analysis provided in the embodiments of the present application acquires sound information in a current scene, and analyzes the sound information to obtain an analysis result. Then, the type of the current scene is determined based on the analysis result, and an operation instruction corresponding to the type of the current scene is executed. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.
In still another embodiment of the present application, a terminal is provided, where the terminal may be a terminal device such as a smart phone, a tablet computer, and the like. As shown in fig. 5, the terminal 400 includes a processor 401 and a memory 402. The processor 401 is electrically connected to the memory 402.
The processor 401 is a control center of the terminal 400, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of the terminal and processes data by running or loading applications stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the terminal.
In this embodiment, the processor 401 in the terminal 400 loads instructions corresponding to the processes of one or more applications into the memory 402 according to the following steps, and the processor 401 executes the applications stored in the memory 402, so as to implement various functions:
collecting sound information in a current scene;
analyzing the sound information to obtain an analysis result;
determining the type of the current scene based on the analysis result;
and executing an operation instruction corresponding to the type of the current scene.
In some embodiments, in determining the type of the current scene based on the analysis result, the processor 401 further performs the steps of:
determining the type of the current scene from a plurality of sample scene types according to the analysis result, wherein the plurality of sample scene types at least comprise: noisy scenes, dangerous scenes, and quiet scenes.
In some embodiments, when executing the operation instruction corresponding to the type to which the current scene belongs, the processor 401 further performs the steps of:
if the type of the current scene is a noisy scene, acquiring the current geographic position;
determining whether the geographic position is in a preset area;
if yes, the current information reminding mode is adjusted to be vibration reminding.
In some embodiments, after determining that the geographic location is within a preset area, the processor 401 further performs the steps of:
and when the terminal is detected to play the first audio signal, carrying out noise reduction processing on the first audio signal.
In some embodiments, if the type of the current scene is a dangerous scene, the processor 401 further performs the following steps:
acquiring current environment image information;
determining the position coverage of the dangerous area based on the current geographic position and the environment image information;
generating route guidance information according to the position coverage area;
and carrying out information prompt on the user according to the route guidance information.
In some embodiments, if the type of the current scene is a quiet scene, when the second audio signal to be played is detected, the processor 401 further performs the steps of:
and adjusting the volume parameter value of the second audio signal to be lower than the current volume parameter value.
In some embodiments, when the sound information is analyzed to obtain an analysis result, the processor 401 further performs the following steps:
extracting the characteristics of the sound information to obtain sound characteristics;
identifying the sound information according to the sound characteristics;
determining sound types, sound sources and sound characteristic parameters contained in the sound information based on the identification result;
and generating the analysis result at least according to the sound type, the sound type number, the sound source number and the sound characteristic parameters.
Memory 402 may be used to store applications and data. The memory 402 stores applications that include instructions executable in a processor. Applications may constitute various functional modules. The processor 401 executes various functional applications and terminal processing based on sound analysis by running applications stored in the memory 402.
In some embodiments, as shown in fig. 6, the terminal 400 further includes: a display 403, control circuitry 404, radio frequency circuitry 405, microphone 406, sensor 408, and power supply 409. The processor 401 is electrically connected to the display 403, the control circuit 404, the radio frequency circuit 405, the microphone 406, the camera 407, the sensor 408, and the power supply 409, respectively.
The display screen 403 may be used to display information input by a user or information provided to the user and various graphical user interfaces of the terminal, which may be composed of images, text, icons, video and any combination thereof.
The control circuit 404 is electrically connected to the display screen 403, and is used for controlling the display screen 403 to display information.
The radio frequency circuit 405 is configured to receive and transmit radio frequency signals, so as to establish wireless communication with a terminal or other terminals via wireless communication, and receive and transmit signals with a server or other terminals.
A microphone (i.e., microphone) 406, an energy conversion device that can be used to detect sound signals in the external environment and convert the sound signals into electrical signals. The microphone can be divided into an electric microphone and a capacitor microphone according to the transduction principle.
The camera 407 may be used to collect image information. The camera may be a single camera with one lens, or may have two or more lenses.
The sensor 408 is used to collect external environmental information. The sensors 408 may include ambient brightness sensors, acceleration sensors, light sensors, motion sensors, and other sensors.
A power supply 409 is used to power the various components of the terminal 400. In some embodiments, power supply 409 may be logically connected to processor 401 through a power management system, thereby performing functions such as managing charging, discharging, and power consumption through the power management system.
Although not shown in fig. 6, the terminal 400 may further include a speaker, a bluetooth module, etc., which will not be described herein.
From the above, the terminal provided in the embodiment of the present application obtains the analysis result by collecting the sound information in the current scene and analyzing the sound information. Then, the type of the current scene is determined based on the analysis result, and an operation instruction corresponding to the type of the current scene is executed. According to the scheme, the current scene state is intelligently identified, the operation instruction of the corresponding service is executed, the intelligence of the terminal is improved, and convenience is brought to a user.
In some embodiments, a computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor to perform any of the above-described voice analysis based terminal processing methods is also provided.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The terminal processing method, device, storage medium and terminal based on sound analysis provided by the embodiment of the application are described in detail, and specific examples are applied to illustrate the principle and implementation of the application, and the description of the above embodiments is only used for helping to understand the method and core idea of the application; meanwhile, those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, and the present description should not be construed as limiting the present application in view of the above.

Claims (7)

1. A terminal processing method based on sound analysis, comprising:
collecting sound information in a current scene;
analyzing the sound information to obtain an analysis result;
denoising the sound information, inputting the denoised sound information into a trained AI algorithm model to extract sound characteristics from the sound information, wherein the sound characteristics at least comprise frequency spectrum, speed, tone, volume and loudness, and the trained AI algorithm model is obtained by carrying out model training on the AI algorithm model according to commonly used sound source data;
identifying the sound information according to the sound characteristics to obtain an identification result;
determining sound types, the number of sound sources and sound characteristic parameters contained in the sound information based on the identification result, wherein the sound types comprise sounds generated by different sound sources, and the sound characteristic parameters comprise tone, volume and loudness;
generating the analysis result at least according to the sound type, the sound type number, the sound source number and the sound characteristic parameters;
determining the type of the current scene from a plurality of sample scene types based on the analysis result, wherein the plurality of sample scene types at least comprise: a noisy scene including a plurality of different kinds of sounds and a plurality of sound sources, a dangerous scene including a sound generated by a special sound source, and a quiet scene including a sound having a sound decibel low;
executing an operation instruction corresponding to the type of the current scene, including:
if the type of the current scene is a dangerous scene, acquiring current environment image information;
determining the position coverage of a dangerous area based on the current geographic position, the environment image information and the analysis result;
generating route guidance information according to the position coverage area, the road condition information in the current scene and the building information;
and carrying out information prompt on the user according to the route guidance information.
2. The sound analysis-based terminal processing method according to claim 1, wherein the executing the operation instruction corresponding to the type to which the current scene belongs comprises:
if the type of the current scene is a noisy scene, acquiring the current geographic position;
determining whether the geographic position is in a preset area;
if yes, the current information reminding mode is adjusted to be vibration reminding.
3. The sound analysis-based terminal processing method according to claim 2, further comprising, after determining that the geographic position is within a preset area:
and when the terminal is detected to play the first audio signal, carrying out noise reduction processing on the first audio signal.
4. The sound analysis-based terminal processing method according to claim 1, wherein the executing the operation instruction corresponding to the type to which the current scene belongs comprises:
if the current scene belongs to a quiet scene, when a second audio signal to be played is detected, adjusting the volume parameter value of the second audio signal to be lower than the current volume parameter value.
5. A terminal processing device based on sound analysis, characterized by comprising:
the acquisition unit is used for acquiring sound information in the current scene;
the analysis unit is used for denoising the sound information, inputting the denoised sound information into a trained AI algorithm model to extract sound characteristics from the sound information, wherein the sound characteristics at least comprise frequency spectrum, speed, tone, volume and loudness, and the trained AI algorithm model is obtained by carrying out model training on the AI algorithm model according to common sound source data; identifying the sound information according to the sound characteristics to obtain an identification result; determining sound types, the number of sound sources and sound characteristic parameters contained in the sound information based on the identification result, wherein the sound types comprise sounds generated by different sound sources, and the sound characteristic parameters comprise tone, volume and loudness; generating the analysis result at least according to the sound type, the sound type number, the sound source number and the sound characteristic parameters;
a determining unit, configured to determine a type of a current scene from a plurality of sample scene types based on the analysis result, where the plurality of sample scene types at least includes: a noisy scene including a plurality of different kinds of sounds and a plurality of sound sources, a dangerous scene including a sound generated by a special sound source, and a quiet scene including a sound having a sound decibel low;
the processing unit is used for executing operation instructions corresponding to the type of the current scene, and comprises the following steps: if the type of the current scene is a dangerous scene, acquiring current environment image information; determining the position coverage of a dangerous area based on the current geographic position, the environment image information and the analysis result; generating route guidance information according to the position coverage area, the road condition information in the current scene and the building information; and carrying out information prompt on the user according to the route guidance information.
6. A computer readable storage medium, characterized in that the storage medium has stored therein a plurality of instructions adapted to be loaded by a processor to perform the sound analysis based terminal processing method of any of claims 1-4.
7. The terminal is characterized by comprising a processor and a memory, wherein the processor is electrically connected with the memory, and the memory is used for storing instructions and data; the processor is configured to perform the voice analysis based terminal processing method of any one of claims 1-4.
CN201911325074.1A 2019-12-20 2019-12-20 Terminal processing method and device based on sound analysis, storage medium and terminal Active CN111081275B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911325074.1A CN111081275B (en) 2019-12-20 2019-12-20 Terminal processing method and device based on sound analysis, storage medium and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911325074.1A CN111081275B (en) 2019-12-20 2019-12-20 Terminal processing method and device based on sound analysis, storage medium and terminal

Publications (2)

Publication Number Publication Date
CN111081275A CN111081275A (en) 2020-04-28
CN111081275B true CN111081275B (en) 2023-05-26

Family

ID=70316248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911325074.1A Active CN111081275B (en) 2019-12-20 2019-12-20 Terminal processing method and device based on sound analysis, storage medium and terminal

Country Status (1)

Country Link
CN (1) CN111081275B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111601000B (en) * 2020-05-14 2022-03-08 支付宝(杭州)信息技术有限公司 Communication network fraud identification method and device and electronic equipment
CN112866480B (en) * 2021-01-05 2023-07-18 北京小米移动软件有限公司 Information processing method, information processing device, electronic equipment and storage medium
CN116320144B (en) * 2022-09-23 2023-11-14 荣耀终端有限公司 Audio playing method, electronic equipment and readable storage medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103945062B (en) * 2014-04-16 2017-01-18 华为技术有限公司 User terminal volume adjusting method, device and terminal
CN108476256A (en) * 2016-09-27 2018-08-31 华为技术有限公司 A kind of volume adjusting method and terminal
CN107277260A (en) * 2017-07-07 2017-10-20 珠海格力电器股份有限公司 A kind of contextual model method of adjustment, device and mobile terminal
CN110019931B (en) * 2017-12-05 2023-01-24 腾讯科技(深圳)有限公司 Audio classification method and device, intelligent equipment and storage medium
CN109995799B (en) * 2017-12-29 2020-12-29 Oppo广东移动通信有限公司 Information pushing method and device, terminal and storage medium
CN108924348A (en) * 2018-06-26 2018-11-30 努比亚技术有限公司 Terminal scene mode conversion method, device and computer readable storage medium
CN109151719B (en) * 2018-09-28 2021-08-17 北京小米移动软件有限公司 Secure boot method, apparatus and storage medium
CN109040473B (en) * 2018-10-23 2020-01-03 珠海格力电器股份有限公司 Terminal volume adjusting method and system and mobile phone
CN110580914A (en) * 2019-07-24 2019-12-17 安克创新科技股份有限公司 Audio processing method and equipment and device with storage function
CN110473566A (en) * 2019-07-25 2019-11-19 深圳壹账通智能科技有限公司 Audio separation method, device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN111081275A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
CN109166593B (en) Audio data processing method, device and storage medium
CN108615526B (en) Method, device, terminal and storage medium for detecting keywords in voice signal
CN111081275B (en) Terminal processing method and device based on sound analysis, storage medium and terminal
US10224019B2 (en) Wearable audio device
KR101965313B1 (en) System for Providing Noise Map Based on Big Data Using Sound Collection Device Looked Like Earphone
EP4191579A1 (en) Electronic device and speech recognition method therefor, and medium
US20200251124A1 (en) Method and terminal for reconstructing speech signal, and computer storage medium
CN110364156A (en) Voice interactive method, system, terminal and readable storage medium storing program for executing
CN109885162B (en) Vibration method and mobile terminal
CN111105788B (en) Sensitive word score detection method and device, electronic equipment and storage medium
CN111696532A (en) Speech recognition method, speech recognition device, electronic device and storage medium
CN110992963A (en) Network communication method, device, computer equipment and storage medium
CN114299933A (en) Speech recognition model training method, device, equipment, storage medium and product
CN111739517A (en) Speech recognition method, speech recognition device, computer equipment and medium
CN110931000A (en) Method and device for speech recognition
CN110992927A (en) Audio generation method and device, computer readable storage medium and computing device
CN115881118A (en) Voice interaction method and related electronic equipment
CN111613213A (en) Method, device, equipment and storage medium for audio classification
CN112614507B (en) Method and device for detecting noise
CN114333774A (en) Speech recognition method, speech recognition device, computer equipment and storage medium
CN112735382B (en) Audio data processing method and device, electronic equipment and readable storage medium
CN113225624A (en) Time-consuming determination method and device for voice recognition
CN107154996B (en) Incoming call interception method and device, storage medium and terminal
CN115641867A (en) Voice processing method and terminal equipment
Chen et al. Audio-based early warning system of sound events on the road for improving the safety of hearing-impaired people

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant