CN109922397B - Intelligent audio processing method, storage medium, intelligent terminal and intelligent Bluetooth headset - Google Patents

Intelligent audio processing method, storage medium, intelligent terminal and intelligent Bluetooth headset Download PDF

Info

Publication number
CN109922397B
CN109922397B CN201910214499.9A CN201910214499A CN109922397B CN 109922397 B CN109922397 B CN 109922397B CN 201910214499 A CN201910214499 A CN 201910214499A CN 109922397 B CN109922397 B CN 109922397B
Authority
CN
China
Prior art keywords
information
current
audio
personnel
intelligent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910214499.9A
Other languages
Chinese (zh)
Other versions
CN109922397A (en
Inventor
尤广国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tinglai Technology Co.,Ltd.
Original Assignee
Shenzhen Quchang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Quchang Technology Co ltd filed Critical Shenzhen Quchang Technology Co ltd
Priority to CN201910214499.9A priority Critical patent/CN109922397B/en
Publication of CN109922397A publication Critical patent/CN109922397A/en
Application granted granted Critical
Publication of CN109922397B publication Critical patent/CN109922397B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Telephone Function (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses an intelligent audio processing method, a storage medium, an intelligent terminal and an intelligent Bluetooth headset; the problem that sound in the environment cannot be recorded in real time is solved, and the technical scheme is characterized in that current audio information in the current environment is obtained and stored according to audio monitoring execution information; if the current audio information is not acquired within a preset time period, entering a dormant state and continuously acquiring the current audio information in the current environment; after entering the dormant state, if the current audio information is acquired, the audio recorder is restarted to store the current audio information.

Description

Intelligent audio processing method, storage medium, intelligent terminal and intelligent Bluetooth headset
Technical Field
The invention relates to an intelligent audio processing method, in particular to an intelligent audio processing method, a storage medium, an intelligent terminal and an intelligent Bluetooth headset.
Background
The Bluetooth earphone applies the Bluetooth technology to the hands-free earphone, so that users can avoid annoying wiring stumbling and can easily talk in various ways. Since the advent of bluetooth headsets, it has been a good tool for the mobile commerce industry to increase efficiency.
Bluetooth is a low-cost, high-capacity short-range wireless communication specification. A Bluetooth notebook computer is a notebook computer with a Bluetooth wireless communication function. The bluetooth specification adopts microwave frequency band work, and transmission rate 1 Mbyte per second, maximum transmission distance 10 meters can reach 100 meters through increasing transmitting power. Bluetooth technology is globally open, has good compatibility in the world, and can be connected into a whole through a low-cost intangible Bluetooth network all over the world.
Publication No. CN105554612A discloses a bluetooth headset capable of storing data, which includes a bluetooth communication interface, an identification processing module, a data storage module, a control module, a playing device and a voice recording module. The Bluetooth communication interface is used for receiving signals of external intelligent equipment and transmitting the signals to the recognition processing module, the recognition processing module is used for analyzing and processing the signals, data records transmitted by the Bluetooth communication interface are recorded on the data storage module, and meanwhile data transmitted by the Bluetooth communication interface are transmitted to the playing equipment, so that the function that the Bluetooth earphone can realize data storage is realized.
The publication No. CN109195049A discloses an intelligent memo Bluetooth headset and a method thereof, wherein the headset comprises a shell, a microphone, a loudspeaker, a recording button and a Bluetooth button, wherein the microphone, the loudspeaker, the recording button and the Bluetooth button are arranged on the shell; the Bluetooth charging device further comprises a battery module, a Bluetooth module and a recording module, wherein the battery module, the Bluetooth module and the recording module are arranged in the shell and provided with a charging control panel. The recording module is used for recording sound and comprises a processor and a storage module, and the microphone inherent to the Bluetooth headset is divided into two paths through a voltage follower and the output end of the voltage follower and is respectively supplied to the recording module and the Bluetooth module for use. The recording module starts and ends recording through a recording button, so that the operation time is saved, and important items can be recorded through a button when a user wears the Bluetooth headset; when it is determined that the bluetooth function is not used or the recording function is not used, its power may be completely turned off using the corresponding switch to extend the battery life.
The recording function can be accomplished to intelligent terminal such as foretell bluetooth headset, but needs artifical the triggering to start the recording function, can't carry out real-time the typing-in to all sounds in the environment of locating, and the condition of omitting often can appear, so the bluetooth headset who uses has certain improvement space at present.
Disclosure of Invention
The invention aims to provide an intelligent audio processing method which can start recording audio in real time according to whether sound exists or not, and avoid omission.
The technical purpose of the invention is realized by the following technical scheme:
an audio intelligent processing method comprises the following steps:
acquiring current behavior information of a current user;
searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises audio monitoring trigger information, and the instruction information comprises audio monitoring execution information corresponding to the audio monitoring trigger information;
acquiring current audio information in the current environment according to the audio monitoring execution information to store;
if the current audio information is not acquired within a preset time period, entering a dormant state and continuously acquiring the current audio information in the current environment; and after entering the dormant state, if the current audio information is acquired, restarting to store the current audio information.
By adopting the scheme, the real-time audio recording function is started according to the audio monitoring trigger information, the audio acquired in real time is stored, whether the current audio information exists in the current environment or not is monitored, if the current audio information is not acquired all the time within the preset time period, the external audio needing to be recorded and stored at the moment is shown, so that the equipment enters a dormant state, the acquisition of the current audio information is still in a starting state, the current dynamic state is kept monitored, and once the current audio information appears again, the recording is continuously started again, the real-time recording function is ensured, and the omission condition is avoided.
Preferably, the current audio information is stored as follows:
after acquiring the current audio information, storing the current audio information in a preset local storage device;
transmitting according to the communication association condition between the current equipment and the preset external storage device; and if the current equipment is in communication association with the external storage device, transmitting the current audio information stored in the local storage device of the current equipment to the external storage device for storage.
By adopting the scheme, the storage modes are various, under the condition that no external storage device is arranged, the data are directly stored through the local storage device, and once the data are associated with the external storage device, the data stored in the local storage device are transmitted to the external storage device, so that the backup function can be realized, and the subsequent calling and using are facilitated.
Preferably, the method for identifying and classifying the current audio information is as follows:
acquiring current audio information in a current environment and carrying out voiceprint recognition on the current audio information to form current voiceprint information;
searching current personnel information corresponding to the current voiceprint information from a corresponding relation between preset voiceprint information and prestored personnel information;
if the current personnel information corresponding to the current voiceprint information exists in the pre-stored personnel information, storing the current audio information into a personal memory corresponding to the current personnel information;
and if the current personnel information corresponding to the current voiceprint information does not exist in the pre-stored personnel information, storing the current audio information into a temporary storage.
By adopting the scheme, the voiceprint recognition is carried out on the current audio information, the current audio information is compared with the voiceprint information of the prestored personnel information, if the same voiceprint exists, the voiceprint information is recorded before, all the audio information of the personnel is stored in the personal memory corresponding to the prestored personnel information, the subsequent calling is convenient, and the classification is realized; if the voiceprints are not the same, the voice print is a stranger, and at the moment, subsequent users are required to classify, and the voice print is stored in the temporary storage device.
Preferably, if the mobile terminal enters a dormant state, non-prestored personnel confirmation information is fed back;
acquiring current behavior information of a current user according to the non-prestored personnel confirmation information;
searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises confirmation resolution information, and the instruction information comprises execution resolution information corresponding to the confirmation resolution information;
sequentially calling the audio information in the temporary storage memory according to the execution distinguishing information and playing the audio information for a preset playing time period;
acquiring personnel identity information corresponding to the current audio information fed back by the current user, and storing the personnel identity information into prestored personnel information;
and transferring the current audio information corresponding to the personnel identity information temporarily stored in the temporary storage memory to the personal storage memory corresponding to the personnel identity information.
By adopting the scheme, when the user enters the dormant state, the user is reminded that the user does not record the audio stored in the temporary storage memory before distinguishing, if so, the audio is played in sequence, the feedback of the user is acquired, the information of the person is stored in the prestored person information after the feedback is acquired, the corresponding audio is also transferred to the personal memory, and the management of the audio and the related person is realized.
Preferably, acquiring current behavior information of a current user;
searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises inquiry triggering information, and the instruction information comprises inquiry execution information corresponding to the inquiry triggering information;
acquiring current voiceprint information according to the query execution information;
searching current personnel information corresponding to the current voiceprint information from a corresponding relation between preset voiceprint information and prestored personnel information;
and if the current personnel information corresponding to the current voiceprint information exists in the pre-stored personnel information, feeding the current personnel information back to the current user.
Adopt above-mentioned scheme, because the personnel of communicating at ordinary times are more, often can appear feeling familiar with the personnel who can not call out the name again, so through the inquiry trigger information that corresponds, just can acquire current personnel's current voiceprint information, compare with the voiceprint information in the personnel information of prestoring through current voiceprint information, if have the same, then feed back the personnel information who corresponds to current user, can let the user know rapidly who communicates at present, avoid appearing talking for a long time and still not knowing who other awkwardness.
Preferably, acquiring current behavior information of a current user;
searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises sharing request information, and the instruction information comprises sharing execution information corresponding to the sharing request information;
and calling the pre-selected current audio information according to the sharing execution information and sending the pre-selected current audio information to the preset associated personnel.
By adopting the scheme, in the process of business negotiation, if professional problems, incomprehensible technical scheme and other situations exist, and other people need to seek assistance, the request can be initiated to the appointed people in advance through the corresponding sharing request information, the audio frequency of the negotiation can be sent to the corresponding people, the final suggestion is fed back to the negotiator after the specific information is known, and the business negotiation is more convenient and faster to use.
Preferably, the current audio information includes voice information and text information formed by text conversion of the voice information.
By adopting the scheme, different types of information can be sent according to different application environments, namely the voice information is directly sent when the voice information needs to be sent, and when the voice information is not suitable for playing, the text information is directly sent after voice-text conversion.
A second object of the present invention is to provide a computer-readable storage medium, which can store a corresponding instruction set, and can start recording audio in real time according to whether there is sound, so as to avoid missing situations.
The technical purpose of the invention is realized by the following technical scheme:
a computer-readable storage medium comprising a program which, when loaded and executed by a processor, implements the intelligent audio processing method as claimed in any preceding claim.
The third purpose of the invention is to provide an intelligent terminal, which can start recording audio in real time according to whether sound exists or not, so as to avoid omission.
The technical purpose of the invention is realized by the following technical scheme:
an intelligent terminal comprising a memory, a processor and a program stored on the memory and executable on the processor, the program being capable of being loaded and executed by the processor to implement the audio intelligent processing method as claimed in the preceding claims.
The fourth purpose of the invention is to provide an intelligent Bluetooth headset, which can start recording audio in real time according to whether sound exists or not, and avoid omission.
The technical purpose of the invention is realized by the following technical scheme:
an intelligent bluetooth headset comprising a memory, a processor and a program stored on the memory and executable on the processor, the program being capable of being loaded and executed by the processor to implement the audio intelligent processing method as claimed in the preceding claim.
In conclusion, the invention has the following beneficial effects: whether sound exists in the current environment can be automatically judged, if sound exists, the recording function is automatically started and stored, and if no sound exists, the voice recording function enters a dormant state.
Drawings
FIG. 1 is a flow diagram of an intelligent audio processing method;
FIG. 2 is a block flow diagram of a method of identifying and classifying current audio information;
FIG. 3 is a block flow diagram of a method of person identity verification;
FIG. 4 is a block flow diagram of a method of human query;
FIG. 5 is a block diagram of a method for sharing audio information;
FIG. 6 is a block flow diagram of a method of audio wakeup pre-recording;
fig. 7 is a flow chart diagram of a method of obtaining current audio information in a current environment.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The present embodiment is only for explaining the present invention, and it is not limited to the present invention, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present invention.
The embodiment of the invention provides an intelligent audio processing method, which comprises the steps of obtaining current behavior information of a current user; searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises audio monitoring trigger information, and the instruction information comprises audio monitoring execution information corresponding to the audio monitoring trigger information; acquiring current audio information in the current environment according to the audio monitoring execution information to store; if the current audio information is not acquired within a preset time period, entering a dormant state and continuously acquiring the current audio information in the current environment; after entering the dormant state, if the current audio information is acquired, the system is restarted to store the current audio information
In the embodiment of the invention, the real-time audio recording function is started according to the audio monitoring trigger information, the audio acquired in real time is stored, whether the current audio information exists in the current environment or not is monitored, if the current audio information is not acquired all the time within the preset time period, the external audio needing to be recorded and stored at the moment is shown, so that the equipment enters a dormant state, the acquisition of the current audio information is still in a starting state, the current dynamic state is monitored, and once the current audio information appears again, the recording is restarted and continued, so that the real-time recording function is ensured, and the omission condition is avoided.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship, unless otherwise specified.
The embodiments of the present invention will be described in further detail with reference to the drawings attached hereto.
Referring to fig. 1, an embodiment of the invention provides an audio intelligent processing method, and a main flow of the method is described as follows.
As shown in fig. 1:
step 1100: and acquiring the current behavior information of the current user.
The current behavior information can be acquired by a mechanical key triggering mode or a virtual key triggering mode; the mechanical key triggering mode can be automatically obtained after starting up by pressing a start-up key, or can be used for obtaining the current information by pressing a corresponding trigger case again after starting up; the virtual key triggering mode can be achieved by pressing the relevant virtual triggering key in the interface of the corresponding software.
Step 1200: searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information includes audio monitoring trigger information, and the instruction information includes audio monitoring execution information corresponding to the audio monitoring trigger information.
After the current behavior information is acquired, querying between preset behavior information and instruction information, and querying the instruction information corresponding to the current behavior information, namely if the acquired current behavior information is audio monitoring trigger information, then correspondingly querying the audio monitoring execution information.
Step 1300: and acquiring current audio information in the current environment according to the audio monitoring execution information for storage.
The method comprises the steps that a microphone or a microphone used for receiving or picking up sound or the like is directly used for obtaining current audio information in the current environment, an audio intensity reference value is preset, only the audio information higher than the audio intensity reference value is obtained, and the audio intensity reference value can be set according to actual conditions; after the current audio information is acquired, the current audio information is stored, and the specific storage method comprises the following steps:
step 1310: and storing the current audio information in a preset local storage device after the current audio information is acquired.
The local storage device may be a built-in SD card, a removable hard disk, a U disk, an optical disk, or other storage devices, and the local storage device in this embodiment may be a combination of one or more of the foregoing.
Step 1320: transmitting according to the communication association condition between the current equipment and the preset external storage device; and if the current equipment is in communication association with the external storage device, transmitting the current audio information stored in the local storage device of the current equipment to the external storage device for storage.
The external storage device may be an external SD card, a mobile hard disk, a usb disk, an optical disk, or other storage devices, or may also adopt an associated external terminal with a storage function as a storage device, such as a mobile phone, a PAD, a computer, or other storage devices, or may also adopt a cloud storage mode to implement storage.
In this embodiment, the external storage device may adopt one or more of the above combinations, and preferably adopts a mode that the external terminal and the cloud storage are combined with each other.
In one embodiment, the external terminal is used as a storage device, if the external terminal is a mobile phone, the mobile phone and the local storage device are associated with each other, so that the output in the local storage device is transmitted to the mobile phone, the mobile phone and the local storage device are both loaded with related APP software, and it is ensured that the mobile phone and the local storage device can be operated and associated with each other, and the association mode can be a bluetooth communication mode, a wireless communication mode, and the like.
In one embodiment, the local storage device itself has a function of communicating with the cloud, that is, the relevant APP software loaded by the local storage device may be associated with the cloud, and the association may be performed in a manner of logging in through an account password, and data on the local storage device is transmitted to the corresponding cloud after the association is determined.
In one embodiment, the local storage device does not have a function of communicating with the cloud, but only has a function of communicating with the external terminal in an association manner, if the external terminal adopts a mobile phone, the external terminal is firstly associated with the external terminal, so that the output in the local storage device is transmitted to the mobile phone, the mobile phone and the local storage device are both provided with relevant APP software, and the two devices can be ensured to be in operation association, and the association manner can be a Bluetooth communication manner, a wireless communication manner and the like; the mobile phone has a function of communicating with the cloud, namely, the mobile phone can be associated with the cloud through related APP software carried by the mobile phone, and the association mode can be used for transmitting data on the mobile phone to the corresponding cloud after the association is determined through an account password login mode.
Step 1400: if the current audio information is not acquired within a preset time period, entering a dormant state and continuously acquiring the current audio information in the current environment; and after entering the dormant state, if the current audio information is acquired, restarting to store the current audio information.
The preset time period may be preferably 3 minutes, or may be other time periods, and is set according to actual conditions; in this embodiment, the sleep state is a state in which only the function of acquiring the current audio information is started, and other functions are all in an off state, so that power consumption is reduced. In addition, once the current audio information is acquired in the sleep state, the corresponding function, such as a recording function, is restarted.
In order to facilitate subsequent audio retrieval query and use, the stored audio is sorted and classified in a relevant manner, and as shown in fig. 2, a method for identifying and classifying the current audio information is as follows:
step 2100: and acquiring current audio information in the current environment and carrying out voiceprint recognition on the current audio information to form current voiceprint information.
Among them, Voiceprint (Voiceprint) is a sound spectrum carrying speech information displayed by an electro-acoustic apparatus. The generation of human language is a complex physiological and physical process between the human language center and the pronunciation organs, and the vocal print maps of any two people are different because the vocal organs used by a person in speaking, namely the tongue, the teeth, the larynx, the lung and the nasal cavity, are different greatly in size and shape. The speech acoustic characteristics of each person are both relatively stable and variable, not absolute, but invariant. The variation can come from physiology, pathology, psychology, simulation, camouflage and is also related to environmental interference. However, since the pronunciation organs of each person are different, in general, people can distinguish different sounds or judge whether the sounds are the same.
Voiceprint recognition can be said to have two key problems, namely feature extraction and pattern matching (pattern recognition).
Regarding feature extraction: the task of feature extraction is to extract and select acoustic or language features with characteristics of strong separability, high stability and the like for the voiceprint of the speaker. Unlike speech recognition, the features of voiceprint recognition must be "personalized" features, while the features of speaker recognition must be "generic" to the speaker. Although most voiceprint recognition systems currently use acoustic level features, the features characterizing a person should be multi-level, including: (1) acoustic features related to the anatomy of human pronunciation mechanisms (e.g., spectrum, cepstrum, formants, fundamental tones, reflection coefficients, etc.), nasal sounds, profound breath sounds, humble, laughing, etc.; (2) semantics, paraphrasing, pronunciation, language habits, etc., which are influenced by social and economic conditions, education level, place of birth, etc.; (3) personal characteristics or characteristics of rhythm, speed, intonation, volume, etc. affected by the parent. From the aspect of modeling by using a mathematical method, the currently available features of the voiceprint automatic recognition model include: (1) acoustic features (cepstrum); (2) lexical features (speaker dependent word n-grams, phoneme n-grams); (3) prosodic features (pitch and energy "poses" described by n-grams); (4) language, dialect and accent information; (5) channel information (what channel to use); and so on.
Voiceprint recognition also faces a problem of feature selection or feature selection according to different task requirements. For example, for "channel" information, in criminal investigation applications, it is desirable to not use, i.e., to weaken, the channel's impact on speaker recognition, since we want it to be recognizable regardless of what channel system the speaker uses; in bank transaction, it is desirable to use channel information, i.e. it is desirable that the channel has a large influence on speaker recognition, so that the influence caused by recording, simulation, etc. can be eliminated.
In a word, the better characteristics should be able to effectively distinguish different speakers, but keep relative stability when the voice of the same speaker changes; the problem of being imitated by others is not easy to imitate or can be better solved; the noise resistance is better; … … are provided. Of course, these problems can also be solved by model methods.
Regarding pattern recognition: for pattern recognition, there are several broad categories of methods:
(1) the template matching method comprises the following steps: training and testing feature sequences are aligned by using Dynamic Time Warping (DTW), and the method is mainly used for application of fixed phrases (generally, text-related tasks);
(2) the nearest neighbor method comprises the following steps: all the feature vectors are reserved during training, the nearest K training vectors are found for each vector during identification, and identification is carried out according to the K training vectors, so that the model storage and the similar calculation are large in quantity;
(3) the neural network method comprises the following steps: there are many forms, such as multilayer perception, Radial Basis Function (RBF), etc., can train explicitly to distinguish the speaker from its background speaker, its training amount is very large, and the model is not very generalizable;
(4) hidden Markov Model (HMM) method: usually, the HMM of a single state or a Gaussian Mixture Model (GMM) is used, which is a popular method and has better effect;
(5) VQ clustering method (as LBG): the effect is good, the algorithm complexity is not high, and better effect can be achieved by matching with an HMM method;
(6) polynomial classifier method: the accuracy is higher, but the model storage and calculation amount are larger;
(7)……。
there are many key issues that need to be addressed for voiceprint recognition, such as: short speech problem, whether to use short speech for model training, and to use short time for recognition, which is mainly required by applications where sound is not easily available; the sound imitation (or recording) problem, to effectively distinguish imitation sounds (recordings) from real sounds; the effective detection of the target speaker under the condition of multiple speakers; eliminating or weakening the influence caused by sound change (different languages, contents, modes, physical conditions, time, age and the like); eliminating the influence caused by channel difference and background noise; … … other techniques, such as denoising, adaptive techniques, etc., are needed to assist in this process.
Speaker verification also faces a dilemma of selection. Generally, two important parameters characterizing the performance of the speaker verification system are the False Reject Rate (FRR), which is the error caused by rejecting the true speaker, and the False Accept Rate (FAR), which is the error caused by accepting the speaker out of the set, which are related to the setting of the threshold, and the equal value of which is called the Equal Error Rate (EER). Under the existing technical level, the two can not reach the minimum at the same time, and the threshold value needs to be adjusted to meet the requirements of different applications, for example, under the condition that the usability is required, the false rejection rate can be lower, and the false acceptance rate can be increased, so that the safety is reduced; in the case of high requirement on "security", the false acceptance rate may be made lower, and the false rejection rate may be increased, thereby decreasing the usability. The former can be summarized as "no missing and no missing for peaceful mistakes" and the latter can be summarized as "no missing and no missing for peaceful mistakes". We refer to the adjustment of the true threshold as an "operating point" adjustment. A good system should allow free adjustment of the operating point.
Step 2200: and searching the current personnel information corresponding to the current voiceprint information from the corresponding relationship between the preset voiceprint information and the pre-stored personnel information.
After the current voiceprint information is obtained, inquiring between preset voiceprint information and voiceprint information of pre-stored personnel information, and forming two results according to the inquiring result, wherein one result is that the pre-stored personnel information corresponding to the current voiceprint information is inquired, and the other result is that the pre-stored personnel information corresponding to the current voiceprint information is not inquired.
Step 2300: and if the current personnel information corresponding to the current voiceprint information exists in the pre-stored personnel information, storing the current audio information into a personal memory corresponding to the current personnel information.
Step 2400: and if the current personnel information corresponding to the current voiceprint information does not exist in the pre-stored personnel information, storing the current audio information into a temporary storage.
The personal memory and the temporary memory may be the above-disclosed local storage device and/or external storage device, that is, the personal memory and the temporary memory exist independently; the local storage device and/or the external storage device may also include a personal memory and a temporary storage memory, that is, the local storage device and/or the external storage device itself includes two types of memories, and only stores in corresponding positions according to different situations.
Selecting according to actual conditions; performing voiceprint recognition on the current audio information, comparing the current audio information with the voiceprint information of the prestored personnel information, and if the same voiceprint exists, indicating that the voice is recorded before, storing all the audio information of the personnel in a personal memory corresponding to the prestored personnel information, so that the subsequent calling is convenient and the classification is realized; if the voiceprints are not the same, the voice print is a stranger, and at the moment, subsequent users are required to classify, and the voice print is stored in the temporary storage device.
As for the current personnel information which does not exist in the pre-stored personnel information and corresponds to the current voiceprint information, the identity can be confirmed by the user in the idle state, and a related file is established, as shown in fig. 3, the specific personnel identity confirmation method is as follows:
step 3100: and if the mobile terminal enters the dormant state, feeding back non-prestored personnel confirmation information.
Once the non-prestoring personnel enter the dormant state, the non-prestoring personnel confirmation information is automatically fed back to the user in a voice broadcast or short message mode and the like, and the user is reminded whether to confirm the identity of the non-prestoring personnel in the past.
Step 3200: and acquiring the current behavior information of the current user according to the non-prestored personnel confirmation information.
The current behavior information can be acquired by a mechanical key triggering mode or a virtual key triggering mode; the mechanical key triggering mode can obtain the current row as information by pressing a corresponding triggering case; the virtual key triggering mode can be achieved by pressing the relevant virtual triggering key in the interface of the corresponding software.
Step 3300: searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information includes confirmation resolution information, and the instruction information includes execution resolution information corresponding to the confirmation resolution information.
After the current behavior information is acquired, querying between preset behavior information and instruction information, and after the instruction information corresponding to the current behavior information is queried, namely if the acquired current behavior information is the identification information, the corresponding query is the execution identification information.
Step 3400: and sequentially calling the audio information in the temporary storage according to the execution distinguishing information and playing the audio information for a preset playing time period.
The method comprises the steps that current audio information in a temporary storage memory is sequentially called, the calling mode is called according to a sequential time stream mode, namely a mode that a time node is from earliest to latest; and playing a preset playing time period every time one piece of audio information is called, wherein the playing time period is preferably 5 seconds, and if the audio information is not clearly listened, the corresponding case can be triggered again to be listened again.
Step 3500: and acquiring the personnel identity information corresponding to the current audio information fed back by the current user, and storing the personnel identity information into the prestored personnel information.
Step 3600: and transferring the current audio information corresponding to the personnel identity information temporarily stored in the temporary storage memory to the personal storage memory corresponding to the personnel identity information.
When the user enters the sleep state, the user can be reminded whether to distinguish the audio stored in the temporary storage memory before the user is not recorded, if so, the audio is played in sequence and the feedback of the user is acquired, the information of the person is stored in the pre-stored person information after the feedback is acquired, and the corresponding audio is also transferred to the personal memory, so that the management of the audio and the related person is realized.
Because many people who communicate at ordinary times often appear and feel familiar with and can not call out a name, the people who speak at present can be queried through the corresponding query trigger information, whether the people are the people who have been stored in the prestored personnel information or not can be queried, and feedback is performed, as shown in fig. 4, the specific personnel query method is as follows:
step 4100: and acquiring the current behavior information of the current user.
The current behavior information can be acquired by a mechanical key triggering mode or a virtual key triggering mode; the mechanical key triggering mode can obtain the current row as information by pressing a corresponding triggering case; the virtual key triggering mode can be achieved by pressing the relevant virtual triggering key in the interface of the corresponding software.
Step 4200: searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information includes query trigger information, and the instruction information includes query execution information corresponding to the query trigger information.
After the current behavior information is acquired, querying between preset behavior information and instruction information, and querying the instruction information corresponding to the current behavior information, namely if the acquired current behavior information is query trigger information, querying execution information correspondingly.
Step 4300: and obtaining the current voiceprint information according to the query execution information.
Step 4400: and searching the current personnel information corresponding to the current voiceprint information from the corresponding relationship between the preset voiceprint information and the pre-stored personnel information.
Step 4500: and if the current personnel information corresponding to the current voiceprint information exists in the pre-stored personnel information, feeding the current personnel information back to the current user.
Wherein, acquire current personnel's current voiceprint information, compare through the voiceprint information in current voiceprint information and the personnel information of prestoring, if have the same, then feed back the personnel information who corresponds to current user, can let the user know rapidly that communicates at present who, avoid appearing talking for a long time and still not knowing who's awkwardness.
The stored audio information can be shared with others, as shown in fig. 5, a specific audio information sharing method is as follows:
step 5100: and acquiring the current behavior information of the current user.
The current behavior information can be acquired by a mechanical key triggering mode or a virtual key triggering mode; the mechanical key triggering mode can obtain the current row as information by pressing a corresponding triggering case; the virtual key triggering mode can be achieved by pressing the relevant virtual triggering key in the interface of the corresponding software.
Step 5200: searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information includes sharing request information, and the instruction information includes sharing execution information corresponding to the sharing request information.
After the current behavior information is acquired, querying between preset behavior information and instruction information, and after the instruction information corresponding to the current behavior information is queried, namely if the acquired current behavior information is sharing request information, the corresponding query is sharing execution information.
Step 5300: and calling the pre-selected current audio information according to the sharing execution information and sending the pre-selected current audio information to the preset associated personnel.
The current audio information comprises voice information and character information formed by converting the voice information into characters. The corresponding voice information and/or text information can be selected and sent according to the actual situation. If the meeting is formed after the meeting is completed, corresponding current audio information can be shared with meeting personnel; if professional problems, incomprehensible technical schemes and the like exist in the business negotiation process, when assistance of other people is required, the request can be sent to a plurality of people appointed in advance through corresponding sharing request information, the audio frequency of the prior negotiation can be sent to the corresponding people, the final suggestion is fed back to the negotiator after the specific information is known, and the business negotiation process is more convenient to use.
When the current audio information is acquired, the pre-storage device and the equipment are started simultaneously, so that a certain time is needed in the process of starting the equipment, and a certain time difference exists between the time when the equipment is started and the time when the equipment starts to record, and the condition of omission is easily caused.
As shown in fig. 6:
step 6100: current audio information in a current environment is obtained.
Since there are many sounds around the user, there are many noises, and these noises will generate large interference, so it is necessary to reduce the interference of external noises as much as possible, as shown in fig. 7, the method for acquiring current audio information in the current environment is as follows:
step 6110: and acquiring the current behavior information of the current user.
The current behavior information can be acquired by a mechanical key triggering mode or a virtual key triggering mode; the mechanical key triggering mode can obtain the current row as information by pressing a corresponding triggering case; the virtual key triggering mode can be achieved by pressing the relevant virtual triggering key in the interface of the corresponding software.
Step 6120: searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information includes audio monitoring trigger information, and the instruction information includes audio monitoring execution information corresponding to the audio monitoring trigger information.
After the current behavior information is acquired, querying between preset behavior information and instruction information, and querying the instruction information corresponding to the current behavior information, namely if the acquired current behavior information is audio monitoring trigger information, then correspondingly querying the audio monitoring execution information.
Step 6130: and monitoring current audio information in the current environment according to the audio monitoring execution information.
Step 6140: comparing the current audio information with preset sound intensity reference information.
Step 6150: and if the current audio information is greater than or equal to the sound intensity reference information, acquiring the current audio information in the current environment.
According to the comparison between the detected current audio information and the sound intensity reference information, the current audio information in the current environment can be really acquired only under the condition that the detected sound intensity is larger than the preset reference sound intensity, so that the prestoring device and the current equipment are started, and the influence of interference is reduced as much as possible. The sound intensity reference information can be set according to actual conditions, and the sound intensity reference information corresponds to the sensitivity, so that the adjustment of the sensitivity can be completed by setting different sound intensity reference value information. The method for acquiring the current audio information in the current environment is also suitable for the situation of acquiring the current audio information in any method flow.
Step 6200: and starting the preset prestoring device to store the current audio information, and simultaneously starting the current equipment and the storage device which is mutually associated with the current equipment.
The pre-storage device may be a built-in storage device such as an SD card, a mobile hard disk, a usb disk, an optical disk, etc., and the pre-storage device in this embodiment may be a combination of one or more of the foregoing devices; the storage device is the local storage device and/or the external storage device disclosed above, and the association relationship between the current device and the storage device is detailed in step 1320, and therefore is not described herein.
The prestoring device stores preset prestoring time for the current audio information in a stack type storage mode, wherein the prestoring time is longer than the starting time of the prestoring device. The stack type storage mode can reduce the storage requirement of the pre-storage device as far as possible, and can meet the requirement, but the pre-storage time must be longer than the starting time of the storage device, so that the pre-storage device can be completely stored in the starting process of the storage device, and the omission condition is avoided.
Step 6300: and acquiring current working state information of the current storage device, wherein the working state information comprises starting state information.
Step 6400: and controlling the storage device to store the current audio information according to the current starting state information.
Step 6500: and transmitting the current audio information stored in the pre-storage device to a storage device associated with the current equipment.
When the current audio information is acquired, simultaneously starting a pre-storage device and equipment, so that in the process of starting the equipment, the current audio information is stored through the pre-storage device firstly, and the current audio information is stored through the storage device after the equipment and the storage device associated with the corresponding equipment are started; then, the current audio information stored in the pre-storage device is transmitted to the storage device, thereby avoiding the condition that the current audio information needing to be recorded is left due to the process of starting the equipment. In the transmission process, the audio can be stored independently, and the two ends of the audio can be spliced to form the whole audio.
Embodiments of the present invention provide a computer-readable storage medium, which includes instructions that, when loaded and executed by a processor, implement the steps as described in the flowcharts of fig. 1-7.
The computer-readable storage medium includes, for example: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Based on the same inventive concept, an embodiment of the present invention provides an intelligent terminal, which includes a memory, a processor, and a program stored in the memory and executable on the processor, where the program is capable of being loaded and executed by the processor to implement the processes shown in fig. 1 to 7.
Based on the same inventive concept, an embodiment of the present invention provides an intelligent bluetooth headset, which includes a memory, a processor, and a program stored in the memory and executable on the processor, and the program can be loaded and executed by the processor to implement the processes shown in fig. 1 to 7.
It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working processes of the system, the apparatus and the unit described above, reference may be made to the corresponding processes in the foregoing method embodiments, and details are not described here again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.
The above embodiments are only used to describe the technical solutions of the present application in detail, but the above embodiments are only used to help understanding the method and the core idea of the present invention, and should not be construed as limiting the present invention. Those skilled in the art should also appreciate that they can easily conceive of various changes and substitutions within the technical scope of the present disclosure.

Claims (8)

1. An intelligent audio processing method is characterized by comprising the following steps:
acquiring current behavior information of a current user;
searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises audio monitoring trigger information, and the instruction information comprises audio monitoring execution information corresponding to the audio monitoring trigger information;
acquiring current audio information in the current environment according to the audio monitoring execution information to store;
if the current audio information is not acquired within a preset time period, entering a dormant state and continuously acquiring the current audio information in the current environment; after entering a dormant state, if the current audio information is acquired, restarting to store the current audio information;
the method for identifying and classifying the current audio information comprises the following steps:
acquiring current audio information in a current environment and carrying out voiceprint recognition on the current audio information to form current voiceprint information;
searching current personnel information corresponding to the current voiceprint information from a corresponding relation between preset voiceprint information and prestored personnel information;
if the current personnel information corresponding to the current voiceprint information exists in the pre-stored personnel information, storing the current audio information into a personal memory corresponding to the current personnel information;
if the current personnel information corresponding to the current voiceprint information does not exist in the pre-stored personnel information, storing the current audio information into a temporary storage memory;
if the mobile terminal enters a dormant state, feeding back non-prestored personnel confirmation information;
acquiring current behavior information of a current user according to the non-prestored personnel confirmation information;
searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises confirmation resolution information, and the instruction information comprises execution resolution information corresponding to the confirmation resolution information;
sequentially calling the audio information in the temporary storage memory according to the execution distinguishing information and playing the audio information for a preset playing time period;
acquiring personnel identity information corresponding to the current audio information fed back by the current user, and storing the personnel identity information into prestored personnel information;
and transferring the current audio information corresponding to the personnel identity information temporarily stored in the temporary storage memory to the personal storage memory corresponding to the personnel identity information.
2. The intelligent audio processing method according to claim 1, wherein the current audio information is stored by the following method:
after acquiring the current audio information, storing the current audio information in a preset local storage device;
transmitting according to the communication association condition between the current equipment and the preset external storage device; and if the current equipment is in communication association with the external storage device, transmitting the current audio information stored in the local storage device of the current equipment to the external storage device for storage.
3. The intelligent audio processing method according to claim 1, wherein:
acquiring current behavior information of a current user;
searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises inquiry triggering information, and the instruction information comprises inquiry execution information corresponding to the inquiry triggering information;
acquiring current voiceprint information according to the query execution information;
searching current personnel information corresponding to the current voiceprint information from a corresponding relation between preset voiceprint information and prestored personnel information;
and if the current personnel information corresponding to the current voiceprint information exists in the pre-stored personnel information, feeding the current personnel information back to the current user.
4. The intelligent audio processing method according to claim 1, wherein:
acquiring current behavior information of a current user;
searching current instruction information corresponding to the current behavior information from a preset corresponding relationship between the behavior information and the instruction information; the behavior information comprises sharing request information, and the instruction information comprises sharing execution information corresponding to the sharing request information;
and calling the pre-selected current audio information according to the sharing execution information and sending the pre-selected current audio information to the preset associated personnel.
5. The intelligent audio processing method according to claim 1, wherein: the current audio information comprises voice information and text information formed by converting the voice information into text.
6. A computer-readable storage medium, comprising a program which is loadable by a processor and which, when executed, carries out an intelligent audio processing method as claimed in any one of claims 1 to 5.
7. An intelligent terminal, comprising a memory, a processor and a program stored in the memory and executable on the processor, wherein the program is capable of being loaded and executed by the processor to implement the audio intelligent processing method according to any one of claims 1 to 5.
8. The utility model provides an intelligence bluetooth headset, characterized by: comprising a memory, a processor and a program stored on said memory and executable on said processor, which program is capable of being loaded and executed by the processor to implement the audio intelligent processing method as claimed in any one of claims 1 to 5.
CN201910214499.9A 2019-03-20 2019-03-20 Intelligent audio processing method, storage medium, intelligent terminal and intelligent Bluetooth headset Active CN109922397B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910214499.9A CN109922397B (en) 2019-03-20 2019-03-20 Intelligent audio processing method, storage medium, intelligent terminal and intelligent Bluetooth headset

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910214499.9A CN109922397B (en) 2019-03-20 2019-03-20 Intelligent audio processing method, storage medium, intelligent terminal and intelligent Bluetooth headset

Publications (2)

Publication Number Publication Date
CN109922397A CN109922397A (en) 2019-06-21
CN109922397B true CN109922397B (en) 2020-06-16

Family

ID=66965872

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910214499.9A Active CN109922397B (en) 2019-03-20 2019-03-20 Intelligent audio processing method, storage medium, intelligent terminal and intelligent Bluetooth headset

Country Status (1)

Country Link
CN (1) CN109922397B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111866644B (en) * 2020-07-14 2022-02-22 歌尔科技有限公司 Bluetooth headset, detection method and device of Bluetooth headset and storage medium
CN112584225A (en) * 2020-12-03 2021-03-30 维沃移动通信有限公司 Video recording processing method, video playing control method and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202563894U (en) * 2012-02-21 2012-11-28 爱国者电子科技有限公司 Sound control recorder pen
CN107360327A (en) * 2017-07-19 2017-11-17 腾讯科技(深圳)有限公司 Audio recognition method, device and storage medium
CN107765891A (en) * 2017-10-19 2018-03-06 广东小天才科技有限公司 The control method and microphone of a kind of microphone
CN108521621A (en) * 2018-03-30 2018-09-11 广东欧珀移动通信有限公司 Signal processing method, device, terminal, earphone and readable storage medium storing program for executing
CN108986826A (en) * 2018-08-14 2018-12-11 中国平安人寿保险股份有限公司 Automatically generate method, electronic device and the readable storage medium storing program for executing of minutes
CN109325737A (en) * 2018-09-17 2019-02-12 态度国际咨询管理(深圳)有限公司 A kind of enterprise intelligent virtual assistant system and its method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103685185B (en) * 2012-09-14 2018-04-27 上海果壳电子有限公司 Mobile equipment voiceprint registration, the method and system of certification
CN104631998A (en) * 2013-11-13 2015-05-20 句容智恒安全设备有限公司 Novel double-unlocking safe case
CN105637895B (en) * 2014-07-10 2019-03-26 奥林巴斯株式会社 The control method of recording device and recording device
CN105807726A (en) * 2014-12-30 2016-07-27 北京奇虎科技有限公司 Terminal control system and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202563894U (en) * 2012-02-21 2012-11-28 爱国者电子科技有限公司 Sound control recorder pen
CN107360327A (en) * 2017-07-19 2017-11-17 腾讯科技(深圳)有限公司 Audio recognition method, device and storage medium
CN107765891A (en) * 2017-10-19 2018-03-06 广东小天才科技有限公司 The control method and microphone of a kind of microphone
CN108521621A (en) * 2018-03-30 2018-09-11 广东欧珀移动通信有限公司 Signal processing method, device, terminal, earphone and readable storage medium storing program for executing
CN108986826A (en) * 2018-08-14 2018-12-11 中国平安人寿保险股份有限公司 Automatically generate method, electronic device and the readable storage medium storing program for executing of minutes
CN109325737A (en) * 2018-09-17 2019-02-12 态度国际咨询管理(深圳)有限公司 A kind of enterprise intelligent virtual assistant system and its method

Also Published As

Publication number Publication date
CN109922397A (en) 2019-06-21

Similar Documents

Publication Publication Date Title
Hanifa et al. A review on speaker recognition: Technology and challenges
CN107818798B (en) Customer service quality evaluation method, device, equipment and storage medium
CN116547746A (en) Dialog management for multiple users
CN108346427A (en) A kind of audio recognition method, device, equipment and storage medium
CN111341325A (en) Voiceprint recognition method and device, storage medium and electronic device
CN111727474A (en) User input processing limits in speech processing systems
CN108711429B (en) Electronic device and device control method
WO2021008538A1 (en) Voice interaction method and related device
CN103680497A (en) Voice recognition system and voice recognition method based on video
CN112102850B (en) Emotion recognition processing method and device, medium and electronic equipment
CN102404278A (en) Song request system based on voiceprint recognition and application method thereof
CN113330511B (en) Voice recognition method, voice recognition device, storage medium and electronic equipment
CN111210829A (en) Speech recognition method, apparatus, system, device and computer readable storage medium
CN104462912B (en) Improved biometric password security
WO2020098523A1 (en) Voice recognition method and device and computing device
CN112562681B (en) Speech recognition method and apparatus, and storage medium
CN109101663A (en) A kind of robot conversational system Internet-based
CN109272991A (en) Method, apparatus, equipment and the computer readable storage medium of interactive voice
CN109922397B (en) Intelligent audio processing method, storage medium, intelligent terminal and intelligent Bluetooth headset
KR20190093962A (en) Speech signal processing mehtod for speaker recognition and electric apparatus thereof
JP2003330485A (en) Voice recognition device, voice recognition system, and method for voice recognition
CN110539721A (en) vehicle control method and device
CN110689887B (en) Audio verification method and device, storage medium and electronic equipment
KR20200016636A (en) Electronic device for performing task including call in response to user utterance and method for operation thereof
CN117636872A (en) Audio processing method, device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220919

Address after: Room 401, No. 292, Longgang Section, Shenshan Road, Longdong Community, Baolong Street, Longgang District, Shenzhen, Guangdong 518100

Patentee after: Shenzhen Tinglai Technology Co.,Ltd.

Address before: Room 202, building 5, Dongya group, No. 6, Nanling North Road, Nanwan street, Longgang District, Shenzhen, Guangdong 518000

Patentee before: SHENZHEN QUCHANG TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right