CN108449497A

CN108449497A - Voice communication data processing method, device, storage medium and mobile terminal

Info

Publication number: CN108449497A
Application number: CN201810201118.9A
Authority: CN
Inventors: 郑志勇; 柳明; 李智豪
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2018-03-12
Filing date: 2018-03-12
Publication date: 2018-08-24
Anticipated expiration: 2038-03-12
Also published as: CN108449497B

Abstract

The embodiment of the present application discloses voice communication data processing method, device, storage medium and mobile terminal.This method includes：Detect that the voice communication group in default application program is successfully established；Obtain the first voice data of the microphone present period acquisition of mobile terminal；When detecting that anti-processing event of uttering long and high-pitched sounds is triggered, anti-processing of uttering long and high-pitched sounds is carried out to first voice data, obtains ascending voice communicating data, and the ascending voice communicating data is sent to the corresponding server of the default application program.The application is by using above-mentioned technical proposal, after the voice communication group of default application program that can be in the terminal is successfully established, when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, anti-processing of uttering long and high-pitched sounds is carried out to the ascending voice communicating data of current mobile terminal in time.

Description

Voice communication data processing method, device, storage medium and mobile terminal

Technical field

The invention relates to voice communication technical field more particularly to voice communication data processing method, device, deposit Storage media and mobile terminal.

Background technology

Currently, as the quick of mobile terminal is popularized, the mobile terminals such as mobile phone and tablet computer have become people's indispensability One of means of communication.Communication mode between mobile terminal user is more and more abundant, is not limited to mobile communication operators already The services such as the traditional phone and short message that quotient provides, under many scenes, user is more likely to using Internet-based logical Voice-enabled chat in letter mode, such as various social softwares and Video chat function.

In addition, application program (Application, APP) function in mobile terminal is increasingly perfect, in many application programs All be provided with voice call function, the communication between the user of same money application program easy to use with exchange.It is with game application Example, some need to carry out between player interactive game be added to built in voice call function, user can use During mobile terminal plays game, speech exchange is carried out with other players.However, in voice call process, voice communication The sound type for including in data is more, such as sound (such as back of the body of game comprising each player's one's voice in speech, application program itself Scape sound or special efficacy sound etc.) and mobile terminal local environment in other sound etc., since sound is more complicated, it is easy to occur It utters long and high-pitched sounds phenomenon, seriously affects the use of user.

Invention content

The embodiment of the present application provides a kind of voice communication data processing method, device, storage medium and mobile terminal, can be with After voice call function in application program for mobile terminal is opened, anti-processing of uttering long and high-pitched sounds targetedly is carried out.

In a first aspect, the embodiment of the present application provides a kind of voice communication data processing method, including：

Detect that the voice communication group in default application program is successfully established；

Obtain the first voice data of the microphone present period acquisition of mobile terminal；

When detecting that anti-processing event of uttering long and high-pitched sounds is triggered, anti-processing of uttering long and high-pitched sounds is carried out to first voice data, is obtained Ascending voice communicating data, and the ascending voice communicating data is sent to the corresponding server of the default application program.

Second aspect, the embodiment of the present application provide a kind of voice communication data processing equipment, including：

Voice communication group detection module, the voice communication group for detecting in default application program are successfully established；

Voice data acquisition module, the first voice data that the microphone present period for obtaining mobile terminal acquires；

Anti- processing module of uttering long and high-pitched sounds, for when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, to first voice data Anti- processing of uttering long and high-pitched sounds is carried out, obtains ascending voice communicating data, and the ascending voice communicating data is sent to described preset and is answered With the corresponding server of program.

The third aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence realizes the voice communication data processing method as described in the embodiment of the present application when the program is executed by processor.

Fourth aspect, the embodiment of the present application provide a kind of mobile terminal, including memory, processor and are stored in storage It can realize on device and when the computer program of processor operation, the processor execute the computer program as the application is real Apply the voice communication data processing method described in example.

The voice communication data processing scheme provided in the embodiment of the present application, in the default application program of mobile terminal After voice communication group is successfully established, the first voice data of the microphone present period acquisition of mobile terminal is obtained, and when detection When being triggered to anti-processing event of uttering long and high-pitched sounds, anti-processing of uttering long and high-pitched sounds is carried out to the first voice data, obtains ascending voice communicating data, and Ascending voice communicating data is sent to the corresponding server of default application program.By using above-mentioned technical proposal, Ke Yi After the voice communication group of default application program in mobile terminal is successfully established, when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, Anti- processing of uttering long and high-pitched sounds is carried out to the ascending voice communicating data of current mobile terminal in time, sound of uttering long and high-pitched sounds is reduced and is brought to user's use Inconvenience.

Description of the drawings

Fig. 1 is a kind of flow diagram of voice communication data processing method provided by the embodiments of the present application；

Fig. 2 is the flow diagram of another voice communication data processing method provided by the embodiments of the present application；

Fig. 3 is the flow diagram of another voice communication data processing method provided by the embodiments of the present application；

Fig. 4 is the flow diagram of another voice communication data processing method provided by the embodiments of the present application；

Fig. 5 is a kind of structure diagram of voice communication data processing equipment provided by the embodiments of the present application；

Fig. 6 is a kind of structural schematic diagram of mobile terminal provided by the embodiments of the present application；

Fig. 7 is the structural schematic diagram of another mobile terminal provided by the embodiments of the present application.

Specific implementation mode

Further illustrate the technical solution of the application below with reference to the accompanying drawings and specific embodiments.It is appreciated that It is that specific embodiment described herein is used only for explaining the application, rather than the restriction to the application.It further needs exist for illustrating , illustrate only for ease of description, in attached drawing and the relevant part of the application rather than entire infrastructure.

It should be mentioned that some exemplary embodiments are described as before exemplary embodiment is discussed in greater detail The processing described as flow chart or method.Although each step is described as the processing of sequence, many of which by flow chart Step can be implemented concurrently, concomitantly or simultaneously.In addition, the sequence of each step can be rearranged.When its operation The processing can be terminated when completion, it is also possible to the additional step being not included in attached drawing.The processing can be with Corresponding to method, function, regulation, subroutine, subprogram etc..

Fig. 1 is a kind of flow diagram of voice communication data processing method provided by the embodiments of the present application, and this method can To be executed by voice communication data processing equipment, wherein the device can generally be integrated in movement by software and or hardware realization In terminal.As shown in Figure 1, this method includes：

Step 101 detects that the voice communication group in default application program is successfully established.

Illustratively, the mobile terminal in the embodiment of the present application may include the mobile devices such as mobile phone and tablet computer.It is default Application program can be the application program of built-in voice group call function, such as online game application, Online class application, video Conference applications or the other applications etc. for needing multiple person cooperational.

Illustratively, can include 2 members in voice communication group, but in most cases, generally comprise 3 or 3 with On member, you can realize the voice communication between 3 or 3 or more mobile terminals.Voice communication group can be by movement It is initiated and is established using the user of default application program in terminal, after voice communication group is successfully established, wrapped in voice communication group It can be communicated between all mobile terminals contained.In general, when mobile terminal is not in silent mode, it is also not in earphone mould When formula, it will be appreciated that be in outer mode playback for mobile terminal, each the sound of user can be used by oneself in voice communication group Mobile terminal microphone acquisition, and after network transmission and processing by the loud speaker of the mobile terminal of other users into Row plays.By taking game application as an example, as needed association's war of forming a team, phonetic function of forming a team can be opened, it is assumed that there are 5 players in team, that After voice communication group is successfully established, this 5 people can converse between each other, any one player can hear separately simultaneously Outer 4 player's words, seemingly other 4 players oneself talking at one's side the same, facilitate and play in exchange.

The first voice data that step 102, the microphone present period for obtaining mobile terminal acquire.

In the embodiment of the present application, present period can be understood as tracing preset period of time from current time forward and forming Time span.The length of present period can be according to the configuration of mobile terminal, data-handling capacity and voice communication to timeliness Because usually determining, the embodiment of the present application does not limit demand of property etc..For example, can be 300 milliseconds, or 100 milliseconds Arbitrary duration between to 1 second.

In general, when mobile terminal is in outer mode playback, include not only in the collected sound of mobile terminal microphone User itself one's voice in speech, it is also possible to include the sound that the default application program itself that loud speaker plays is sent out, such as background sound It is happy etc., it is also possible to include the sound of ambient enviroment, it is also possible to which that other people speak in the voice communication group played comprising loud speaker Sound, in this way, when the data comprising various sound respectively acquired are sent to the same shifting by multiple mobile terminals by network (such as include 5 mobile terminals in voice communication group, then wherein 4 mobile terminals will be respectively acquiring when dynamic terminal Sound is sent to server, and server gives the audio data transmitting of 4 mobile terminals to the 5th mobile terminal), these sound by Broadcasting can be mixed in the mobile terminal, may will produce phenomenon of uttering long and high-pitched sounds.

In the embodiment of the present application, the first voice data may include the current mobile terminal that user acquires in present period Corresponding user's one's voice in speech data can also include the ambient sound that mobile terminal is presently in environment.Certainly, mobile terminal Loud speaker or receiver play downlink voice communicating data when, user can hear, in addition, the microphone of mobile terminal also can Collect namely the first voice data in also can include the corresponding voice data of downlink voice communicating data.Certainly, if downlink It also can include sound of uttering long and high-pitched sounds in the first voice data acquired comprising sound of uttering long and high-pitched sounds, mobile terminal microphone in voice communication data.

Step 103, when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, the anti-place that utters long and high-pitched sounds is carried out to first voice data Reason, obtains ascending voice communicating data, and it is corresponding that the ascending voice communicating data is sent to the default application program Server.

In the embodiment of the present application, in order to carry out anti-processing of uttering long and high-pitched sounds on suitable opportunity, anti-processing of uttering long and high-pitched sounds can be pre-set The condition that event is triggered.Optionally, it carries out preventing uttering long and high-pitched sounds in order to more targeted, while saving anti-processing operation institute band of uttering long and high-pitched sounds The extra power consumption come can carry out theory analysis or investigation etc. to being easy to happen the scene uttered long and high-pitched sounds, reasonably default scene is arranged, When detecting that mobile terminal is in default scene, the anti-processing event of uttering long and high-pitched sounds of triggering.

Optionally, under the application scenarios of multi-person speech, inventor has found, when the microphone of mobile terminal is in present period When the voice data of acquisition and the higher voice data similarity of acquisition of the upper period, it was easy to happen and utters long and high-pitched sounds.Current mobile terminal Play identical voice data always within certain period, and by the audio data transmitting to server, server will be at this Between other mobile terminals in the identical audio data transmitting to voice communication group that acquires in section, identical voice data is at other When mobile terminal playing, due to the superposition of voice data, sound is made to be amplified at double, to generate sound of uttering long and high-pitched sounds.Therefore, the application In embodiment, the sound for the voice data and the acquisition of a upper period that mobile terminal microphone is acquired in present period can be first determined The similarity of data, and judge whether the similarity of the voice data of two adjacent time intervals is more than a certain similarity threshold, if It is, then the anti-processing event of uttering long and high-pitched sounds of triggering to need to carry out anti-processing of uttering long and high-pitched sounds to the first voice data that present period acquires.It is optional , the voice data and the voice data of acquisition of the upper period that can also acquire mobile terminal microphone in present period carry out Simulation superposition, namely the voice data simulation of two adjacent time intervals is played simultaneously, detection be superimposed after voice data in whether In the presence of sound of uttering long and high-pitched sounds, if in the presence of sound of uttering long and high-pitched sounds, the anti-processing event of uttering long and high-pitched sounds of triggering needs the first sound number acquired to present period According to carrying out anti-processing of uttering long and high-pitched sounds.

Optionally, under the application scenarios of multi-person speech, inventor has found, when there are the distance between two mobile terminals When closer, easily utter long and high-pitched sounds.Assuming that the mobile terminal first and mobile terminal second distance in voice communication group are closer, it is mobile whole The loud speaker of end first can amplify and play the sound of the microphone acquisition of the mobile terminal second received, and eventually due to two movements Hold closer, this sound will be acquired by the microphone of mobile terminal second and be sent to mobile terminal first, the sound quilt again Continue to amplify and play, the positive feedback amplification of sound is easily formed, to generate sound of uttering long and high-pitched sounds.It therefore, can in the embodiment of the present application First judge closer at a distance from current mobile terminal with the presence or absence of other mobile terminals in voice communication, and if it exists, The then anti-processing event of uttering long and high-pitched sounds of triggering needs to carry out anti-processing of uttering long and high-pitched sounds to the first voice data that present period acquires.

It should be noted that the embodiment of the present application is not specifically limited the trigger condition of anti-processing event of uttering long and high-pitched sounds.May be used also Theory analysis or investigation etc. are carried out to being easy to happen other scenes uttered long and high-pitched sounds, reasonably default scene is set, is detecting movement When terminal is in the default scene, the anti-processing event of uttering long and high-pitched sounds of triggering.

In the embodiment of the present application, first voice data that anti-will utter long and high-pitched sounds that treated as ascending voice communicating data, and The ascending voice communicating data is sent to the corresponding server of default application program.The advantages of this arrangement are as follows working as service When the ascending voice communicating data is sent to other mobile terminals in voice communication group by device, it is possible to prevente effectively from other movements are eventually It terminates and there is sound of uttering long and high-pitched sounds in the voice data received.

The voice communication data processing method provided in the embodiment of the present application, in the default application program of mobile terminal After voice communication group is successfully established, the first voice data of the microphone present period acquisition of mobile terminal is obtained, and when detection When being triggered to anti-processing event of uttering long and high-pitched sounds, anti-processing of uttering long and high-pitched sounds is carried out to the first voice data, obtains ascending voice communicating data, and Ascending voice communicating data is sent to the corresponding server of default application program.By using above-mentioned technical proposal, Ke Yi After the voice communication group of default application program in mobile terminal is successfully established, when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, Anti- processing of uttering long and high-pitched sounds is carried out to the ascending voice communicating data of current mobile terminal in time, sound of uttering long and high-pitched sounds is reduced and is brought to user's use Inconvenience.

In some embodiments, described to detect that anti-processing event of uttering long and high-pitched sounds is triggered, including：By first voice data It is compared with pre-stored second sound data, determines that first voice data is similar to the second sound data Degree, wherein the second sound data are the sound clip that the microphone of the mobile terminal was acquired in a upper period；When described When similarity is more than default similarity value, confirmly detects anti-processing event of uttering long and high-pitched sounds and be triggered.The advantages of this arrangement are as follows can Quickly and easily to determine the similarity of the voice data acquired in two adjacent time intervals, and then quickly determine the need for triggering Anti- processing event of uttering long and high-pitched sounds.

Illustratively, pre-stored second sound data are obtained, wherein second sound data can be stored in mobile whole The buffer zone in ascending voice channel is held, also, second sound data are the sound that mobile terminal microphone was acquired in a upper period Tablet section.It is understood that second sound data are not changeless, but, example primary every preset period of time update Such as, primary every 300 milliseconds of updates, i.e., second sound data are the voice data of upper one 300 milliseconds of acquisitions.For example, first Voice data is the sound clips of current 300 milliseconds acquisitions, and by first voice data and the of upper one 300 milliseconds of acquisitions The comparison that two voice datas carry out, determines the similitude of the first voice data and second sound data.Wherein it is possible to by the first sound Sound data carry out overall contrast as a whole, with second sound data, using comparing result as two adjacent voice datas Similarity.Wherein, similarity is bigger, indicates that the first voice data is more similar to second sound number, i.e., the two include identical or Similar voice content is more.When the similarity (such as 0.7) of the first voice data and second sound data is more than default similarity When threshold value (such as 0.5), the anti-processing event of uttering long and high-pitched sounds of triggering.

Optionally, first voice data is compared with pre-stored second sound data, determines described The similarity of one voice data and the second sound data may include：Piecemeal processing is carried out to first voice data, Each data block is compared with the second sound data respectively, obtains sub- similarity corresponding with the data block；It is right The sub- similarity is summed, and the similarity of first voice data and the second sound data is obtained.It is arranged in this way It is advantageous in that, the similarity of the first voice data and second sound data can be accurately determined.Illustratively, according to default The first voice data of unit length pair carries out piecemeal processing, and default unit length can be 30 milliseconds.Assuming that the first voice data For the sound clip in current 300 milliseconds of periods, it is 30 milliseconds to preset unit length, then can be by the first voice data point For 10 data blocks.This 10 data blocks are compared with second sound data respectively, obtain corresponding 10 sub- similarities. By this 10 sub- similarities and as the first voice data and second sound data similarities.It is of course also possible to by each Similarity of the similar mean value of son as the first voice data and second sound data.

In some embodiments, described to detect that anti-processing event of uttering long and high-pitched sounds is triggered, including：By first voice data Simulation overlap-add procedure is carried out with pre-stored second sound data, obtains analog sound data, wherein the second sound number The sound clip acquired in a upper period according to the microphone for the mobile terminal；Exist when determining in the analog sound data When uttering long and high-pitched sounds, confirmly detects anti-processing event of uttering long and high-pitched sounds and be triggered.The advantages of this arrangement are as follows two-phase can be accurately determined When the voice data acquired in the adjacent period plays, if will produce sound of uttering long and high-pitched sounds, and then accurately determine the need for triggering anti-utter long and high-pitched sounds Processing event.

Illustratively, the first voice data of present period acquisition and the second sound data of acquisition of the upper period were carried out Overlap-add procedure obtains analog sound data.Judge whether comprising feature of uttering long and high-pitched sounds in the analog sound data, if including, it is determined that Exist in the analog sound data and utters long and high-pitched sounds a little.Wherein, the feature of uttering long and high-pitched sounds may include that energy is concentrated, periodicity and frequency are higher than Predeterminated frequency threshold value etc..Also detection of uttering long and high-pitched sounds can be carried out to the data after superposition, to judge to whether there is in the analog sound data It utters long and high-pitched sounds a little namely in the analog sound data with the presence or absence of sound of uttering long and high-pitched sounds.

In some embodiments, under type such as can be used to judge in the analog sound data with the presence or absence of sound of uttering long and high-pitched sounds：

The first, piecemeal processing is carried out to the analog sound data；For each data block, using presupposition analysis mode Determine doubtful present in current data block utter long and high-pitched sounds a little；When the multiple doubtful point groups of uttering long and high-pitched sounds that there is presentation periodic feature, and doubt When seemingly uttering long and high-pitched sounds that a little corresponding energy value is in rising trend according to the sequence of affiliated data block, determines and deposited in the analog sound data In sound of uttering long and high-pitched sounds；Wherein, the doubtful point group of uttering long and high-pitched sounds is that continuous adjacent data frequency difference in the block is in doubting in preset range It seemingly utters long and high-pitched sounds a little, the quantity of the continuous adjacent data block reaches default continuous threshold value.

Second, piecemeal processing is carried out to the analog sound data, obtains M data block；Using presupposition analysis mode It analyzes in current data block and utters long and high-pitched sounds a little with the presence or absence of doubtful successively, the doubtful data block uttered long and high-pitched sounds a little will be first appeared and be determined as originating Data block；From the initial data BOB(beginning of block), successively with n data block for data segment to be analyzed, using the presupposition analysis Mode analyze include in current data section doubtful utter long and high-pitched sounds a little, when include in N number of data segment it is doubtful utter long and high-pitched sounds a little between frequency When rate difference is in preset range, determine there is sound of uttering long and high-pitched sounds in the analog sound data；Wherein, n=2,3 ..., N；N is small In or equal to M, it is greater than or equal to 2；The starting point of each data segment is identical as the starting point of initial data block, described Beginning data block is first data segment.

Certainly, other modes also can be used in the embodiment of the present application and utter long and high-pitched sounds to judge to whether there is in analog sound data Sound, the application do not limit.It is described in detail by taking above-mentioned two ways as an example below.

For first way, it can divide according to default unit length to carry out piecemeal processing to analog sound data Block processing, it for example can be 40 milliseconds to preset unit length.Assuming that being overlapped to the first voice data and second sound data After processing, the obtained corresponding time span of analog sound data is 600 milliseconds, and it is 40 milliseconds to preset unit length, then can To be divided into 15 data blocks.

The embodiment of the present application is not especially limited presupposition analysis mode.For example, the presupposition analysis mode may include： The frequency point to be determined that energy value in high-frequency region is higher than preset energy threshold value is obtained on frequency domain, is calculated around the frequency point to be determined The capacity volume variance value of the frequency point of preset quantity determines described to be determined when the capacity volume variance value is more than default discrepancy threshold Frequency point is doubtful utters long and high-pitched sounds a little；The high-frequency region is the frequency range that frequency is higher than predeterminated frequency threshold value.

Specifically, for current data block, frequency domain can be first transformed from the time domain to, spectrum analysis is convenient for.Transformation Mode the embodiment of the present application does not limit, and Fourier transformation mode may be used, such as the fast algorithm (Fast of discrete fourier transform Fourier Transformation, FFT).By taking 40ms as an example, audio data (16bit, 16k sample rate) size of 40ms is 40*16*16/2=1280 bytes, are adapted for use with 1024 and do FFT transform and carry out spectrum analysis, by FFT treated frequencies Frequency range in analysis is 0~16K/2, and step-length is (16K/2)/1024, and step-length is about 8Hz.

In the embodiment of the present application, high-frequency region and other regions can be divided using predeterminated frequency threshold value as cut off value.In advance If frequency threshold can be configured according to actual conditions, such as can according to voice frequency and be susceptible to the frequency feature of howling into Row setting, such as can be 1KHz, 1.5KHz or 2KHz etc..Such as predeterminated frequency threshold value is 2KHz, that is, is more than the portion of 2KHz It is divided into high-frequency region.The frequency of general howling appears in high-frequency region, and sound is larger (i.e. energy value is higher), the application Embodiment can quickly determine that a data are in the block according to energy value characteristic distributions and doubtful utter long and high-pitched sounds a little.

Illustratively, the corresponding energy value of each Frequency point (abbreviation frequency point) in data block is obtained, then from high-frequency region In find energy value be higher than preset energy threshold value frequency point to be determined, calculate the energy of the frequency point of preset quantity around frequency point to be determined Measure difference value.Preset energy threshold value and preset quantity can be arranged according to actual demand, for example, preset energy threshold value can be- 10dB, preset quantity can be 8 (before frequency point to be determined 4 and 4 below).By taking step-length above is about 8Hz as an example, it is assumed that The frequency values of frequency point to be determined be 3362Hz, then around it frequency values of frequency point of preset quantity be about 3330Hz, 3338Hz, 3346Hz, 3354Hz, 3370Hz, 3378Hz, 3386Hz and 3394Hz.Capacity volume variance value is for weighing frequency point to be determined and surrounding Difference degree between the frequency point of preset quantity can be specifically the difference of maximum energy value and minimum energy value, can also be energy Variance yields or energy mean square deviation etc. are measured, the application does not limit.Default discrepancy threshold and corresponding, the example of capacity volume variance value Such as, when capacity volume variance value is energy variance yields, it is default variance threshold values to preset discrepancy threshold.When capacity volume variance value is poor more than default When different threshold value, illustrate frequency point to be determined than more prominent, is very likely to be to utter long and high-pitched sounds a little, accordingly, it is determined that frequency point to be determined is doubtful It utters long and high-pitched sounds a little.In this way setting can rapidly and accurately identify it is doubtful utter long and high-pitched sounds a little, further to judge in the analog sound data With the presence or absence of uttering long and high-pitched sounds, sound lays the first stone.

Illustratively, there may be multiple frequency points to be determined, the application in a data block can be highest from corresponding energy Frequency point to be determined proceeds by the doubtful judgement uttered long and high-pitched sounds a little.

In addition, the presupposition analysis mode may also include：Energy value maximum first in high-frequency region is obtained on frequency domain Maximum second frequency point of energy value in frequency point and low frequency region, when first frequency point meet it is default it is doubtful utter long and high-pitched sounds condition when, really Fixed first frequency point be current data it is in the block it is doubtful utter long and high-pitched sounds a little, it is described that preset the doubtful condition of uttering long and high-pitched sounds include first frequency point Energy value be more than preset energy threshold value, and the energy differences of first frequency point and second frequency point are more than preset difference value threshold Value.

Specifically, for current data block, frequency domain can be first transformed from the time domain to, spectrum analysis is convenient for.Equally It can also preset and divide frequency as cut off value to divide high-frequency region and low frequency region.Default division frequency can be according to practical feelings Condition is configured, and can be such as configured according to voice frequency and the frequency feature for being susceptible to howling, such as can be 1KHz, 1.5KHz or 2KHz etc..Such as the default frequency that divides is 2KHz, that is, the part for being more than 2KHz is high-frequency region, is less than or waits In the part of 2KHz be low frequency region.

Illustratively, the corresponding energy value of each Frequency point in data block is obtained, energy is then found from high-frequency region It is worth maximum first frequency point, maximum second frequency point of energy value is found from low frequency region, if the energy value of the first frequency point is more than in advance If energy threshold (such as -30dB), and the difference of the energy value of the energy value of the first frequency point and the second frequency point is more than preset difference value threshold When being worth (such as 60), it is believed that the first frequency point, which is that current data is in the block, doubtful utters long and high-pitched sounds a little.Setting can rapidly and accurately be known in this way Do not go out it is doubtful utter long and high-pitched sounds a little, further to judge to lay the first stone with the presence or absence of sound of uttering long and high-pitched sounds in the analog sound data.

Illustratively, for each data block, presupposition analysis mode as above is respectively adopted and judges whether doubtful utter long and high-pitched sounds Point, and if it exists, then record it is doubtful utter long and high-pitched sounds a little, and whether further judge in current analog sound data comprising sound of uttering long and high-pitched sounds.

It is understood that if there are doubtful sounds of uttering long and high-pitched sounds in some data block, whole section of analog sound data can not be thought In comprising uttering long and high-pitched sounds sound, it is also possible to since certain especial sounds are misidentified as sound of uttering long and high-pitched sounds, such as the first voice data or second Comprising the ear-piercing sound generated due to object friction in voice data, general frequency is higher and sound is larger, it is likely that quilt It is identified as doubtful sound of uttering long and high-pitched sounds, but this sound is generally very briefer, the duration is shorter, is not belonging to sound of uttering long and high-pitched sounds, and therefore, it is necessary to increase Add further judgement.

In the embodiment of the present application, the characteristic distributions of doubtful sound of uttering long and high-pitched sounds present in each data block are analyzed.When continuous When uttering long and high-pitched sounds there are smaller doubtful of frequency difference in multiple adjacent data blocks, these doubtful utter long and high-pitched sounds can a little be become doubtful howl It is point group.That is, doubtful point group of uttering long and high-pitched sounds be that continuous adjacent data frequency difference in the block is in preset range it is doubtful utter long and high-pitched sounds a little, The quantity of the continuous adjacent data block reaches default continuous threshold value.Wherein, preset continuous threshold value can determines according to actual conditions, Such as 3；The corresponding preset range of frequency difference also can determines according to actual conditions, such as 40Hz.Inventor's discovery, howling Characteristics of SSTA persistence is generally shown in a short time, and is periodically occurred, and in addition sound becomes larger.Therefore, the application is implemented In example, periodic feature is presented into multiple (can be regarded as be greater than or equal to 2) doubtful point groups of uttering long and high-pitched sounds and doubtful is uttered long and high-pitched sounds a little pair The energy value answered is in rising trend as decision condition according to the sequence of affiliated data block, to identify current analog sound data In with the presence or absence of uttering long and high-pitched sounds sound, if meeting above-mentioned condition, it is determined that there is sound of uttering long and high-pitched sounds, can rapidly and accurately identify utter long and high-pitched sounds in this way Sound.

Illustratively, it is assumed that analog sound data is divided into 15 data blocks.If for example, the 1st, 2,3,5,7,8,9,13, It all detects doubtful in the section (A-40, A+40) of frequency in 14 and 15 this 10 data blocks to utter long and high-pitched sounds a little, every 2 data blocks Corresponding doubtful utter long and high-pitched sounds a little becomes a doubtful point group of uttering long and high-pitched sounds, and 5 doubtful, and point groups of uttering long and high-pitched sounds are in periodic feature, and doubtful utters long and high-pitched sounds a little Corresponding energy value is sequentially increased, accordingly, it is determined that including sound of uttering long and high-pitched sounds in analog sound data.For another example, if only the 1st and 2 this 2 numbers It utters long and high-pitched sounds a little according to doubtful in the section (B-40, B+40) of frequency is detected in block, this 2 data blocks are corresponding doubtful to utter long and high-pitched sounds a little As a doubtful point group of uttering long and high-pitched sounds, but there is only this, periodic feature is not presented, accordingly, it can be determined that simulated sound number Sound of uttering long and high-pitched sounds is not included in.

For the second way, piecemeal processing mode and presupposition analysis mode can refer to the phase in first way inside the Pass Hold, the embodiment of the present application repeats no more.

It utters long and high-pitched sounds a little with the presence or absence of doubtful specifically, being analyzed in first data block using above-mentioned presupposition analysis mode, if depositing Then doubtful utter long and high-pitched sounds a little first appears, and first data block is determined as initial data block；If being not present, by current data Next data block of block is analyzed as new current data block, and using above-mentioned presupposition analysis mode in new current data block It utters long and high-pitched sounds a little with the presence or absence of doubtful.And so on, it is determined as initial data block until first appearing the doubtful data block uttered long and high-pitched sounds a little, if It utters long and high-pitched sounds a little there is no doubtful in M data block, then it is believed that not including sound of uttering long and high-pitched sounds in current analog sound data.

By taking above-mentioned partitioned mode as an example, M=15,2≤N≤15.When carrying out spectrum analysis, data length pair to be analyzed Analysis result will produce influence, because when data point is less, precision may not be too accurate, so, it is larger using length Data are analyzed again, are equivalent to there are one modified processing, can more accurately be determined whether to utter long and high-pitched sounds.The application couple The specific value of N does not limit, it is assumed that the length of N=4, a data block are 40ms, then the time range of initial data block It can be denoted as 0 to 40ms, analyzed and finished due to initial data block, and as the first data segment, be the so since n=2 The time range of two data segments, second data segment can be denoted as 0 to 80ms, and so on, the time model of third data segment 0 to 120ms can be denoted as by enclosing, and the time range of third data segment can be denoted as 0 to 160ms.

Illustratively, preset range can be arranged according to actual conditions, such as can be that (such as the example above can recognize 40Hz To be equivalent to 5 step-lengths).Assuming that the doubtful frequency uttered long and high-pitched sounds a little that 4 data piecewise analysis come out is respectively A, B, C and D, and A, B, There is sound of uttering long and high-pitched sounds in analog sound data within 40Hz, then can determine in difference mutual C and D.

In determining the analog sound data exist utter long and high-pitched sounds sound when, doubtful utter long and high-pitched sounds a little is determined as uttering long and high-pitched sounds a little.

In some embodiments, when the similarity is more than default similarity value, anti-processing thing of uttering long and high-pitched sounds is confirmly detected Part is triggered, including：It, will first voice data and pre-stored the when the similarity is more than default similarity value Two voice datas carry out simulation overlap-add procedure, obtain analog sound data；It utters long and high-pitched sounds when determining to exist in the analog sound data When point, confirmly detects anti-processing event of uttering long and high-pitched sounds and be triggered.The advantages of this arrangement are as follows can quickly and accurately determine is The no anti-processing event of uttering long and high-pitched sounds of needs triggering.

In some embodiments, described that anti-processing of uttering long and high-pitched sounds is carried out to first voice data, including：To first sound Sound data carry out weakening process；Alternatively, processing is removed to first voice data, alternatively, determining first sound The larger target audio signal of the sub- similarity of preset quantity in data；The target audio signal carries out weakening process or removal Processing；Or to being carried out at weakening process or removal with a little corresponding audio signal of uttering long and high-pitched sounds in first voice data Reason.Wherein, preset quantity can be arranged according to actual demand, for example, preset quantity can be 3.Assuming that by the first voice data It is divided into 10 data blocks, correspondingly, 10 sub- similarities can be obtained, it is maximum that similarity is chosen from this 10 sub- similarities Three similarities are as target similarity.Believe the corresponding data of target similarity in the first voice data as target audio Number, and weakening process or removal processing are carried out to the target audio signal.Here weakening process may include reducing sound energy Amount, and removal processing here can be understood as filtering out corresponding audio data.

In some embodiments, described to detect that anti-processing event of uttering long and high-pitched sounds is triggered, including：When in the voice communication group In the presence of the destination mobile terminal for being less than pre-determined distance value with the distance between the mobile terminal, it is determined that detect the anti-place that utters long and high-pitched sounds Director's part is triggered.Under the application scenarios of multi-person speech, inventor has found, when there are the distance between two mobile terminals ratios When closer, easily utter long and high-pitched sounds.Therefore, in the embodiment of the present application, can first judge in voice communication with the presence or absence of other shifting Dynamic terminal is closer at a distance from current mobile terminal, and if it exists, the then anti-processing event of uttering long and high-pitched sounds of triggering, and then detect anti-howl Processing event is made to be triggered.Wherein, pre-determined distance value can be set such as can be 20 meters or 10 meters according to actual demand It sets.

In the embodiment of the present application, judge in the voice communication group with the presence or absence of small with the distance between the mobile terminal In pre-determined distance value destination mobile terminal specific judgment mode can there are many kinds of, do not limit, be given below it is several Mode is illustratively.

1, preset sound segment is played using predetermined manner, and receives the anti-of other mobile terminals in the voice communication group Feedforward information, the feedback information include that other described mobile terminals are attempted to acquire sound letter corresponding with the preset sound segment Number result；Judged to whether there is the distance between described mobile terminal in the voice communication group according to the feedback information Less than the destination mobile terminal of pre-determined distance value.

The advantages of this arrangement are as follows can rapidly and accurately judge to whether there is destination mobile terminal, and then quickly Determine the need for the anti-processing event of uttering long and high-pitched sounds of triggering.Illustratively, can by loud speaker with preset volume played pre-recorded or The sound clip obtained in advance；Or, playing the ultrasonic wave segment of predeterminated frequency and preset strength by ultrasonic transmitter.It can root Above-mentioned default volume or predeterminated frequency and preset strength are configured according to pre-determined distance value.The knot for including in feedback information Fruit can refer to whether other mobile terminals can collect the voice signal.When other mobile terminals can collect default sound When the corresponding voice signal of tablet section, illustrate that the distance of two mobile terminals is less than pre-determined distance value.Feedback information can be by presetting The corresponding server of online game application program is forwarded.In addition, may also include collected voice signal in feedback information Attribute information, such as intensity of sound, since the intensity of the sound of mobile terminal playing is known, with the propagation meeting of sound Decayed, propagation distance is remoter, and attenuation degree is higher, can be according to strength information of the voice signal in feedback information etc. come really Other fixed mobile terminals judge whether the distance is less than pre-determined distance value at a distance from current mobile terminal.

2, obtain the mobile terminal the first location information and other mobile terminals in the voice communication group the Two location informations；According to first location information and second location information, judge whether deposited in the voice communication group It is less than the destination mobile terminal of the pre-determined distance value at a distance between the mobile terminal.

The advantages of this arrangement are as follows mobile terminal generally has positioning function, location information can be utilized quick and precisely Ground is judged to whether there is destination mobile terminal, and then quickly determines the need for the anti-processing event of uttering long and high-pitched sounds of triggering.Illustratively, Mobile terminal can be obtained by positioning methods such as global positioning system (Global Positioning System, GPS) or the Big Dippeves Location information also can obtain location information by modes such as base station location or network positions.Location information may include that longitude and latitude is sat Mark etc..Second location information of other mobile terminals in voice communication group can be corresponding by default online game application program Server is forwarded to current mobile terminal.Current mobile terminal comes the first location information of itself and server forwarding at least One the second location information is compared one by one, judges whether between second location information and the first location information Distance is less than pre-determined distance value.

3, other mobile terminals in the first WiFi information and the voice communication group that the mobile terminal connects are obtained 2nd WiFi information of connection；According to the first WiFi information and the 2nd WiFi information, the voice communication group is judged In with the presence or absence of the destination mobile terminal for being less than the pre-determined distance value with the distance between described mobile terminal.

The advantages of this arrangement are as follows user is to save campus network, generally by the way of connecting Wi-Fi hotspot into Row voice communication can rapidly and accurately be judged to whether there is destination mobile terminal using this feature, and then quickly be determined Whether triggering anti-utter long and high-pitched sounds processing event is needed.Illustratively, the attribute information of Wi-Fi hotspot, attribute letter are may include in WiFi information Breath for example can be the media access control (Media Access Control, MAC) of Wi-Fi hotspot title or Wi-Fi hotspot Location etc. may also include WiFi signal intensity etc..In general, the signal effective range of Wi-Fi hotspot is limited, generally at 50 meters or so, It, can be according to whether there are the WiFi of the 2nd WiFi information heat if pre-determined distance value is more than the signal effective range of Wi-Fi hotspot Attribute information is identical as the Wi-Fi hotspot attribute information of the first WiFi information whether there is to determine in the voice communication group for point The distance between mobile terminal is less than the destination mobile terminal of pre-determined distance value, if there are any one the 2nd WiFi information Wi-Fi hotspot attribute information is identical as the Wi-Fi hotspot attribute information of the first WiFi information, it is determined that there are mesh in voice communication group Mark mobile terminal, that is to say, that when in voice communication group there are one other mobile terminals connect with current mobile terminal it is same When Wi-Fi hotspot, it is believed that other mobile terminals are destination mobile terminal.In addition, if pre-determined distance value is less than Wi-Fi hotspot Signal effective range, such as 10 meters, then the movement of the same Wi-Fi hotspot can be connected further according to WiFi signal strength estimation Terminal the distance between determines two mobile terminals respectively at a distance from Wi-Fi hotspot, judges whether the distance is less than in advance If distance value.

4, the third voice data of microphone acquisition is obtained, and obtains the downlink voice communicating data in mobile terminal； Wherein, the sound that the loud speaker not comprising the mobile terminal plays in the third voice data；According to the third sound The sound for whether including same person in data and the downlink voice communicating data, judges whether deposited in the voice communication group It is less than the destination mobile terminal of the pre-determined distance value at a distance between the mobile terminal.

The advantages of this arrangement are as follows can not be quick by other information (such as above-mentioned location information or WiFi information) Accurately judge to whether there is destination mobile terminal, and then quickly determines the need for the anti-processing event of uttering long and high-pitched sounds of triggering.Example Property, the sound that the loud speaker not comprising the mobile terminal plays in third voice data may be accomplished by： The loud speaker of mobile terminal is closed during acquisition third voice data and downlink voice communicating data；Alternatively, The loud speaker of mobile terminal is in open state, third during obtaining third voice data and downlink voice communicating data Voice data is to filter out the sound number obtained after the voice data of loud speaker broadcasting in all voice datas that microphone acquires According to.When two user's hand-held mobile terminals and closer distance, it is assumed that user's first uses mobile terminal first, user's second to use movement Terminal second, user's first one's voice in speech acquire and are sent to mobile terminal second, mobile terminal second by the microphone of mobile terminal first Downlink voice communicating data in can include user's first one's voice in speech, and due to user's first and closer, the user of user second distance First one's voice in speech can also be acquired by the microphone of mobile terminal second, therefore, for mobile terminal second, microphone acquisition Third voice data and acquisition downlink voice communicating data in include the sound of same person (user's first), so that it is determined that language There are the distance between mobile terminal first and mobile terminal second to be less than pre-determined distance value in sound phone group, i.e., for mobile terminal second For, mobile terminal first is destination mobile terminal.

It is understood that the combination of any one or more above-mentioned mode can be chosen according to actual conditions to judge to be It is no there are destination mobile terminal, the embodiment of the present application does not limit.Moreover, it is judged that with the presence or absence of the related step of destination mobile terminal Suddenly it can also be completed by the corresponding server of default online game application program, when server judges that there are destination mobile terminals When, it will determine that result is sent to mobile terminal, the judging result is used to indicate the anti-processing event of uttering long and high-pitched sounds of mobile terminal triggering.Phase It answers, the method for the embodiment of the present application further includes receiving the corresponding server of the default online game application program to send Judging result, when in the judging result including following content, the anti-processing event of uttering long and high-pitched sounds of triggering：It is deposited in the voice communication group It is less than the destination mobile terminal of pre-determined distance value at a distance between the mobile terminal.The specific deterministic process of server can With reference to several judgment modes of above-mentioned offer, the embodiment of the present application does not repeat.

In some embodiments, described that anti-processing of uttering long and high-pitched sounds is carried out to first voice data, obtain ascending voice call Data, including：Voice and background sound lock out operation are carried out to first voice data；The background sound isolated is weakened Processing；After background sound after weakening process is carried out stereo process with the voice isolated, ascending voice call number is obtained According to.The advantages of this arrangement are as follows can effectively prevent generating the possibility for sound of uttering long and high-pitched sounds due to the influence of background sound.It is exemplary , when, there are when microphone array (number of microphone be greater than or equal to 2), can determine whether out sound source position in mobile terminal, according to Sound source position filters out the sound apart from mobile terminal (such as larger than 1 meter) farther out as background sound；Alternatively, movement can be obtained in advance The voiceprint of terminal user extracts user's one's voice in speech as voice, residue according to voiceprint from voice data Sound as background sound.Illustratively, it can be the side by adjusting gain to carry out weakening process to the background sound isolated Formula reduces the sound of background sound, can also wiping out background sound.Background sound is after weakening process, volume down, destroys sound and gets over Carry out bigger condition, and then effectively weakens and uttered long and high-pitched sounds caused by background sound.

Optionally, it after the background sound after weakening process is carried out stereo process with the voice isolated, obtains Before row voice communication data, further include：Enhancing processing is carried out to the voice isolated；The back of the body by after weakening process After scape sound and the voice isolated carry out stereo process, ascending voice communicating data is obtained, including：It is described to pass through weakening process Rear background sound and treated after voice carries out stereo process by enhancing, obtains ascending voice communicating data.It is arranged in this way Be advantageous in that, ensure in time to the ascending voice communicating data of current mobile terminal carry out it is anti-utter long and high-pitched sounds processing while, Other users in voice communication group can also be enable more clearly to hear that current mobile terminal corresponds to the sound of user.

Fig. 2 is the flow diagram of another voice communication data detection method provided by the embodiments of the present application, with default Application program is for online game application program, this method comprises the following steps：

Step 201 detects that the voice communication group in default game application is successfully established.

Illustratively, by taking team's battle game as an example, such as king's honor, there are 5 players in every team, and Hong Lan two teams carry out pair It fights, needs progress communication exchange to discuss battle strategy between 5 players of each troop, therefore, many players can select to open Voice call function in team, if a player applies opening in team after voice call function, voice communication group is successfully established.This Afterwards, with any one in 5 players of World War I team, remaining 4 player's one's voice in speech can be heard.In general, player can incite somebody to action Mobile terminal is set as outer mode playback, convenience gaming.

The first voice data that step 202, the microphone present period for obtaining mobile terminal acquire.

Step 203 carries out piecemeal processing to the first voice data, and each data block is carried out with second sound data respectively It compares, obtains sub- similarity corresponding with data block.

Wherein, the sound clip that second sound data acquired for the microphone of mobile terminal in a upper period.

Step 204 sums to sub- similarity, obtains the similarity of the first voice data and second sound data.

Step 205 judges whether similarity is more than default similarity value, if so, thening follow the steps 206, otherwise, execution is returned Receipt row step 202.

Step 206 determines the larger target audio signal of the sub- similarity of preset quantity in the first voice data.

Step 207 carries out target audio signal weakening process or removal processing, obtains ascending voice communicating data, and Ascending voice communicating data is sent to the corresponding server of default application program.

After voice communication group of the embodiment of the present application in game application is successfully established, the microphone for obtaining mobile terminal is worked as First voice data of preceding period acquisition, and piecemeal processing carried out to the first voice data, and by each data block with deposit in advance The second sound data of the acquisition of the upper period of storage are compared, and determine the similar of the first voice data and second sound data Degree, when similarity is higher, the target audio signal larger to the sub- similarity of preset quantity in the first voice data is cut Weak or removal processing carries out anti-processing of uttering long and high-pitched sounds to the ascending voice communicating data of current mobile terminal in time, avoids sound of uttering long and high-pitched sounds Game process is interfered, game player's pain spot is reduced, keeps the function of mobile terminal more perfect.

Fig. 3 is the flow diagram of another voice communication data detection method provided by the embodiments of the present application, with default Application program is for online game application program, this method comprises the following steps：

Step 301 detects that the voice communication group in default game application is successfully established.

The first voice data that step 302, the microphone present period for obtaining mobile terminal acquire.

First voice data and pre-stored second sound data are carried out simulation overlap-add procedure by step 303, obtain mould Quasi- voice data.

Wherein, the sound clip that second sound data acquired for the microphone of mobile terminal in a upper period；

Step 304 judges, if so, thening follow the steps 305, otherwise, to execute with the presence or absence of uttering long and high-pitched sounds a little in analog sound data Return to step 302.

Step 305, to carrying out weakening process with a little corresponding audio signal of uttering long and high-pitched sounds in the first voice data or removal is handled, Ascending voice communicating data is obtained, and ascending voice communicating data is sent to the corresponding server of default application program.

After voice communication group of the embodiment of the present application in game application is successfully established, the microphone for obtaining mobile terminal is worked as First voice data of preceding period acquisition, and the second sound number that the first voice data was acquired with a pre-stored upper period According to carrying out simulation overlap-add procedure, when exist in the voice data after determining overlap-add procedure utter long and high-pitched sounds when, in the first voice data Weakening process is carried out with a little corresponding audio signal of uttering long and high-pitched sounds or removal is handled, it can be in time to the uplink language of current mobile terminal Sound communicating data carries out anti-processing of uttering long and high-pitched sounds, and the sound that avoids uttering long and high-pitched sounds interferes game process, and reduction game player's pain spot makes movement The function of terminal is more perfect.

Fig. 4 is the flow diagram of another voice communication data detection method provided by the embodiments of the present application, with default Application program is for online game application program, this method comprises the following steps：

Step 401 detects that the voice communication group in default game application is successfully established.

The first voice data that step 402, the microphone present period for obtaining mobile terminal acquire.

Step 403 judges that whether there is the distance between mobile terminal in voice communication group is less than pre-determined distance value Destination mobile terminal, if so, thening follow the steps 404；Otherwise, step 403 is repeated.

Step 404 carries out voice and background sound lock out operation to the first voice data.

Step 405 carries out weakening process to the background sound isolated.

Step 406 carries out enhancing processing to the voice isolated.

Step 407, by after weakening process background sound and after enhancing treated voice carries out stereo process, Ascending voice communicating data is obtained, and ascending voice communicating data is sent to the corresponding server of default application program.

It should be noted that the embodiment of the present application does not limit the execution sequence of step 405 and step 406, wherein can To first carry out step 405, then step 406 is executed, step 406 can also be first carried out, then execute step 405, can also held simultaneously Row step 405 and step 406.

After voice communication group of the embodiment of the present application in game application is successfully established, the microphone for obtaining mobile terminal is worked as First voice data of preceding period acquisition, and with the presence or absence of closer with current mobile terminal distance in determining voice communication group When other mobile terminals, the lock out operation of voice and background sound is carried out to the first voice data, and weakening process is carried out to background sound Or removal processing, can anti-processing of uttering long and high-pitched sounds quickly, in time be carried out to the ascending voice communicating data of current mobile terminal, kept away Sound of exempting to utter long and high-pitched sounds interferes game process, reduces game player's pain spot, keeps the function of mobile terminal more perfect.

Fig. 5 is a kind of structure diagram of voice communication data processing equipment provided by the embodiments of the present application, which can be by Software and or hardware realization is typically integrated in mobile terminal, can be by executing voice communication data processing method come to voice Communicating data carries out anti-processing of uttering long and high-pitched sounds.As shown in figure 5, the device includes：

Voice communication group detection module 501, the voice communication group for detecting in default application program are successfully established；

Voice data acquisition module 502, the first sound number that the microphone present period for obtaining mobile terminal acquires According to；

Anti- processing module 503 of uttering long and high-pitched sounds, for when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, to the first sound number According to anti-processing of uttering long and high-pitched sounds is carried out, ascending voice communicating data is obtained, and the ascending voice communicating data is sent to described preset The corresponding server of application program.

The voice communication data processing equipment provided in the embodiment of the present application, can be in the terminal it is default apply journey After the voice communication group of sequence is successfully established, when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, in time to current mobile terminal Ascending voice communicating data carries out anti-processing of uttering long and high-pitched sounds, and reduces sound of uttering long and high-pitched sounds and uses the inconvenience brought to user.

Optionally, described to detect that anti-processing event of uttering long and high-pitched sounds is triggered, including：

First voice data is compared with pre-stored second sound data, determines the first sound number According to the similarity with the second sound data, wherein the second sound data are the microphone of the mobile terminal upper The sound clip of one period acquisition；

When the similarity is more than default similarity value, confirmly detects anti-processing event of uttering long and high-pitched sounds and be triggered.

First voice data and pre-stored second sound data are subjected to simulation overlap-add procedure, obtain simulated sound Sound data, wherein the second sound data are the sound clip that the microphone of the mobile terminal was acquired in a upper period；

When determine exist in the analog sound data utter long and high-pitched sounds when, confirmly detect anti-processing event of uttering long and high-pitched sounds and be triggered.

Optionally, the anti-processing module of uttering long and high-pitched sounds, is used for：

Weakening process is carried out to first voice data；

Alternatively, being removed processing to first voice data.

Optionally, first voice data is compared with pre-stored second sound data, determines described The similarity of one voice data and the second sound data, including：

Piecemeal processing is carried out to first voice data, each data block is carried out with the second sound data respectively It compares, obtains sub- similarity corresponding with the data block；

It sums to the sub- similarity, it is similar to the second sound data to obtain first voice data Degree；

The anti-processing module of uttering long and high-pitched sounds, is used for：

Determine the larger target audio signal of the sub- similarity of preset quantity in first voice data；

The target audio signal carries out weakening process or removal processing.

To carrying out weakening process or removal processing with a little corresponding audio signal of uttering long and high-pitched sounds in first voice data.

It is moved when there is target of the distance between the described mobile terminal less than pre-determined distance value in the voice communication group Dynamic terminal, it is determined that detect that anti-processing event of uttering long and high-pitched sounds is triggered.

Optionally, the anti-processing module of uttering long and high-pitched sounds, including：

Sound separative element, for carrying out voice and background sound lock out operation to first voice data；

Background sound de-emphasis unit, for carrying out weakening process to the background sound isolated；

Stereo process unit, for the background sound after weakening process to be carried out stereo process with the voice isolated Afterwards, ascending voice communicating data is obtained.

Optionally, further include：

Voice enhancement unit, for the background sound after weakening process to be carried out stereo process with the voice isolated Afterwards, before obtaining ascending voice communicating data, enhancing processing is carried out to the voice isolated；

The stereo process unit, is used for：

The background sound by after weakening process and treated after voice carries out stereo process by enhancing, obtains Ascending voice communicating data.

Optionally, the default application program is online game application program.

The embodiment of the present application also provides a kind of storage medium including computer executable instructions, and the computer is executable When being executed by computer processor for executing voice communication data processing method, this method includes for instruction：

Storage medium --- any various types of memory devices or storage device.Term " storage medium " is intended to wrap It includes：Install medium, such as CD-ROM, floppy disk or magnetic tape equipment；Computer system memory or random access memory, such as DRAM, DDRRAM, SRAM, EDORAM, blue Bath (Rambus) RAM etc.；Nonvolatile memory, such as flash memory, magnetic medium (example Such as hard disk or optical storage)；The memory component etc. of register or other similar types.Storage medium can further include other types Memory or combinations thereof.In addition, storage medium can be located at program in the first computer system being wherein performed, or It can be located in different second computer systems, second computer system is connected to the first meter by network (such as internet) Calculation machine system.Second computer system can provide program instruction to the first computer for executing.Term " storage medium " can To include two or more that may reside in different location (such as in different computer systems by network connection) Storage medium.Storage medium can store the program instruction that can be executed by one or more processors and (such as be implemented as counting Calculation machine program).

Certainly, a kind of storage medium including computer executable instructions that the embodiment of the present application is provided, computer The voice communication data processing operation that executable instruction is not limited to the described above can also be performed the application any embodiment and be carried Relevant operation in the voice communication data processing method of confession.

The embodiment of the present application provides a kind of mobile terminal, and language provided by the embodiments of the present application can be integrated in the mobile terminal Sound communicating data processing unit.Fig. 6 is a kind of structural schematic diagram of mobile terminal provided by the embodiments of the present application.Mobile terminal 600 may include：Memory 601, processor 602 and is stored in the computer that can be run on memory 601 and in processor 602 Program, the processor 602 are realized when executing the computer program at the voice communication data as described in the embodiment of the present application Reason method.

Mobile terminal provided by the embodiments of the present application, the voice communication group of default application program that can be in the terminal After being successfully established, when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, in time to the ascending voice of current mobile terminal call number According to anti-processing of uttering long and high-pitched sounds is carried out, reduces sound of uttering long and high-pitched sounds and use the inconvenience brought to user.

Fig. 7 is the structural schematic diagram of another mobile terminal provided by the embodiments of the present application, which may include： Shell (not shown), memory 701, central processing unit (central processing unit, CPU) 702 (are also known as located Manage device, hereinafter referred to as CPU), circuit board (not shown) and power circuit (not shown).The circuit board is placed in institute State the space interior that shell surrounds；The CPU702 and the memory 701 are arranged on the circuit board；The power supply electricity Road, for being each circuit or the device power supply of the mobile terminal；The memory 701, for storing executable program generation Code；The CPU702 is run and the executable journey by reading the executable program code stored in the memory 701 The corresponding computer program of sequence code, to realize following steps：

The mobile terminal further includes：Peripheral Interface 703, RF (Radio Frequency, radio frequency) circuit 705, audio-frequency electric Road 706, loud speaker 711, power management chip 708, input/output (I/O) subsystem 709, other input/control devicess 710, Touch screen 712, other input/control devicess 710 and outside port 704, these components pass through one or more communication bus Or signal wire 707 communicates.

It should be understood that diagram mobile terminal 700 is only an example of mobile terminal, and mobile terminal 700 Can have than shown in the drawings more or less component, can combine two or more components, or can be with It is configured with different components.Various parts shown in the drawings can be including one or more signal processings and/or special It is realized in the combination of hardware, software or hardware and software including integrated circuit.

Just the mobile terminal provided in this embodiment for voice communication data processing is described in detail below, the shifting Dynamic terminal is by taking mobile phone as an example.

Memory 701, the memory 701 can be by access such as CPU702, Peripheral Interfaces 703, and the memory 701 can Can also include nonvolatile memory to include high-speed random access memory, such as one or more disk memory, Flush memory device or other volatile solid-state parts.

The peripheral hardware that outputs and inputs of equipment can be connected to CPU702 and deposited by Peripheral Interface 703, the Peripheral Interface 703 Reservoir 701.

I/O subsystems 709, the I/O subsystems 709 can be by the input/output peripherals in equipment, such as touch screen 712 With other input/control devicess 710, it is connected to Peripheral Interface 703.I/O subsystems 709 may include 7091 He of display controller One or more input controllers 7092 for controlling other input/control devicess 710.Wherein, one or more input controls Device 7092 processed receives electric signal from other input/control devicess 710 or sends electric signal to other input/control devicess 710, Other input/control devicess 710 may include physical button (pressing button, rocker buttons etc.), dial, slide switch, behaviour Vertical pole clicks idler wheel.It is worth noting that input controller 7092 can with it is following any one connect：Keyboard, infrared port, The indicating equipment of USB interface and such as mouse.

Touch screen 712, the touch screen 712 are the input interface and output interface between customer mobile terminal and user, Visual output is shown to user, visual output may include figure, text, icon, video etc..

Display controller 7091 in I/O subsystems 709 receives electric signal from touch screen 712 or is sent out to touch screen 712 Electric signals.Touch screen 712 detects the contact on touch screen, and the contact detected is converted to and is shown by display controller 7091 The interaction of user interface object on touch screen 712, that is, realize human-computer interaction, the user interface being shown on touch screen 712 Object can be the icon of running game, be networked to the icon etc. of corresponding network.It is worth noting that equipment can also include light Mouse, light mouse are the extensions for the touch sensitive surface for not showing the touch sensitive surface visually exported, or formed by touch screen.

RF circuits 705 are mainly used for establishing the communication of mobile phone and wireless network (i.e. network side), realize mobile phone and wireless network The data receiver of network and transmission.Such as transmitting-receiving short message, Email etc..Specifically, RF circuits 705 receive and send RF letters Number, RF signals are also referred to as electromagnetic signal, and RF circuits 705 convert electrical signals to electromagnetic signal or electromagnetic signal is converted to telecommunications Number, and communicated with communication network and other equipment by the electromagnetic signal.RF circuits 705 may include for executing The known circuit of these functions comprising but it is not limited to antenna system, RF transceivers, one or more amplifiers, tuner, one A or multiple oscillators, digital signal processor, CODEC (COder-DECoder, coder) chipset, user identifier mould Block (Subscriber Identity Module, SIM) etc..

Voicefrequency circuit 706 is mainly used for receiving audio data from Peripheral Interface 703, which is converted to telecommunications Number, and the electric signal is sent to loud speaker 711.

Loud speaker 711, the voice signal for receiving mobile phone from wireless network by RF circuits 705, is reduced to sound And play the sound to user.

Power management chip 708, the hardware for being connected by CPU702, I/O subsystem and Peripheral Interface are powered And power management.

Voice communication data processing equipment, storage medium and the mobile terminal provided in above-described embodiment can perform the application The voice communication data processing method that any embodiment is provided has and executes the corresponding function module of this method and beneficial to effect Fruit.The not technical detail of detailed description in the above-described embodiments, reference can be made to the voice communication that the application any embodiment is provided Data processing method.

Note that above are only preferred embodiment and the institute's application technology principle of the application.It will be appreciated by those skilled in the art that The application is not limited to specific embodiment described here, can carry out for a person skilled in the art it is various it is apparent variation, The protection domain readjusted and substituted without departing from the application.Therefore, although being carried out to the application by above example It is described in further detail, but the application is not limited only to above example, in the case where not departing from the application design, also May include other more equivalent embodiments, and scope of the present application is determined by scope of the appended claims.

Claims

1. a kind of voice communication data processing method, which is characterized in that including：

When detecting that anti-processing event of uttering long and high-pitched sounds is triggered, anti-processing of uttering long and high-pitched sounds is carried out to first voice data, obtains uplink Voice communication data, and the ascending voice communicating data is sent to the corresponding server of the default application program.

2. according to the method described in claim 1, it is characterized in that, described detect that anti-processing event of uttering long and high-pitched sounds is triggered, including：

First voice data is compared with pre-stored second sound data, determine first voice data with The similarity of the second sound data, wherein the second sound data were the microphone of the mobile terminal in upper a period of time The sound clip of section acquisition；

3. according to the method described in claim 1, it is characterized in that, described detect that anti-processing event of uttering long and high-pitched sounds is triggered, including：

First voice data and pre-stored second sound data are subjected to simulation overlap-add procedure, obtain simulated sound number According to, wherein the second sound data are the sound clip that the microphone of the mobile terminal was acquired in a upper period；

4. according to the method in claim 2 or 3, which is characterized in that described to carry out anti-utter long and high-pitched sounds to first voice data Processing, including：

Weakening process is carried out to first voice data；

Alternatively, being removed processing to first voice data.

5. according to the method described in claim 2, it is characterized in that, by first voice data and the pre-stored rising tone Sound data are compared, and determine the similarity of first voice data and the second sound data, including：

Piecemeal processing is carried out to first voice data, each data block is compared with the second sound data respectively It is right, obtain sub- similarity corresponding with the data block；

It sums to the sub- similarity, obtains the similarity of first voice data and the second sound data；

It is described that anti-processing of uttering long and high-pitched sounds is carried out to first voice data, including：

Weakening process or removal processing are carried out to the target audio signal.

6. according to the method described in claim 3, it is characterized in that, described carry out the anti-place that utters long and high-pitched sounds to first voice data Reason, including：

7. according to the method described in claim 1, it is characterized in that, described detect that anti-processing event of uttering long and high-pitched sounds is triggered, including：

It is less than the target movement of pre-determined distance value eventually when there is the distance between described mobile terminal in the voice communication group End, it is determined that detect that anti-processing event of uttering long and high-pitched sounds is triggered.

8. according to the method described in claim 1, it is characterized in that, described carry out the anti-place that utters long and high-pitched sounds to first voice data Reason, obtains ascending voice communicating data, including：

Voice and background sound lock out operation are carried out to first voice data；

Weakening process is carried out to the background sound isolated；

After background sound after weakening process is carried out stereo process with the voice isolated, ascending voice call number is obtained According to.

9. according to the method described in claim 8, it is characterized in that, by after weakening process background sound and isolate After voice carries out stereo process, before obtaining ascending voice communicating data, further include：

Enhancing processing is carried out to the voice isolated；

After the background sound by after weakening process and the voice isolated carry out stereo process, ascending voice call is obtained Data, including：

The background sound by after weakening process and treated after voice carries out stereo process by enhancing, obtains uplink Voice communication data.

10. according to the method described in claim 1, it is characterized in that, the default application program is online game application program.

11. a kind of voice communication data processing equipment, which is characterized in that including：

Anti- processing module of uttering long and high-pitched sounds, for when detecting that anti-processing event of uttering long and high-pitched sounds is triggered, being carried out to first voice data Anti- processing of uttering long and high-pitched sounds obtains ascending voice communicating data, and the ascending voice communicating data is sent to described default using journey The corresponding server of sequence.

12. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The voice communication data processing method as described in any in claim 1-10 is realized when execution.

13. a kind of mobile terminal, which is characterized in that including memory, processor and storage are on a memory and can be in processor The computer program of operation, the processor realize the language as described in claim 1-10 is any when executing the computer program Sound communicating data processing method.