CN108449496B

CN108449496B - Voice call data detection method and device, storage medium and mobile terminal

Info

Publication number: CN108449496B
Application number: CN201810201116.XA
Authority: CN
Inventors: 郑志勇; 柳明; 李智豪
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2018-03-12
Filing date: 2018-03-12
Publication date: 2019-12-10
Anticipated expiration: 2038-03-12
Also published as: CN108449496A

Abstract

the embodiment of the application discloses a voice call data detection method, a voice call data detection device, a storage medium and a mobile terminal. The method comprises the following steps: after a voice call group in a preset application program is successfully established, detecting that a howling detection event is triggered, acquiring downlink voice call data with a preset time length, performing blocking processing, acquiring a first frequency point with the largest energy value in a high-frequency area and a second frequency point with the largest energy value in a low-frequency area in a frequency domain for each data block, determining that the first frequency point is a suspected howling point in the current data block when the first frequency point meets a preset suspected howling condition, and determining that howling sound exists in the downlink voice call data when a plurality of suspected howling point groups with periodic characteristics exist and the energy values corresponding to the suspected howling points are in an ascending trend according to the sequence of the data blocks. By adopting the technical scheme, the embodiment of the application can timely and accurately carry out howling detection on the downlink voice call data.

Description

Voice call data detection method and device, storage medium and mobile terminal

Technical Field

The embodiment of the application relates to the technical field of voice call, in particular to a voice call data detection method, a voice call data detection device, a storage medium and a mobile terminal.

background

at present, with the rapid popularization of mobile terminals, mobile terminals such as mobile phones and tablet computers have become one of the necessary communication tools for people. Communication modes between mobile terminal users are becoming more and more abundant, and are not limited to traditional telephone and short message services provided by mobile communication operators for a long time, and in many scenarios, users tend to use internet-based communication modes, such as voice chat and video chat functions in various social software.

In addition, the functions of Application programs (APP) in the mobile terminal are increasingly improved, and a voice call function is set in many APP programs, so that communication between users using the same APP program is facilitated. Taking a game application as an example, some games requiring interaction between players have a built-in voice communication function added, and a user can perform voice communication with other players in the process of playing the games by using a mobile terminal. However, in the voice call process, the voice data includes many kinds of voices, such as voices spoken by each player, voices of the application program itself (e.g., background sounds or special effects of a game), and other voices in the environment where the mobile terminal is located, and the voice is relatively complicated, so that a howling phenomenon is easily generated, which seriously affects the use of the user.

Disclosure of Invention

The embodiment of the application provides a voice call data detection method, a voice call data detection device, a storage medium and a mobile terminal, which can timely and accurately detect howling when a voice call function in a mobile terminal application program is started.

In a first aspect, an embodiment of the present application provides a voice call data detection method, including:

After a voice call group in a preset application program is successfully established, detecting that a howling detection event is triggered;

Acquiring downlink voice call data with a preset time length in a mobile terminal, and performing blocking processing on the downlink voice call data;

For each data block, acquiring a first frequency point with the largest energy value in a high-frequency area and a second frequency point with the largest energy value in a low-frequency area on a frequency domain, and determining the first frequency point as a suspected howling point in the current data block when the first frequency point meets a preset suspected howling condition; the preset suspected howling condition comprises that the energy value of the first frequency point is greater than a preset energy threshold value, and the energy difference value between the first frequency point and the second frequency point is greater than a preset difference threshold value;

When a plurality of suspected howling point groups presenting periodic characteristics exist and energy values corresponding to the suspected howling points are in an ascending trend according to the sequence of the data blocks, determining that howling sound exists in the downlink voice call data; the suspected howling point group is a suspected howling point of which the frequency difference in the continuous adjacent data blocks is within a preset range, and the number of the continuous adjacent data blocks reaches a preset continuous threshold value.

In a second aspect, an embodiment of the present application provides a voice call data detection apparatus, including:

The trigger detection module is used for detecting that a howling detection event is triggered after a voice call group in a preset application program is successfully established;

the system comprises a downlink voice data acquisition module, a voice call processing module and a voice call processing module, wherein the downlink voice data acquisition module is used for acquiring downlink voice call data with a preset time length in a mobile terminal and carrying out blocking processing on the downlink voice call data;

A suspected howling point determining module, configured to, for each data block, obtain, in a frequency domain, a first frequency point with a largest energy value in a high-frequency region and a second frequency point with a largest energy value in a low-frequency region, and when the first frequency point meets a preset suspected howling condition, determine that the first frequency point is a suspected howling point in a current data block; the preset suspected howling condition comprises that the energy value of the first frequency point is greater than a preset energy threshold value, and the energy difference value between the first frequency point and the second frequency point is greater than a preset difference threshold value;

a howling sound determination module, configured to determine that a howling sound exists in the downlink voice call data when multiple suspected howling point groups exhibiting periodic characteristics exist and energy values corresponding to the suspected howling points are in an ascending trend according to the sequence of the data blocks to which the suspected howling points belong; the suspected howling point group is a suspected howling point of which the frequency difference in the continuous adjacent data blocks is within a preset range, and the number of the continuous adjacent data blocks reaches a preset continuous threshold value.

In a third aspect, an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a voice call data detection method according to an embodiment of the present application.

In a fourth aspect, an embodiment of the present application provides a mobile terminal, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the voice call data detection method according to the embodiment of the present application.

According to the voice call data detection scheme provided by the embodiment of the application, after a voice call group in a preset application program is successfully established, when a howling detection event is detected to be triggered, downlink voice call data with a preset time length in a mobile terminal are obtained, and blocking processing is carried out; for each data block, respectively determining whether a suspected howling point exists; and then, according to the distribution situation of the suspected howling points, quickly determining whether howling sound exists in the downlink voice call data. By adopting the technical scheme, after the voice call group of the preset application program in the mobile terminal is successfully established, howling detection can be timely and accurately carried out on the downlink voice call data, so that corresponding measures can be taken subsequently, and inconvenience brought to users by howling is reduced.

Drawings

Fig. 1 is a schematic flowchart of a voice call data detection method according to an embodiment of the present application;

Fig. 2 is a schematic flowchart of another voice call data detection method according to an embodiment of the present application;

Fig. 3 is a block diagram illustrating a voice call data detection apparatus according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a mobile terminal according to an embodiment of the present application;

Fig. 5 is a schematic structural diagram of another mobile terminal according to an embodiment of the present application.

Detailed Description

the technical scheme of the application is further explained by the specific implementation mode in combination with the attached drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be further noted that, for the convenience of description, only some of the structures related to the present application are shown in the drawings, not all of the structures.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.

Fig. 1 is a flowchart illustrating a voice call data detection method according to an embodiment of the present application, where the method may be executed by a voice call data detection apparatus, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in a mobile terminal. As shown in fig. 1, the method includes:

step 101, after a voice call group in a preset application program is established successfully, it is detected that a howling detection event is triggered.

for example, the mobile terminal in the embodiment of the present application may include mobile devices such as a mobile phone and a tablet computer. The preset application may be an application with built-in voice group call function, such as a network game application, an online classroom application, a video conference application, or other applications that require multi-person collaboration, and so on.

for example, the voice call group may include 2 members, but in most cases, the voice call group generally includes 3 or more than 3 members, that is, voice calls between 3 or more than 3 mobile terminals can be realized. The voice talk group can be established by user initiation using a preset application program on the mobile terminal, and after the voice talk group is established successfully, all the mobile terminals included in the voice talk group can communicate with each other. Generally, when the mobile terminal is not in the mute mode or the earphone mode, it may be understood that the mobile terminal is in the play-out mode, and the sound of each user in the voice call group is collected by the microphone of the mobile terminal being used by the user, and is played through the speakers of the mobile terminals of other users after being transmitted and processed through the network. Taking game application as an example, if team formation is needed to cooperate, team formation voice function can be started, and if 5 players exist in a team, after a voice call group is successfully established, the 5 players can talk with each other, and any one player can simultaneously hear the words spoken by the other 4 players, so that the game can be conveniently played while communicating as if the other 4 players speak at the same time. The execution main body of the technical scheme of the application, namely the current mobile terminal, can be any one mobile terminal in the voice call group, and also can be one or a plurality of specified mobile terminals in the voice call group. That is to say, in the voice talkgroup, any one mobile terminal may execute the method provided by the embodiment of the present application, one or more specified mobile terminals may execute the method provided by the embodiment of the present application, or all the mobile terminals may execute the method provided by the embodiment of the present application.

Generally, when the mobile terminal is in the play-out mode, the sound collected by the microphone of the mobile terminal not only includes the voice of the user speaking, but also may include the sound emitted by the preset application program played by the speaker, such as background music, etc., and may include ambient sounds, and may also include sounds played by speakers that are spoken by others in the voice talkgroup, and, as such, when a plurality of mobile terminals send data including various sounds collected by the respective mobile terminals to the same mobile terminal through a network (for example, 5 mobile terminals are included in a voice call group, 4 of the mobile terminals send the sound collected by the respective mobile terminals to a server, and the server sends the sound data of the 4 mobile terminals to a 5 th mobile terminal), these sounds may be mixed and played in the mobile terminal, thereby generating a howling phenomenon.

in the embodiment of the present application, in order to perform howling detection at an appropriate timing, a condition that a howling detection event is triggered may be set in advance. Optionally, in order to effectively perform howling real-time detection in time, a howling detection event may be triggered immediately after a voice call group in a preset application program is successfully established; optionally, in order to perform howling detection more specifically and save extra power consumption caused by howling detection operation, theoretical analysis or investigation and the like can be performed on scenes in which howling easily occurs, a reasonable preset scene is set, and a howling detection event is triggered when the mobile terminal is detected to be in the preset scene.

Step 102, acquiring downlink voice call data with a preset time length in the mobile terminal, and performing blocking processing on the downlink voice call data.

for example, the downlink voice call data may be data that is sent to the mobile terminal after a server corresponding to a preset application program receives sound data of other mobile terminals in the voice call group and is processed by sound mixing and the like, or data that is directly forwarded to the mobile terminal. In the related art, after receiving the downlink voice call data from the server, the mobile terminal plays the data through the speaker without performing howling detection. In the present application, after detecting that a howling detection event is triggered, the downlink voice call data is not directly played, but is analyzed to determine whether a howling sound exists in the downlink voice data.

In the embodiment of the present application, the preset time length may be determined according to factors such as a specific configuration of the mobile terminal, a data processing capability, and a requirement of the voice call on timeliness, and the embodiment of the present application is not limited. For example, it may be any time period between 1 and 2 seconds. The block processing of the downlink voice call data may be a block processing according to a preset unit length, and the preset unit length may be, for example, 40 milliseconds. Assuming that the preset time length is 1.2 seconds and the preset unit length is 40 milliseconds, the data block can be divided into 30 data blocks.

103, for each data block, acquiring a first frequency point with the largest energy value in a high-frequency area and a second frequency point with the largest energy value in a low-frequency area on a frequency domain, and when the first frequency point meets a preset suspected howling condition, determining that the first frequency point is a suspected howling point in the current data block.

The preset suspected howling condition comprises that the energy value of the first frequency point is greater than a preset energy threshold, and the energy difference value between the first frequency point and the second frequency point is greater than a preset difference threshold.

For example, for each data block, it may be transformed from the time domain to the frequency domain first, facilitating spectral analysis. Transform mode the embodiment of the present application is not limited, and a Fourier transform mode, such as Fast Fourier Transform (FFT), may be adopted. Taking 40ms as an example, the size of 40ms audio data (16bit,16K sampling rate) is 40 × 16 × 16/2-1280 bytes, which is suitable for performing spectrum analysis by using 1024 as FFT transformation, and the frequency range in the frequency analysis after FFT processing is 0-16K/2, the step size is (16K/2)/1024, and the step size is about 8 Hz.

In the embodiment of the application, the dividing frequency can be preset as a boundary value to divide the high-frequency area and the low-frequency area. The preset division frequency can be set according to actual conditions, for example, the preset division frequency can be set according to the frequency of human voice and the frequency characteristics of easy occurrence of howling, and can be 1KHz, 1.5KHz, 2KHz and the like. For example, the preset division frequency is 2KHz, that is, the part greater than 2KHz is a high frequency region, and the part less than or equal to 2KHz is a low frequency region. Generally, the frequency of the howling sound appears in a high-frequency area, and the sound is relatively large (i.e. the energy value is relatively high), and the suspected howling point in one data block can be quickly determined according to the distribution characteristics of the energy value.

Illustratively, an energy value corresponding to each frequency point (frequency point for short) in the data block is obtained, then a first frequency point with the largest energy value is found from the high-frequency region, and a second frequency point with the largest energy value is found from the low-frequency region, if the energy value of the first frequency point is greater than a preset energy threshold (such as-30 dB), and the difference between the energy value of the first frequency point and the energy value of the second frequency point is greater than a preset difference threshold (such as 60), the first frequency point can be considered as a suspected howling point in the current data block. The suspected howling point can be quickly and accurately identified by the aid of the method, and a foundation is laid for improving howling detection efficiency.

Illustratively, for each data block, whether a suspected howling point exists is respectively determined, if yes, the suspected howling point is recorded, and whether a howling sound is contained in the current downlink voice call data is further determined.

And step 104, when a plurality of suspected howling point groups presenting periodic characteristics exist and the energy values corresponding to the suspected howling points are in an ascending trend according to the sequence of the data blocks, determining that howling sound exists in the downlink voice call data.

The suspected howling point group is a suspected howling point of which the frequency difference in the continuous adjacent data blocks is within a preset range, and the number of the continuous adjacent data blocks reaches a preset continuous threshold value.

for example, if a suspected howling sound exists in a certain data block, the whole downlink voice call audio cannot be considered to contain the howling sound, and it may also be that some special sounds are mistakenly recognized as the howling sound, for example, harsh sounds generated when an object rubs are generally high in frequency and large in sound, and are likely to be recognized as the suspected howling sound, but such sounds are generally short and short in duration, and do not belong to the howling sound, and therefore, further determination needs to be added.

In the embodiment of the application, the distribution characteristics of suspected howling sounds existing in each data block are analyzed. When there are suspected howling points with small frequency differences in a plurality of consecutive adjacent data blocks, these several suspected howling points may be referred to as a group of suspected howling points. Namely, the suspected howling point group is a suspected howling point in which the frequency difference between the consecutive adjacent data blocks is within a preset range, and the number of the consecutive adjacent data blocks reaches a preset consecutive threshold. The preset continuous threshold value can be determined according to actual conditions, for example, 3; the preset range corresponding to the frequency difference can also be determined according to actual conditions, such as 40 Hz. The inventors found that howling generally exhibits a persistent characteristic in a short time and occurs periodically, and further, the sound gradually becomes louder. Therefore, in the embodiment of the present application, whether there is a howling sound in the current downlink voice call data is identified by using a determination condition that a plurality of (which may be understood as 2 or more) groups of suspected howling points exhibit a periodic characteristic and an energy value corresponding to the suspected howling points increases according to the order of the belonging data blocks, and if the above condition is satisfied, it is determined that there is a howling sound, so that the howling sound can be identified quickly and accurately.

For example, it is assumed that the downlink voice call data is divided into 30 data blocks. For example, if suspected howling points with a frequency within an (a-40, a +40) interval are detected in all 15 data blocks 1, 2, 3, 7, 8, 9, 13, 14, 15, 19, 20, 21, 25, 26, and 27, the suspected howling points corresponding to 3 data blocks become one suspected howling point group, 5 suspected howling point groups have a periodic characteristic, and energy values corresponding to the suspected howling points sequentially increase, and thus it is determined that the howling sound is included in the downlink voice call data. For another example, if a suspected howling point with a frequency in the (B-40, B +40) interval is detected in only 3 data blocks of 1 st, 2 nd, and 3 rd, the suspected howling points corresponding to the 3 data blocks become a suspected howling point group, but only one suspected howling point group exists and a periodic feature is not present, and thus it is determined that the howling sound is not included in the downlink voice call data.

According to the voice call data detection method provided by the embodiment of the application, after a voice call group in a preset application program is successfully established, when a howling detection event is detected to be triggered, downlink voice call data with a preset time length in a mobile terminal are obtained, and blocking processing is carried out; for each data block, respectively determining whether a suspected howling point exists; and then, according to the distribution situation of the suspected howling points, quickly determining whether howling sound exists in the downlink voice call data. By adopting the technical scheme, after the voice call group of the preset application program in the mobile terminal is successfully established, howling detection can be timely and accurately carried out on the downlink voice call data, so that corresponding measures can be taken subsequently, and inconvenience brought to users by howling is reduced.

in some embodiments, after determining that there is a howling tone in the downlink voice call data, the method further includes: determining the suspected howling point as a howling point; and carrying out howling suppression processing on the downlink voice call data according to the howling point. After determining that there is a howling sound in the downlink voice call data, it is described that the suspected howling point that satisfies the howling sound determination condition and is identified before is actually a howling point, then it is necessary to perform howling suppression processing on the downlink voice according to the howling point, so as to prevent the howling sound from being played out from a speaker or a receiver, which affects the use of the user. Further, after the howling suppression processing is performed, the downlink voice call data subjected to the howling suppression processing is played through a loudspeaker or an earphone.

In some embodiments, the performing howling suppression processing on the downlink voice call data according to the howling point includes: and selecting frequencies corresponding to the howling points with higher energy values in preset quantity as target frequencies, and performing attenuation processing on audio signals corresponding to the target frequencies in the downlink voice call data. The preset number can be freely set, such as 1, 3, or even more, and can be dynamically determined according to the number of howling points. The howling points can be sorted according to the sequence of the illumination energy values from high to low, the howling points arranged in the front in a preset number are selected, and the frequency of the selected howling points is determined as the target frequency. The higher the energy value is, the louder the howling sound is, the higher the influence degree on the user is, so the advantage of the arrangement is that the howling suppression can be more pertinently carried out on the frequency with higher energy value, the howling suppression efficiency is improved, and the timeliness of the voice call is ensured.

in some embodiments, the performing howling suppression processing on the downlink voice call data according to the howling point may also include: and carrying out attenuation processing on audio signals corresponding to the frequencies of all howling points in the downlink voice call data. The advantage of this arrangement is that howling suppression can be performed on all howling points comprehensively, and the howling sound is prevented from being played.

For example, a notch filter may be used to attenuate an audio signal corresponding to a frequency of a howling point (i.e., a target frequency) that needs to be suppressed. The notch filter can quickly attenuate an input signal at a certain frequency point so as to achieve a filtering effect of preventing the frequency signal from passing through. The type of notch filter and the specific parameter values are not limited in this application. Generally, the target frequency is used as the center frequency of the notch filter, and parameters such as processing bandwidth and gain of the notch filter can be set according to actual requirements.

in some embodiments, after determining the suspected howling point as the howling point, the method may further include: and setting a suppression flag for the howling point. After performing howling suppression processing on the downlink voice call data according to the howling point, the method further includes: and continuously acquiring downlink voice call data with a preset time length, judging whether a suspected howling point is set with a suppression flag or not when the new downlink voice call data contains the suspected howling point, and if so, performing howling suppression processing on the new downlink voice call data according to the suspected howling point set with the suppression flag. The method has the advantages that suspected howling points continuously exist after a section of downlink voice call data with howling sound exists, and if the suspected howling points appear in an upper section of downlink voice call data, the suspected howling points are very likely to be the howling points, so that the howling points can be directly suppressed without being judged, the judging steps of the howling points are saved, the power consumption is saved, and the timeliness of voice calls can be improved. Optionally, if not, it is continuously determined whether it is the howling point according to the manner in the above embodiment (i.e., step 104). Optionally, after setting the suppression flag for the howling point, the method further includes: the howling index is updated according to the howling point after the suppression mark is set, so that the time when the howling point appears can be recorded in time, the time difference between the suspected howling point and the howling point with the suppression mark is conveniently judged subsequently, and whether the suspected howling point is the howling point or not is judged more accurately. After the suspected howling point is continuously determined as the howling point in step 104, a suppression flag may be set for the new howling point, and the howling index may be updated.

in some embodiments, the detecting the howling detection event is triggered, including: and judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than a preset distance value exists in the voice call group, and if so, determining that a howling detection event is triggered. In the application scenario of multi-person voice, the inventor finds that howling is very easy to occur when the distance between two mobile terminals is relatively close. Supposing that the mobile terminal A and the mobile terminal B in the voice call group are close to each other, the loudspeaker of the mobile terminal A amplifies and plays the received sound collected by the microphone of the mobile terminal B, and because the two mobile terminals are close to each other, the sound is collected again by the microphone of the mobile terminal B and is sent to the mobile terminal A, the sound is amplified and played continuously, positive feedback amplification of the sound is easily formed, and howling sound is generated. Therefore, in the embodiment of the present application, it may be determined whether there is a closer distance between one other mobile terminal and the current mobile terminal in the voice call, and if so, the howling detection event is triggered, and it is further detected that the howling detection event is triggered. The preset distance value may be, for example, 20 meters or 10 meters, and may be set according to actual requirements.

In the embodiment of the present application, there may be many specific ways for determining whether there is a target mobile terminal in the voice call group whose distance from the mobile terminal is smaller than the preset distance value, and the specific ways are not limited, and several ways are given below as schematic descriptions.

1. Playing a preset sound segment in a preset mode, and receiving feedback information of other mobile terminals in the voice call group, wherein the feedback information comprises a result of the other mobile terminals trying to acquire sound signals corresponding to the preset sound segment; and judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than a preset distance value exists in the voice call group according to the feedback information.

The method has the advantages that whether the target mobile terminal exists or not can be judged quickly and accurately, and whether the howling detection event needs to be triggered or not can be determined quickly. Illustratively, a prerecorded or prerequished sound clip may be played through a speaker at a preset volume; or playing the ultrasonic wave segments with preset frequency and preset intensity by the ultrasonic wave transmitter. Correspondingly, other mobile terminals can collect the sound signals corresponding to the preset sound segments through the microphone or the ultrasonic receiver. The preset volume, or the preset frequency and the preset intensity can be set according to the preset distance value. The result included in the feedback information may indicate whether the other mobile terminal can collect the sound signal. When other mobile terminals can acquire the sound signals corresponding to the preset sound segments, the distance between the two mobile terminals is smaller than the preset distance value. The feedback information can be forwarded by a server corresponding to a preset application program. In addition, the feedback information may further include attribute information of the collected sound signal, such as sound intensity, and since the intensity of the sound played by the mobile terminal is known, the sound may be attenuated along with the propagation of the sound, the farther the propagation distance is, the higher the attenuation degree is, the distance between the other mobile terminal and the current mobile terminal may be determined according to the intensity information of the sound signal in the feedback information, and whether the distance is smaller than a preset distance value may be determined.

2. Acquiring first positioning information of the mobile terminal and second positioning information of other mobile terminals in the voice call group; and judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than the preset distance value exists in the voice call group or not according to the first positioning information and the second positioning information.

the method has the advantages that the mobile terminal generally has a positioning function, and can quickly and accurately judge whether the target mobile terminal exists by utilizing the positioning information, so as to quickly determine whether the howling detection event needs to be triggered. For example, the mobile terminal may obtain the Positioning information through a Global Positioning System (GPS) or a Beidou satellite System, or may obtain the Positioning information through a base station Positioning or a network Positioning. The positioning information may include latitude and longitude coordinates, etc. And the second positioning information of other mobile terminals in the voice call group can be forwarded to the current mobile terminal through a server corresponding to the preset application program. The current mobile terminal compares the first positioning information of the current mobile terminal with at least one second positioning information forwarded by the server one by one, and judges whether the distance between one second positioning information and the first positioning information is smaller than a preset distance value.

3. acquiring first WiFi information connected with the mobile terminal and second WiFi information connected with other mobile terminals in the voice call group; and judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than the preset distance value exists in the voice call group or not according to the first WiFi information and the second WiFi information.

The method has the advantages that in order to save traffic cost, a user generally adopts a mode of connecting the WiFi hotspot to carry out voice call, and can quickly and accurately judge whether the target mobile terminal exists or not by utilizing the characteristic, so as to quickly determine whether a howling detection event needs to be triggered or not. For example, the WiFi information may include attribute information of the WiFi hotspot, and the attribute information may be, for example, a name of the WiFi hotspot, a Media Access Control (MAC) address of the WiFi hotspot, and the like, and may further include WiFi signal strength, and the like. Generally, the effective signal range of the WiFi hotspot is limited, generally about 50 meters (radius), if the preset distance value is greater than the effective signal range of the WiFi hotspot, it may be determined whether a target mobile terminal whose distance from the mobile terminal is less than the preset distance value exists in the voice call group according to whether WiFi hotspot attribute information of one second WiFi information is the same as the WiFi hotspot attribute information of the first WiFi information exists, and if any WiFi hotspot attribute information of one second WiFi information is the same as the WiFi hotspot attribute information of the first WiFi information exists, it is determined that a target mobile terminal exists in the voice call group, that is, when one other mobile terminal in the voice call group is connected with the current mobile terminal at the same WiFi hotspot, the other mobile terminal may be considered as the target mobile terminal. In addition, if the preset distance value is smaller than the effective signal range of the WiFi hotspot, for example, 10 meters, the distances between the mobile terminals connected to the same WiFi hotspot and the WiFi hotspot can be further estimated according to the WiFi signal strength, so as to determine the distance between the two mobile terminals, and determine whether the distance is smaller than the preset distance value.

4. Acquiring first sound data acquired by a microphone and acquiring downlink voice call data in a mobile terminal; the first sound data does not contain sound played by a loudspeaker of the mobile terminal; and judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than the preset distance value exists in the voice call group or not according to whether the first voice data and the downlink voice call data contain the voice of the same person or not.

the advantage of this arrangement is that it can quickly and accurately determine whether there is a target mobile terminal without using other information (such as the positioning information or WiFi information mentioned above), and thus quickly determine whether a howling detection event needs to be triggered. Illustratively, the first sound data does not include the sound played by the speaker of the mobile terminal, and the first sound data can be implemented by: the method comprises the steps that a loudspeaker of the mobile terminal is in a closed state in the process of acquiring first sound data and downlink voice call data; or the loudspeaker of the mobile terminal is in an open state in the process of acquiring the first sound data and the downlink voice call data, wherein the first sound data is sound data obtained by filtering sound data played by the loudspeaker from all sound data acquired by the microphone. When two users hold the mobile terminal and the distance is close, it is assumed that the first user uses the first mobile terminal, the second user uses the second mobile terminal, the voice of the first user is collected by a microphone of the first mobile terminal and sent to the second mobile terminal, the downlink voice call data of the second mobile terminal comprises the voice of the first user, and the voice of the first user is collected by the microphone of the second mobile terminal due to the close distance between the first user and the second user, so that the first voice data collected by the microphone and the acquired downlink voice call data comprise the voice of the same person (the first user) for the second mobile terminal, and the fact that the distance between the first mobile terminal and the second mobile terminal in a voice call group is smaller than a preset distance value is determined, namely for the second mobile terminal, the first mobile terminal is a target mobile terminal.

it can be understood that any one or a combination of multiple manners described above may be selected according to actual situations to determine whether the target mobile terminal exists, and the embodiment of the present application is not limited. In addition, the relevant step of judging whether the target mobile terminal exists can also be finished by a server corresponding to a preset application program, when the server judges that the target mobile terminal exists, a judgment result is sent to the mobile terminal, and the judgment result is used for indicating the mobile terminal to trigger a howling detection event. Correspondingly, the method in the embodiment of the present application further includes receiving a judgment result sent by the server corresponding to the preset application program, and triggering a howling detection event when the judgment result includes the following contents: and a target mobile terminal with the distance to the mobile terminal being less than a preset distance value exists in the voice call group. The specific determination process of the server may refer to the above-mentioned several determination methods, which are not described in detail in this embodiment of the present application.

In the embodiment of the present application, when there are two mobile terminals in a voice call group, and there is a howling situation, a method of turning off a speaker is not used to avoid the howling, but a howling suppression process is performed on downlink voice call data, which is determined by a special application scenario provided in the embodiment of the present application. If a loudspeaker of a mobile terminal of b is selected to be closed, the voice of a speaking will not be played in the mobile terminal of b, but at the same time, the voice of c speaking will not be played in the mobile terminal of b, and b can not hear the voice of c speaking, so that the meaning of the voice call group is lost.

in some embodiments, after determining that there is a howling tone in the downlink voice call data, the method further includes: acquiring sound data collected by the mobile terminal; separating the voice data from the background voice; weakening the separated background sound; and after the background sound after the weakening processing and the separated voice are subjected to sound mixing processing, the voice is used as uplink voice call data and is sent to a server corresponding to the preset application program. This has the advantage that howling due to background sounds can be effectively attenuated. For example, when a microphone array (the number of microphones is greater than or equal to 2) exists in the mobile terminal, the position of a sound source can be judged, and sound which is far away from the mobile terminal (for example, greater than 1 meter) is screened out as background sound according to the position of the sound source; or, the voiceprint information of the mobile terminal user can be acquired in advance, the voice of the user speaking is extracted from the voice data according to the voiceprint information to be used as the human voice, and the rest voice is used as the background voice. For example, the attenuating process for the separated background sound may be to reduce the sound of the background sound by adjusting the gain, or to filter the background sound. After the background sound is weakened, the volume is reduced, the condition that the sound is larger and larger is destroyed, and then howling caused by the background sound is effectively weakened.

Fig. 2 is a schematic flow chart of another voice call data detection method according to an embodiment of the present application, where a preset application is taken as an example of an online game application, the method includes the following steps:

step 201, detecting that the voice call group in the preset game application is successfully established.

For example, in the case of a team fighting game, such as royal, where each team has 5 players, the two teams of red and blue fight, and 5 players of each team need to communicate with each other to exchange a strategy of fighting the amount of business, many players may choose to open the in-team voice call function, for example, after one player applies for opening the in-team voice call function, the voice call group is successfully established. Thereafter, any one of the 5 players of the same team may hear the voice of the remaining 4 players speaking. Generally, a player sets the mobile terminal to a play-out mode, which facilitates a game.

Step 202, judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than a preset distance value exists in the voice call group, if so, executing step 203; otherwise, step 202 is repeated.

If the mobile terminals of two players are close to each other among 5 players, for example, two good friends play together at home, and the mobile terminals are set to the play-out mode, howling is very easily caused. Therefore, in the embodiment of the present application, it may be determined whether there are other mobile terminals in the voice call group that are closer to the current mobile terminal, and if there are other mobile terminals in the voice call group, howling detection is required.

optionally, in this embodiment of the present application, whether a target mobile terminal exists may be determined by using any one or a combination of multiple manners described above, which is not limited in this embodiment of the present application.

And 203, acquiring the downlink voice call data with the preset time length in the mobile terminal.

for example, the downlink voice call data includes sounds collected by microphones of mobile terminals of other 4-bit teammates, and the sounds generally include not only the sounds spoken by the 4-bit teammates but also sounds played by speakers of the mobile terminals of the 4-bit teammates and other environmental sounds. Generally, the game server collects uplink voice call data uploaded by other 4 mobile terminals, and sends the uplink voice call data of the 4 mobile terminals to the current mobile terminal.

And 204, carrying out block processing on the downlink voice call data.

step 205, for each data block, acquiring a first frequency point with the largest energy value in a high frequency region and a second frequency point with the largest energy value in a low frequency region in a frequency domain, and when the first frequency point meets a preset suspected howling condition, determining that the first frequency point is a suspected howling point in the current data block.

step 206, judging whether a plurality of suspected howling point groups presenting periodic characteristics exist, wherein energy values corresponding to the suspected howling points are in an ascending trend according to the sequence of the data blocks, and if so, executing step 207; otherwise, return to execute step 203.

And step 207, determining that howling sound exists in the downlink voice call data, and determining a suspected howling point as a howling point.

And 208, selecting a preset number of frequencies corresponding to the howling points with higher energy values as target frequencies, and performing attenuation processing on audio signals corresponding to the target frequencies in the downlink voice call data by adopting a notch filter.

and step 209, acquiring the sound data acquired by the mobile terminal, separating the voice from the background sound of the sound data, attenuating the separated background sound, mixing the attenuated background sound and the separated voice, and sending the mixed voice as uplink voice call data to a server corresponding to a preset game application.

In the embodiment of the application, after a voice call group in game application is successfully established, if a target mobile terminal close to a current mobile terminal is detected to exist in the voice call group, howling detection is performed, and when howling sound is determined to exist, inhibition processing aiming at the howling sound is performed on uplink voice call data and downlink voice call data respectively, so that the howling sound can be effectively weakened, interference of the howling sound on a game process is avoided, pain spots of game players are reduced, and functions of the mobile terminal are more complete.

fig. 3 is a block diagram of a voice call data detection apparatus according to an embodiment of the present disclosure, where the apparatus may be implemented by software and/or hardware, and is generally integrated in a mobile terminal, and may perform howling detection on voice call data by executing a voice call data detection method. As shown in fig. 3, the apparatus includes:

a trigger detection module 301, configured to detect that a howling detection event is triggered after a voice call group in a preset application program is successfully established;

a downlink voice data obtaining module 302, configured to obtain downlink voice call data of a preset time duration in a mobile terminal, and perform blocking processing on the downlink voice call data;

A suspected howling point determining module 303, configured to obtain, for each data block, a first frequency point with a largest energy value in a high-frequency region and a second frequency point with a largest energy value in a low-frequency region in a frequency domain, and when the first frequency point meets a preset suspected howling condition, determine that the first frequency point is a suspected howling point in a current data block; the preset suspected howling condition comprises that the energy value of the first frequency point is greater than a preset energy threshold value, and the energy difference value between the first frequency point and the second frequency point is greater than a preset difference threshold value;

a howling sound determination module 304, configured to determine that a howling sound exists in the downlink voice call data when multiple suspected howling point groups with periodic characteristics exist and energy values corresponding to the suspected howling points trend upward according to the sequence of the data blocks to which the suspected howling points belong; the suspected howling point group is a suspected howling point of which the frequency difference in the continuous adjacent data blocks is within a preset range, and the number of the continuous adjacent data blocks reaches a preset continuous threshold value.

according to the voice call data detection device provided by the embodiment of the application, after the voice call group in the preset application program is successfully established, when a howling detection event is detected to be triggered, downlink voice call data with a preset time length in the mobile terminal are obtained, and blocking processing is carried out; for each data block, respectively determining whether a suspected howling point exists; and then, according to the distribution situation of the suspected howling points, quickly determining whether howling sound exists in the downlink voice call data. By adopting the technical scheme, after the voice call group of the preset application program in the mobile terminal is successfully established, howling detection can be timely and accurately carried out on the downlink voice call data, so that corresponding measures can be taken subsequently, and inconvenience brought to users by howling is reduced.

Optionally, the apparatus further comprises:

And the howling point determining module is used for determining the suspected howling point as the howling point after determining that the howling sound exists in the downlink voice call data.

And the howling suppression module is used for carrying out howling suppression processing on the downlink voice call data according to the howling point.

Optionally, the howling suppression module is specifically configured to:

selecting frequencies corresponding to howling points with higher energy values in preset quantity as target frequencies, and performing attenuation processing on audio signals corresponding to the target frequencies in the downlink voice call data; or the like, or, alternatively,

and carrying out attenuation processing on audio signals corresponding to the frequencies of all howling points in the downlink voice call data.

Optionally, the detecting that the howling detection event is triggered includes:

and judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than a preset distance value exists in the voice call group, and if so, determining that a howling detection event is triggered.

optionally, the determining whether there is a target mobile terminal in the voice call group whose distance to the mobile terminal is smaller than a preset distance value includes:

Playing a preset sound segment in a preset mode, and receiving feedback information of other mobile terminals in the voice call group, wherein the feedback information comprises a result of the other mobile terminals trying to acquire sound signals corresponding to the preset sound segment; judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than a preset distance value exists in the voice call group or not according to the feedback information;

Alternatively, the first and second electrodes may be,

acquiring first positioning information of the mobile terminal and second positioning information of other mobile terminals in the voice call group; judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than the preset distance value exists in the voice call group or not according to the first positioning information and the second positioning information;

alternatively, the first and second electrodes may be,

Acquiring first WiFi information connected with the mobile terminal and second WiFi information connected with other mobile terminals in the voice call group; judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than the preset distance value exists in the voice call group or not according to the first WiFi information and the second WiFi information;

Alternatively, the first and second electrodes may be,

Acquiring first sound data acquired by a microphone and acquiring downlink voice call data in a mobile terminal; the first sound data does not contain sound played by a loudspeaker of the mobile terminal; and judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than the preset distance value exists in the voice call group or not according to whether the first voice data and the downlink voice call data contain the voice of the same person or not.

Optionally, the apparatus further comprises:

The voice data acquisition module is used for acquiring voice data acquired by the mobile terminal after determining that the howling sound exists in the downlink voice call data;

The voice separation module is used for carrying out human voice and background voice separation operation on the voice data;

The background sound weakening module is used for weakening the separated background sound;

And the uplink data sending module is used for carrying out sound mixing processing on the weakened background sound and the separated human voice and then sending the sound as uplink voice call data to the server corresponding to the preset application program.

Optionally, the preset application program is an online game application program.

Embodiments of the present application also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a voice call data detection method, the method including:

Storage medium-any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDRRAM, SRAM, EDORAM, Lanbas (Rambus) RAM, etc.; non-volatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network (such as the internet). The second computer system may provide program instructions to the first computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected by a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors.

of course, the storage medium provided in the embodiments of the present application and containing computer-executable instructions is not limited to the voice call data detection operation described above, and may also perform related operations in the voice call data detection method provided in any embodiment of the present application.

the embodiment of the application provides a mobile terminal, and the voice call data detection device provided by the embodiment of the application can be integrated in the mobile terminal. Fig. 4 is a schematic structural diagram of a mobile terminal according to an embodiment of the present application. The mobile terminal 400 may include: the device comprises a memory 401, a processor 402 and a computer program stored on the memory 401 and executable by the processor 402, wherein the processor 402 executes the computer program to implement the voice call data detection method according to the embodiment of the present application.

the mobile terminal provided by the embodiment of the application can timely and accurately perform howling detection on downlink voice call data after the voice call group of the preset application program in the mobile terminal is successfully established, so that corresponding measures can be taken subsequently, and inconvenience brought to users by howling sound is reduced.

Fig. 5 is a schematic structural diagram of another mobile terminal provided in an embodiment of the present application, where the mobile terminal may include: a housing (not shown), a memory 501, a Central Processing Unit (CPU) 502 (also called processor, hereinafter referred to as CPU), a circuit board (not shown), and a power circuit (not shown). The circuit board is arranged in a space enclosed by the shell; the CPU502 and the memory 501 are provided on the circuit board; the power supply circuit is used for supplying power to each circuit or device of the mobile terminal; the memory 501 is used for storing executable program codes; the CPU502 executes a computer program corresponding to the executable program code by reading the executable program code stored in the memory 501 to implement the steps of:

the mobile terminal further includes: peripheral interface 503, RF (Radio Frequency) circuitry 505, audio circuitry 506, speakers 511, power management chip 508, input/output (I/O) subsystem 509, other input/control devices 510, touch screen 512, other input/control devices 510, and external port 504, which communicate via one or more communication buses or signal lines 507.

It should be understood that the illustrated mobile terminal 500 is merely one example of a mobile terminal and that the mobile terminal 500 may have more or fewer components than shown, may combine two or more components, or may have a different configuration of components. The various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.

the following describes in detail a mobile terminal for voice call data howling detection provided in this embodiment, where the mobile terminal is a mobile phone as an example.

A memory 501, the memory 501 being accessible by the CPU502, the peripheral interface 503, and the like, the memory 501 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, or other volatile solid state storage devices.

A peripheral interface 503, the peripheral interface 503 may connect input and output peripherals of the device to the CPU502 and the memory 501.

An I/O subsystem 509, which I/O subsystem 509 may connect input and output peripherals on the device, such as a touch screen 512 and other input/control devices 510, to the peripheral interface 503. The I/O subsystem 509 may include a display controller 5091 and one or more input controllers 5092 for controlling other input/control devices 510. Where one or more input controllers 5092 receive electrical signals from or send electrical signals to other input/control devices 510, the other input/control devices 510 may include physical buttons (push buttons, rocker buttons, etc.), dials, slide switches, joysticks, click wheels. It is noted that the input controller 5092 may be connected to any one of: a keyboard, an infrared port, a USB interface, and a pointing device such as a mouse.

A touch screen 512, which is an input interface and an output interface between the user's mobile terminal and the user, displays visual output to the user, which may include graphics, text, icons, video, and the like.

the display controller 5091 in the I/O subsystem 509 receives electrical signals from the touch screen 512 or transmits electrical signals to the touch screen 512. The touch screen 512 detects a contact on the touch screen, and the display controller 5091 converts the detected contact into an interaction with a user interface object displayed on the touch screen 512, that is, implements a human-computer interaction, and the user interface object displayed on the touch screen 512 may be an icon for running a game, an icon networked to a corresponding network, or the like. It is worth mentioning that the device may also comprise a light mouse, which is a touch sensitive surface that does not show visual output, or an extension of the touch sensitive surface formed by the touch screen.

The RF circuit 505 is mainly used to establish communication between the mobile phone and the wireless network (i.e., network side), and implement data reception and transmission between the mobile phone and the wireless network. Such as sending and receiving short messages, e-mails, etc. In particular, the RF circuitry 505 receives and transmits RF signals, also referred to as electromagnetic signals, through which the RF circuitry 505 converts electrical signals to or from electromagnetic signals and communicates with communication networks and other devices. The RF circuitry 505 may include known circuitry for performing these functions including, but not limited to, an antenna system, an RF transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a CODEC (CODEC) chipset, a Subscriber Identity Module (SIM), and so forth.

the audio circuit 506 is mainly used to receive audio data from the peripheral interface 503, convert the audio data into an electric signal, and transmit the electric signal to the speaker 511.

The speaker 511 is used for restoring the voice signal received by the handset from the wireless network through the RF circuit 505 to sound and playing the sound to the user.

And a power management chip 508 for supplying power and managing power to the hardware connected to the CPU502, the I/O subsystem, and the peripheral interfaces.

The voice call data detection device, the storage medium and the mobile terminal provided in the above embodiments can execute the voice call data detection method provided in any embodiment of the present application, and have corresponding functional modules and beneficial effects for executing the method. For details of the voice call data detection method provided in any of the embodiments of the present application, reference may be made to the technical details not described in detail in the above embodiments.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present application and the technical principles employed. It will be understood by those skilled in the art that the present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, although the present application has been described in more detail with reference to the above embodiments, the present application is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present application, and the scope of the present application is determined by the scope of the appended claims.

Claims

1. A voice call data detection method is characterized by comprising the following steps:

The method comprises the steps of detecting that a voice call group in a preset game application in a mobile terminal is successfully established, judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than a preset distance value exists in the voice call group, and if yes, determining that a howling detection event is triggered;

when a plurality of suspected howling point groups presenting periodic characteristics exist and energy values corresponding to the suspected howling points are in an ascending trend according to the sequence of the data blocks, determining that howling sound exists in the downlink voice call data; the suspected howling point group is suspected howling points with frequency difference in continuous adjacent data blocks within a preset range, and the number of the continuous adjacent data blocks reaches a preset continuous threshold;

Determining the suspected howling point as a howling point, and setting a suppression mark for the howling point;

And carrying out howling suppression processing on the downlink voice call data according to the howling point, continuously acquiring downlink voice call data with a preset time length, judging whether the suspected howling point is set with the suppression mark or not when determining that the new downlink voice call data contains the suspected howling point, and carrying out howling suppression processing on the new downlink voice call data according to the suspected howling point with the suppression mark if the suspected howling point is set.

2. the method according to claim 1, wherein said performing howling suppression processing on the downlink voice call data according to the howling point comprises:

3. the method of claim 1, wherein the determining whether there is a target mobile terminal in the voice call group whose distance to the mobile terminal is less than a preset distance value comprises:

Alternatively, the first and second electrodes may be,

4. The method of claim 1, after determining that howling is present in the downstream voice call data, further comprising:

acquiring sound data collected by the mobile terminal;

Separating the voice data from the background voice;

Weakening the separated background sound;

and after the background sound after the weakening processing and the separated voice are subjected to sound mixing processing, the voice is used as uplink voice call data and is sent to a server corresponding to the preset application program.

5. The method of claim 1, wherein the predetermined application is an online gaming application.

6. a voice call data detection apparatus, comprising:

The system comprises a trigger detection module, a data processing module and a data processing module, wherein the trigger detection module is used for detecting that a voice call group in a preset game application in a mobile terminal is successfully established, judging whether a target mobile terminal with the distance between the target mobile terminal and the mobile terminal being smaller than a preset distance value exists in the voice call group, and if the target mobile terminal exists, determining that a howling detection event is triggered;

a howling sound determination module, configured to determine that a howling sound exists in the downlink voice call data when multiple suspected howling point groups exhibiting periodic characteristics exist and energy values corresponding to the suspected howling points are in an ascending trend according to the sequence of the data blocks to which the suspected howling points belong; the suspected howling point group is suspected howling points with frequency difference in continuous adjacent data blocks within a preset range, and the number of the continuous adjacent data blocks reaches a preset continuous threshold;

A howling point determining module, configured to determine the suspected howling point as a howling point after determining that a howling sound exists in the downlink voice call data, and set a suppression flag for the howling point;

and the howling suppression module is used for performing howling suppression processing on the downlink voice call data according to the howling point, continuously acquiring downlink voice call data with a preset time length, judging whether the suspected howling point is set with the suppression flag or not when the new downlink voice call data contains the suspected howling point, and performing howling suppression processing on the new downlink voice call data according to the suspected howling point with the suppression flag set if the suspected howling point is set.

7. A computer-readable storage medium on which a computer program is stored, the program, when being executed by a processor, implementing the voice call data detection method according to any one of claims 1 to 5.

8. a mobile terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the voice call data detection method according to any one of claims 1 to 5 when executing the computer program.