CN114710589A - Call quality evaluation method and device - Google Patents

Call quality evaluation method and device Download PDF

Info

Publication number
CN114710589A
CN114710589A CN202210312627.5A CN202210312627A CN114710589A CN 114710589 A CN114710589 A CN 114710589A CN 202210312627 A CN202210312627 A CN 202210312627A CN 114710589 A CN114710589 A CN 114710589A
Authority
CN
China
Prior art keywords
signal
sound signal
analog
digital
digital sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210312627.5A
Other languages
Chinese (zh)
Inventor
郎睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xumi Yuntu Space Technology Co Ltd
Original Assignee
Shenzhen Jizhi Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jizhi Digital Technology Co Ltd filed Critical Shenzhen Jizhi Digital Technology Co Ltd
Priority to CN202210312627.5A priority Critical patent/CN114710589A/en
Publication of CN114710589A publication Critical patent/CN114710589A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/22Arrangements for supervision, monitoring or testing
    • H04M3/2236Quality of speech transmission monitoring

Abstract

The disclosure relates to the technical field of communication, and provides a method and a device for evaluating call quality. The method comprises the following steps: when detecting that a first object and a second object carry out voice communication, respectively acquiring a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal and a second analog sound signal corresponding to the first object within preset time from the first object, the first equipment, a communication channel, the second equipment and the second object; respectively carrying out segmentation processing on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal; and determining a call quality score corresponding to the voice call based on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal which are subjected to the segmentation processing.

Description

Call quality evaluation method and device
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a method and an apparatus for evaluating call quality.
Background
In order to improve the quality of the call, it is first necessary to measure the call quality of a call system or a voice call and determine whether the information is lost during the call. At present, the method for measuring the call quality only obtains two signals from two different positions in a call system, and compares whether the two signals are consistent, for example, compares the two signals at the input end and the output end in the call system. The existing method for measuring the call quality has the problem of low accuracy.
In the course of implementing the disclosed concept, the inventors found that there are at least the following technical problems in the related art: the existing method for measuring the call quality has the problem of low accuracy.
Disclosure of Invention
In view of this, embodiments of the present disclosure provide a method and an apparatus for evaluating call quality, an electronic device, and a computer-readable storage medium, so as to solve the problem in the prior art that the accuracy of the current method for measuring call quality is low.
In a first aspect of the embodiments of the present disclosure, a method for evaluating call quality is provided, including: when detecting that a first object and a second object carry out voice call, respectively acquiring a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal and a second analog sound signal corresponding to the first object within preset time from the first object, a first device, a call channel, a second device and the second object, wherein the call system corresponding to the voice call comprises: a first object, a first device, a talk channel, a second device, and a second object; respectively carrying out segmentation processing on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal; and determining a call quality score corresponding to the voice call based on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal which are subjected to the segmentation processing.
In a second aspect of the embodiments of the present disclosure, an apparatus for evaluating call quality is provided, including: the obtaining module is configured to obtain a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal and a second analog sound signal corresponding to a first object within a preset time from a first object, a first device, a communication channel, a second device and a second object when detecting that the first object and the second object perform a voice communication, where the communication system corresponding to the voice communication includes: a first object, a first device, a talk channel, a second device, and a second object; a processing module configured to segment the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal, respectively; the determining module is configured to determine a call quality score corresponding to the voice call based on the segmented first analog sound signal, the segmented first digital sound signal, the segmented first data packet, the segmented second digital sound signal and the segmented second analog sound signal.
In a third aspect of the disclosed embodiments, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the above method when executing the computer program.
In a fourth aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor, implements the steps of the above-mentioned method.
Compared with the prior art, the embodiment of the disclosure has the following beneficial effects: because the embodiment of the present disclosure obtains the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal corresponding to the first object within the preset time from the first object, the first device, the communication channel, the second device, and the second object, respectively, when detecting that the first object and the second object perform the voice communication, the communication system corresponding to the voice communication includes: a first object, a first device, a talk channel, a second device, and a second object; respectively carrying out segmentation processing on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal; the method comprises the steps of determining a call quality score corresponding to a voice call based on a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal and a second analog sound signal which are processed in a segmented mode, and therefore by adopting the technical means, the problem that the existing call quality measuring method in the prior art is low in accuracy can be solved, and the accuracy of call quality evaluation is improved.
Drawings
To more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive efforts.
FIG. 1 is a scenario diagram of an application scenario of an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a method for evaluating call quality according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of an apparatus for evaluating call quality according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. However, it will be apparent to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
A method and an apparatus for evaluating call quality according to an embodiment of the present disclosure will be described in detail below with reference to the accompanying drawings.
Fig. 1 is a scene schematic diagram of an application scenario of an embodiment of the present disclosure. The application scenario may include terminal devices 1, 2 and 3, a server 4 and a network 5.
The terminal devices 1, 2, and 3 may be hardware or software. When the terminal devices 1, 2 and 3 are hardware, they may be various electronic devices having a display screen and supporting communication with the server 4, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like; when the terminal devices 1, 2, and 3 are software, they may be installed in the electronic devices as above. The terminal devices 1, 2, and 3 may be implemented as a plurality of pieces of software or software modules, or may be implemented as a single piece of software or software module, which is not limited in this disclosure. Further, various applications, such as a data processing application, an instant messaging tool, social platform software, a search type application, a shopping type application, etc., may be installed on the terminal devices 1, 2, and 3.
The server 4 may be a server providing various services, for example, a backend server receiving a request sent by a terminal device establishing a communication connection with the server, and the backend server may receive and analyze the request sent by the terminal device and generate a processing result. The server 4 may be one server, may also be a server cluster composed of a plurality of servers, or may also be a cloud computing service center, which is not limited in this disclosure.
The server 4 may be hardware or software. When the server 4 is hardware, it may be various electronic devices that provide various services to the terminal devices 1, 2, and 3. When the server 4 is software, it may be a plurality of software or software modules providing various services for the terminal devices 1, 2, and 3, or may be a single software or software module providing various services for the terminal devices 1, 2, and 3, which is not limited by the embodiment of the present disclosure.
The network 5 may be a wired network connected by a coaxial cable, a twisted pair and an optical fiber, or may be a wireless network that can interconnect various Communication devices without wiring, for example, Bluetooth (Bluetooth), Near Field Communication (NFC), Infrared (Infrared), and the like, which is not limited in the embodiment of the present disclosure.
A user can establish a communication connection with the server 4 via the network 5 through the terminal devices 1, 2, and 3 to receive or transmit information or the like. It should be noted that the specific types, numbers and combinations of the terminal devices 1, 2 and 3, the server 4 and the network 5 may be adjusted according to the actual requirements of the application scenarios, and the embodiment of the present disclosure does not limit this.
Fig. 2 is a flowchart illustrating a method for evaluating call quality according to an embodiment of the present disclosure. The method for evaluating call quality of fig. 2 may be performed by the terminal device or the server of fig. 1. As shown in fig. 2, the method for evaluating call quality includes:
s201, when detecting that a first object and a second object perform a voice call, respectively obtaining a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal, and a second analog sound signal corresponding to the first object within a preset time from the first object, the first device, the call channel, the second device, and the second object, where a call system corresponding to the voice call includes: a first object, a first device, a talk channel, a second device, and a second object;
s202, the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal are respectively processed in a segmented mode;
s203, determining a call quality score corresponding to the voice call based on the segmented first analog sound signal, the segmented first digital sound signal, the segmented first data packet, the segmented second digital sound signal and the segmented second analog sound signal.
For example, a first object a and a second object B perform a voice call, the first device used by a and the second device used by B are both mobile phones, and the call channel is air. In the embodiment of the disclosure, in the voice call between the A and the B, the quality of the voice call is judged only by evaluating the transmission of one or more voices of the A in the preset time in the call system. The segmentation process is to divide the signal into multiple segments according to a fixed time window. The first analog sound signal is voice information uttered by the first object; the first equipment processes the first analog sound signal to obtain a first digital sound signal; at this time, the processing means of the first device is mainly analog-to-digital conversion, that is, conversion from an analog signal to a digital signal; the data packet is data transmitted in a call channel, the call channel uses an information transmission protocol, and before the data packet is transmitted in the call channel, the first digital sound signal is processed by the information transmission protocol to obtain the data packet; before entering the second device from the communication channel, the data packet needs to be processed by an information transmission Protocol to obtain a second digital sound signal, wherein the information transmission Protocol may be a Real-time Transport Protocol (RTP), the Real-time Transport Protocol is a network Transport Protocol, and the data packet is an RTP data packet; the second device processes the second digital sound signal to obtain a second analog sound signal, and the processing means of the second device mainly comprises digital-to-analog conversion, namely, conversion from a digital signal to an analog signal.
According to the technical solution provided by the embodiment of the present disclosure, when it is detected that the first object and the second object perform a voice call, the first object, the first device, the call channel, the second device, and the second object respectively obtain a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal, and a second analog sound signal corresponding to the first object within a preset time, where the call system corresponding to the voice call includes: a first object, a first device, a talk channel, a second device, and a second object; respectively carrying out segmentation processing on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal; the method comprises the steps of determining a call quality score corresponding to a voice call based on a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal and a second analog sound signal which are processed in a segmented mode, and therefore by adopting the technical means, the problem that the existing call quality measuring method in the prior art is low in accuracy can be solved, and the accuracy of call quality evaluation is improved.
In step S203, determining a call quality score corresponding to the voice call based on the segmented first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal, includes: respectively carrying out analog-to-digital conversion on the first analog sound signal and the second analog sound signal which are subjected to the segmentation processing to obtain a first signal and a second signal in sequence; processing the first data packet after the segmentation processing by using an information transmission protocol to obtain a third signal, wherein the information transmission protocol is a transmission protocol used by a communication channel; the first signal, the second signal, the third signal, the first digital sound signal after the segmentation processing and the second digital sound signal are sequentially processed as follows to sequentially obtain a fourth signal, a fifth signal, a sixth signal, a seventh signal and an eighth signal: level adjustment, filtering, temporal alignment and compensation, and auditory transformation; and determining a call quality score corresponding to the voice call based on the fourth signal, the fifth signal, the sixth signal, the seventh signal and the eighth signal.
The first signal, the second signal, the third signal, the first digital sound signal and the second digital sound signal are all digital signals. The embodiment of the present disclosure evaluates the call quality score based on the digital signal, so it is necessary to first perform analog-to-digital conversion on the first analog voice signal and the second analog voice signal after the segmentation processing. Level adjustment: the amplification factor and the circuit are adjusted according to the output level range of the sensor and the level range of the acquisition end, and a certain margin is reserved; time alignment is to put the first signal, the second signal, the third signal, the first digital sound signal after the segmentation processing and the second digital sound signal in the same time axis, for example, 1s, the first signal, the second signal, the third signal, the information corresponding to the first digital sound signal after the segmentation processing and the second digital sound signal, time compensation, for example, 1s, the first signal has no corresponding information and can be filled with 0, and finally, the information corresponding to the first signal, the second signal, the third signal, the first digital sound signal after the segmentation processing and the second digital sound signal are made to have the same length. The auditory transformation is to calculate the frequency spectrum frame by using fast fourier transformation to obtain the loudness corresponding to the signal, which is also called volume. The sound felt by human ears is strong and weak.
Determining a call quality score corresponding to the voice call based on the fourth signal, the fifth signal, the sixth signal, the seventh signal, and the eighth signal, including: respectively extracting signal parameters of a fourth signal, a fifth signal, a sixth signal, a seventh signal and an eighth signal to sequentially obtain a first signal parameter, a second signal parameter, a third signal parameter, a fourth signal parameter and a fifth signal parameter; and determining a call quality score corresponding to the voice call based on the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter and the fifth signal parameter.
The signal parameter may be a parameter representing the waveform, duration or amplitude, frequency, phase of each frequency component of the signal.
Determining a call quality score corresponding to the voice call based on the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter, and the fifth signal parameter, including: comparing any two signal parameters of the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter and the fifth signal parameter to obtain a first comparison result; and determining a call quality score corresponding to the voice call based on the first comparison result.
Any two of the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter and the fifth signal parameter are compared, and there are ten pieces of comparison information in total (two signal parameters correspond to one piece of comparison information), so that the first comparison result includes ten pieces of comparison information. Each piece of comparison information is used to represent the loss and gain of the signal between two signal parameters, and each piece of comparison information may be a score, with the closer the score is to 100, the less the loss and gain of the signal between two signal parameters. And determining a call quality score corresponding to the voice call based on the first comparison result, wherein the call quality score may be obtained by weighting and summing ten pieces of comparison information according to a preset weight value, and taking the result as the call quality score.
After the step S201 is executed, that is, when it is detected that the first object and the second object have a voice call, the method further includes, after acquiring the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal corresponding to the first object within a preset time from the first object, the first device, the call channel, the second device, and the second object, respectively: processing the first data packet by using an information transmission protocol to obtain a ninth signal, wherein the information transmission protocol is a transmission protocol used by a communication channel; respectively carrying out digital-to-analog conversion on the first digital sound signal, the second digital sound signal and the ninth signal to obtain a tenth signal, an eleventh signal and a twelfth signal in sequence; respectively identifying a first analog sound signal, a second analog sound signal, a tenth signal, an eleventh signal and a twelfth signal by utilizing an automatic speech recognition technology to sequentially obtain a first text, a second text, a third text, a fourth text and a fifth text; and determining a call quality score corresponding to the voice call based on the first text, the second text, the third text, the fourth text and the fifth text.
The tenth signal, the eleventh signal, and the twelfth signal are analog signals, and the embodiment of the present disclosure determines the call quality score based on the analog signals, so that the first digital sound signal, the second digital sound signal, and the ninth signal are digital-to-analog converted. The aim of Automatic Speech Recognition (ASR) technology is to make computer "listen and write" continuous speech spoken by different people, namely "speech dictation machine", which is a technology for realizing conversion from "voice" to "character".
Determining a call quality score corresponding to the voice call based on the first text, the second text, the third text, the fourth text and the fifth text, including: comparing any two texts of the first text, the second text, the third text, the fourth text and the fifth text to obtain a second comparison result; and determining a call quality score corresponding to the voice call based on the second comparison result.
Comparing any two of the first text, the second text, the third text, the fourth text, and the fifth text is similar to comparing any two of the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter, and the fifth signal parameter.
After performing step S202, that is, after performing the segmentation processing on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal, respectively, the method further includes: respectively acquiring a third analog sound signal, a third digital sound signal, a second data packet, a fourth digital sound signal and a fourth analog sound signal corresponding to a second object within preset time from the second object, the second device, a communication channel, the first device and the first object; respectively carrying out segmentation processing on the third analog sound signal, the third digital sound signal, the second data packet, the fourth digital sound signal and the fourth analog sound signal; determining a first score based on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal after the segmentation processing; determining a second score based on the segmented third analog sound signal, the segmented third digital sound signal, the segmented second data packet, the segmented fourth digital sound signal and the segmented fourth analog sound signal; and determining a call quality score corresponding to the voice call based on the first score and the second score.
In order to further improve the accuracy of the evaluation of the call quality score, the embodiment of the present disclosure evaluates a loss and gain of transmission of one or more voices in the call system within a preset time, and evaluates B loss and gain of transmission of one or more voices in the call system within a preset time. The first score may be a call quality score determined based on the segmented first analog voice signal, the segmented first digital voice signal, the segmented first data packet, the segmented second digital voice signal, and the segmented second analog voice signal. The second score is similar to the first score.
All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.
Fig. 3 is a schematic diagram of an apparatus for evaluating call quality according to an embodiment of the present disclosure. As shown in fig. 3, the call quality evaluation device includes:
the obtaining module 301 is configured to obtain, when it is detected that a first object and a second object perform a voice call, a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal, and a second analog sound signal corresponding to the first object within a preset time from the first object, the first device, the call channel, the second device, and the second object, respectively, where a call system corresponding to the voice call includes: a first object, a first device, a talk channel, a second device, and a second object;
a processing module 302 configured to segment the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal, respectively;
a determining module 303 configured to determine a call quality score corresponding to the voice call based on the segmented first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal.
For example, a first object a and a second object B perform a voice call, the first device used by a and the second device used by B are both mobile phones, and the call channel is air. In the embodiment of the disclosure, in the voice call between the A and the B, the quality of the voice call is judged only by evaluating the transmission of one or more voices of the A in the preset time in the call system. The segmentation process is to divide the signal into multiple segments according to a fixed time window. The first analog sound signal is voice information uttered by the first object; the first equipment processes the first analog sound signal to obtain a first digital sound signal; at this time, the processing means of the first device is mainly analog-to-digital conversion, that is, conversion from an analog signal to a digital signal; the data packet is data transmitted in a call channel, the call channel uses an information transmission protocol, and before the data packet is transmitted in the call channel, the first digital sound signal is processed by the information transmission protocol to obtain the data packet; before entering the second device from the communication channel, the data packet needs to be processed by an information transmission Protocol to obtain a second digital sound signal, wherein the information transmission Protocol may be a Real-time Transport Protocol (RTP), the Real-time Transport Protocol is a network Transport Protocol, and the data packet is an RTP data packet; the second device processes the second digital sound signal to obtain a second analog sound signal, and the processing means of the second device mainly comprises digital-to-analog conversion, namely, conversion from a digital signal to an analog signal.
According to the technical solution provided by the embodiment of the present disclosure, when it is detected that the first object and the second object perform a voice call, the first object, the first device, the call channel, the second device, and the second object respectively obtain a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal, and a second analog sound signal corresponding to the first object within a preset time, where the call system corresponding to the voice call includes: a first object, a first device, a talk channel, a second device, and a second object; respectively carrying out segmentation processing on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal; the method comprises the steps of determining a call quality score corresponding to a voice call based on a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal and a second analog sound signal which are processed in a segmented mode, and therefore by adopting the technical means, the problem that the existing call quality measuring method in the prior art is low in accuracy can be solved, and the accuracy of call quality evaluation is improved.
Optionally, the determining module 303 is further configured to perform analog-to-digital conversion on the first analog sound signal and the second analog sound signal after the segmentation processing, respectively, so as to obtain a first signal and a second signal in sequence; processing the first data packet after the segmentation processing by using an information transmission protocol to obtain a third signal, wherein the information transmission protocol is a transmission protocol used by a communication channel; the first signal, the second signal, the third signal, the first digital sound signal after the segmentation processing and the second digital sound signal are sequentially processed as follows to sequentially obtain a fourth signal, a fifth signal, a sixth signal, a seventh signal and an eighth signal: level adjustment, filtering, temporal alignment and compensation, and auditory transformation; and determining a call quality score corresponding to the voice call based on the fourth signal, the fifth signal, the sixth signal, the seventh signal and the eighth signal.
The first signal, the second signal, the third signal, the first digital sound signal and the second digital sound signal are all digital signals. The embodiment of the present disclosure evaluates the call quality score based on the digital signal, so it is necessary to first perform analog-to-digital conversion on the first analog voice signal and the second analog voice signal after the segmentation processing. Level adjustment: the amplification factor and the circuit are adjusted according to the output level range of the sensor and the level range of the acquisition end, and a certain margin is reserved; time alignment is to put the first signal, the second signal, the third signal, the first digital sound signal and the second digital sound signal after the segmentation processing in the same time axis, for example, 1s, the first signal, the second signal, the third signal, the information corresponding to the first digital sound signal and the second digital sound signal after the segmentation processing, time compensation, for example, 1s, the first signal has no corresponding information and can be filled with 0, and finally, the information corresponding to the first signal, the second signal, the third signal, the first digital sound signal after the segmentation processing and the second digital sound signal are the same in length. The auditory transformation is to calculate the frequency spectrum frame by using fast fourier transformation to obtain the loudness corresponding to the signal, which is also called the volume. The sound felt by human ears is strong and weak.
Optionally, the determining module 303 is further configured to extract signal parameters of a fourth signal, a fifth signal, a sixth signal, a seventh signal and an eighth signal respectively, and obtain a first signal parameter, a second signal parameter, a third signal parameter, a fourth signal parameter and a fifth signal parameter in sequence; and determining a call quality score corresponding to the voice call based on the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter and the fifth signal parameter.
The signal parameter may be a parameter representing the waveform, duration or amplitude, frequency, phase of each frequency component of the signal.
Optionally, the determining module 303 is further configured to compare any two signal parameters of the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter, and the fifth signal parameter, so as to obtain a first comparison result; and determining a call quality score corresponding to the voice call based on the first comparison result.
Any two of the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter and the fifth signal parameter are compared, and there are ten pieces of comparison information in total (two signal parameters correspond to one piece of comparison information), so that the first comparison result includes ten pieces of comparison information. Each piece of comparison information is used to represent the loss and gain of the signal between two signal parameters, and each piece of comparison information may be a score, with the closer the score is to 100, the less the loss and gain of the signal between two signal parameters. And determining a call quality score corresponding to the voice call based on the first comparison result, wherein the call quality score may be obtained by weighting and summing ten pieces of comparison information according to a preset weight value, and taking the result as the call quality score.
Optionally, the processing module 302 is further configured to process the first data packet by using an information transmission protocol, so as to obtain a ninth signal, where the information transmission protocol is a transmission protocol used by the call channel; respectively carrying out digital-to-analog conversion on the first digital sound signal, the second digital sound signal and the ninth signal to sequentially obtain a tenth signal, an eleventh signal and a twelfth signal; respectively identifying a first analog sound signal, a second analog sound signal, a tenth signal, an eleventh signal and a twelfth signal by utilizing an automatic voice recognition technology to obtain a first text, a second text, a third text, a fourth text and a fifth text in sequence; and determining a call quality score corresponding to the voice call based on the first text, the second text, the third text, the fourth text and the fifth text.
The tenth signal, the eleventh signal, and the twelfth signal are analog signals, and the embodiment of the present disclosure determines the call quality score based on the analog signals, so that the first digital sound signal, the second digital sound signal, and the ninth signal are digital-to-analog converted. The aim of Automatic Speech Recognition (ASR) technology is to make computer "listen and write" continuous speech spoken by different people, namely "speech dictation machine", which is a technology for realizing conversion from "voice" to "character".
Optionally, the processing module 302 is further configured to compare any two texts in the first text, the second text, the third text, the fourth text, and the fifth text to obtain a second comparison result; and determining a call quality score corresponding to the voice call based on the second comparison result.
Comparing any two of the first text, the second text, the third text, the fourth text, and the fifth text is similar to comparing any two of the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter, and the fifth signal parameter.
Optionally, the processing module 302 is further configured to obtain a third analog sound signal, a third digital sound signal, a second data packet, a fourth digital sound signal and a fourth analog sound signal corresponding to the second object within a preset time from the second object, the second device, the communication channel, the first device and the first object, respectively; respectively carrying out segmentation processing on the third analog sound signal, the third digital sound signal, the second data packet, the fourth digital sound signal and the fourth analog sound signal; determining a first score based on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal after the segmentation processing; determining a second score based on the segmented third analog sound signal, the segmented third digital sound signal, the segmented second data packet, the segmented fourth digital sound signal and the segmented fourth analog sound signal; and determining a call quality score corresponding to the voice call based on the first score and the second score.
In order to further improve the accuracy of the evaluation of the call quality score, the embodiment of the present disclosure evaluates a loss and gain of transmission of one or more voices in the call system within a preset time, and evaluates B loss and gain of transmission of one or more voices in the call system within a preset time. The first score may be a call quality score determined based on the segmented first analog voice signal, the segmented first digital voice signal, the segmented first data packet, the segmented second digital voice signal, and the segmented second analog voice signal. The second score is similar to the first score.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by functions and internal logic of the process, and should not constitute any limitation to the implementation process of the embodiments of the present disclosure.
Fig. 4 is a schematic diagram of an electronic device 4 provided by the embodiment of the present disclosure. As shown in fig. 4, the electronic apparatus 4 of this embodiment includes: a processor 401, a memory 402 and a computer program 403 stored in the memory 402 and executable on the processor 401. The steps in the various method embodiments described above are implemented when the processor 401 executes the computer program 403. Alternatively, the processor 401 implements the functions of the respective modules/units in the above-described respective apparatus embodiments when executing the computer program 403.
Illustratively, the computer program 403 may be partitioned into one or more modules/units, which are stored in the memory 402 and executed by the processor 401 to accomplish the present disclosure. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, which are used to describe the execution of the computer program 403 in the electronic device 4.
The electronic device 4 may be a desktop computer, a notebook, a palm computer, a cloud server, or other electronic devices. The electronic device 4 may include, but is not limited to, a processor 401 and a memory 402. Those skilled in the art will appreciate that fig. 4 is merely an example of the electronic device 4 and does not constitute a limitation of the electronic device 4 and may include more or fewer components than shown, or combine certain components, or different components, e.g., the electronic device may also include an input second device, a network access device, a bus, etc.
The Processor 401 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 402 may be an internal storage unit of the electronic device 4, for example, a hard disk or a memory of the electronic device 4. The memory 402 may also be an external storage device of the electronic device 4, for example, a plug-in hard disk provided on the electronic device 4, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 402 may also include both internal storage units of the electronic device 4 and external storage devices. The memory 402 is used for storing computer programs and other programs and data required by the electronic device. The memory 402 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described or recited in any embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
In the embodiments provided in the present disclosure, it should be understood that the disclosed apparatus/electronic device and method may be implemented in other ways. For example, the above-described apparatus/electronic device embodiments are merely illustrative, and for example, a module or a unit may be divided into only one logical function, and may be implemented in other ways, and multiple units or components may be combined or integrated into another system, or some features may be omitted or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the above embodiments may be realized by the present disclosure, and the computer program may be stored in a computer readable storage medium to instruct related hardware, and when the computer program is executed by a processor, the steps of the above method embodiments may be realized. The computer program may comprise computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain suitable additions or additions that may be required in accordance with legislative and patent practices within the jurisdiction, for example, in some jurisdictions, computer readable media may not include electrical carrier signals or telecommunications signals in accordance with legislative and patent practices.
The above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present disclosure, and are intended to be included within the scope of the present disclosure.

Claims (10)

1. A method for evaluating call quality, comprising:
when detecting that a first object and a second object carry out voice call, respectively acquiring a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal and a second analog sound signal corresponding to the first object within preset time from the first object, a first device, a call channel, a second device and the second object, wherein the call system corresponding to the voice call comprises: the first object, the first device, the talk channel, the second device, and the second object;
segmenting the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal, respectively;
and determining a call quality score corresponding to the voice call based on the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal and the second analog sound signal which are subjected to the segmentation processing.
2. The method of claim 1, wherein determining the call quality score corresponding to the voice call based on the segmented first analog voice signal, the first digital voice signal, the first data packet, the second digital voice signal, and the second analog voice signal comprises:
respectively carrying out analog-to-digital conversion on the first analog sound signal and the second analog sound signal which are subjected to the segmentation processing to obtain a first signal and a second signal in sequence;
processing the first data packet after the segmentation processing by using an information transmission protocol to obtain a third signal, wherein the information transmission protocol is a transmission protocol used by the communication channel;
the first signal, the second signal, the third signal, the first digital sound signal and the second digital sound signal after the segmentation processing are sequentially processed as follows to sequentially obtain a fourth signal, a fifth signal, a sixth signal, a seventh signal and an eighth signal: level adjustment, filtering, temporal alignment and compensation, and auditory transformation;
determining a call quality score corresponding to the voice call based on the fourth signal, the fifth signal, the sixth signal, the seventh signal, and the eighth signal.
3. The method of claim 2, wherein determining a call quality score corresponding to the voice call based on the fourth signal, the fifth signal, the sixth signal, the seventh signal, and the eighth signal comprises:
respectively extracting signal parameters of the fourth signal, the fifth signal, the sixth signal, the seventh signal and the eighth signal to sequentially obtain a first signal parameter, a second signal parameter, a third signal parameter, a fourth signal parameter and a fifth signal parameter;
determining a call quality score corresponding to the voice call based on the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter, and the fifth signal parameter.
4. The method of claim 3, wherein determining the call quality score corresponding to the voice call based on the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter, and the fifth signal parameter comprises:
comparing any two signal parameters of the first signal parameter, the second signal parameter, the third signal parameter, the fourth signal parameter and the fifth signal parameter to obtain a first comparison result;
and determining a call quality score corresponding to the voice call based on the first comparison result.
5. The method of claim 1, wherein after acquiring the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal corresponding to the first object within a preset time from the first object, the first device, the call channel, the second device, and the second object, respectively, when detecting that the first object has a voice call with the second object, the method further comprises:
processing the first data packet by using an information transmission protocol to obtain a ninth signal, wherein the information transmission protocol is a transmission protocol used by the communication channel;
respectively carrying out digital-to-analog conversion on the first digital sound signal, the second digital sound signal and the ninth signal to obtain a tenth signal, an eleventh signal and a twelfth signal in sequence;
respectively recognizing the first analog sound signal, the second analog sound signal, the tenth signal, the eleventh signal and the twelfth signal by utilizing an automatic voice recognition technology to sequentially obtain a first text, a second text, a third text, a fourth text and a fifth text;
determining a call quality score corresponding to the voice call based on the first text, the second text, the third text, the fourth text, and the fifth text.
6. The method of claim 5, wherein determining a call quality score corresponding to the voice call based on the first text, the second text, the third text, the fourth text, and the fifth text comprises:
comparing any two texts of the first text, the second text, the third text, the fourth text and the fifth text to obtain a second comparison result;
and determining a call quality score corresponding to the voice call based on the second comparison result.
7. The method of claim 1, wherein after the segmenting the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal, respectively, the method further comprises:
acquiring a third analog sound signal, a third digital sound signal, a second data packet, a fourth digital sound signal and a fourth analog sound signal corresponding to the second object within preset time from the second object, the second device, the call channel, the first device and the first object respectively;
segmenting the third analog sound signal, the third digital sound signal, the second data packet, the fourth digital sound signal, and the fourth analog sound signal, respectively;
determining a first score based on the segmented first analog sound signal, the segmented first digital sound signal, the segmented first data packet, the segmented second digital sound signal and the segmented second analog sound signal;
determining a second score based on the segmented third analog sound signal, the segmented third digital sound signal, the segmented second data packet, the segmented fourth digital sound signal and the segmented fourth analog sound signal;
and determining a call quality score corresponding to the voice call based on the first score and the second score.
8. An apparatus for evaluating call quality, comprising:
the obtaining module is configured to obtain a first analog sound signal, a first digital sound signal, a first data packet, a second digital sound signal and a second analog sound signal corresponding to a first object within a preset time from a first object, a first device, a call channel, a second device and a second object when detecting that the first object and the second object are in a voice call, wherein the call system corresponding to the voice call comprises: the first object, the first device, the talk channel, the second device, and the second object;
a processing module configured to segment the first analog sound signal, the first digital sound signal, the first data packet, the second digital sound signal, and the second analog sound signal, respectively;
the determining module is configured to determine a call quality score corresponding to the voice call based on the segmented first analog sound signal, the segmented first digital sound signal, the segmented first data packet, the segmented second digital sound signal and the segmented second analog sound signal.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202210312627.5A 2022-03-28 2022-03-28 Call quality evaluation method and device Pending CN114710589A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210312627.5A CN114710589A (en) 2022-03-28 2022-03-28 Call quality evaluation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210312627.5A CN114710589A (en) 2022-03-28 2022-03-28 Call quality evaluation method and device

Publications (1)

Publication Number Publication Date
CN114710589A true CN114710589A (en) 2022-07-05

Family

ID=82170009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210312627.5A Pending CN114710589A (en) 2022-03-28 2022-03-28 Call quality evaluation method and device

Country Status (1)

Country Link
CN (1) CN114710589A (en)

Similar Documents

Publication Publication Date Title
CN107393541B (en) Information verification method and device
CN107623614A (en) Method and apparatus for pushed information
CN110347863B (en) Speaking recommendation method and device and storage medium
CN106156009A (en) Voice translation method and device
CN107172255A (en) Voice signal self-adapting regulation method, device, mobile terminal and storage medium
CN107172256A (en) Earphone call self-adapting regulation method, device, mobile terminal and storage medium
CN111312283B (en) Cross-channel voiceprint processing method and device
CN112017630B (en) Language identification method and device, electronic equipment and storage medium
CN107580155B (en) Network telephone quality determination method, network telephone quality determination device, computer equipment and storage medium
CN111683317B (en) Prompting method and device applied to earphone, terminal and storage medium
CN109545193A (en) Method and apparatus for generating model
CN109151148B (en) Call content recording method, device, terminal and computer readable storage medium
CN108962231A (en) A kind of method of speech classification, device, server and storage medium
CN107172313A (en) Improve method, device, mobile terminal and the storage medium of hand-free call quality
CN107682553B (en) Call signal sending method and device, mobile terminal and storage medium
CN111243595A (en) Information processing method and device
CN109286554B (en) Social function unlocking method and device in social application
CN108600559B (en) Control method and device of mute mode, storage medium and electronic equipment
CN112992190B (en) Audio signal processing method and device, electronic equipment and storage medium
CN104348655A (en) Method and device for determining degree of safety and health of system
CN112382266A (en) Voice synthesis method and device, electronic equipment and storage medium
CN114710589A (en) Call quality evaluation method and device
CN107608718B (en) Information processing method and device
CN114861064A (en) Object recommendation method and device based on double-tower model
CN113517000A (en) Echo cancellation test method, terminal and storage device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20221201

Address after: 518054 cable information transmission building 25f2504, no.3369 Binhai Avenue, Haizhu community, Yuehai street, Nanshan District, Shenzhen City, Guangdong Province

Applicant after: Shenzhen Xumi yuntu Space Technology Co.,Ltd.

Address before: No.103, no.1003, Nanxin Road, Nanshan community, Nanshan street, Nanshan District, Shenzhen City, Guangdong Province

Applicant before: Shenzhen Jizhi Digital Technology Co.,Ltd.

TA01 Transfer of patent application right