CN108074586B

CN108074586B - Method and device for positioning voice problem

Info

Publication number: CN108074586B
Application number: CN201611013656.2A
Authority: CN
Inventors: 王威
Original assignee: China Academy of Telecommunications Technology CATT
Current assignee: China Academy of Telecommunications Technology CATT
Priority date: 2016-11-15
Filing date: 2016-11-15
Publication date: 2021-02-12
Anticipated expiration: 2036-11-15
Also published as: CN108074586A

Abstract

The invention provides a method and a device for positioning a voice question, wherein the method comprises the following steps: acquiring voice data to be analyzed, and analyzing the voice data to be analyzed to obtain analyzed voice data; establishing a corresponding relation of each voice frame in the analyzed voice data, wherein the corresponding relation comprises information source frame number information; searching a problem frame with a voice problem in the analyzed voice data; and outputting result information comprising the information source frame number relation of the problem frame. The embodiment of the invention can quickly locate the problem frame with the voice problem so as to improve the speed of locating the voice problem.

Description

Method and device for positioning voice problem

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a method and an apparatus for positioning a voice problem.

Background

When analyzing a terminal voice problem, the positioning voice problem is often the problem that a large amount of data needs to be analyzed to position the voice problem, i.e. find out the voice problem. For example: taking the narrow band as an example, the sampling frequency is 8000Hz, 20 milliseconds (ms) is one frame, and 50 frames are provided in 1 second. The voice test is usually measured in hours, and the time point of reporting the problem often has a time difference of several seconds, even tens of seconds or several minutes, so that the problem analysis may need to search several minutes before and after the problem point to locate the voice problem, that is, several thousands to ten thousands of frames of data need to be analyzed. When analyzing the speech problem, the processing operation is usually performed based on local soft-copy engineering, and the code stream needs to be analyzed and separated into different data types. Then, different encoding and decoding soft simulation projects are continuously changed and adjusted according to encoding and decoding modes and various information of files, time consumed for abnormal operation is consumed, working efficiency is seriously affected, and the time consumed for positioning voice is long. Therefore, the problem of too low speed exists in the current voice positioning problem process.

Disclosure of Invention

The invention aims to provide a method and a device for positioning a voice problem, which solve the problem of over-low speed in the process of positioning the voice problem.

In order to achieve the above object, an embodiment of the present invention provides a method for positioning a speech problem, including:

acquiring voice data to be analyzed, and analyzing the voice data to be analyzed to obtain analyzed voice data;

establishing a corresponding relation of each voice frame in the analyzed voice data, wherein the corresponding relation comprises information source frame number information;

searching a problem frame with a voice problem in the analyzed voice data;

and outputting result information comprising the information source frame number relation of the problem frame.

Optionally, the establishing a corresponding relationship including information source frame number information of each speech frame in the analyzed speech data includes:

reading the information source frame number information of each voice frame in the analyzed voice data and the timestamp information of the channel protocol layer, and establishing the corresponding relation between the information source frame number information of each voice frame and the timestamp information of the channel protocol layer; and/or

And reading the information source frame number information of each voice frame in the analyzed voice data and the frame number information of the channel protocol layer, and establishing the corresponding relation between the information source frame number information of each voice frame and the frame number information of the channel protocol layer.

Optionally, the searching for the problem frame with the speech problem in the parsed speech data includes:

searching a problem frame with a voice problem in the analyzed voice data during cell switching or channel switching, and marking the position of the problem frame by using a label; and/or

And searching for a bad frame in the analyzed voice data, and marking the bad frame by using bad frame indication information.

Optionally, the to-be-analyzed speech data is speech data after channel decoding and before source decoding of the terminal under test, and the method further includes:

selecting a target decoding mode;

decoding the analyzed voice data by using the target decoding mode to obtain offline pulse code modulation data;

and comparing and analyzing the offline pulse code modulation data with the acquired online pulse code data of the tested terminal to determine whether the tested terminal has a decoding problem, wherein the online pulse code data are data after the source decoding of the tested terminal.

Optionally, the voice data to be analyzed includes:

voice data after source coding and before channel coding; or

The voice data after channel decoding and before source decoding.

The embodiment of the present invention further provides a device for positioning a voice question, including:

the analysis module is used for acquiring voice data to be analyzed and analyzing the voice data to be analyzed to obtain analyzed voice data;

the establishing module is used for establishing the corresponding relation of each voice frame in the analyzed voice data, wherein the corresponding relation comprises information source frame number information;

the searching module is used for searching the problem frame with the voice problem in the analyzed voice data;

and the output module is used for outputting result information comprising the information source frame number relation of the problem frame.

Optionally, the establishing module is configured to read information source frame number information of each speech frame in the analyzed speech data and timestamp information of the channel protocol layer, and establish a corresponding relationship between the information source frame number information of each speech frame and the timestamp information of the channel protocol layer; and/or

The establishing module is used for reading the information source frame number information of each voice frame in the analyzed voice data and the frame number information of the channel protocol layer, and establishing the corresponding relation between the information source frame number information of each voice frame and the frame number information of the channel protocol layer.

Optionally, the searching module is configured to search a problem frame in the analyzed voice data for a voice problem during cell switching or channel switching, and mark a position of the problem frame by using a tag; and/or

The searching module is used for searching the bad frame in the analyzed voice data and marking the bad frame by using the bad frame indication information.

Optionally, the to-be-analyzed speech data is speech data after channel decoding and before source decoding of the terminal under test, and the apparatus further includes:

the selection module is used for selecting a target decoding mode;

the decoding module is used for decoding the analyzed voice data by using the target decoding mode to obtain offline pulse code modulation data;

and the analysis module is used for comparing and analyzing the offline pulse code modulation data with the acquired online pulse code data of the tested terminal to determine whether the tested terminal has a decoding problem, wherein the online pulse code data is data after the source of the tested terminal is decoded.

Optionally, the voice data to be analyzed includes:

voice data after source coding and before channel coding; or

The voice data after channel decoding and before source decoding.

The technical scheme of the invention at least has the following beneficial effects:

according to the embodiment of the invention, voice data to be analyzed are obtained and analyzed to obtain analyzed voice data; establishing a corresponding relation of each voice frame in the analyzed voice data, wherein the corresponding relation comprises information source frame number information; searching a problem frame with a voice problem in the analyzed voice data; and outputting result information comprising the information source frame number relation of the problem frame. Because the corresponding relation of each voice frame including the information source frame number information is established, the problem frame with voice problem can be quickly positioned through the corresponding relation, and the speed of positioning the voice problem is improved.

Drawings

Fig. 1 is a model diagram of a digital baseband communication system to which an embodiment of the present invention is applicable;

FIG. 2 is a flowchart illustrating a method for locating a speech problem according to an embodiment of the present invention;

fig. 3 is a schematic diagram of reporting data according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a speech frame analysis according to an embodiment of the present invention;

fig. 5 is a schematic diagram illustrating a correspondence relationship between voice frames according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating another speech frame analysis according to an embodiment of the present invention;

fig. 7 is a schematic diagram illustrating a correspondence relationship between voice frames according to another embodiment of the present invention;

FIG. 8 is a diagram illustrating a software module implementation process according to an embodiment of the present invention;

FIG. 9 is a schematic structural diagram of a device for locating a speech problem according to an embodiment of the present invention;

FIG. 10 is a schematic structural diagram of a device for locating a speech problem according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of a positioning apparatus for a speech problem according to an embodiment of the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.

Referring to fig. 1, fig. 1 is a schematic diagram of a model of a digital baseband communication system to which an embodiment of the present invention is applicable, and as shown in fig. 1, a process of transmitting voice data from a source to a sink includes: analog-to-digital conversion, source coding, channel transmission, channel decoding, source decoding and digital-to-analog conversion, wherein a noise source is often added in the channel transmission process. In addition, the source can be understood as a distributor of voice data or a voice data transmitting end, and the sink can be understood as a receiver of voice data or a voice data receiving end.

In addition, in the embodiment of the present invention, the source may be a user terminal, for example: a Mobile phone, a Computer, a home appliance, a Tablet Personal Computer (Tablet Personal Computer), a Laptop Computer (Laptop Computer), a Personal Digital Assistant (PDA), a Mobile Internet Device (MID), a Wearable Device (Wearable Device), and other terminal devices. Or the source may be a network side device, for example: a base station or a server, etc. Similarly, the sink may be a user terminal or a network side device. It should be noted that the specific types of the source and the sink are not limited in the embodiments of the present invention.

Referring to fig. 2, an embodiment of the present invention provides a method for positioning a speech problem, as shown in fig. 2, including the following steps:

201. acquiring voice data to be analyzed, and analyzing the voice data to be analyzed to obtain analyzed voice data;

202. establishing a corresponding relation of each voice frame in the analyzed voice data, wherein the corresponding relation comprises information source frame number information;

203. searching a problem frame with a voice problem in the analyzed voice data;

204. and outputting result information comprising the information source frame number relation of the problem frame.

In the embodiment of the present invention, the analysis voice data may be voice data acquired from a terminal to be tested, where the terminal to be tested may be a signal source or a signal sink. And the voice data to be analyzed may also be voice data of a specific location or a specific time period, for example: the voice data of the specific position in the tested terminal can be obtained through the starting frame number and the ending frame number, or the voice data of the specific time in the tested terminal can be obtained through the starting time and the ending time.

After the voice data are analyzed, the information source frame number information of each voice frame can be obtained, and the corresponding relation of each voice frame including the information source frame number information is established. The source frame number information may be a source frame number of a speech frame, for example: frame number information of source Pulse Code Modulation (PCM) data. In addition, the information source frame number can be the frame number when the information source generates a voice signal, the frame number corresponds to the information source, and the problem of the information source can be quickly positioned through the information source frame number. In addition, the above correspondence relationship including the information source frame number information may be understood as a correspondence relationship capable of finding information source frame number information, for example: the corresponding relationship between the information source frame number information and the time, or the corresponding relationship between the information source frame number information and the frame number information of the channel protocol layer, etc., of course, the corresponding relationship between the information source frame number information can also be found out by other methods, and the embodiment of the present invention is not limited.

In the embodiment of the present invention, searching for a problem frame having a speech problem in the analyzed speech data may be performed by analyzing each speech frame in the analyzed speech data, for example: the embodiment of the present invention is not limited to this, and the power spectrum of each speech frame is analyzed, or the sound quality of each speech frame can be analyzed, and other methods can be used to find the problem frame with speech problem. In addition, the above speech problem may be a speech sound interruption, a word drop, or an unclear speech problem, and the embodiment of the present invention is not limited thereto. For example: as shown in fig. 3, the sample data of the problem time frame number is reported, and the data may include the source frame number, the sample value, and the reporting problem time point. The problem analysis is performed on the data shown in fig. 3, and as shown in fig. 4, the position condition of the problem point speech frame can be directly positioned according to the sampling point value through fig. 4, so that the problem positioning time is greatly shortened in tens of thousands of frames of speech data, and the working efficiency is improved.

After the problem frame is found, the result information of the information source frame number relationship of the problem frame can be output, and certainly, the output can be the result information of the problem frame which is output independently, or the result information of all the speech frames which are output, and the method is not limited. In addition, the output mode includes, but is not limited to, display, printing, and the like.

It should be noted that, in the embodiment of the present invention, the execution sequence of step 202 and step 203 is not limited, for example: may be performed simultaneously or may be performed sequentially.

Through the steps, the problem frame with the voice problem can be quickly positioned by using the corresponding relation, so that the speed of positioning the voice problem is improved.

The time stamp information of the channel protocol layer may be time stamp information when the voice signal is channel encoded or decoded. In addition, the corresponding relationship here is a one-to-one corresponding relationship, that is, one voice frame corresponds to the corresponding relationship between one information source frame number information and the time stamp information of the channel protocol layer. Therefore, the time point of reporting the problem can find the corresponding information source problem point frame number through the timestamp information to be quickly positioned so as to quickly position the problem point.

The frame number information of the channel protocol layer may be frame number information when the speech signal is channel encoded or decoded. Similarly, there is a one-to-one correspondence relationship. Through the corresponding relation between the information source frame number information of the voice frame and the frame number information of the channel protocol layer, accurate alignment can be achieved between the problem point of the channel protocol layer and the problem point of the information source, so that problems found by the information source can be solved, the information source can be analyzed without problems, corresponding accurate positions can be provided for the channel to be analyzed continuously, the information source and the channel can be connected, and the accuracy of judging the positions of the problems can be improved rapidly.

It should be noted that, in the embodiment of the present invention, the two corresponding relationships, that is, the sum condition, may be established, so that the problem location is more convenient and faster. For example: as shown in fig. 5, for each speech frame, a corresponding relationship may be established, where the corresponding relationship includes a source frame number, a frame number of a channel protocol layer, and timestamp information of the channel protocol layer, and the corresponding relationship may further include a sampling point, a coding and decoding manner, and a Bad Frame Indication (BFI), where a BFI indicates a bad frame when the BFI is 1, and of course, other information may also be included, which is not limited to this embodiment of the present invention.

Optionally, in this embodiment, the searching for the problem frame with the speech problem in the parsed speech data includes:

The cell switching or the channel switching may be determined by analyzing a change of frame number information of a channel protocol layer of the voice frame, for example: as shown in fig. 5, the frame number information of the channel protocol layer of one speech frame is 1376854, and the frame number information of the next speech frame is 217701, so that the two frames can be determined to be speech frames during cell switching or channel switching according to the frame number information of the channel protocol layers of the two speech frames, and if the two frames have a speech problem, the two frames are the above problem frames, or the two frames are directly determined to be the problem frames. When the terminal moves and is switched between cells or channels, the situations of voice sound interruption, word dropping, unclear and the like can be caused at the moment, and certain switching information can be changed, so that the label indication position can be automatically given according to the switching information situations, the channel timestamp and the information source frame number can be clearly corresponding, the positioning can be quickly and automatically analyzed, most of problems can be quickly filtered, and the problem positioning analysis efficiency can be improved.

The finding of the bad frame in the analyzed speech data may be a bad frame found by analyzing the bad frame through each speech frame, and most of the speech quality problems such as speech break, word loss, and unclear are usually caused by the bad frame, for example: speech quality problems such as speech breaks, dropped characters, and unclear can be defined as bad frames. Specifically, as shown in fig. 5, a bad frame indicator is added to the corresponding relationship of each frame, so that the bad frame indicators of all encoding and decoding modes are corresponding to the channel timestamp and the source frame number, so as to analyze the voice quality and the channel condition by checking the quality of each frame of data.

In the embodiment of the present invention, the problem cause can be further found out through the information displayed in fig. 5, and the sudden change of the frame number of the channel in this time period indicates that the terminal also has cell switching at this time, and the problem point sound break is serious due to a lot of bad frames. If further deep understanding of the reasons is needed, analysis can be continued on a corresponding channel protocol layer according to the channel timestamp corresponding to the information source frame number of the problem, the position of the problem point is accurate, and rapid analysis and positioning of the problem are achieved.

In addition, in the embodiment of the present invention, the cell switching or the channel switching may also be determined by analyzing a change of a number mode of a speech frame, for example: as shown in fig. 6, for example, the ringback tone noise is an example, and it can be determined that a speech frame is switched from an encoding mode adaptive multi-rate codec (AMR) to an enhanced full-rate codec (EFR) through the corresponding relationship information of each frame shown in fig. 7, so as to determine that the channel is switched and the noise quality problem of the speech occurs due to a bad frame, and can quickly analyze and process many similar speech problems, and also quickly locate whether the problem is caused by a source layer or a channel network layer, thereby clarifying the source of the problem and speeding up the problem solution and location.

selecting a target decoding mode;

In this embodiment, a decoding method may be flexibly selected to decode a voice frame, where the target decoding method may be other decoding methods such as a full rate codec (FR), a half rate codec (HR), an enhanced full rate codec (EFR), a narrowband adaptive multi-rate codec (AMR-NB), or a wideband adaptive multi-rate codec (AMR-WB). For example: when the encoding mode of the speech frame is FR, the FR decoding mode can be selected to decode the speech frame,

after obtaining the off-line PCM data, the off-line PCM data can be connected with the on-line PCM of the tested terminal

And comparing the data, so as to determine whether the tested terminal has decoding problems, such as: when the offline PCM data is not matched with the online PCM, the decoding problem of the tested terminal can be determined, otherwise, the decoding problem of the tested terminal is determined. When the situation that the detected terminal has no decoding problem is determined, the corresponding accurate position can be provided for the channel to be analyzed continuously through the corresponding relation, so that the information source and the channel are linked, and the accuracy of judging the problem position is improved rapidly.

It should be noted that the data after source decoding of the terminal to be tested may be data decoded by the terminal when the terminal is online, that is, data after source decoding when channel transmission is performed. In addition, during the comparison, the offline PCM data of the same frame may be compared with the online PCM data to determine whether the decoding problem exists in the tested terminal. In this scenario, the terminal under test may be a signal sink.

Optionally, in this embodiment of the present invention, the voice data to be analyzed may include: voice data after source coding and before channel coding; or speech data after channel decoding and before source decoding.

The problem point of the voice data after the source coding and before the channel coding can be quickly positioned through the voice data after the source coding and before the channel coding, and whether the tested terminal has a coding problem or not can be determined through the data, namely whether the source has a coding problem or not or whether the source has a channel side problem or not can be determined. The voice data after the channel decoding and before the source decoding can quickly locate the problem point of the voice data after the channel decoding and before the source decoding, and can determine whether the tested terminal has the decoding problem, namely whether the signal sink has the decoding problem or not, or whether the signal sink is the problem of the channel side.

Optionally, in the embodiment of the present invention, the method may be implemented by special software, for example: the software may include: the system comprises an input and output file module, a code stream analysis and problem analysis module, a decoding module and a printing display module, wherein the implementation processes of the four modules can be shown in fig. 8, and the modules can be specifically explained as follows:

the input and output file module is mainly responsible for importing or storing corresponding files and types through control buttons and displaying corresponding file path names; for the output file with the code stream analysis function, the corresponding output file name does not need to be filled, and different analyzed and separated files can be automatically stored in the folders of the corresponding input files under the same path according to the sequence number.

The input file types analyzed by the code stream include data before source decoding and data after source encoding, where the data before source decoding can be understood as the voice data after channel decoding and before source decoding introduced in the foregoing embodiment, and the data after source encoding can be understood as the voice data after source encoding and before channel encoding. The code stream analyzing step may include:

importing an input file, wherein the step can be realized by clicking a button to select a corresponding input file; because the output files have different types, the output files do not need to be operated, and the generated output files can be automatically stored in the same path folder of the input files.

Selecting a data type, wherein the selected data type can be data before source decoding or data after source coding;

after the data type is selected, the data can be analyzed and the problem can be analyzed, wherein the analyzed data is stored in the same folder as the input file, and the code stream analysis is completed. For analysis and problem analysis, reference may be made to the description of the foregoing embodiments, which are not described herein again.

The decoding module can realize the following steps:

selecting and importing a data type to be decoded and inputting a file name;

saving the decoded corresponding output file name;

and the decoding can be automatically finished by clicking the corresponding button according to the type of the input file coding and decoding.

The communication terminal offline decoding basic types comprise AMR-WB, AMR-NB, EFR, HR and FR, and other corresponding required coding and decoding types can be added into an interface tool of software to meet the daily work requirement.

The invention can realize the off-line decoding through special software to realize the fast automatic decoding into the PCM data according to the input file, has simple operation, compares and analyzes the off-line data with the on-line PCM data to check whether the self encoding and decoding of the terminal side is abnormal or not, if the information source side does not analyze the abnormal, the information source side transmits the information source side to continue analyzing, and the problem point of the corresponding channel side is easy to find through the corresponding relationship between the analyzed channel time stamp and the information source frame number.

The printing display module can provide printing output information and error prompt information of each step in the tool operation process, and a user can conveniently check and know the information in time.

In the embodiment of the invention, the use in the field of voice specialty can be realized through special software, and personnel such as a channel protocol layer, related tests and the like can also use the device daily, so that the operation is simple and convenient. The universal installation and use under the operating system of the terminal can be realized through one installation package, and the code outflow of company enterprises is also avoided to meet the confidentiality requirement.

In the embodiment of the present invention, the above-described embodiments may be implemented in combination with each other or implemented separately, and the embodiment of the present invention is not limited thereto.

In the embodiment of the present invention, the method may be applied to any terminal having installed software, for example: intelligent devices such as computers, notebook computers, tablet computers and the like.

Referring to fig. 9, an embodiment of the present invention provides a device for locating a voice question, as shown in fig. 9, the device 900 for locating a voice question includes the following modules:

the analysis module 901 is configured to obtain voice data to be analyzed, and analyze the voice data to be analyzed to obtain analyzed voice data;

an establishing module 902, configured to establish a corresponding relationship, including information source frame number information, of each voice frame in the analyzed voice data;

a searching module 903, configured to search for a problem frame with a speech problem in the analyzed speech data;

an output module 904, configured to output result information including the source frame number relationship of the problem frame.

Optionally, the establishing module 902 is configured to read information source frame number information of each speech frame in the analyzed speech data and timestamp information of the channel protocol layer, and establish a corresponding relationship between the information source frame number information of each speech frame and the timestamp information of the channel protocol layer; and/or

The establishing module 902 is configured to read information source frame number information of each voice frame in the analyzed voice data and frame number information of the channel protocol layer, and establish a corresponding relationship between the information source frame number information of each voice frame and the frame number information of the channel protocol layer.

Optionally, the searching module 903 is configured to search a problem frame in the analyzed voice data, where the problem frame is a problem of voice when performing cell switching or channel switching, and mark a position of the problem frame by using a label; and/or

The searching module 903 is configured to search for a bad frame in the parsed voice data, and mark the bad frame with the bad frame indication information.

Optionally, the to-be-analyzed speech data is speech data after channel decoding and before source decoding of the terminal under test, as shown in fig. 10, the apparatus further includes:

a selecting module 905, configured to select a target decoding method;

a decoding module 906, configured to decode the parsed voice data by using the target decoding manner, so as to obtain offline pulse code modulation data;

an analysis module 907, configured to compare the offline pulse code modulation data with the obtained online pulse code data of the terminal under test, and determine whether the terminal under test has a decoding problem, where the online pulse code data is data after the source of the terminal under test is decoded.

Optionally, the output module 904 is further configured to output the analysis result of the analysis module 907.

Optionally, the voice data to be analyzed includes:

voice data after source coding and before channel coding; or

The voice data after channel decoding and before source decoding.

In the embodiment of the present invention, the apparatus may be applied to or may be any terminal having software installed therein, for example: intelligent devices such as computers, notebook computers, tablet computers and the like.

It should be noted that the positioning apparatus 900 for speech problem in this embodiment can implement any implementation manner in the method embodiment shown in fig. 2 in this embodiment of the present invention, and achieve the same beneficial effects, which are not described herein again.

Referring to fig. 11, there is shown a structure of a localization apparatus of a voice question including: a processor 1100, a transceiver 1110, a memory 1120, a user interface 1130, and a bus interface, wherein:

the processor 1100, which reads the program in the memory 1120, performs the following processes:

searching a problem frame with a voice problem in the analyzed voice data;

Among other things, the transceiver 1110 is used for receiving and transmitting data under the control of the processor 1100.

In FIG. 11, the bus architecture may include any number of interconnected buses and bridges, with one or more processors, represented by processor 1100, and various circuits of memory, represented by memory 1120, being linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. The transceiver 1110 may be a number of elements including a transmitter and a receiver that provide a means for communicating with various other apparatus over a transmission medium. For different user devices, the user interface 1130 may also be an interface capable of interfacing with a desired device, including but not limited to a keypad, display, speaker, microphone, joystick, etc.

The processor 1100 is responsible for managing the bus architecture and general processing, and the memory 1120 may store data used by the processor 1100 in performing operations.

Optionally, the to-be-analyzed speech data is speech data after channel decoding and before source decoding of the terminal under test, and the processor 1100 is further configured to:

selecting a target decoding mode;

Optionally, the voice data to be analyzed includes:

voice data after source coding and before channel coding; or

The voice data after channel decoding and before source decoding.

It should be noted that, the positioning apparatus for speech problem in this embodiment may implement any implementation manner in the method embodiment shown in fig. 2 in this embodiment of the present invention, and achieve the same beneficial effects, and details are not described here again.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be physically included alone, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute some steps of the transceiving method according to various embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method for locating a speech problem, comprising:

searching a problem frame with a voice problem in the analyzed voice data;

outputting result information comprising the information source frame number relation of the problem frame;

wherein the voice data to be analyzed includes:

voice data after source coding and before channel coding; or

The voice data after channel decoding and before source decoding.

2. The method of claim 1, wherein the establishing the correspondence relationship including the source frame number information for each speech frame in the parsed speech data comprises:

3. The method of claim 2, wherein said finding problem frames in said parsed speech data having speech problems comprises:

4. The method according to any of claims 1-3, wherein the voice data to be analyzed is voice data after channel decoding and before source decoding of the terminal under test, the method further comprising:

selecting a target decoding mode;

5. A device for locating a speech question, comprising:

the output module is used for outputting result information comprising the information source frame number relation of the problem frame;

wherein the voice data to be analyzed includes:

voice data after source coding and before channel coding; or

The voice data after channel decoding and before source decoding.

6. The apparatus of claim 5, wherein the establishing module is configured to read the source frame number information of each speech frame in the parsed speech data and the timestamp information of the channel protocol layer, and establish a corresponding relationship between the source frame number information of each speech frame and the timestamp information of the channel protocol layer; and/or

7. The apparatus of claim 6, wherein the searching module is configured to search for a problem frame in the parsed voice data, where the problem frame is a voice problem when performing cell switching or channel switching, and perform location marking on the problem frame by using a tag; and/or

8. The apparatus according to any of claims 5-7, wherein the voice data to be analyzed is voice data after channel decoding and before source decoding of the terminal under test, the apparatus further comprising:

the selection module is used for selecting a target decoding mode;