CN110827838A - Opus-based voice coding method and apparatus - Google Patents
Opus-based voice coding method and apparatus Download PDFInfo
- Publication number
- CN110827838A CN110827838A CN201910984964.7A CN201910984964A CN110827838A CN 110827838 A CN110827838 A CN 110827838A CN 201910984964 A CN201910984964 A CN 201910984964A CN 110827838 A CN110827838 A CN 110827838A
- Authority
- CN
- China
- Prior art keywords
- preset
- speech
- voice
- module
- coded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
Abstract
The invention provides an opus-based speech coding method, which comprises the following steps: step S1: acquiring a current speech frame to be coded in a preset speech section; step S2: based on the opus coding technology, coding the obtained current speech frame to be coded to obtain a coded speech frame; step S3: the byte size of the obtained encoded speech frame is expressed and determined based on a preset byte header. The method is used for representing the size of the coded voice frame based on the preset byte header by adopting the opus coding technology, so that the proportion of header information can be effectively reduced, and the byte size after voice coding can be effectively reduced.
Description
Technical Field
The present invention relates to the field of speech coding technology, and in particular, to a speech coding method and apparatus based on opus.
Background
In the currently used speech coding technology, each frame of speech is generally encoded to have 8 bytes of header information, and a section of speech is integrally encoded to be composed of a plurality of header information + encoded speech data, but if 8 bytes of header information are used, the proportion occupied by the header information in the bytes after speech coding is large, for example, the sampling rate is 16k, a frame of 20ms speech, the size after 8 times compression coding is about 80 bytes, wherein the header information occupies about 10% of the total size; the sampling rate is 8k, the size of a frame of 20ms speech after 8 times of compression coding is about 40 bytes, the header information occupies about 16% of the total size, and the header information occupies a larger proportion, so that it is important to reduce the percentage of the header information to reduce the size of the byte after speech coding.
Disclosure of Invention
The invention provides an opus-based speech coding method, which is used for representing the size of a coded speech frame based on a preset byte header by adopting opus coding technology, effectively reducing the proportion of header information and further effectively reducing the byte size after speech coding.
The embodiment of the invention provides an opus-based speech coding method, which comprises the following steps:
step S1: acquiring a current speech frame to be coded in a preset speech section;
step S2: based on the opus coding technology, coding the obtained current speech frame to be coded to obtain a coded speech frame;
step S3: the byte size of the obtained encoded speech frame is represented and determined based on a preset byte header.
In a possible implementation manner, before performing step S1, the method further includes:
step S11: acquiring a preset voice section input by a user;
step S12: and according to the preset time length, carrying out segmentation processing on the acquired preset voice section input by the user, and acquiring a plurality of voice frames to be coded.
In a possible implementation manner, after performing step S3, the method further includes:
step S31: acquiring a next speech frame to be coded of the current speech frame to be coded;
step S32: controlling the obtained next speech frame to be encoded to execute steps S2-S3;
step S33: and based on the preset arrangement sequence of the speech frames to be coded, continuing to execute the steps S31-S32 until all the speech frames to be coded in the preset speech segment are completely executed.
In a possible implementation manner, after performing step S3, the method further includes:
step S41: determining the proportional size of the preset byte header in the obtained byte size of the encoded voice frame;
step S42: judging whether the determined proportion size is smaller than a preset proportion size;
if yes, executing a first alarm operation;
otherwise, determining the byte size of the preset byte header, judging whether the determined byte size is smaller than the preset byte size, and if so, executing a second alarm operation;
otherwise, executing a third alarm operation.
In a possible implementation manner, after the step S11 is executed and before the step S12 is executed, the method further includes:
step S111: judging whether blank voice exists in the acquired preset voice section input by the user, and if so, sending a discarding instruction;
step S112: based on a voice position database, determining the position information of the blank voice in the preset voice section according to the sent discarding instruction;
step S113: based on the position information determined in step S112, deleting the corresponding blank speech, and recombining into a new preset speech segment.
The embodiment of the invention provides an opus-based speech coding device, which comprises:
the first acquisition module is used for acquiring a current speech frame to be coded in a preset speech segment;
the encoding module is used for encoding the current speech frame to be encoded acquired by the first acquisition module based on the opus encoding technology to acquire an encoded speech frame;
a first determining module, configured to indicate and determine a byte size of the encoded speech frame obtained by the encoding module based on a preset byte header.
In one possible implementation manner, the method further includes:
the second acquisition module is used for acquiring a preset voice segment input by a user before the first acquisition module acquires the current voice frame to be coded;
and the segmentation module is used for segmenting the preset voice segment input by the user and acquired by the second acquisition module according to the preset time length and acquiring a plurality of voice frames to be coded.
In one possible implementation manner, the method further includes:
a third obtaining module, configured to obtain a next speech frame to be encoded of the current speech frame to be encoded;
the first control module is used for controlling the next speech frame to be coded, which is acquired by the third acquisition module, to execute corresponding subsequent operations;
and controlling the rest speech frames to be coded in the preset speech segment based on the preset arrangement sequence of the speech frames to be coded, and executing corresponding subsequent operations.
In one possible implementation manner, the method further includes:
a second determining module, configured to determine a proportional size of the preset byte header in the obtained byte size of the encoded speech frame;
the second control module is used for judging whether the proportion size determined by the second determination module is smaller than a preset proportion size or not;
if so, controlling an alarm module to execute a first alarm operation;
otherwise, determining the byte size of the preset byte header, judging whether the determined byte size is smaller than the preset byte size, and if so, controlling an alarm module to execute a second alarm operation;
otherwise, controlling the alarm module to execute a third alarm operation.
In one possible implementation manner, the method further includes:
the judging module is used for judging whether blank voice exists in the preset voice section input by the user and acquired by the second acquiring module, and if yes, a discarding instruction is sent out;
the determining module is used for determining the position information of the blank voice in the preset voice section based on a voice position database and according to the discarding instruction sent by the judging module;
and the recombination module is used for deleting the corresponding blank voice according to the position information of the blank voice determined by the determination module in the preset voice section and recombining the blank voice into a new preset voice section.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a method for opus-based speech coding according to an embodiment of the present invention;
FIG. 2 is a block diagram of an opus-based speech coder according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
An embodiment of the present invention provides an opus-based speech coding method, as shown in fig. 1, including:
step S1: acquiring a current speech frame to be coded in a preset speech section;
step S2: based on the opus coding technology, coding the obtained current speech frame to be coded to obtain a coded speech frame;
step S3: the byte size of the obtained encoded speech frame is represented and determined based on a preset byte header.
The opus is a sound coding format, and the opus coding technology adopted here is used for compressing and coding a speech frame to be coded;
the current speech frame to be encoded is a certain frame in the preset speech segment, and the frame can be obtained by taking 20ms as a frame;
the preset byte header is 2 bytes in size, which has the advantage of effectively reducing the ratio of the size of the speech frame after the byte header is encoded, for example:
the maximum sampling rate supported by opus is 48k, the maximum frame size is 60ms, and the corresponding frame voice size is: sampling rate/1000 x 2 frame size, i.e. the size of a frame of speech is at most 5760 bytes, at this time, based on a preset byte header, for example, 2 bytes size can be expressed, and the byte size of the obtained encoded speech frame is determined;
and, for example: in the prior art, it is assumed that a sampling rate is 16k, a frame of 20ms speech has a size after 8 times of compression coding of 80 bytes, wherein header information accounts for 10% of the total size of the speech frame after coding; the sampling rate is 8k, one frame of 20ms speech, the 8 times compressed and encoded size is 40 bytes, wherein the header information accounts for 16% of the total size of the encoded speech frame, compared with the previous frame, the size of each frame of speech can be reduced by 6 bytes, then, the sampling rate is 16k, one frame of 20ms speech, the 8 times compressed and encoded size is 80, wherein the preset byte header accounts for 2-3%, and the preset byte header accounts for 6-7% of the header information in the prior art;
similarly, the sampling rate is 8k, the size of a frame of 20ms speech after 8 times of compression coding is 40 bytes, wherein the ratio of the preset bytes is 4-6%, and the ratio of the preset byte header is reduced by 10-12% compared with the ratio of the header information in the prior art; the higher the compression factor thereof, the larger the reduction ratio of the preset byte header compared with the ratio of the header information of the prior art.
The beneficial effects of the above technical scheme are: the method is used for representing the size of the coded voice frame based on the preset byte header by adopting the opus coding technology, so that the proportion of header information can be effectively reduced, and the byte size after voice coding can be effectively reduced.
The embodiment of the invention provides an opus-based speech coding method, which further comprises the following steps before the step S1 is executed:
step S11: acquiring a preset voice section input by a user;
step S12: and according to the preset time length, carrying out segmentation processing on the acquired preset voice section input by the user, and acquiring a plurality of voice frames to be coded.
The preset voice segment input by the user can be a segment of audio information;
the predetermined time length may be a frame length less than or equal to 60ms, such as: 20ms because the frame size supported by opus is 60ms maximum.
The beneficial effects of the above technical scheme are: the preset voice segment is segmented, so that a plurality of voice frames to be coded can be conveniently obtained, and convenience is brought to the coding processing of the subsequent voice frames to be coded.
The embodiment of the present invention provides an opus-based speech encoding method, which further includes, after performing step S3:
step S31: acquiring a next speech frame to be coded of the current speech frame to be coded;
step S32: controlling the obtained next speech frame to be encoded to execute steps S2-S3;
step S33: and based on the preset arrangement sequence of the speech frames to be coded, continuing to execute the steps S31-S32 until all the speech frames to be coded in the preset speech segment are completely executed.
The preset arrangement order of the preset speech frames to be encoded may be, for example, that frames are arranged in a time sequence.
The beneficial effects of the above technical scheme are: according to the preset mine removal sequence, all the segmented frames are conveniently processed, and omission and information loss are avoided.
The embodiment of the present invention provides an opus-based speech encoding method, which further includes, after performing step S3:
step S41: determining the proportional size of the preset byte header in the obtained byte size of the encoded voice frame;
step S42: judging whether the determined proportion size is smaller than a preset proportion size;
if yes, executing a first alarm operation;
otherwise, determining the byte size of the preset byte header, judging whether the determined byte size is smaller than the preset byte size, and if so, executing a second alarm operation;
otherwise, executing a third alarm operation.
For example: the sampling rate is 16k, the size of a frame of 20ms voice after 8 times of compression coding is 80 bytes, wherein the size of header information is 8 bytes, and the header information accounts for 10 percent of the total size of the coded voice frame;
or the sampling rate is 16k, the size of a frame of 20ms voice after 8 times of compression coding is 80 bytes, wherein the preset byte head is 2 bytes in size, and the total occupation ratio of the preset byte head is 2-3%;
wherein, a frame of 20ms speech is a speech frame to be coded, and the preset proportion is 10% of the total size of the header information; the proportion is 2-3% of the total proportion of the preset byte heads;
the determined byte size is the size of a preset byte header;
the preset byte size may be 8 bytes of header information;
the first alarm operation may be an alarm indicating that the preset byte head is qualified;
the second alarm operation may be an alarm indicating that the byte size of the preset byte header is qualified;
the third alarm operation may be an alarm indicating that the byte size of the preset byte header is not acceptable.
The beneficial effects of the above technical scheme are: by judging the byte size of the preset byte header, the error of representation caused by misoperation can be avoided, and the qualified preset byte header can be effectively ensured to be represented.
The embodiment of the present invention provides an opus-based speech encoding method, which, after performing step S11 and before performing step S12, further includes:
step S111: judging whether blank voice exists in the acquired preset voice section input by the user, and if so, sending a discarding instruction;
step S112: based on a voice position database, determining the position information of the blank voice in the preset voice section according to the sent discarding instruction;
step S113: based on the position information determined in step S112, deleting the corresponding blank speech, and recombining into a new preset speech segment.
The discard instruction may be an instruction of blank speech, and may include: the time of starting and the time of ending in the preset voice section of the blank voice;
the new preset voice segment is formed by the recombination, and blank voice is not included.
The beneficial effects of the above technical scheme are: by deleting the blank speech, the encoding compression work of the blank speech can be reduced, the efficiency of encoding compression of the reconstructed preset speech segment is improved, and meanwhile, the storage space of the speech segment after encoding compression can be saved.
An embodiment of the present invention provides an opus-based speech encoding apparatus, as shown in fig. 2, including:
the first acquisition module is used for acquiring a current speech frame to be coded in a preset speech segment;
the encoding module is used for encoding the current speech frame to be encoded acquired by the first acquisition module based on the opus encoding technology to acquire an encoded speech frame;
a first determining module, configured to determine, based on a preset byte header, a byte size of the encoded speech frame obtained by the encoding module.
The beneficial effects of the above technical scheme are: the method is used for representing the size of the coded voice frame based on the preset byte header by adopting the opus coding technology, so that the proportion of header information can be effectively reduced, and the byte size after voice coding can be effectively reduced.
The embodiment of the invention provides an opus-based speech coding device, which further comprises:
the second acquisition module is used for acquiring a preset voice segment input by a user before the first acquisition module acquires the current voice frame to be coded;
and the segmentation module is used for segmenting the preset voice segment input by the user and acquired by the second acquisition module according to the preset time length and acquiring a plurality of voice frames to be coded.
The beneficial effects of the above technical scheme are: the preset voice segment is segmented, so that a plurality of voice frames to be coded can be conveniently obtained, and convenience is brought to the coding processing of the subsequent voice frames to be coded.
The embodiment of the invention provides an opus-based speech coding device, which further comprises:
a third obtaining module, configured to obtain a next speech frame to be encoded of the current speech frame to be encoded;
the first control module is used for controlling the next speech frame to be coded, which is acquired by the third acquisition module, to execute corresponding subsequent operations;
and controlling the rest speech frames to be coded in the preset speech segment based on the preset arrangement sequence of the speech frames to be coded, and executing corresponding subsequent operations.
The beneficial effects of the above technical scheme are: according to the preset mine removal sequence, all the segmented frames are conveniently processed, and omission and information loss are avoided.
The embodiment of the invention provides an opus-based speech coding device, which further comprises:
a second determining module, configured to determine a proportional size of the preset byte header in the obtained byte size of the encoded speech frame;
the second control module is used for judging whether the proportion size determined by the second determination module is smaller than a preset proportion size or not;
if so, controlling an alarm module to execute a first alarm operation;
otherwise, determining the byte size of the preset byte header, judging whether the determined byte size is smaller than the preset byte size, and if so, controlling an alarm module to execute a second alarm operation;
otherwise, controlling the alarm module to execute a third alarm operation.
The beneficial effects of the above technical scheme are: by judging the byte size of the preset byte header, the error of representation caused by misoperation can be avoided, and the qualified preset byte header can be effectively ensured to be represented.
The embodiment of the invention provides an opus-based speech coding device, which further comprises:
the judging module is used for judging whether blank voice exists in the preset voice section input by the user and acquired by the second acquiring module, and if yes, a discarding instruction is sent out;
the determining module is used for determining the position information of the blank voice in the preset voice section based on a voice position database and according to the discarding instruction sent by the judging module;
and the recombination module is used for deleting the corresponding blank voice according to the position information of the blank voice determined by the determination module in the preset voice section and recombining the blank voice into a new preset voice section.
The beneficial effects of the above technical scheme are: by deleting the blank speech, the encoding compression work of the blank speech can be reduced, the efficiency of encoding compression of the reconstructed preset speech segment is improved, and meanwhile, the storage space of the speech segment after encoding compression can be saved.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (10)
1. An opus-based speech coding method, comprising:
step S1: acquiring a current speech frame to be coded in a preset speech section;
step S2: based on the opus coding technology, coding the obtained current speech frame to be coded to obtain a coded speech frame;
step S3: the byte size of the obtained encoded speech frame is represented and determined based on a preset byte header.
2. The opus-based speech coding method of claim 1, further comprising, before performing step S1:
step S11: acquiring a preset voice section input by a user;
step S12: and according to the preset time length, carrying out segmentation processing on the acquired preset voice section input by the user, and acquiring a plurality of voice frames to be coded.
3. The opus-based speech coding method of claim 2, further comprising, after performing step S3:
step S31: acquiring a next speech frame to be coded of the current speech frame to be coded;
step S32: controlling the obtained next speech frame to be encoded to execute steps S2-S3;
step S33: and based on the preset arrangement sequence of the speech frames to be coded, continuing to execute the steps S31-S32 until all the speech frames to be coded in the preset speech segment are completely executed.
4. The opus-based speech coding method of claim 1, further comprising, after performing step S3:
step S41: determining the proportional size of the preset byte header in the obtained byte size of the encoded voice frame;
step S42: judging whether the determined proportion size is smaller than a preset proportion size;
if yes, executing a first alarm operation;
otherwise, determining the byte size of the preset byte header, judging whether the determined byte size is smaller than the preset byte size, and if so, executing a second alarm operation;
otherwise, executing a third alarm operation.
5. The opus-based speech coding method of claim 2, wherein after the step S11 is performed and before the step S12 is performed, further comprising:
step S111: judging whether blank voice exists in the acquired preset voice section input by the user, and if so, sending a discarding instruction;
step S112: based on a voice position database, determining the position information of the blank voice in the preset voice section according to the sent discarding instruction;
step S113: based on the position information determined in step S112, deleting the corresponding blank speech, and recombining into a new preset speech segment.
6. An opus-based speech coder, comprising:
the first acquisition module is used for acquiring a current speech frame to be coded in a preset speech segment;
the encoding module is used for encoding the current speech frame to be encoded acquired by the first acquisition module based on the opus encoding technology to acquire an encoded speech frame;
a first determining module, configured to indicate and determine a byte size of the encoded speech frame obtained by the encoding module based on a preset byte header.
7. The opus-based speech coder of claim 6, further comprising:
the second acquisition module is used for acquiring a preset voice segment input by a user before the first acquisition module acquires the current voice frame to be coded;
and the segmentation module is used for segmenting the preset voice segment input by the user and acquired by the second acquisition module according to the preset time length and acquiring a plurality of voice frames to be coded.
8. The opus-based speech coder of claim 7, further comprising:
a third obtaining module, configured to obtain a next speech frame to be encoded of the current speech frame to be encoded;
the first control module is used for controlling the next speech frame to be coded, which is acquired by the third acquisition module, to execute corresponding subsequent operations;
and controlling the rest speech frames to be coded in the preset speech segment based on the preset arrangement sequence of the speech frames to be coded, and executing corresponding subsequent operations.
9. The opus-based speech coder of claim 6, further comprising:
a second determining module, configured to determine a proportional size of the preset byte header in the obtained byte size of the encoded speech frame;
the second control module is used for judging whether the proportion size determined by the second determination module is smaller than a preset proportion size or not;
if so, controlling an alarm module to execute a first alarm operation;
otherwise, determining the byte size of the preset byte header, judging whether the determined byte size is smaller than the preset byte size, and if so, controlling an alarm module to execute a second alarm operation;
otherwise, controlling the alarm module to execute a third alarm operation.
10. The opus-based speech coder of claim 7, further comprising:
the judging module is used for judging whether blank voice exists in the preset voice section input by the user and acquired by the second acquiring module, and if yes, a discarding instruction is sent out;
the determining module is used for determining the position information of the blank voice in the preset voice section based on a voice position database and according to the discarding instruction sent by the judging module;
and the recombination module is used for deleting the corresponding blank voice according to the position information of the blank voice determined by the determination module in the preset voice section and recombining the blank voice into a new preset voice section.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910984964.7A CN110827838A (en) | 2019-10-16 | 2019-10-16 | Opus-based voice coding method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910984964.7A CN110827838A (en) | 2019-10-16 | 2019-10-16 | Opus-based voice coding method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110827838A true CN110827838A (en) | 2020-02-21 |
Family
ID=69549439
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910984964.7A Pending CN110827838A (en) | 2019-10-16 | 2019-10-16 | Opus-based voice coding method and apparatus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110827838A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050216262A1 (en) * | 2004-03-25 | 2005-09-29 | Digital Theater Systems, Inc. | Lossless multi-channel audio codec |
CN1684523A (en) * | 2003-11-26 | 2005-10-19 | 三星电子株式会社 | Method and apparatus for encoding/decoding mpeg-4 bsac audio bitstream having auxillary information |
CN101136233A (en) * | 2006-07-14 | 2008-03-05 | 索尼株式会社 | Playback apparatus, playback method, system and recording medium |
US20080062877A1 (en) * | 2006-09-13 | 2008-03-13 | Juin-Hwey Chen | Adaptive packet size modification for voice over packet networks |
CN103617797A (en) * | 2013-12-09 | 2014-03-05 | 腾讯科技(深圳)有限公司 | Voice processing method and device |
CN106683682A (en) * | 2015-11-05 | 2017-05-17 | 湖南德海通信设备制造有限公司 | Method for improving speech transmission efficiency |
CN111326176A (en) * | 2018-12-14 | 2020-06-23 | 中移(杭州)信息技术有限公司 | Detection method, device and medium of RTP packet based on OPUS coding |
-
2019
- 2019-10-16 CN CN201910984964.7A patent/CN110827838A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1684523A (en) * | 2003-11-26 | 2005-10-19 | 三星电子株式会社 | Method and apparatus for encoding/decoding mpeg-4 bsac audio bitstream having auxillary information |
US20050216262A1 (en) * | 2004-03-25 | 2005-09-29 | Digital Theater Systems, Inc. | Lossless multi-channel audio codec |
CN101136233A (en) * | 2006-07-14 | 2008-03-05 | 索尼株式会社 | Playback apparatus, playback method, system and recording medium |
US20080062877A1 (en) * | 2006-09-13 | 2008-03-13 | Juin-Hwey Chen | Adaptive packet size modification for voice over packet networks |
CN103617797A (en) * | 2013-12-09 | 2014-03-05 | 腾讯科技(深圳)有限公司 | Voice processing method and device |
CN106683682A (en) * | 2015-11-05 | 2017-05-17 | 湖南德海通信设备制造有限公司 | Method for improving speech transmission efficiency |
CN111326176A (en) * | 2018-12-14 | 2020-06-23 | 中移(杭州)信息技术有限公司 | Detection method, device and medium of RTP packet based on OPUS coding |
Non-Patent Citations (1)
Title |
---|
HONORZHANG: ""音频编解码器 - Opus"", 《HTTPS://WWW.JIANSHU.COM/P/BE8D40B61171》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112289323B (en) | Voice data processing method and device, computer equipment and storage medium | |
CN107886560B (en) | Animation resource processing method and device | |
CN106847305B (en) | Method and device for processing recording data of customer service telephone | |
CN110535846B (en) | Data frame compression method and system based on DL/T698.45 protocol | |
CN106911939A (en) | A kind of video transcoding method, apparatus and system | |
CN113129927A (en) | Voice emotion recognition method, device, equipment and storage medium | |
CN110827838A (en) | Opus-based voice coding method and apparatus | |
CN112802498A (en) | Voice detection method and device, computer equipment and storage medium | |
CN109361923B (en) | Sliding time window scene switching detection method and system based on motion analysis | |
CN106782573B (en) | Method for generating AAC file through coding | |
CN113257238B (en) | Training method of pre-training model, coding feature acquisition method and related device | |
CN113409792B (en) | Voice recognition method and related equipment thereof | |
CN110392262B (en) | Method and device for compressing virtual desktop image | |
CN114220415A (en) | Audio synthesis method and device, electronic equipment and storage medium | |
CN111354351A (en) | Control device, voice interaction device, voice recognition server, and storage medium | |
CN111757119B (en) | Method for realizing vp9 prob updating through cooperative work of software and hardware and storage device | |
CN110931021B (en) | Audio signal processing method and device | |
CN113096673B (en) | Voice processing method and system based on generation countermeasure network | |
CN108171763B (en) | Method and system for accessing decoded coefficient, and method for accessing JPEG decoded coefficient | |
CN116597817A (en) | Audio recognition method, device and storage medium | |
CN114448957B (en) | Audio data transmission method and device | |
CN110708074B (en) | Compression and decompression method, system and medium for SAM and BAM file CIGAR domain | |
CN115050368B (en) | Equipment control method and device, intelligent equipment and readable storage medium | |
CN113838450B (en) | Audio synthesis and corresponding model training method, device, equipment and storage medium | |
KR20190141750A (en) | Method and device for processing stereo signal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200221 |
|
RJ01 | Rejection of invention patent application after publication |