CN112511407A - Self-adaptive voice playing method and system - Google Patents

Self-adaptive voice playing method and system Download PDF

Info

Publication number
CN112511407A
CN112511407A CN202011196492.8A CN202011196492A CN112511407A CN 112511407 A CN112511407 A CN 112511407A CN 202011196492 A CN202011196492 A CN 202011196492A CN 112511407 A CN112511407 A CN 112511407A
Authority
CN
China
Prior art keywords
voice
voice data
time
max
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011196492.8A
Other languages
Chinese (zh)
Other versions
CN112511407B (en
Inventor
刘祥国
张营
杜慧珺
李文敬
雷现惠
彭佳
杨坤
周佳
淳于岳松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, TaiAn Power Supply Co of State Grid Shandong Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202011196492.8A priority Critical patent/CN112511407B/en
Publication of CN112511407A publication Critical patent/CN112511407A/en
Application granted granted Critical
Publication of CN112511407B publication Critical patent/CN112511407B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/52User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services

Abstract

The invention provides a self-adaptive voice playing method, which overlaps voice time lengths of voice data which belong to the same user ID and have a time interval between two adjacent voice data not larger than a preset interval threshold value, and then uses double-speed playing when the overlapped voice time lengths are larger than the preset time threshold value. The invention also provides a self-adaptive voice playing system. The invention can carry out self-adaptive speed-multiplying playing on the voice data.

Description

Self-adaptive voice playing method and system
Technical Field
The invention relates to a voice playing method and a voice playing system, in particular to a self-adaptive voice playing method and a self-adaptive voice playing system.
Background
With the rapid development of mobile internet technology, users increasingly use instant messaging software to communicate. Instant messaging software supports voice transmission, but some people transmit too long voice, thereby affecting the efficiency of users in obtaining information in the voice. For example, some users send a plurality of voices with a time duration of 60 seconds or more, so to understand the information expressed by the users, the voices need to be heard completely, and the time is long. One solution is to use the technology of converting the voice into text in the instant messaging software, so as to convert the audio into text and increase the browsing speed. However, the disadvantage of this scheme is also obvious, and mainly when the audio is converted into text, some information is lost and misplaced.
There are also some playing software in the prior art, which can increase the video playing speed. For example, the improvement is 1.25 times, 1.5 times and the like of the original. By increasing the playing speed, the user can quickly know the information in the audio, as shown in fig. 1. However, this kind of playing software can only play fixed multiple speed, and cannot perform adaptive adjustment according to the voice duration, and the user cannot acquire much information in a short time, and the user experience is not good.
Disclosure of Invention
The invention provides a self-adaptive voice playing method, which can play the voice with longer time in the instant messaging software at a speed matched with the voice time length, so as to accelerate the playing speed, and thus, a user can obtain more information in a short time. The invention also provides a self-adaptive voice playing system.
The technical scheme adopted by the invention is as follows:
the invention provides a self-adaptive voice playing method on one hand, which comprises the following steps:
s401, acquiring first unplayed voice data in an instant messaging software chat, wherein the voice data comprises voice duration and a corresponding user ID;
s402, obtaining the next unplayed voice data in the instant messaging software chat;
s403, if the user ID corresponding to the voice data acquired in step S402 is the same as the user ID corresponding to the voice data acquired before, and the time interval between the acquired voice data and the voice data acquired before is lower than the preset interval threshold, executing S404;
s404, repeatedly executing steps S402 and S403 until the obtained voice data is inconsistent with the user ID corresponding to the previously obtained voice data or the time interval between the obtained voice data and the previously obtained voice data is greater than the preset interval threshold; executing S405;
s405, accumulating the acquired voice time lengths of the voice data with the same user ID to obtain a total voice time length Z; executing S406;
s406, if the obtained total voice time length Z is larger than a preset time threshold value Z0, playing the obtained voice data with the same user ID at a preset double speed Q, wherein Q is larger than 1, and determining based on the total voice time length Z.
Optionally, the preset interval threshold is 3-5 seconds.
Optionally, the preset multiple speed
Figure BDA0002754175400000021
Wherein Q isminAt a predetermined minimum multiple speed, QmaxAt a predetermined maximum speed, f (Q)max,Qmin) Is AND QminAnd QmaxThe associated speed compensation function.
Alternatively, f (Q)max,Qmin) And QmaxPositively correlated with QminA negative correlation.
Alternatively, f (Q)max,Qmin) And (Q)max-Qmin) And (4) positively correlating.
Optionally, Z0 is max (a preset threshold, k is the maximum speech duration allowed by the instant messaging software), and k is a coefficient smaller than k.
Optionally, the preset threshold is 10-30 seconds.
Alternatively, Qmin=1.1,Qmax=1.5。
Alternatively, QminAnd QmaxThe settings are made based on the corresponding user IDs, respectively.
Another aspect of the present invention provides an adaptive audio playing system, including: the system comprises a processor and a storage medium, wherein the storage medium is provided with a computer program stored therein, and the processor executes the computer program and realizes the method when acquiring a voice playing instruction.
The self-adaptive voice playing method and the system provided by the embodiment of the invention are used for superposing the voice time lengths of the voice data which meet the requirement of the voice data belonging to the same user ID and the time interval between two adjacent voice data does not exceed the preset interval threshold value corresponding to the voice data which is not played, and when the superposed voice time length is greater than the preset time threshold value, the double-speed playing is carried out, so that the playing speed can be accelerated, and the additional processing is reduced as much as possible.
Drawings
FIG. 1 is a diagram of a conventional playback software;
fig. 2 is a schematic flowchart illustrating a method for adaptive voice playing according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of a method for adaptive voice playing according to another embodiment of the present invention;
fig. 4 is a schematic flowchart of a method for adaptive voice playing according to another embodiment of the present invention;
fig. 5 is a diagram showing a set of voice data having the same user ID but different voice durations;
fig. 6 is a schematic flowchart of a method for adaptive voice playing according to another embodiment of the present invention;
fig. 7 shows a set of voice data transmitted at different times.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
In some of the flows described in the present specification and claims and in the above figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, with the order of the operations being indicated as 101, 102, etc. merely to distinguish between the various operations, and the order of the operations by themselves does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 2 is a diagram illustrating an adaptive voice playing method according to an embodiment of the present invention, which is suitable for voice playing of instant messaging software. As shown in fig. 2, the adaptive speech playing method provided in the embodiment of the present invention includes the following steps:
s101, voice data X which are not played in the instant messaging software chat are obtained, wherein the voice data X comprise voice time Z and a corresponding user ID.
S102, if the acquired voice time length Z is larger than a preset time threshold value Z0, playing corresponding voice data at a preset double speed Q, wherein Q is larger than 1, and determining based on the acquired voice time length.
Obviously, in practice, in the chat in the instant messaging software, the time length for sending the voice can be changed, and the double-speed playing needs to perform additional processing on the audio. If the duration of the voice is short, the double-speed playing can not save too much time. If the voice duration is longer, the double-speed playing can save more time. Therefore, the embodiment of the invention uses a time domain value Z0 to play the voice with longer time at double speed, thereby accelerating the playing speed and simultaneously reducing the extra processing as much as possible.
Further, in the embodiment of the present invention, the preset time threshold Z0 may be, for example, 10 to 30 seconds. Preferably, Z0 is max (a preset threshold, k is the maximum speech duration allowed by the instant messaging software), and k is a smaller coefficient, i.e., k < 1, e.g., k is 0.5. The predetermined threshold may be 10-30 seconds.
Further, in an embodiment of the present invention, the preset double speed Q may be equal to 1.25 or 1.5. In another embodiment, the predetermined multiple speed Q can be at a predetermined minimum multiple speed QminAnd a preset maximum speed QmaxIn the meantime. Preferably, in one embodiment, Qmin=1.1,Qmax=1.5。
Further, the preset double speed Q may be determined by the following formula (1):
Figure BDA0002754175400000041
wherein, f (Q)max,Qmin) Is AND QminAnd QmaxThe associated speed compensation function. f (Q)max,Qmin) And QmaxPositive correlation, i.e. QmaxThe larger, f (Q)max,Qmin) The larger; f (Q)max,Qmin) And QminNegative correlation, i.e. QminThe larger, f (Q)max,Qmin) The smaller. Further, f (Q)max,Qmin) And (Q)max-Qmin) Positive correlation, i.e. (Q)max-Qmin) The larger, f (Q)max,Qmin) The smaller. The technical effect of determining Q by using the formula (1) in the embodiment of the invention is that the larger the Z/Z0 is, the longer the voice time length is, and the faster the voice time length needs to be played, so that the time of a user can be saved, and the longer the time length is, the higher the information distribution is, and the worry about missing details is avoided. On the contrary, the smaller the Z/Z0, the shorter the voice time, the slower the speed playing, which can ensure the user not to miss the information in the voice.
In one embodiment, f (Q)max,Qmin)=Qmaxand/N1, for example, N1 ═ 10.
In one embodiment, f (Q)max,Qmin)=(Qmax-Qmin) and/N2, for example, N2 ═ 5.
In one embodiment, f (Q)max,Qmin)=min(Qmax/N1,(Qmax-Qmin) /N2), N1 and N2 have the same values.
The reason for using the above embodiment is that in general, the audio speed-doubled playback will not exceed 2 times, nor will it be lower than 1.1 times, otherwise it will be unclear or not meaningful to hear at all. Based on this, the compensation function f (Q)max,Qmin) May be Qmax1/10 of (1), i.e. about 0.15-0.2, or Qmin1/5, i.e., about 0.22-0.3, or the lesser of the two, these settings are reasonable.
In another embodiment of the present invention, the substrate is,
Figure BDA0002754175400000051
Qmax=2,Qmin=1.2,
Figure BDA0002754175400000052
further, QminAnd QmaxCan be set based on the corresponding user ID, QminAnd QmaxCan be set by the user, e.g. having a setting interface, the user entering QminAnd Qmax. Because the friends in private chat are familiar, the speaking speed of the friends can be known, and for people with high speaking speed, Q can be adjustedminAnd QmaxThe setting is smaller, and for people with slow speech speed, Q can be setminAnd QmaxThe setting is larger, so that the information in the voice can be more accurately known.
Fig. 3 is a diagram illustrating an adaptive audio playing method according to another embodiment of the present invention.
As shown in fig. 3, in one embodiment, preferably, the step S101 includes:
s201, sequentially acquiring the voice data which are not played in the instant messaging software chat.
Preferably, the step S102 includes:
and S202, if the voice time of the acquired certain voice data is greater than the preset time threshold, playing the voice data corresponding to the voice data at a preset speed Q.
In this embodiment, the voice data that is not played back may be sequentially obtained one by one, and only when the voice duration Z of a certain piece of voice data is greater than the preset time threshold Z0, the voice data corresponding to the voice data is played back at the preset double speed Q, so that the playback speed can be increased, and additional processing is reduced as much as possible. The preset time threshold Z0 and the preset speed Q in this embodiment are defined as in the previous embodiment, and detailed descriptions thereof are omitted here for avoiding redundancy.
Fig. 4 is a diagram illustrating an adaptive speech playing method according to another embodiment of the present invention.
As shown in fig. 4, in an embodiment, preferably, the steps S101 and S102 may further include:
s301, acquiring first unplayed voice data in the instant messaging software chat;
s302, acquiring next unplayed voice data in the instant messaging software chat;
s303, if the user ID corresponding to the voice data acquired in step 302 is the same as the user ID corresponding to the voice data acquired before, that is, the user ID corresponding to the voice data acquired in the next step is the same as the user ID corresponding to the voice data acquired in the previous step, then S304 is executed;
s304, repeatedly executing the steps S302 and S304 until the acquired voice data is inconsistent with the user ID corresponding to the voice data acquired before; executing S305;
s305, accumulating the voice durations of the obtained voice data with the same user ID to obtain a total voice duration, for example, obtaining n pieces of voice data X1, X2, …, and Xn with the same user ID, where the corresponding voice durations are Z1, Z2, …, and Zn, and then the total voice duration Z is Z1+ Z2+ … + Zn; executing S306;
and S306, if the obtained total voice time length Z is larger than the preset time threshold Z0, playing the acquired voice data with the same user ID at a preset double speed Q, for example, sequentially playing the voice data X1, X2, … and Xn in sequence. And if the obtained total voice time length Z is not greater than a preset time threshold value Z0, not performing double-speed playing.
In this embodiment, based on a set of voice data that is not played, it is possible to effectively distinguish between the case of short voice and the case of long voice. Short speech does not require speed doubling, and long speech does. Taking Z0 as an example of 30 seconds, two segments of audio, 19 seconds and 17 seconds in fig. 5, are not played at double speed if S302 to S304 are not used, and are played at double speed using S302 to S304. Obviously, these 4 words are spoken simultaneously, if only the simple judgment is made by using the time pre-value Z0, the change in speech speed will occur, thereby affecting the user experience. Such continuous time-varying voice data is determined by the user's usage habits in the instant messenger chat history, and therefore is given sufficient consideration and respect. The preset time threshold Z0 and the preset speed Q in this embodiment are defined as in the previous embodiment, and detailed descriptions thereof are omitted here for avoiding redundancy.
Fig. 6 is a diagram illustrating an adaptive audio playing method according to another embodiment of the present invention.
As shown in fig. 6, in one embodiment, preferably, the steps S101 and S102 may further include:
s401, acquiring first unplayed voice data in an instant messaging software chat, wherein the voice data comprises voice duration and a corresponding user ID;
s402, obtaining the next unplayed voice data in the instant messaging software chat;
s403, if the user ID corresponding to the voice data acquired in step S402 is the same as the user ID corresponding to the voice data acquired before, and the time interval between the acquired voice data and the voice data acquired before is lower than the preset interval threshold P, executing S404; if not, go to step S405;
that is, if the user ID corresponding to the voice data acquired in the next step is the same as the user ID corresponding to the voice data acquired in the previous step, and the time interval Ts2-Te1 between the voice data acquired in the next step and the voice data acquired in the previous step is lower than the preset interval threshold P, Ts2 is the start time of the voice data acquired in the next step, and Ts1 is the end time of the voice data acquired in the previous step, S404 is executed;
s404, repeatedly executing steps S402 and S403 until the obtained voice data is inconsistent with the user ID corresponding to the previously obtained voice data or the time interval between the obtained voice data and the previously obtained voice data is greater than the preset interval threshold; executing S405;
s405, accumulating the obtained voice durations of the voice data with the same user ID to obtain a total voice duration Z, for example, obtaining n pieces of voice data X1, X2, …, and Xn with the same user ID, where the corresponding voice durations are Z1, Z2, …, and Zn, and then the total voice duration Z is Z1+ Z2+ … + Zn; executing S406;
s406, if the obtained total voice time length Z is larger than a preset time threshold value Z0, playing the acquired voice data with the same user ID at a preset double speed Q, for example, sequentially playing the voice data X1, X2, …, Xn Q > 1 in sequence, and determining based on the total voice time length Z. And if the obtained total voice time length Z is not greater than a preset time threshold value Z0, not performing double-speed playing.
In this embodiment, the preset interval threshold P may be, for example, 3 to 5 seconds. In addition, the preset time threshold Z0 and the preset speed Q in this embodiment are defined in accordance with the previous embodiment, and detailed descriptions thereof are omitted here for avoiding redundancy.
In this embodiment, for a group of unplayed voice data, only the voice durations of the voice data that satisfy the voice data belonging to the same user ID and the time interval between two adjacent voice data does not exceed the preset interval threshold are superimposed, and when the superimposed voice durations are greater than the preset time threshold, the double-speed playback is performed, for example, as shown in fig. 7, the voice durations of the voice data with the time of 11 seconds at the upper end of the right and the voice data with the time of 4 seconds can be superimposed to perform continuous double-speed playback, and the voice data with the time of 11 seconds and the time of 5 seconds at the left cannot be continuously played, so that the playback efficiency and the accuracy of information extraction can be improved.
An embodiment of the present invention further provides a self-adaptive voice playing system, including: the processor executes the computer program, and when a voice playing instruction is obtained, the steps of the self-adaptive voice playing method are realized. The self-adaptive voice playing system provided by the embodiment of the invention can be arranged on a mobile terminal.
Specifically, the memory and the processor can be general-purpose memory and processor, which are not limited in particular, and when the processor runs a computer program stored in the memory, the adaptive speech playing method can be executed, so as to solve the problem in the related art that speech cannot be adaptively played at double speed.
The above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A time-based adaptive voice playing method is characterized by comprising the following steps:
s401, acquiring first unplayed voice data in an instant messaging software chat, wherein the voice data comprises voice duration and a corresponding user ID;
s402, obtaining the next unplayed voice data in the instant messaging software chat;
s403, if the user ID corresponding to the voice data acquired in step S402 is the same as the user ID corresponding to the voice data acquired before, and the time interval between the acquired voice data and the voice data acquired before is lower than the preset interval threshold, executing S404;
s404, repeatedly executing steps S402 and S403 until the obtained voice data is inconsistent with the user ID corresponding to the previously obtained voice data or the time interval between the obtained voice data and the previously obtained voice data is greater than the preset interval threshold; executing S405;
s405, accumulating the acquired voice time lengths of the voice data with the same user ID to obtain a total voice time length Z; executing S406;
s406, if the obtained total voice time length Z is larger than a preset time threshold value Z0, playing the obtained voice data with the same user ID at a preset double speed Q, wherein Q is larger than 1, and determining based on the total voice time length Z.
2. The time-based adaptive voice playing method according to claim 1, wherein the preset interval threshold is 3-5 seconds.
3. The time-based adaptive speech playback method of claim 1, wherein the preset multiple speed is set
Figure FDA0002754175390000011
Wherein Q isminAt a predetermined minimum multiple speed, QmaxAt a predetermined maximum speed, f (Q)max,Qmin) Is AND QminAnd QmaxThe associated speed compensation function.
4. The method according to claim 3Time adaptive speech playing method, characterized in that f (Q)max,Qmin) And QmaxPositively correlated with QminA negative correlation.
5. The time-based adaptive speech playback method of claim 3, wherein f (Q)max,Qmin) And (Q)max-Qmin) And (4) positively correlating.
6. The time-based adaptive speech playback method according to claim 3, wherein Z0 ═ max (a preset threshold, k ═ a single maximum speech duration allowed by instant messaging software), and k is a smaller coefficient.
7. The time-based adaptive voice playing method according to claim 6, wherein the preset threshold is 10-30 seconds.
8. The time-based adaptive speech playback method of claim 3, wherein Q ismin=1.1,Qmax=1.5。
9. The time-based adaptive speech playback method of claim 3, wherein Q isminAnd QmaxThe settings are made based on the corresponding user IDs, respectively.
10. An adaptive speech playback system, comprising: a processor and a storage medium having a computer program stored thereon, the computer program being executable by the processor to perform the method of any one of claims 1 to 9 when the voice playback instruction is obtained.
CN202011196492.8A 2020-10-30 2020-10-30 Self-adaptive voice playing method and system Active CN112511407B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011196492.8A CN112511407B (en) 2020-10-30 2020-10-30 Self-adaptive voice playing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011196492.8A CN112511407B (en) 2020-10-30 2020-10-30 Self-adaptive voice playing method and system

Publications (2)

Publication Number Publication Date
CN112511407A true CN112511407A (en) 2021-03-16
CN112511407B CN112511407B (en) 2022-04-29

Family

ID=74954725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011196492.8A Active CN112511407B (en) 2020-10-30 2020-10-30 Self-adaptive voice playing method and system

Country Status (1)

Country Link
CN (1) CN112511407B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113595868A (en) * 2021-06-28 2021-11-02 深圳云之家网络有限公司 Voice message processing method and device based on instant messaging and computer equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719371A (en) * 2009-11-20 2010-06-02 安凯(广州)微电子技术有限公司 Voice speed changing method
WO2016000219A1 (en) * 2014-07-02 2016-01-07 华为技术有限公司 Information transmission method and transmission device
CN107124352A (en) * 2017-05-26 2017-09-01 维沃移动通信有限公司 The processing method and mobile terminal of a kind of voice messaging
US20180061404A1 (en) * 2016-09-01 2018-03-01 Amazon Technologies, Inc. Indicator for voice-based communications
CN109979474A (en) * 2019-03-01 2019-07-05 珠海格力电器股份有限公司 Speech ciphering equipment and its user speed modification method, device and storage medium
CN110086941A (en) * 2019-04-30 2019-08-02 Oppo广东移动通信有限公司 Speech playing method, device and terminal device
CN110177298A (en) * 2019-05-27 2019-08-27 湖南快乐阳光互动娱乐传媒有限公司 A kind of voice-based video speed playback method and system
CN110365574A (en) * 2019-05-24 2019-10-22 珠海格力电器股份有限公司 A kind of playback method of voice messaging, device and storage medium
CN110830654A (en) * 2019-11-07 2020-02-21 腾讯科技(深圳)有限公司 Method and equipment for playing voice message
CN111726693A (en) * 2020-06-02 2020-09-29 广州视源电子科技股份有限公司 Audio and video playing method, device, equipment and medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719371A (en) * 2009-11-20 2010-06-02 安凯(广州)微电子技术有限公司 Voice speed changing method
WO2016000219A1 (en) * 2014-07-02 2016-01-07 华为技术有限公司 Information transmission method and transmission device
US20180061404A1 (en) * 2016-09-01 2018-03-01 Amazon Technologies, Inc. Indicator for voice-based communications
CN107124352A (en) * 2017-05-26 2017-09-01 维沃移动通信有限公司 The processing method and mobile terminal of a kind of voice messaging
CN109979474A (en) * 2019-03-01 2019-07-05 珠海格力电器股份有限公司 Speech ciphering equipment and its user speed modification method, device and storage medium
CN110086941A (en) * 2019-04-30 2019-08-02 Oppo广东移动通信有限公司 Speech playing method, device and terminal device
CN110365574A (en) * 2019-05-24 2019-10-22 珠海格力电器股份有限公司 A kind of playback method of voice messaging, device and storage medium
CN110177298A (en) * 2019-05-27 2019-08-27 湖南快乐阳光互动娱乐传媒有限公司 A kind of voice-based video speed playback method and system
CN110830654A (en) * 2019-11-07 2020-02-21 腾讯科技(深圳)有限公司 Method and equipment for playing voice message
CN111726693A (en) * 2020-06-02 2020-09-29 广州视源电子科技股份有限公司 Audio and video playing method, device, equipment and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
51准答: "QQ语音消息怎么调倍速", 《百度经验》 *
舒永明: "音视频变速播放算法及其在IPTV中应用的研究", 《中国优秀硕士论文电子期刊网》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113595868A (en) * 2021-06-28 2021-11-02 深圳云之家网络有限公司 Voice message processing method and device based on instant messaging and computer equipment

Also Published As

Publication number Publication date
CN112511407B (en) 2022-04-29

Similar Documents

Publication Publication Date Title
US8660038B1 (en) Previewing voicemails using mobile devices
WO2015098306A1 (en) Response control device and control program
US20130067050A1 (en) Playback manager
CN105448312B (en) Audio sync playback method, apparatus and system
WO2015132767A1 (en) Systems and methods for digital multimedia capture using haptic control, cloud voice changer, and protecting digital multimedia privacy
CN102292766A (en) Method, apparatus and computer program product for providing compound models for speech recognition adaptation
CN102868836A (en) Real person talk skill system for call center and realization method thereof
CN109634501B (en) Electronic book annotation adding method, electronic equipment and computer storage medium
WO2019071808A1 (en) Video image display method, apparatus and system, terminal device, and storage medium
CN110025958B (en) Voice sending method, device, medium and electronic equipment
WO2012159095A2 (en) Background audio listening for content recognition
CN112511407B (en) Self-adaptive voice playing method and system
CN104123114A (en) Method and device for playing voice
CN115329206A (en) Voice outbound processing method and related device
WO2012065567A1 (en) Conversion method and apparatus of text message
CN110086941B (en) Voice playing method and device and terminal equipment
US11580954B2 (en) Systems and methods of handling speech audio stream interruptions
CN112511406B (en) Voice playing method and system of instant messaging software
KR20230087577A (en) Control Playback of Scene Descriptions
US9069526B2 (en) Audio data processing method and audio data processing system
JP7052335B2 (en) Information processing system, information processing method and program
CN203840387U (en) Telephone set suitable for storing telephone record preservation system
CN110164411A (en) A kind of voice interactive method, equipment and storage medium
US20220217425A1 (en) Call audio playback speed adjustment
CN113808592A (en) Method and device for transcribing call recording, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant