WO2014079382A1 - 语音传输方法、终端、语音服务器及语音传输系统 - Google Patents

语音传输方法、终端、语音服务器及语音传输系统 Download PDF

Info

Publication number
WO2014079382A1
WO2014079382A1 PCT/CN2013/087653 CN2013087653W WO2014079382A1 WO 2014079382 A1 WO2014079382 A1 WO 2014079382A1 CN 2013087653 W CN2013087653 W CN 2013087653W WO 2014079382 A1 WO2014079382 A1 WO 2014079382A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
voice data
data segment
transmission
audio
Prior art date
Application number
PCT/CN2013/087653
Other languages
English (en)
French (fr)
Inventor
文孝木
王永鑫
尹凡
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2014079382A1 publication Critical patent/WO2014079382A1/zh
Priority to US14/719,144 priority Critical patent/US9832621B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/16Communication-related supplementary services, e.g. call-transfer or call-hold
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]
    • H04L51/046Interoperability with other network applications or services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/253Telephone sets using digital voice transmission
    • H04M1/2535Telephone sets using digital voice transmission adapted for voice communication over an Internet Protocol [IP] network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/12Messaging; Mailboxes; Announcements

Definitions

  • the embodiments of the present invention relate to communication technologies, and in particular, to a voice transmission method, a terminal, a voice server, and a voice transmission system. Background technique
  • Instant messaging technology A communication technology developed in the Internet and mobile communication networks, which can realize communication of video, text, short message and voice, and has been widely welcomed by users.
  • the voice intercom function is an important voice communication method in instant messaging technology.
  • the user can perform real-time voice chat, just like the short message chat, effectively satisfying the real-time communication needs of the user, and is widely used. It is applied to instant messaging of mobile terminals such as mobile phones.
  • the transmission of voice data is transmitted by using an accessory method.
  • the process of performing voice intercom between the user B1 holding the mobile terminal A1 and the user ⁇ 2 holding the mobile terminal ⁇ 2 is as follows: When the user B1 sends a voice to the user ⁇ 2, the mobile terminal A1 detects that the user B1 presses the voice function.
  • the voice sent by the user B1 during the pressing of the button is first collected, and after detecting the pressing of the button, the collection of the voice is ended; secondly, the collected voice is sequentially encoded and compressed to obtain a voice file; Sending the voice file to the voice server C; after receiving the voice file, the voice file is forwarded to the mobile terminal B1, and the mobile terminal B1 sequentially decompresses and decodes the received voice file to obtain the voice. And the voice is played to the user ⁇ 2; likewise, when the user ⁇ 2 sends the voice to the user B1, the same processing manner is adopted, so that the voice intercom can be realized between the two mobile terminals.
  • the voice recording, encoding, and compression are all completed to obtain the entire voice file, and then the entire voice file is sent out, which makes the voice data transmission time longer, resulting in low voice data transmission efficiency.
  • the real-time performance of voice transmission is poor.
  • the wireless network in a wireless network environment such as mobile communication, the wireless network is often not The voice transmission fails due to stability, and the entire voice file must be retransmitted after the voice transmission fails. The network resources of the voice file retransmission are consumed, which further reduces the voice transmission efficiency and the real-time performance of the voice transmission.
  • Voice intercom technology requires real-time voice.
  • Embodiments of the present invention provide a voice transmission method, a terminal, a voice server, and a voice transmission system, which can overcome the low transmission efficiency and the low real-time voice transmission of the voice transmission using the accessory mode in the existing voice intercom technology. problem.
  • the embodiment of the invention provides a voice transmission method, including:
  • the collected voice and audio are processed
  • the voice data is sent as a voice data segment.
  • the embodiment of the invention further provides a voice transmission method, including:
  • the received voice data segment is forwarded to the voice receiving terminal in real time.
  • the embodiment of the invention further provides a voice transmission method, including:
  • the voice data segment is a voice data segment sent by the voice transmitting terminal by using the voice transmission method provided by the foregoing embodiment, or the voice data sent by the voice transmitting terminal forwarded by the voice server by using the voice transmission method provided by the foregoing embodiment Paragraph
  • each obtained voice data segment is combined according to a sequence of voice data segments in a voice audio processing process to obtain a voice data file;
  • the voice data file is parsed to obtain voice audio.
  • the embodiment of the invention further provides a voice transmission terminal, including:
  • a voice audio collection module for collecting voice and audio
  • a voice audio processing module configured to process the collected voice and audio during the voice and audio collection process
  • the voice sending module is configured to send the voice data as a voice data segment when the length of the processed voice data reaches a preset data length during the voice audio processing.
  • the embodiment of the invention further provides a voice server, including:
  • a voice data receiving module configured to receive the voice data segment sent by the voice transmission terminal provided by the foregoing embodiment
  • the voice data forwarding module is configured to forward the received voice data segment to the voice receiving terminal in real time.
  • the embodiment of the invention further provides a voice transmission terminal, including:
  • a receiving module configured to receive a voice data segment, where the voice data segment is a voice data segment sent by the voice transmission terminal provided by the foregoing embodiment, or a voice data segment forwarded by the voice server provided by the foregoing embodiment;
  • a combination module configured to combine the obtained voice data segments according to a sequence of voice data segments in a voice audio processing process to obtain a voice data file
  • the parsing module is configured to parse the voice data file to obtain voice audio.
  • the embodiment of the present invention further provides a voice transmission system, including a mobile terminal and a voice server, wherein the mobile terminal is a voice transmission terminal provided by the foregoing embodiment, and the voice server is a voice server provided by using the foregoing embodiment. .
  • the embodiment of the invention further provides a voice transmission terminal, including:
  • One or more processors are One or more processors.
  • the memory stores one or more programs, the one or more programs being configured to be executed by the one or more processors, the one or more programs including instructions for: acquiring voice Audio
  • the collected voice and audio are processed
  • the voice data is sent as a voice data segment.
  • the embodiment of the invention further provides a voice server, including:
  • One or more processors are One or more processors.
  • the memory stores one or more programs, the one or more programs being configured to be executed by the one or more processors, the one or more programs including instructions for: receiving the a voice data segment sent by the voice transmission terminal according to the embodiment;
  • the received voice data segment is forwarded to the voice receiving terminal in real time.
  • the embodiment of the invention further provides a voice transmission terminal, including:
  • One or more processors are One or more processors.
  • the memory stores one or more programs, the one or more programs being configured to be executed by the one or more processors, the one or more programs including instructions for: receiving a voice a data segment, the voice data segment is a voice data segment sent by the voice transmitting terminal, or the voice data segment sent by the voice sending terminal forwarded by the voice server in the foregoing embodiment;
  • each obtained voice data segment is combined according to a sequence of voice data segments in a voice audio processing process to obtain a voice data file;
  • the voice data file is parsed to obtain voice audio.
  • the voice transmission method, the terminal, the voice server and the voice transmission system can perform real-time processing on the voice and audio in the process of voice and audio collection, and can process the voice data according to the voice data segment of the preset data length.
  • Real-time transmission so that voice processing and transmission can be performed in the process of voice and audio collection, and voice collection, processing and transmission are synchronized, thereby improving voice transmission efficiency and improving real-time voice transmission; meanwhile, in voice transmission
  • the segmented voice data segment is transmitted in a manner, and when the network fault occurs, for example, the wireless communication network is unstable and the data transmission fails, only the voice data segment that fails to be transmitted needs to be retransmitted, thereby avoiding the need to retransmit the existing need.
  • the network resource consumption caused by the entire voice file is large, and the problem of low voice transmission efficiency and poor real-time voice transmission is caused.
  • FIG. 1 is a schematic flowchart of a voice transmission method according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of a voice transmission method according to another embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of voice retransmission in a voice transmission method according to another embodiment of the present invention
  • FIG. 4 is a schematic flowchart of a voice transmission method according to another embodiment of the present invention
  • FIG. 5 is a schematic flowchart of a voice transmission method according to another embodiment of the present invention
  • FIG. 6 is a schematic flowchart of a voice transmission method according to another embodiment of the present invention
  • FIG. 7 is a schematic structural diagram of a voice transmission terminal according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a voice transmission terminal according to another embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a voice transmission terminal according to another embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a voice server according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of a voice transmission terminal according to an embodiment of the present invention.
  • FIG. 12 is a schematic structural diagram of a voice transmission system according to another embodiment of the present invention
  • FIG. 13 is a schematic structural diagram of a voice transmission terminal according to an embodiment of the present invention
  • FIG. 14 is a schematic structural diagram of a voice server according to another embodiment of the present invention. detailed description
  • FIG. 1 is a schematic flowchart diagram of a voice transmission method according to an embodiment of the present invention.
  • the voice transmission method of the embodiment is applied to instant messaging, and the voice data can be transmitted during the voice intercom process.
  • the voice transmitting terminal can process the voice sent by the user A according to the method in this embodiment.
  • the method in this embodiment may include the following steps: Step 101: The voice transmitting terminal collects voice and audio;
  • Step 102 The voice sending terminal processes the collected voice and audio in the process of voice and audio collection;
  • Step 103 In the process of voice and audio processing, when the length of the processed voice data reaches a preset data length, the voice data is sent as a voice data segment.
  • the voice transmitting terminal detects that the user A presses the voice intercom function button, the voice audio can be collected, and during the voice and audio collection process, the collected voice audio can be processed at the same time, and the processing is processed.
  • the obtained voice data is sent to the voice server in the network in real time according to the voice data segment of the preset data length until the voice audio collection ends; meanwhile, the voice server can be connected
  • the received voice data segment is forwarded to the voice receiving terminal held by the user B in real time, so that the voice receiving terminal processes the voice data and presents it to the user B, so that the voice transmission in the voice intercom can be realized.
  • User B sends voice data to User A, it has the same voice transmission process.
  • the voice data when performing voice data processing, the voice data is processed into a plurality of voice data segments, and the essence is to divide a large block of voice data into smaller voice data blocks.
  • the voice When the voice is sent, it is sent based on a small number of voice data blocks.
  • the voice sending terminal and the voice receiving terminal may be a mobile terminal based on a mobile communication network, such as a mobile phone, or may be a mobile terminal based on an existing other wireless network, such as a wifi network, such as a tablet computer.
  • a mobile communication network such as a mobile phone
  • an existing other wireless network such as a wifi network
  • the present invention is not limited to the embodiment of the present invention, as long as the terminals that can perform instant messaging are the terminals described in this embodiment.
  • the voice transmission method provided by the embodiment of the present invention can perform real-time processing on the voice and audio in the process of voice and audio collection, and can transmit the processed voice data in real time according to the voice data segment of the preset data length, so that the voice can be transmitted in real time.
  • voice processing and transmission are performed, and voice collection, processing, and transmission are performed simultaneously, thereby improving voice transmission efficiency and improving real-time voice transmission; meanwhile, segmented voice data segments are used for transmission during voice transmission.
  • the network is faulty, for example, the wireless communication network is unstable and the data transmission fails, the voice data segment that fails to be transmitted can be retransmitted, thereby avoiding the network resources that need to be retransmitted to the entire voice file.
  • the problem is large consumption, and the resulting inefficient voice transmission and poor real-time voice transmission.
  • FIG. 2 is a schematic flowchart diagram of a voice transmission method according to another embodiment of the present invention.
  • the voice sending terminal may add a logical identifier to the processed voice data segment, so that the voice receiving terminal that receives the voice data segment may be based on the logical identifier.
  • the voice data segment is reorganized. Specifically, as shown in FIG. 2, the method in this embodiment may include the following steps:
  • Step 201 The user A presses the voice intercom function button on the voice transmitting terminal to instruct the voice sending terminal to start transmitting the voice to the voice receiving terminal held by the user B.
  • Step 202 After detecting that the user A presses the button, the voice transmitting terminal immediately performs recording, and collects voice audio sent by the user A until the user A releases the button to stop sending the voice;
  • Step 203 Perform encoding processing on the collected voice and audio in the process of voice and audio collection, and perform compression processing on the encoded data.
  • Step 204 In the process of processing the voice data in step 203, it is determined whether the length of the compressed voice data reaches a preset data length, if yes, step 205 is performed; otherwise, the step is continued. 203;
  • Step 205 determining whether the voice audio is collected, if yes, proceed to step 206, otherwise, performing step 203 and step 208;
  • Step 206 determining whether the collected voice audio is completely processed, if yes, performing step 207, otherwise, performing step 203 and step 208;
  • Step 207 adding a voice end identifier to the last voice data segment obtained after the processing, and executing step 209;
  • Step 208 Add a logical identifier to the voice data segment obtained after the processing, where the logical identifier is used to indicate a processing sequence of the voice data segment.
  • Step 209 Transmit the voice data segment to the voice server in the network in real time.
  • the user can initiate a voice intercom and send a voice to the voice transmitting terminal by using a voice command or the like in addition to the voice intercom function button on the voice transmitting terminal.
  • the instruction command initiated by the voice in this embodiment is not particularly limited.
  • the collected voice and audio may be buffered in real time until the voice transmission end instruction is detected, that is, the user stops pressing the voice intercom function button.
  • step 203 is in the process of performing the voice audio collection in step 202, and simultaneously encodes and compresses the voice audio collected in step 202, that is, step 202 and step 203 are performed synchronously.
  • the above-mentioned encoding of voice audio is to convert the collected voice audio into a digital signal suitable for network transmission; the compression of the encoded data is to reduce the size of voice data in network transmission. To increase the voice transmission rate.
  • the specific coding and compression process is the same as or similar to the conventional technology and will not be described here.
  • the length of the data in the compression process may be detected in step 203, so that when the data length reaches the preset data length, the compressed data may be used as a voice data segment, where the preset is
  • the data length can be set to an appropriate size according to the needs of the network transmission. For example, when the voice data transmission using the TCP/IP protocol is used, the preset data length can be set to 1500 bytes, so that it can be adapted to the underlying medium access control (Media).
  • Media medium access control
  • the Access Control (MAC) protocol limits the length of the data packet, avoids re-segmentation of data over 1500 bytes at the bottom, reduces the operation of the underlying protocol, and improves data transmission efficiency.
  • This step 204 and step 203 are also performed synchronously.
  • the voice and audio collection ends and the collected voice and audio are processed.
  • the voice end identifier is added to the voice data segment to indicate the end of the voice, so that the voice server and the voice receiving terminal can determine the end of the voice.
  • the voice end command is sent to the voice server and the voice receiving terminal to announce the end of the voice, and the present invention is implemented.
  • the example is not particularly limited.
  • the end of the voice and audio collection refers to that the voice transmitting terminal stops the voice collection when receiving the voice transmission end instruction of the user. In this embodiment, it is detected that the user no longer presses the voice on the voice transmitting terminal. When the talkback function button is pressed, the voice and audio collection is stopped. At this point, the voice to be sent by the user ends.
  • a logical identifier is added to the voice data segment obtained after the processing to indicate the processing order of each voice data segment, for example, the processed sequence number, so that the voice receiving terminal can perform voice data recombination according to the sequence numbers.
  • the voice server can also determine whether the received voice data segment is lost or garbled based on the logical identifier of the received voice data segment.
  • the voice data segment processed in step 203 can be sent to the voice server in real time, and after receiving the voice data segment, the voice server can forward the voice data segment to the voice receiving terminal in real time, so that the voice receiving terminal receives the received data.
  • the voice data segment is processed, and the finally obtained voice is played to the user who receives the voice. The specific processing will be described later.
  • TCP Transmission Control Protocol
  • UDP User Datagram Protocol
  • the logical identifier may not be added to the voice data segment, but the TCP protocol control is used to ensure the orderly arrangement of the voice data segments.
  • the voice data is processed into a plurality of slice data, and the slice data is separately transmitted, so that the entire voice is not required.
  • the file is sent to make the voice data transmission more efficient and real-time better, which can meet the real-time needs of instant messaging.
  • FIG. 3 is a schematic flowchart of voice retransmission in a voice transmission method according to another embodiment of the present invention.
  • the voice transmitting terminal can ensure that each voice data segment is reliably transmitted to the voice server, and can also retransmit the voice data segment that fails to be transmitted.
  • the method of this embodiment may further include the following steps:
  • Step 301 The voice sending terminal receives the transmission feedback information returned by the voice server, where the transmission feedback feedback information includes a retransmission identifier, where the retransmission identifier is used to indicate a voice data segment that needs to be retransmitted; Step 302, according to the retransmission identifier, Send a segment of voice data that needs to be resent.
  • the voice transmitting terminal when the voice transmitting terminal performs the voice data segment transmission network, for example, the mobile communication network is faulty or unstable, the voice server cannot receive the voice data segment and the voice data segment is lost, and the voice server can return to the voice transmitting end.
  • the feedback information is transmitted to indicate the voice data segment that the voice transmitting end needs to retransmit, so that the voice transmitting end only needs to retransmit the voice data segment that needs to be retransmitted.
  • each voice data segment is temporarily stored, so that the voice data segment can be retransmitted when the voice data segment fails to transmit, until the voice server feeds back the voice transmission successfully.
  • the sent voice data segment can also be stored in a set time, which is not particularly limited in this embodiment.
  • the voice transmission terminal may resend the voice data segment. All voice data segments.
  • the transmission failure rate of the voice data is high, and if the entire voice file is used for voice transmission, if voice data is used. If the transmission fails in the middle of the transmission, then the entire voice file needs to be retransmitted.
  • the voice transmission is a piece of voice data segment during voice transmission, even if a voice data segment fails to be transmitted during the voice transmission process, only The voice data segment needs to be retransmitted, thereby reducing network resources occupied by retransmission and providing voice transmission efficiency.
  • FIG. 4 is a schematic flowchart diagram of a voice transmission method according to another embodiment of the present invention.
  • the user A is prompted to send a success message, so as to improve the user experience of using the voice communication mode of the instant messaging mode, specifically, as shown in the figure.
  • the method of this embodiment may include the following steps:
  • Step 401 User A presses a voice intercom function button on the voice sending terminal to indicate voice transmission. The terminal starts to send a voice to the voice receiving terminal;
  • Step 402 After the voice transmitting terminal detects that the user A presses the button, immediately performs recording, and collects voice audio sent by the user A;
  • Step 403 In the process of voice and audio collection, encoding and compressing the collected voice and audio, and performing the process of encoding and compressing, when the length of the voice data reaches a preset data length, the voice data is used as a voice data.
  • the segment is sent to the voice server in real time;
  • Step 404 determining whether the voice audio is collected, if yes, go to step 405, otherwise, go to step 403;
  • Step 405 Determine whether the collected voice audio is completely processed, if yes, go to step 406, otherwise, go to step 403;
  • Step 406 after determining whether the voice data segment is all sent, if yes, go to step 407, otherwise go to step 403;
  • Step 407 detecting whether the network connection of the voice transmitting terminal is normal, if yes, proceed to step 409, otherwise, performing step 408;
  • Step 408 The voice sending terminal provides the user with the prompt information being sent, and continues to perform the step.
  • Step 409 The voice sending terminal provides the user with a sending success prompt message.
  • Step 410 Determine, within a preset time period, whether the transmission success information returned by the voice server is received, and then terminate the transmission of the entire voice; otherwise, perform step 411;
  • Step 411 Resend all the voice data segments.
  • the voice transmission terminal After the voice audio collection ends and the data transmission is completed, the voice transmission terminal provides the user with a successful transmission prompt message as long as it detects that the network connection of the voice transmitting terminal is normal, so that the user can be better. It reflects the real-time nature of instant messaging and improves the user's instant messaging experience.
  • step 408 when the voice connection is detected and the network connection is abnormal, it is indicated that the data may not be successfully sent to the voice server and the voice receiving terminal at this time. Therefore, the user may be provided with the prompt information during voice transmission.
  • a certain length of time can also be set, for example, 1 minute. If the network connection is still detected abnormally within the length of time, the user may be provided with a prompt message such as a failure to transmit.
  • the voice data is ensured by detecting whether the voice server returns a confirmation transmission success information.
  • the transmission to the voice server can effectively improve the reliability of voice data transmission.
  • the voice transmitting terminal usually prompts the user to send a successful prompt after receiving the confirmation message that the voice server feeds back successfully. Otherwise, it will continue to wait. Since the voice transmitting terminal in the voice intercom is based on a wireless network such as mobile communication, the voice data is sent to the voice server, and the complexity in the wireless network environment is far greater than that of the wired network, and the uplink and downlink bandwidth of the voice transmitting terminal is seriously asymmetrical. In a wireless network environment with low signal-to-noise ratio, the loss of signaling data between the voice transmitting terminal and the voice server will occupy a certain proportion. In this case, the true voice data has been successfully transmitted.
  • the acknowledgment message sent by the voice server will be delayed, which may result in the success of the voice sending terminal. This will seriously affect the experience of the voice intercom service.
  • the present embodiment optimizes the prompting process for successful voice transmission, which can effectively improve the user experience when using instant messaging.
  • FIG. 5 is a schematic flowchart of a voice transmission method according to Embodiment 5 of the present invention.
  • the voice server can receive the voice data segment sent by the voice sending terminal in the foregoing method embodiment of the present invention in real time, and can forward the voice data segment to the voice receiving end in real time.
  • the implementation is implemented.
  • the example method can include the following steps:
  • Step 501 The voice server receives the voice data segment sent by the voice sending terminal.
  • Step 502 The voice server forwards the received voice data segment to the voice receiving terminal in real time.
  • the voice server can receive the voice data segment sent by the voice transmitting terminal described above in FIG. 1 to FIG. 4 in real time, and can forward the voice data terminal to the voice receiving terminal in real time to improve the voice data transmission efficiency.
  • the voice server may return the transmission feedback information to the voice sending terminal when the voice data segment fails to receive the voice data segment, and the transmission feedback information may include a retransmission identifier, which is used to indicate that the retransmission is needed.
  • the voice data segment can be retransmitted according to the retransmission identifier.
  • the specific processing procedure can be referred to the description in the method shown in FIG. 3 above.
  • FIG. 6 is a schematic flowchart diagram of a voice transmission method according to Embodiment 6 of the present invention.
  • the voice receiving terminal can receive the voice data segment that is forwarded by the voice server in real time in the method of the embodiment shown in FIG. 5 in real time.
  • the embodiment of the present invention may include the following steps:
  • Step 601 The voice receiving terminal receives the voice data segment.
  • Step 602 The voice receiving terminal combines the obtained voice data segments according to the sequence of voice data segments in the process of voice and audio processing to obtain a voice data file.
  • Step 603 The voice receiving terminal parses the voice data file to obtain voice audio.
  • the voice receiving terminal can receive the voice data segment sent by the voice transmitting terminal described in FIG. 1 to FIG. 4 that is forwarded by the voice server in real time, and can combine the received voice data segments to obtain a complete voice packet.
  • the voice file can be parsed and the corresponding voice audio can be played to the user.
  • the voice receiving terminal may combine the received voice data segments to obtain a voice data file.
  • the voice data files may be combined according to the processing sequence of the voice data segments according to the logical identifiers carried in the voice data segments.
  • the voice server needs to be forwarded as voice data.
  • voice can also be performed according to the foregoing manner.
  • the embodiment of the present invention is not limited in particular, for example, when two voice terminals are in the same communication network, when a voice intercom is directly performed, a mobile terminal can directly directly intercommunicate the voice according to the voice. The acquisition, processing and transmission methods are sent to another mobile terminal.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • FIG. 7 is a schematic structural diagram of a voice transmission terminal according to an embodiment of the present invention.
  • the voice transmission terminal in this embodiment may be the voice transmission terminal described in the foregoing method embodiment of the present invention, to perform voice transmission.
  • the voice transmission terminal in this embodiment includes a voice audio collection module 11.
  • the voice audio processing module 12 and the voice sending module 13 wherein:
  • the voice audio collection module 11 is configured to collect voice and audio
  • the voice audio processing module 12 is configured to process the collected voice and audio during the voice and audio collection process
  • the voice sending module 13 is configured to send the voice data as a voice data segment when the length of the processed voice data reaches a preset data length in the voice audio processing process.
  • the voice transmission terminal in this embodiment may perform voice transmission based on the method embodiment shown in FIG. 1, FIG. 2, FIG. 3 or FIG. 4, and the specific implementation may refer to the description of the foregoing method embodiment of the present invention. Narration.
  • FIG. 8 is a schematic structural diagram of a voice transmission terminal according to another embodiment of the present invention.
  • the voice transmission terminal of this embodiment may further include an identifier adding module 14 and a voice ending identifier adding module 15, wherein the identifier adding module 14 may be used in the voice sending module.
  • a logical identifier is added to the voice data segment that is sent, and the logical identifier indicates a processing sequence of the voice data segment in the voice audio processing process; the voice end identifier adding module 15 can be used to process the last voice data after the voice audio collection is completed.
  • the segment adds a voice end identifier.
  • the voice transmission terminal of this embodiment may further include a feedback information receiving module 16 and a feedback retransmission module 17, wherein the feedback information receiving module 16 may be configured to receive transmission feedback information returned by the voice server, where the feedback information includes Sending an identifier, the retransmission identifier indicating a voice data segment that needs to be retransmitted; the feedback retransmission module 17 is configured to resend the voice data segment that needs to be retransmitted according to the retransmission identifier.
  • the voice transmission terminal of this embodiment may further include a voice audio retransmission module 18, where the voice audio retransmission module 18 may be used for each voice data segment obtained after the voice audio collection is completed and processed. After the transmission success message returned by the voice server is not received within the preset time period after the transmission is completed, all the voice data segments in the voice and audio processing process are resent.
  • the voice transmission terminal in this embodiment can implement the voice transmission according to the method embodiment shown in FIG. 2 or FIG. 3 of the present invention.
  • FIG. 2 or FIG. 3 of the present invention For the specific implementation, refer to the description of the foregoing method embodiment of the present invention, and details are not described herein again.
  • FIG. 9 is a schematic structural diagram of a voice transmission terminal according to an embodiment of the present invention.
  • the embodiment may further include a sending success prompting module 19, configured to detect that the network connection is normal after the voice audio collection ends. The user provides a message to send a success message.
  • the voice transmission terminal in this embodiment can implement the voice transmission based on the method embodiment shown in FIG. 4, and the specific implementation can refer to the description of the foregoing method embodiment of the present invention, and details are not described herein again.
  • FIG. 10 is a schematic structural diagram of a voice server according to an embodiment of the present invention.
  • the voice server of the embodiment includes a voice data receiving module 21 and a voice data forwarding module 22, wherein: the voice data receiving module 21 is configured to receive a voice data segment sent by the voice transmitting terminal, and the voice data forwarding module 22, And used to forward the received voice data segment to the voice receiving terminal in real time.
  • the voice server of this embodiment may further include a feedback module 23, configured to return, when the voice data segment fails to receive, the transmission feedback information, where the transmission feedback information includes a retransmission identifier, and the retransmission identifier Indicates the voice data segment that needs to be retransmitted, so that the voice transmitting terminal resends A segment of voice data that needs to be retransmitted.
  • a feedback module 23 configured to return, when the voice data segment fails to receive, the transmission feedback information, where the transmission feedback information includes a retransmission identifier, and the retransmission identifier Indicates the voice data segment that needs to be retransmitted, so that the voice transmitting terminal resends A segment of voice data that needs to be retransmitted.
  • the voice server of this embodiment can process the voice data segment sent by the voice transmission terminal shown in FIG. 7 , FIG. 8 or FIG. 9 according to the method embodiment shown in FIG. 5 .
  • the voice server of this embodiment can process the voice data segment sent by the voice transmission terminal shown in FIG. 7 , FIG. 8 or FIG. 9 according to the method embodiment shown in FIG. 5 .
  • the specific implementation refer to the description of the foregoing method embodiment of the present invention. , will not repeat them here.
  • FIG. 11 is a schematic structural diagram of a voice transmission terminal according to an embodiment of the present invention.
  • the voice transmission terminal can be used as a voice receiving terminal, and receive the voice data segment sent by the voice server or the voice transmitting terminal.
  • the voice transmission terminal in this embodiment can include a receiving module 31, Combination module 32 and parsing module 33, wherein:
  • a receiving module 31 configured to receive a voice data segment
  • the combining module 32 is configured to combine the obtained voice data segments according to a sequence of voice data segments in a voice audio processing process to obtain a voice data file;
  • the parsing module 33 is configured to parse the voice data file to obtain voice audio.
  • each of the voice data segments carries a logical identifier for indicating a processing sequence of the voice data segment
  • the combining module 32 is specifically configured to use the voice identifier carried in each voice data segment according to the voice data segment.
  • the processing order is combined to obtain a voice data file.
  • the last voice data segment sent by the voice transmitting terminal carries a voice ending identifier
  • the combining module 32 is specifically configured to: after receiving the voice data segment carrying the voice ending identifier, the received voice data segment. Combine to get a voice data file.
  • the voice transmission terminal in this embodiment can be used as a voice receiving terminal, and the received voice data segment is processed according to the foregoing embodiment of the present invention.
  • the specific implementation refer to the description of the method embodiment of the present invention, and details are not described herein.
  • FIG. 12 is a schematic structural diagram of a voice transmission system according to an embodiment of the present invention.
  • the system of the present embodiment includes a voice transmitting terminal 10 and a voice receiving terminal 30 as mobile terminals, and a voice server 20, and the voice transmitting terminal 10 and the voice receiving terminal 30 are both performed by the mobile communication network and the voice server 20.
  • the voice communication terminal 10 can adopt the voice transmission terminal shown in FIG. 6 , 7 or 8
  • the voice receiving terminal 30 can adopt the voice transmission terminal shown in FIG. 11
  • the voice server 30 can be specifically configured as shown in FIG. 10 .
  • Voice server For the specific structure and working process, refer to the description of the device embodiment of the present invention, and details are not described herein again.
  • FIG. 13 is a block diagram showing the structure of a voice transmission terminal according to an embodiment of the present invention.
  • the voice transmission terminal is used to implement the voice transmission method provided by the foregoing embodiment.
  • the voice transmission terminal in the embodiment of the present invention may include one or more of the following components. Section: Used to execute computer program instructions to complete various Process and method processor for information and storage program instructions Random access memory (RAM) and read only memory (ROM), memory for storing data and information, I/O devices, interfaces, antennas, and the like. Specifically:
  • the voice transmission terminal 300 may include an RF (Radio Frequency) circuit 310, a memory 320, an input unit 330, a display unit 340, a sensor 350, an audio circuit 360, a WiFi (wireless fidelity) module 370, a processor 380, Power supply 382, camera 390 and other components.
  • RF Radio Frequency
  • the components of the voice transmission terminal 300 will be specifically described below with reference to FIG. 13:
  • the RF circuit 310 can be used for receiving and transmitting signals during and after the transmission or reception of information, in particular, after receiving the downlink information of the base station, and processing it to the processor 380; in addition, transmitting the designed uplink data to the base station.
  • RF circuits include, but are not limited to, an antenna, at least one amplifier, a transceiver, a coupler, an LNA (Low Noise Amplifier), a duplexer, and the like.
  • RF circuitry 310 can also communicate with the network and other devices via wireless communication.
  • the wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System of Mobile communication), GPRS (General Packet Radio Service), CDMA (Code Division Multiple Access). , Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access), LTE (Long Term Evolution), e-mail, SMS (Short Messaging Service), and so on.
  • the memory 320 can be used to store software programs and modules, and the processor 380 executes various functional applications and data processing of the voice transmission terminal 300 by running software programs and modules stored in the memory 320.
  • the memory 320 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may be stored according to The data created by the use of the terminal 300 (such as audio data, phone book, etc.) and the like.
  • the memory 320 may include a high speed random access memory, and may also include a nonvolatile memory such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 330 can be configured to receive input digital or character information, and to generate key signal inputs related to user settings and function control of the voice transmission terminal 300.
  • the input unit 330 can A touch panel 331 and other input devices 332 are included.
  • the touch panel 331 also referred to as a touch screen, can collect touch operations on or near the user (such as a user using a finger, a stylus, or the like on the touch panel 331 or near the touch panel 331 Operation), and drive the corresponding connecting device according to a preset program.
  • the touch panel 331 can include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information
  • the processor 380 is provided and can receive commands from the processor 380 and execute them.
  • the touch panel 331 can be implemented in various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave.
  • the input unit 330 may also include other input devices 332.
  • other input devices 332 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackballs, mice, joysticks, and the like.
  • the display unit 340 can be used to display information input by the user or information provided to the user and various menus of the voice transmission terminal 300.
  • the display unit 340 may include a display panel 341.
  • the display panel 341 may be configured in the form of an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), or the like.
  • the touch panel 331 can cover the display panel 341. When the touch panel 331 detects a touch operation on or near it, the touch panel 331 transmits to the processor 380 to determine the type of the touch event, and then the processor 380 according to the touch event. The type provides a corresponding visual output on display panel 341.
  • touch panel 331 and the display panel 341 are used as two independent components to implement the input and input functions of the voice transmission terminal 300 in FIG. 13, in some embodiments, the touch panel 331 and the display panel may be The 341 is integrated to implement the input and output functions of the voice transmission terminal 300.
  • the voice transmission terminal 300 may also include at least one type of sensor 350, such as a gyro sensor, a magnetic induction sensor, a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 341 according to the brightness of the ambient light, and the proximity sensor may close the display panel 341 when the terminal 350 moves to the ear. / or backlight.
  • the acceleration sensor can detect the magnitude of acceleration in all directions (usually three axes). When it is stationary, it can detect the magnitude and direction of gravity. It can be used to identify the attitude of the terminal (such as horizontal and vertical screen switching, related games).
  • the audio circuit 360, the speaker 361, and the microphone 362 can provide an audio interface between the user and the voice transmission terminal 300.
  • the audio circuit 360 can transmit the converted electrical data of the received audio data to the speaker 361, and convert it into a sound signal output by the speaker 361.
  • the microphone 362 converts the collected sound signal into an electrical signal, and the audio circuit 360 After receiving, it is converted into audio data, and then processed by the audio data output processor 380, transmitted to the terminal, for example, by the RF circuit 310, or outputted to the memory 320 for further processing.
  • WiFi is a short-range wireless transmission technology
  • the voice transmission terminal 300 can help users to send and receive emails, browse web pages, and access streaming media through the WiFi module 370, which provides users with wireless broadband Internet access.
  • FIG. 13 shows the WiFi module 370, it can be understood that it does not belong to the essential configuration of the voice transmission terminal 300, and may be omitted as needed within the scope of not changing the essence of the invention.
  • Processor 380 is the control center of voice transmission terminal 300, which connects various portions of the entire terminal using various interfaces and lines, by running or executing software programs and/or modules stored in memory 320, and by calling stored in memory 320. Data, performing various functions and processing data of the voice transmission terminal 300, thereby performing overall monitoring of the terminal.
  • the processor 380 may include one or more processing units; preferably, the processor 380 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It will be appreciated that the above described modem processor may also not be integrated into the processor 380.
  • the voice transmission terminal 300 further includes a power source 382 (such as a battery) for supplying power to the various components.
  • a power source 382 such as a battery
  • the power source can be logically connected to the processor 382 through the power management system to manage charging, discharging, and power management through the power management system.
  • the camera 390 is generally composed of a lens, an image sensor, an interface, a digital signal processor, a CPU, a display screen, and the like.
  • the lens is fixed above the image sensor, and the focus can be changed by manually adjusting the lens;
  • the image sensor is equivalent to the "film" of the conventional camera, and is the heart of the image captured by the camera;
  • the interface is used to connect the camera with the cable and the board to the board And the spring-type connection mode is connected to the terminal motherboard, and the collected image is sent to the memory 320;
  • the digital signal processor processes the acquired image through a mathematical operation, converts the collected analog image into a digital image, and sends the image to the interface Memory 420.
  • the voice transmission terminal 300 may further include a Bluetooth module or the like, and details are not described herein again.
  • the voice transmission terminal 300 includes a memory 320 in addition to one or more processors 380.
  • the memory 320 stores one or more programs, the one or more programs being configured to be executed by the one or more processors 380, the one or more programs being included for performing as shown in FIG. 1 or 2 or the voice transmission method shown in FIG. 3 or FIG. 4 or FIG. 6.
  • FIG. 14 is a schematic structural diagram of a voice server according to an embodiment of the present invention.
  • the voice server 400 includes a central processing unit (CPU) 401, a system memory 404 including a random access memory (RAM) 402 and a read only memory (ROM) 403, and a system bus 405 that connects the system memory 404 and the central processing unit 401.
  • the voice server 400 also includes a basic input/output system (I/O system) 406 that facilitates the transfer of information between various devices within the computer, and a large capacity for storing the operating system 413, applications 414, and other program modules 415.
  • the basic input/output system 406 includes a display 408 for displaying information and an input device 409 such as a mouse, keyboard for user input of information.
  • the display 408 and input device 409 are both coupled to the central processing unit 401 via an input and output controller 410 coupled to the system bus 405.
  • the basic input/output system 406 can also include an input and output controller 410 for receiving and processing input from a plurality of other devices, such as a keyboard, mouse, or electronic stylus.
  • input output controller 410 also provides output to a display screen, printer, or other type of output device.
  • the mass storage device 407 is coupled to the central processing unit 401 via a mass storage controller (not shown) coupled to the system bus 405.
  • the mass storage device 407 and its associated computer readable medium provide non-volatile storage for the voice server 400. That is, the mass storage device 407 can include a computer readable medium (not shown) such as a hard disk or a CD-ROM drive.
  • the computer readable medium can include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid state storage technologies, CD-ROM, DVD or other optical storage, tape cartridges, magnetic tape, disk storage or other magnetic storage devices.
  • RAM random access memory
  • ROM read only memory
  • EPROM Erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • the voice server 400 may also be operated by a remote computer connected to the network through a network such as the Internet. That is, the voice server 400 can be connected to the network 412 through the network interface unit 411 connected to the system bus 405, or can also be used. Network interface unit 411 is coupled to other types of networks or remote computer systems (not shown).
  • the memory also includes one or more programs, the one or more programs being stored in a memory, and configured to be executed by one or more central processing units 401 to execute the map.
  • the voice transmission method provided by the embodiment shown in FIG.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本发明实施例提供一种语音传输方法、终端、语音服务器及语音传输系统。该语音传输方法包括采集语音音频;在语音音频采集过程中,对采集到的语音音频进行处理;在语音音频处理过程中,处理得到的语音数据的长度达到预设数据长度时,将所述语音数据作为一个语音数据段发送出去。此外本发明还提供一种语音传输终端、语音服务器以及语音传输系统。本发明技术方案可有效提高语音传输效率,可有效满足即时通讯的语音对讲对语音传输实时性的需要。

Description

说 明 书
语音传输方法、 终端、 语音服务器及语音传输系统 本申请要求于 2012 年 11 月 22 日提交中国专利局、 申请号为 201210479379.X, 发明名称为 "语音传输方法、 终端、 语音服务器及语音传输 系统" 的中国专利申请的优先权, 其全部内容通过引用结合在本申请中。 技术领域
本发明实施例涉及通信技术, 尤其涉及一种语音传输方法、 终端、 语音服 务器及语音传输系统。 背景技术
即时通讯技术^ ϋ于互联网及移动通信网发展起来的一种通信技术,其可 以实现视频、 文本、 短消息以及语音等方式的通信, 得到了用户的普遍欢迎。 其中, 语音对讲功能就是即时通讯技术中的一项重要语音通信方式, 基于该语 音通信方式, 用户可进行实时的语音聊天, 就好比短信聊天一样, 有效满足了 用户实时通信的需要, 被广泛应用于手机等移动终端的即时通讯中。
现有语音对讲技术中,语音数据的传输是采用附件方式进行传输。具体地, 持有移动终端 A1的用户 B1与持有移动终端 Α2的用户 Β2之间进行语音对讲 的过程如下: 当用户 B1向用户 Β2发送语音时, 移动终端 A1检测到用户 B1 按压语音功能按键后, 首先采集用户 B1在按压按键过程中发出的语音, 并在 检测到按压按键结束后, 结束对语音的采集; 其次, 对采集到的语音依次进行 编码和压缩处理, 得到语音文件; 然后, 将语音文件发送至语音服务器 C; 语 音服务器 C接收到的语音文件后, 将语音文件转发至移动终端 B1 , 并由移动 终端 B1对接收到语音文件依次进行解压缩和解码处理, 得到语音, 并将语音 播放给用户 Β2; 同样地, 当用户 Β2向用户 B1发送语音时, 采用相同的处理 方式, 这样, 就可以在两个移动终端之间实现语音对讲。
但是, 现有语音传输过程中, 是将语音的采集、 编码和压缩全部完成得到 整个语音文件后再将整个语音文件发送出去, 这就使得语音数据传输的时间较 长, 导致语音数据传输效率低, 语音传输的实时性较差; 同时, 采用整个语音 文件进行传输的过程中, 在移动通信等无线网络环境下, 常常会因无线网络不 稳定而导致语音传输失败, 且语音传输失败后必须要重传整个语音文件, 导致 语音文件重传的网络资源消耗大, 且这也进一步的降低了语音传输效率和语音 传输的实时性, 无法满足语音对讲技术中对语音实时性的要求。 发明内容 本发明实施例提供一种语音传输方法、终端、语音服务器及语音传输系统, 可克服现有语音对讲技术中采用附件方式进行语音传输存在的传输效率低及 语音传输实时性较差的问题。
本发明实施例提供一种语音传输方法, 包括:
采集语音音频;
在语音音频采集过程中, 对采集到的语音音频进行处理;
在语音音频处理过程中, 处理得到的语音数据的长度达到预设数据长度 时, 将所述语音数据作为一个语音数据段发送出去。
本发明实施例又提供一种语音传输方法, 包括:
接收语音发送终端通过上述实施例提供的语音传输方法发送的语音数据 段;
将接收到的所述语音数据段实时转发至语音接收终端。
本发明实施例又提供一种语音传输方法, 包括:
接收语音数据段, 所述语音数据段为语音发送终端通过上述实施例提供的 语音传输方法发送的语音数据段, 或者语音服务器通过上述实施例提供的语音 传输方法转发的语音发送终端发送的语音数据段;
将得到的各语音数据段按照语音数据段在语音音频处理过程中的先后次 序组合起来得到语音数据文件;
对所述语音数据文件进行解析, 得到语音音频。
本发明实施例又提供一种语音传输终端, 包括:
语音音频采集模块, 用于采集语音音频;
语音音频处理模块, 用于在语音音频采集过程中, 对采集到的语音音频进 行处理;
语音发送模块, 用于在语音音频处理过程中, 处理得到的语音数据的长度 达到预设数据长度时, 将所述语音数据作为一个语音数据段发送出去。 本发明实施例又提供一种语音服务器, 包括:
语音数据接收模块, 用于接收上述实施例提供的语音传输终端发送的语音 数据段;
语音数据转发模块, 用于将接收到的所述语音数据段实时转发至语音接收 终端。
本发明实施例又提供一种语音传输终端, 包括:
接收模块, 用于接收语音数据段, 所述语音数据段为上述实施例提供的语 音传输终端发送的语音数据段, 或者为上述实施例提供的语音服务器转发的语 音数据段;
组合模块, 用于将得到的各语音数据段按照语音数据段在语音音频处理过 程中的先后次序组合起来得到语音数据文件;
解析模块, 用于对所述语音数据文件进行解析, 得到语音音频。
本发明实施例还提供一种语音传输系统, 包括移动终端和语音服务器, 其 特征在于, 所述移动终端为上述实施例提供的语音传输终端; 所述语音服务器 为采用上述实施例提供的语音服务器。
本发明实施例又提供了一种语音传输终端, 包括:
一个或多个处理器; 和
存储器;
所述存储器存储有一个或多个程序, 所述一个或多个程序被配置成由所述 一个或多个处理器执行, 所述一个或多个程序包含用于进行以下操作的指令: 采集语音音频;
在语音音频采集过程中, 对采集到的语音音频进行处理;
在语音音频处理过程中, 处理得到的语音数据的长度达到预设数据长度 时, 将所述语音数据作为一个语音数据段发送出去。
本发明实施例又提供了一种语音服务器, 包括:
一个或多个处理器; 和
存储器;
所述存储器存储有一个或多个程序, 所述一个或多个程序被配置成由所述 一个或多个处理器执行, 所述一个或多个程序包含用于进行以下操作的指令: 接收上述实施例所述的语音传输终端发送的语音数据段;
将接收到的所述语音数据段实时转发至语音接收终端。 本发明实施例还提供了一种语音传输终端, 包括:
一个或多个处理器; 和
存储器;
所述存储器存储有一个或多个程序, 所述一个或多个程序被配置成由所述 一个或多个处理器执行, 所述一个或多个程序包含用于进行以下操作的指令: 接收语音数据段, 所述语音数据段为上述实施例所述的语音发送终端发送 的语音数据段, 或者上述实施例所述的语音服务器转发的语音发送终端发送的 语音数据段;
将得到的各语音数据段按照语音数据段在语音音频处理过程中的先后次 序组合起来得到语音数据文件;
对所述语音数据文件进行解析, 得到语音音频。
本发明实施例提供的语音传输方法、 终端、 语音服务器及语音传输系统, 在语音音频采集过程中, 可对语音音频进行实时处理, 并可将处理的语音数据 按预设数据长度的语音数据段进行实时发送, 这样, 可在语音音频采集过程中 就进行语音处理和传输, 语音的采集、 处理和传输同步进行, 从而提高语音传 输效率, 提高语音传输的实时性; 同时, 在语音传输时采用分段的语音数据段 方式进行发送,在网络故障,例如无线通信网络不稳定而导致数据传输失败时, 只需要将传输失败的语音数据段进行重传即可,从而可避免现有需要重传整个 语音文件而带来的网络资源消耗较大、 以及引起的语音传输效率低和语音传输 实时性较差的问题。 附图说明
为了更清楚地说明本发明实施例中的技术方案, 下面将对实施例描述中所 需要使用的附图作筒单地介绍, 显而易见地, 下面描述中的附图仅仅是本发明 的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1为本发明一个实施例提供的语音传输方法的流程示意图;
图 2为本发明另一实施例提供的语音传输方法的流程示意图;
图 3为本发明另一实施例提供的语音传输方法中语音重传的流程示意图; 图 4为本发明另一实施例提供的语音传输方法的流程示意图; 图 5为本发明另一实施例提供的语音传输方法的流程示意图; 图 6为本发明另一实施例提供的语音传输方法的流程示意图;
图 7为本发明一个实施例提供的语音传输终端的结构示意图;
图 8为本发明另一实施例提供的语音传输终端的结构示意图;
图 9为本发明另一实施例提供的语音传输终端的结构示意图;
图 10为本发明一个实施例提供的语音服务器的结构示意图;
图 11为本发明一个实施例提供的语音传输终端的结构示意图;
图 12为本发明另一个实施例提供的语音传输系统的结构示意图; 图 13为本发明一个实施例提供的语音传输终端的结构示意图;
图 14为本发明另一实施例提供的语音服务器的结构示意图。 具体实施方式
为使本发明的目的、 技术方案和优点更加清楚, 下面将结合本发明实施例 中的附图, 对本发明实施例中的技术方案进行清楚、 完整地描述, 显然, 所描 述的实施例是本发明一部分实施例, 而不是全部的实施例。 基于本发明中的实 施例, 本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他 实施例, 都属于本发明保护的范围。
图 1为本发明一个实施例提供的语音传输方法的流程示意图。 本实施例语 音传输方法应用于即时通讯中, 可在语音对讲过程中实现语音数据的传输, 当 移动通信网络中的用户 A需要发送语音到用户 B时,用户 A可按压其手持的语音 发送终端上的对讲功能按键, 此时语音发送终端就可根据本实施例方法对用户 A发出的语音进行处理, 具体地, 如图 1所示, 本实施例方法可包括如下步骤: 步骤 101、 语音发送终端采集语音音频;
步骤 102、 语音发送终端在语音音频采集过程中, 对采集到的语音音频进 行处理;
步骤 103、 在语音音频处理过程中, 处理得到的语音数据的长度达到预设 数据长度时, 将该语音数据作为一个语音数据段发送出去。
本实施例中, 当语音发送终端检测到用户 A按压语音对讲功能按键时, 就 可对语音音频进行采集, 在语音音频采集过程中, 可同时对采集到的语音音频 进行处理, 并将处理得到的语音数据按预设数据长度的语音数据段进行实时发 送至网络中的语音服务器, 直到语音音频采集结束; 同时, 语音服务器可将接 收到的语音数据段实时转发至用户 B所持有的语音接收终端, 以便由语音接收 终端对语音数据处理后展现给用户 B, 从而可实现语音对讲中语音的传输。 类 似地, 当用户 B向用户 A发送语音数据时, 具有相同的语音传输过程。
本领域技术人员可以理解, 本实施例在进行语音数据处理时, 是将语音音 频处理成一个个的语音数据段, 其实质就是将一个大块的语音数据块分成较小 的语音数据块,这样,语音发送时是基于一个个的较小的语音数据块进行发送。
本实施例中, 所述的语音发送终端和语音接收终端可以为基于移动通信网 络的移动终端, 例如手机, 或者也可以是基于现有其他无线网络, 例如 wifi网 络的移动终端, 如平板电脑、 笔记本电脑等, 对此本发明实施例并不做特别限 制, 只要可以进行即时通讯的终端均是本实施例中所述的终端。
本发明实施例提供的语音传输方法, 在语音音频采集过程中, 可对语音音 频进行实时处理, 并可将处理的语音数据按预设数据长度的语音数据段进行实 时发送,这样,可在语音音频采集过程中就进行语音处理和传输,语音的采集、 处理和传输同步进行,从而提高语音传输效率,提高语音传输的实时性; 同时, 在语音传输时采用分段的语音数据段方式进行发送, 在网络故障, 例如无线通 信网络不稳定而导致数据传输失败时, 只需要将传输失败的语音数据段进行重 传即可, 从而可避免现有需要重传整个语音文件而带来的网络资源消耗较大、 以及引起的语音传输效率低和语音传输实时性较差的问题。
图 2为本发明另一实施例提供的语音传输方法的流程示意图。本实施例中, 用户 A通过语音发送终端发送对讲的语音时, 语音发送终端可对处理得到的语 音数据段增加逻辑标识, 以便于接收到该语音数据段的语音接收终端可基于该 逻辑标识将语音数据段重组, 具体地, 如图 2所示, 本实施例方法可包括如下 步骤:
步骤 201、用户 A按压语音发送终端上的语音对讲功能按键, 以指示语音发 送终端开始发送语音到用户 B所持有的语音接收终端;
步骤 202、语音发送终端检测到用户 A按压该按键后, 立即进行录音, 采集 用户 A发出的语音音频, 直到用户 A松开按键, 指示停止发送语音为止;
步骤 203、 在语音音频采集过程中, 对采集到的语音音频进行编码处理, 并对编码处理后的数据进行压缩处理;
步骤 204、 在上述步骤 203对语音数据进行处理过程中, 判断压缩得到的语 音数据长度是否达到预设数据长度, 是则执行步骤 205, 否则, 继续执行步骤 203;
步骤 205、 判断语音音频是否采集结束, 是则执行步骤 206, 否则, 执行步 骤 203和步骤 208;
步骤 206、 判断采集的语音音频是否全部处理完毕, 是则执行步骤 207, 否 则, 执行步骤 203和步骤 208;
步骤 207、 在处理后得到的最后一个语音数据段中增加语音结束标识, 执 行步骤 209;
步骤 208、 在处理后得到的语音数据段中增加逻辑标识, 该逻辑标识用于 表示语音数据段的处理次序;
步骤 209、 将语音数据段实时传输至网络中的语音服务器。
上述步骤 201和步骤 202中,用户除了可以通过按压语音发送终端上的语音 对讲功能按键, 指示发送语音外, 实际应用中也可通过语音命令等方式向语音 发送终端发起语音对讲并发送语音。 对此, 本实施例对语音发起的指示命令并 不做特别限制。
上述步骤 202中, 语音发送终端采集语音音频过程中, 可实时将采集到的 语音音频进行緩存, 直到语音发送结束指示, 即检测到用户停止按压语音对讲 功能按键为止。
上述步骤 203是在步骤 202执行语音音频采集的过程中, 同时对步骤 202采 集到的语音音频进行编码和压缩处理, 即步骤 202和步骤 203是同步执行的。
本领域技术人员可以理解, 上述对语音音频进行编码是将采集的语音音频 转换成适合网络传输的数字信号; 所述的对编码后的数据进行压缩, 是为了减 少网络传输中语音数据的大小, 以提高语音传输速率。 具体的编码和压缩过程 与传统技术相同或类似, 在此不再赘述。
上述步骤 204中, 可对步骤 203在压缩处理过程中的数据的长度进行检测, 以在数据长度达到预设数据长度时, 可将压缩后的数据作为一个语音数据段, 其中所述的预设数据长度可以根据网络传输的需要设置成合适大小, 例如在采 用 TCP/IP协议的语音数据传输时,可将该预设数据长度设置为 1500字节,这样, 可适合底层介质接入控制(Media Access Control, MAC )协议对数据包长度的 限制, 避免在底层对超过 1500字节的数据要重新分段重组, 减少底层协议的操 作, 提高数据传输效率。 该步骤 204和步骤 203也是同步进行的。
上述步骤 205-步骤 208中, 在语音音频采集结束及采集的语音音频均处理 完毕后, 可在最后处理后得到语音数据段中增加语音结束标识, 以表示该次语 音的结束, 从而可便于语音服务器以及语音接收终端可判断语音的结束。 本领 域技术人员可以理解, 实际应用中, 也可在语音音频采集结束, 即用户 A指示 语音发送结束后, 向语音服务器以及语音接收终端发送语音结束指令, 以通告 语音结束, 对此本发明实施例并不做特别限制。
本实施例中, 所述的语音音频采集结束是指语音发送终端接收到用户的语 音发送结束指令时, 停止语音的采集, 本实施例中就是在检测到用户不再按压 语音发送终端上的语音对讲功能按键时, 停止语音音频的采集, 此时说明用户 所要发送的语音结束。
上述步骤 208 , 是在处理后得到的语音数据段增加逻辑标识, 以表示各语 音数据段的处理次序, 例如处理的序列号, 这样, 语音接收终端就可以根据这 些序列号进行语音数据的重组, 从而得到相应的完整的语音文件; 此外, 语音 服务器也可基于接收到的语音数据段的逻辑标识,确定接收到的语音数据段是 否丟失或者是否错乱。
上述步骤 209中, 可将步骤 203处理得到的语音数据段, 实时发送至语音服 务器, 而语音服务器接收到该语音数据段后, 可实时转发至语音接收终端, 以 便由语音接收终端对接收到的语音数据段进行处理, 并将最终得到的语音播放 给语音接收的用户, 其具体处理过程将在后面说明。
本领域技术人员可以理解, 在进行语音数据段的发送时, 具体可采用传输 控制协议 ( Transmission Control Protocol, TCP )协议, 将处理后得到的语音数 据段实时发送至语音服务器, 或者, 也可采用用户数据报协议(User Datagram Protocol, UDP )协议, 将处理后得到的语音数据段实时发送至语音服务器, 或者也可采用其他传输协议, 本实施例并不做特别限制。
本领域技术人员可以理解, 在采用 TCP协议进行语音数据段的发送时, 也 可不在语音数据段中增加逻辑标识, 而是依靠 TCP协议控制来确保各语音数据 段的有序排列。
本领域技术人员可以理解, 本实施例中, 对语音音频采集过程中的处理过 程中, 就是将语音数据处理成多个分片数据, 并将分片数据分别进行发送, 这 样可不需要对整个语音文件进行发送, 使得语音数据的发送效率更高, 实时性 也更好, 可满足即时通讯的实时性的需要。
图 3为本发明另一实施例提供的语音传输方法中语音重传的流程示意图。 在上述本发明各实施例技术方案的基础上,语音发送终端为确保各语音数据段 可靠发送至语音服务器, 还可对发送失败的语音数据段进行重传, 具体地, 如 图 3所示, 本实施例方法还可包括如下步骤:
步骤 301、 语音发送终端接收语音服务器返回的传输反馈信息, 该传输反 馈反馈信息包括重发标识, 该重发标识用于表示需要重发的语音数据段; 步骤 302、 根据该重发标识, 重新发送需要重发的语音数据段。
本实施例中, 当语音发送终端进行语音数据段传输的网络, 例如移动通信 网络故障或不稳定, 导致语音服务器无法接收到语音数据段出现语音数据段丟 失, 语音服务器就可向语音发送端返回传输反馈信息, 以指示语音发送端需要 重发的语音数据段, 这样, 语音发送端仅需重发需要重发的语音数据段。
本领域技术人员可以理解, 语音发送终端在发送完各语音数据段后, 会暂 时存储各语音数据段, 以便语音数据段传输失败时可进行重传, 直到语音服务 器反馈语音传输成功。 实际应用中, 也可将已发送的语音数据段按设定时间进 行存储, 对此本实施例并不做特别限制。
本实施例中, 当语音发送终端在语音音频采集结束, 且处理后得到的各语 音数据段均发送完毕之后, 预设时间段内, 未接收到语音服务器返回的传输成 功信息后, 可重新发送所有语音数据段。 本领域技术人员可以理解, 当语音发 送结束后, 长时间内没有接收到语音服务器反馈的接收成功消息, 则表示服务 器未能接收到语音, 因此, 对语音数据进行重传可确保语音能可靠的传输至语 音发送终端。
本领域技术人员可以理解, 在移动通信等无线网络环境下, 由于无线通信 网络的不稳定性, 语音数据的传输失败率是较高的, 现有采用整个语音文件进 行语音传输时, 若语音数据传输中途失败, 那么就需要重传整个语音文件, 而 本实施例中, 由于语音传输时, 是一个个的语音数据段, 因此, 在语音传输过 程中, 即使一个语音数据段传输失败, 也只需要重发该语音数据段, 从而可减 少重传占用的网络资源, 并可提供语音传输效率。
图 4为本发明另一实施例提供的语音传输方法的流程示意图。 与上述本发 明各实施例不同的是, 本实施例可在语音音频采集结束后就提示用户 A发送成 功信息, 以提高用户使用语音对讲这种即时通讯方式的用户体验, 具体地, 如 图 4所示, 本实施例方法可包括如下步骤:
步骤 401、用户 A按压语音发送终端上的语音对讲功能按键,指示语音发送 终端开始发送语音到语音接收终端;
步骤 402、语音发送终端检测到用户 A按压该按键后, 立即进行录音, 采集 用户 A发出的语音音频;
步骤 403、 在语音音频采集过程中, 对采集到的语音音频进行编码和压缩 处理, 并编码和压缩处理过程中, 对处理得到语音数据长度达到预设数据长度 时, 将语音数据作为一个语音数据段实时发送至语音服务器;
步骤 404、 判断语音音频是否采集结束, 是则执行步骤 405 , 否则, 执行步 骤 403;
步骤 405、 判断采集的语音音频是否全部处理完毕, 是则执行步骤 406, 否 则, 执行步骤 403;
步骤 406、 判断处理后得到语音数据段是否全部发送完毕, 是则执行步骤 407, 否则执行步骤 403;
步骤 407、 检测语音发送终端的网络连接是否正常, 是则执行步骤 409, 否 则, 执行步骤 408;
步骤 408、 语音发送终端为用户提供正在发送中提示信息, 继续执行步骤
407。
步骤 409、 语音发送终端为用户提供发送成功提示信息;
步骤 410、 在预设时间段内, 确定是否接收到语音服务器返回的传输成功 信息, 是则结束整个语音的传输, 否则, 执行步骤 411 ;
步骤 411、 重新发送所有的语音数据段。
上述步骤 409中,语音发送终端是在语音音频采集结束且数据发送完成后, 只要检测到语音发送终端的网络连接是正常的, 就为用户提供发送成功提示信 息, 这样, 可确保用户更好的体现即时通讯的实时性, 提高用户即时通讯的体 验。
上述步骤 408中, 当语音数据发送完毕后, 检测到网络连接不正常时, 说 明此时数据可能还没有成功发送至语音服务器以及语音接收终端, 因此, 可为 用户提供语音发送中提示信息。此外, 实际应用中,也可设置一定的时间长度, 例如 1分钟, 若在该时间长度内仍旧检测网络连接不正常, 则可为用户提供发 送失败等提示信息。
上述步骤 410和步骤 411中,在语音音频采集结束并为用户提供发送成功提 示后, 通过检测语音服务器是否返回确认传输成功信息, 来确保语音数据可靠 的发送至语音服务器, 从而可有效提高语音数据发送的可靠性。
现有技术中,语音发送终端通常是在接收到语音服务器反馈的发送成功的 确认信息后, 才为用户提示发送成功提示, 否则, 会继续等待。 由于语音对讲 中的语音发送终端是基于移动通信等无线网络, 将语音数据发送至语音服务 器, 而在无线网络环境的复杂度远远大于有线网络, 而且语音发送终端的上下 行带宽严重不对称, 在信噪比较低的无线网络环境中, 语音发送终端与语音服 务器之间的信令数据的丟失就会占到一定的比例, 在这种情况下, 真正的语音 数据已经成功发送, 而由于语音服务器反馈发送成功的确认信息将会延迟, 导 致在语音发送终端迟迟不能为用户提供成功信息, 这会严重影响语音对讲的服 务的体验效果。 为此本实施例通过对语音发送成功的提示过程进行优化, 可有 效提高用户在使用即时通讯时的体验。
图 5为本发明实施例五提供的语音传输方法的流程示意图。 本实施例中, 语音服务器可实时接收上述本发明方法实施例中语音发送终端发送的语音数 据段, 并可将语音数据段实时转发至语音接收端, 具体地, 如图 5所示, 本实 施例方法可包括如下步骤:
步骤 501、 语音服务器接收语音发送终端发送的语音数据段;
步骤 502、 语音服务器将接收到的语音数据段实时转发至语音接收终端。 本实施例中, 语音服务器可实时接收上述图 1-图 4所述的语音发送终端发 送来的语音数据段, 并可实时将语音数据端转发至语音接收终端, 以提高语音 数据传输效率。
本实施例中, 语音服务器在接收到语音数据段失败, 导致语音数据段出现 丟失时, 可向语音发送终端返回传输反馈信息, 该传输反馈信息中可包括重发 标识, 用于表示需要重发的语音数据段, 以便语音发送终端可根据该重发标识 重发该需要重发的语音数据段, 其具体处理过程可参见上述图 3所示方法中的 说明。
图 6为本发明实施例六提供的语音传输方法的流程示意图。 本实施例中, 语音接收终端可实时接收上述图 5所示实施例方法中语音服务器实时转发的语 音数据段, 具体地, 如图 6所示, 本发明实施例可包括如下步骤:
步骤 601、 语音接收终端接收语音数据段;
步骤 602、 语音接收终端将得到的各语音数据段按照语音数据段在语音音 频处理过程中的先后次序组合起来得到语音数据文件; 步骤 603、 语音接收终端对语音数据文件进行解析, 得到语音音频。
本实施例中, 语音接收终端可对语音服务器转发的上述图 1-图 4所述的语 音发送终端发送的语音数据段进行实时接收, 并可将接收的各语音数据段组合 起来, 得到完整的语音文件, 并可对语音文件进行解析, 得到相应的语音音频 播放给用户。
本实施例中,语音接收终端具体可在接收到携带有语音结束标识的语音数 据段后, 对接收到的各语音数据段进行组合, 得到语音数据文件。
本实施例中, 语音接收终端在对接收到的各语音数据段组合时, 具体可根 据各语音数据段中携带的逻辑标识,按照语音数据段的处理次序组合得到语音 数据文件。
本领域技术人员可以理解, 上述各实施例中语音发送时, 均需要通过语音 服务器作为语音数据进行转发, 实际应用中, 在移动终端之间直接通信的情况 下, 也可按照上述方式进行语音的发送或接收, 对此本发明实施例并不做特别 限制, 例如处于同一通信网络下的两个移动终端之间, 直接进行语音对讲时, 一个移动终端可直接将对讲的语音按照上述语音采集、处理和传输方式发送至 另一移动终端。
本领域普通技术人员可以理解: 实现上述各方法实施例的全部或部分步骤 可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取 存储介质中。 该程序在执行时, 执行包括上述各方法实施例的步骤; 而前述的 存储介质包括: ROM、 RAM, 磁碟或者光盘等各种可以存储程序代码的介质。
图 7为本发明一个实施例提供的语音传输终端的结构示意图。 本实施例语 音传输终端可为上述本发明方法实施例中所述的语音发送终端, 以进行语音的 发送, 具体地, 如图 7所示, 本实施例语音传输终端包括语音音频采集模块 11、 语音音频处理模块 12和语音发送模块 13 , 其中:
语音音频采集模块 11 , 用于采集语音音频;
语音音频处理模块 12, 用于在语音音频采集过程中, 对采集到的语音音频 进行处理;
语音发送模块 13, 用于在语音音频处理过程中, 处理得到的语音数据的长 度达到预设数据长度时, 将该语音数据作为一个语音数据段发送出去。
本实施例语音传输终端可基于上述图 1、 图 2、 图 3或图 4所示方法实施例来 进行语音的发送, 其具体实现可参见上述本发明方法实施例的说明, 在此不再 赘述。
图 8为本发明另一实施例提供的语音传输终端的结构示意图。在上述图 7所 示实施例基础上, 如图 8所示, 本实施例语音传输终端还可包括标识增加模块 14以及语音结束标识增加模块 15 , 其中, 标识增加模块 14可用于在语音发送模 块 13发送的语音数据段中增加逻辑标识, 该逻辑标识表示语音数据段在语音音 频处理过程中的处理次序; 语音结束标识增加模块 15可用于语音音频采集结束 后, 在处理得到的最后一个语音数据段增加语音结束标识。
如图 8所示, 本实施例语音传输终端还可包括反馈信息接收模块 16和反馈 重传模块 17 , 其中, 反馈信息接收模块 16可用于接收语音服务器返回的传输反 馈信息, 该反馈信息包括重发标识, 该重发标识表示需要重发的语音数据段; 反馈重传模块 17可用于根据该重发标识, 重新发送需要重发的语音数据段。
进一步地, 如图 8所示, 本实施例语音传输终端还可包括语音音频重传模 块 18, 其中, 语音音频重传模块 18可用于在语音音频采集结束且处理后得到的 各语音数据段均发送完毕后的预设时间段内, 未接收到语音服务器返回的传输 成功信息时, 重新发送语音音频处理过程中的所有语音数据段。
本实施例语音传输终端可基于本发明图 2或图 3所示方法实施例来实现语 音的发送, 其具体实现可参见上述本发明方法实施例的说明, 在此不再赘述。
图 9为本发明一个实施例提供的语音传输终端的结构示意图。在上述图 7或 图 8所示实施例技术方案基础上, 如图 9所示, 本实施例还可包括发送成功提示 模块 19, 用于在语音音频采集结束后, 检测网络连接正常时, 为用户提供发送 成功提示信息。
本实施例语音传输终端可基于图 4所示方法实施例来实现语音的发送, 其 具体实现可参见上述本发明方法实施例的说明, 在此不再赘述。
图 10为本发明一个实施例提供的语音服务器的结构示意图。 如图 10所示, 本实施例语音服务器包括语音数据接收模块 21和语音数据转发模块 22, 其中: 语音数据接收模块 21 , 用于接收语音发送终端发送的语音数据段; 语音数据转发模块 22, 用于将接收到的语音数据段实时转发至语音接收终 端。
如图 10所示, 本实施例语音服务器还可包括反馈模块 23 , 用于在语音数据 段接收失败时, 向语音发送终端返回传输反馈信息, 该传输反馈信息包括重发 标识, 该重发标识表示需要重发的语音数据段, 以便语音发送终端从重新发送 需要重发的语音数据段。
本实施例语音服务器可基于上述图 5所示方法实施例对图 7、 图 8或 9所示的 语音传输终端发送的语音数据段进行处理, 其具体实现可参见上述本发明方法 实施例的说明, 在此不再赘述。
图 11为一个本发明实施例提供的语音传输终端的结构示意图。 本实施例语 音传输终端可作为语音接收终端, 对上述语音服务器或语音发送终端发送来的 语音数据段进行接收, 具体地, 如图 11所示, 本实施例语音传输终端可包括接 收模块 31、 组合模块 32和解析模块 33 , 其中:
接收模块 31 , 用于接收语音数据段;
组合模块 32, 用于将得到的各语音数据段按照语音数据段在语音音频处理 过程中的先后次序组合起来得到语音数据文件;
解析模块 33 , 用于对语音数据文件进行解析, 得到语音音频。
本实施例中, 上述的各语音数据段中携带有用于表示语音数据段的处理次 序的逻辑标识, 上述的组合模块 32具体用于根据各语音数据段中携带的逻辑标 识, 按照语音数据段的处理次序组合得到语音数据文件。
此外, 上述的语音发送终端发送的最后一个语音数据段携带有语音结束标 识, 上述的组合模块 32具体可用于在接收到携带有语音结束标识的语音数据段 后, 对接收到的各语音数据段进行组合, 得到语音数据文件。
本实施例语音传输终端可作为语音接收终端,基于上述本发明方法实施例 六对接收到的语音数据段进行处理, 其具体实现可参见上述本发明方法实施例 的说明, 在此不再赘述。
图 12为本发明一个实施例提供的语音传输系统的结构示意图。 如图 12所 示, 本实施例系统包括作为移动终端的语音发送终端 10和语音接收终端 30, 以 及语音服务器 20,语音发送终端 10和语音接收终端 30均是通过移动通信网络与 语音服务器 20进行数据通信, 其中, 语音发送终端 10具体可采用图 6、 7或 8所 示的语音传输终端, 语音接收终端 30可采用图 11所示的语音传输终端, 语音服 务器 30具体可采用图 10所示的语音服务器。其具体结构及工作过程可参见上述 本发明装置实施例的说明, 在此不再赘述。
图 13为本发明一个实施例提供的语音传输终端的结构方框图, 该语音传 输终端用于实施上述实施例提供的语音传输方法, 本发明实施例中的语音传输 终端可以包括一个或多个如下组成部分: 用于执行计算机程序指令以完成各种 流程和方法的处理器, 用于信息和存储程序指令随机接入存储器(RAM) 和只 读存储器 (ROM) , 用于存储数据和信息的存储器, I/O设备, 界面, 天线等。 具体来讲:
语音传输终端 300可以包括 RF ( Radio Frequency, 射频) 电路 310、 存储 器 320、输入单元 330、显示单元 340、传感器 350、音频电路 360、 WiFi(wireless fidelity, 无线保真)模块 370、 处理器 380、 电源 382、 摄像头 390等部件。 本 领域技术人员可以理解, 图 13 中示出的终端结构并不构成对语音传输终端的 限定, 可以包括比图示更多或更少的部件, 或者组合某些部件, 或者不同的部 件布置。
下面结合图 13对语音传输终端 300的各个构成部件进行具体的介绍:
RF电路 310可用于收发信息或通话过程中, 信号的接收和发送, 特别地, 将基站的下行信息接收后, 给处理器 380处理; 另外, 将设计上行的数据发送 给基站。 通常, RF 电路包括但不限于天线、 至少一个放大器、 收发信机、 耦 合器、 LNA ( Low Noise Amplifier, 低噪声放大器)、 双工器等。 此外, RF电 路 310还可以通过无线通信与网络和其他设备通信。 所述无线通信可以使用任 一通信标准或协议, 包括但不限于 GSM(Global System of Mobile communication, 全球移动通讯系统)、 GPRS (General Packet Radio Service , 通 用分组无线服务)、 CDMA(Code Division Multiple Access , 码分多址)、 WCDMA(Wideband Code Division Multiple Access, 宽带码分多址)、 LTE(Long Term Evolution,长期演进)、 电子邮件、 SMS (Short Messaging Service, 短消息服 务)等。
存储器 320可用于存储软件程序以及模块, 处理器 380通过运行存储在存 储器 320的软件程序以及模块,从而执行语音传输终端 300的各种功能应用以 及数据处理。 存储器 320可主要包括存储程序区和存储数据区, 其中, 存储程 序区可存储操作系统、 至少一个功能所需的应用程序(比如声音播放功能、 图 像播放功能等 )等; 存储数据区可存储根据终端 300的使用所创建的数据 (比 如音频数据、 电话本等)等。 此外,存储器 320可以包括高速随机存取存储器, 还可以包括非易失性存储器, 例如至少一个磁盘存储器件、 闪存器件、 或其他 易失性固态存储器件。
输入单元 330可用于接收输入的数字或字符信息, 以及产生与语音传输终 端 300的用户设置以及功能控制有关的键信号输入。 具体地, 输入单元 330可 包括触控面板 331 以及其他输入设备 332。 触控面板 331 , 也称为触摸屏, 可 收集用户在其上或附近的触摸操作(比如用户使用手指、 触笔等任何适合的物 体或附件在触控面板 331上或在触控面板 331附近的操作 ), 并根据预先设定 的程式驱动相应的连接装置。 可选的, 触控面板 331可包括触摸检测装置和触 摸控制器两个部分。 其中, 触摸检测装置检测用户的触摸方位, 并检测触摸操 作带来的信号, 将信号传送给触摸控制器; 触摸控制器从触摸检测装置上接收 触摸信息, 并将它转换成触点坐标, 再送给处理器 380, 并能接收处理器 380 发来的命令并加以执行。 此外, 可以采用电阻式、 电容式、 红外线以及表面声 波等多种类型实现触控面板 331。 除了触控面板 331 , 输入单元 330还可以包 括其他输入设备 332。具体地,其他输入设备 332可以包括但不限于物理键盘、 功能键(比如音量控制按键、 开关按键等)、 轨迹球、 鼠标、 操作杆等中的一 种或多种。
显示单元 340可用于显示由用户输入的信息或提供给用户的信息以及语音 传输终端 300的各种菜单。 显示单元 340可包括显示面板 341 , 可选的, 可以 采用 LCD(Liquid Crystal Display, 液晶显示器)、 OLED(Organic Light-Emitting Diode,有机发光二极管)等形式来配置显示面板 341。 进一步的, 触控面板 331 可覆盖显示面板 341 , 当触控面板 331检测到在其上或附近的触摸操作后, 传 送给处理器 380以确定触摸事件的类型, 随后处理器 380根据触摸事件的类型 在显示面板 341上提供相应的视觉输出。 虽然在图 13中, 触控面板 331与显 示面板 341 是作为两个独立的部件来实现语音传输终端 300 的输入和输入功 能, 但是在某些实施例中, 可以将触控面板 331与显示面板 341集成而实现语 音传输终端 300的输入和输出功能。
语音传输终端 300还可包括至少一种传感器 350, 比如陀螺仪传感器、 磁 感应传感器、 光传感器、 运动传感器以及其他传感器。 具体地, 光传感器可包 括环境光传感器及接近传感器, 其中, 环境光传感器可根据环境光线的明暗来 调节显示面板 341的亮度, 接近传感器可在终端 350移动到耳边时, 关闭显示 面板 341和 /或背光。作为运动传感器的一种,加速度传感器可检测各个方向上 (一般为三轴)加速度的大小, 静止时可检测出重力的大小及方向, 可用于识 别终端姿态的应用 (比如横竖屏切换、 相关游戏、 磁力计姿态校准)、 振动识 别相关功能(比如计步器、 敲击)等; 至于语音传输终端 300还可配置的气压 计、 湿度计、 温度计、 红外线传感器等其他传感器, 在此不再赘述。 音频电路 360、 扬声器 361 , 传声器 362可提供用户与语音传输终端 300 之间的音频接口。 音频电路 360可将接收到的音频数据转换后的电信号, 传输 到扬声器 361 , 由扬声器 361转换为声音信号输出; 另一方面, 传声器 362将 收集的声音信号转换为电信号, 由音频电路 360接收后转换为音频数据, 再将 音频数据输出处理器 380处理后, 经 RF电路 310以发送给比如另一终端, 或 者将音频数据输出至存储器 320以便进一步处理。
WiFi属于短距离无线传输技术, 语音传输终端 300通过 WiFi模块 370可 以帮助用户收发电子邮件、 浏览网页和访问流式媒体等, 它为用户提供了无线 的宽带互联网访问。 虽然图 13示出了 WiFi模块 370, 但是可以理解的是, 其 并不属于语音传输终端 300的必须构成, 完全可以根据需要在不改变发明的本 质的范围内而省略。
处理器 380是语音传输终端 300的控制中心, 利用各种接口和线路连接整 个终端的各个部分, 通过运行或执行存储在存储器 320 内的软件程序和 /或模 块, 以及调用存储在存储器 320内的数据, 执行语音传输终端 300的各种功能 和处理数据, 从而对终端进行整体监控。 可选的, 处理器 380可包括一个或多 个处理单元;优选的,处理器 380可集成应用处理器和调制解调处理器,其中, 应用处理器主要处理操作系统、 用户界面和应用程序等, 调制解调处理器主要 处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器 380 中。
语音传输终端 300还包括给各个部件供电的电源 382(比如电池),优选的, 电源可以通过电源管理系统与处理器 382逻辑相连,从而通过电源管理系统实 现管理充电、 放电、 以及功耗管理等功能。
摄像头 390—般由镜头、 图像传感器、 接口、 数字信号处理器、 CPU、 显 示屏幕等组成。 其中, 镜头固定在图像传感器的上方, 可以通过手动调节镜头 来改变聚焦; 图像传感器相当于传统相机的"胶卷",是摄像头采集图像的心脏; 接口用于把摄像头利用排线、板对板连接器、弹簧式连接方式与终端主板连接, 将采集的图像发送给所述存储器 320; 数字信号处理器通过数学运算对采集的 图像进行处理, 将采集的模拟图像转换为数字图像并通过接口发送给存储器 420。
尽管未示出, 语音传输终端 300还可以包括蓝牙模块等, 在此不再赘述。 语音传输终端 300除了包括一个或者多个处理器 380, 还包括有存储器 320, 所述存储器 320存储有一个或多个程序, 所述一个或多个程序被配置成由所述 一个或多个处理器 380执行, 所述一个或多个程序包含用于执行如图 1或图 2或 图 3或图 4或图 6所示出的语音传输方法。
图 14是本发明一个实施例提供的语音服务器的结构示意图。 所述语音服 务器 400 包括中央处理单元(CPU ) 401、 包括随机存取存储器(RAM ) 402 和只读存储器( ROM ) 403的系统存储器 404 , 以及连接系统存储器 404和中 央处理单元 401的系统总线 405。 所述语音服务器 400还包括帮助计算机内的 各个器件之间传输信息的基本输入 /输出系统(I/O 系统) 406, 和用于存储操 作系统 413、 应用程序 414和其他程序模块 415的大容量存储设备 407。
所述基本输入 /输出系统 406包括有用于显示信息的显示器 408和用于用户 输入信息的诸如鼠标、 键盘之类的输入设备 409。 其中所述显示器 408和输入 设备 409都通过连接到系统总线 405的输入输出控制器 410连接到中央处理单 元 401。所述基本输入 /输出系统 406还可以包括输入输出控制器 410以用于接 收和处理来自键盘、 鼠标、 或电子触控笔等多个其他设备的输入。 类似地, 输 入输出控制器 410还提供输出到显示屏、 打印机或其他类型的输出设备。
所述大容量存储设备 407通过连接到系统总线 405 的大容量存储控制器 (未示出)连接到中央处理单元 401。 所述大容量存储设备 407及其相关联的 计算机可读介质为语音服务器 400提供非易失性存储。 也就是说, 所述大容量 存储设备 407可以包括诸如硬盘或者 CD-ROM驱动器之类的计算机可读介质 (未示出)。
不失一般性, 所述计算机可读介质可以包括计算机存储介质和通信介质。 计算机存储介质包括以用于存储诸如计算机可读指令、 数据结构、 程序模块或 其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移 动介质。 计算机存储介质包括 RAM、 ROM, EPROM、 EEPROM、 闪存或其他 固态存储其技术, CD-ROM、 DVD 或其他光学存储、 磁带盒、 磁带、 磁盘存 储或其他磁性存储设备。 当然, 本领域技术人员可知所述计算机存储介质不局 限于上述几种。上述的系统存储器 404和大容量存储设备 407可以统称为存储 器。
根据本发明的各种实施例, 所述语音服务器 400还可以通过诸如因特网等 网络连接到网络上的远程计算机运行。也即语音服务器 400可以通过连接在所 述系统总线 405上的网络接口单元 411连接到网络 412, 或者说, 也可以使用 网络接口单元 411来连接到其他类型的网络或远程计算机系统(未示出)。 所述存储器还包括一个或者一个以上的程序, 所述一个或者一个以上程序 存储于存储器中, 且经配置以由一个或者一个以上中央处理单元 401执行所述 一个或者一个以上程序包含用于执行图 5所示实施例所提供的语音传输方法。
最后应说明的是: 以上各实施例仅用以说明本发明的技术方案, 而非对其 限制; 尽管参照前述各实施例对本发明进行了详细的说明, 本领域的普通技术 人员应当理解: 其依然可以对前述各实施例所记载的技术方案进行修改, 或者 对其中部分或者全部技术特征进行等同替换; 而这些修改或者替换, 并不使相 应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims

权 利 要 求 书
1、 一种语音传输方法, 其特征在于, 包括:
采集语音音频;
在语音音频采集过程中, 对采集到的语音音频进行处理;
在语音音频处理过程中, 处理得到的语音数据的长度达到预设数据长度时, 将所述语音数据作为一个语音数据段发送出去。
2、 根据权利要求 1所述的语音传输方法, 其特征在于, 所述在语音音频采 集过程中, 对采集到的语音音频进行处理包括:
在语音音频采集过程中, 对采集到的语音音频进行编码及压缩处理。
3、 根据权利要求 1所述的语音传输方法, 其特征在于, 将语音数据段发送 之前, 还包括:
在所述语音数据段中增加逻辑标识, 所述逻辑标识表示所述语音数据段在 所述语音音频处理过程中的处理次序。
4、 根据权利要求 1所述的语音传输方法, 其特征在于, 还包括:
所述语音音频采集结束后, 在处理得到的最后一个语音数据段增加语音结 束标识。
5、 根据权利要求 1所述的语音传输方法, 其特征在于, 还包括:
接收语音服务器返回的传输反馈信息, 所述传输反馈信息包括重发标识, 所述重发标识表示需要重发的语音数据段;
根据所述重发标识, 重新发送所述需要重发的语音数据段。
6、 根据权利要求 1所述的语音传输方法, 其特征在于, 还包括:
在语音音频采集结束后, 检测网络连接正常时, 为用户提供发送成功提示 信息。
7、 根据权利要求 1-6任一所述的语音传输方法, 其特征在于, 还包括: 在语音音频采集结束且处理后得到的各语音数据段均发送完毕后的预设时 间段内, 未接收到语音服务器返回的传输成功信息时, 重新发送所述语音音频 处理过程中得到的所有语音数据段。
8、 一种语音传输方法, 其特征在于, 包括:
接收语音发送终端通过权利要求 1所述的语音传输方法发送的语音数据段; 将接收到的所述语音数据段实时转发至语音接收终端。
9、 根据权利要求 8所述的语音传输方法, 其特征在于, 还包括: 在语音数据段接收失败时, 向所述语音发送终端返回传输反馈信息, 所述 传输反馈信息包括重发标识, 所述重发标识表示需要重发的语音数据段, 以便 所述语音发送终端重新发送所述需要重发的语音数据段。
10、 一种语音传输方法, 其特征在于, 包括:
接收语音数据段, 所述语音数据段为语音发送终端通过权利要求 1发送的语 音数据段, 或者语音服务器通过权利要求 8转发的语音发送终端发送的语音数据 段;
将得到的各语音数据段按照语音数据段在语音音频处理过程中的先后次序 组合起来得到语音数据文件;
对所述语音数据文件进行解析, 得到语音音频。
11、 根据权利要求 10所述的语音传输方法, 其特征在于, 各语音数据段中 携带有逻辑标识, 所述逻辑标识用于表示语音数据段在语音音频处理过程中的 处理次序;
所述将得到的各语音数据段按照语音数据段在语音音频处理过程中的先后 次序组合起来得到语音数据文件, 包括:
根据各语音数据段中携带的逻辑标识, 按照语音数据段的处理次序组合得 到语音数据文件。
12、 根据权利要求 10所述的语音传输方法, 其特征在于, 所述语音发送终 端发送的最后一个语音数据段携带有语音结束标识;
所述将得到的各语音数据段按照语音数据段在语音音频处理过程中的先后 次序组合起来得到语音数据文件, 包括:
在接收到携带有语音结束标识的语音数据段后, 对接收到的各语音数据段 按照语音数据段在语音音频处理过程中的先后次序进行组合, 得到语音数据文 件。
13、 一种语音传输终端, 其特征在于, 包括:
语音音频采集模块, 用于采集语音音频;
语音音频处理模块, 用于在语音音频采集过程中, 对采集到的语音音频进 行处理;
语音发送模块, 用于在语音音频处理过程中, 处理得到的语音数据的长度 达到预设数据长度时, 将所述语音数据作为一个语音数据段发送出去。
14、 根据权利要求 13所述的语音传输终端, 其特征在于, 还包括: 标识增加模块, 用于在所述语音数据段中增加逻辑标识, 所述逻辑标识表 示所述语音数据段在所述语音音频处理过程中的处理次序。
15、 根据权利要求 13所述的语音传输终端, 其特征在于, 还包括: 语音结束标识增加模块, 用于所述语音音频采集结束后, 在处理得到的最 后一个语音数据段增加语音结束标识。
16、 根据权利要求 13所述的语音传输终端, 其特征在于, 还包括: 反馈信息接收模块, 用于接收语音服务器返回的传输反馈信息, 所述传输 反馈信息包括重发标识, 所述重发标识表示需要重发的语音数据段;
反馈重传模块, 用于根据所述重发标识, 重新发送所述需要重发的语音数 据段。
17、 根据权利要求 13所述的语音传输终端, 其特征在于, 还包括: 发送成功提示模块, 用于在语音音频采集结束后, 检测网络连接正常时, 为用户提供发送成功提示信息。
18、 根据权利要求 13-17任一所述的语音传输终端, 其特征在于, 还包括: 语音音频重传模块, 用于在语音音频采集结束且处理后得到的各语音数据 段均发送完毕后的预设时间段内, 未接收到语音服务器返回的传输成功信息时, 重新发送所述语音音频处理过程中得到的所有语音数据段。
19、 一种语音服务器, 其特征在于, 包括:
语音数据接收模块, 用于接收权利要求 13所述的语音传输终端发送的语音 数据段;
语音数据转发模块, 用于将接收到的所述语音数据段实时转发至语音接收 终端。
20、 根据权利要求 19所述的语音服务器, 其特征在于, 还包括:
反馈模块, 用于在语音数据段接收失败时, 向所述语音发送终端返回传输 反馈信息, 所述传输反馈信息包括重发标识, 所述重发标识表示需要重发的语 音数据段, 以便所述语音发送终端重新发送所述需要重发的语音数据段。
21、 一种语音传输终端, 其特征在于, 包括:
接收模块, 用于接收语音数据段, 所述语音数据段为权利要求 13所述的语 音传输终端发送的语音数据段, 或者为权利要求 19所述的语音服务器转发的语 音数据段; 组合模块, 用于将得到的各语音数据段按照语音数据段在语音音频处理过 程中的先后次序组合起来得到语音数据文件;
解析模块, 用于对所述语音数据文件进行解析, 得到语音音频。
22、 根据权利要求 21所述的语音传输终端, 其特征在于, 各语音数据段中 携带有逻辑标识, 所述逻辑标识用于表示语音数据段在语音音频处理过程中的 处理次序;
所述组合模块, 用于根据各语音数据段中携带的逻辑标识, 按照语音数据 段的处理次序组合得到语音数据文件。
23、 根据权利要求 21所述的语音传输终端, 其特征在于, 所述语音发送终 端发送的最后一个语音数据段携带有语音结束标识;
所述组合模块, 用于在接收到携带有语音结束标识的语音数据段后, 对接 收到的各语音数据段按照语音数据段在语音音频处理过程中的先后次序进行组 合, 得到语音数据文件。
24、 一种语音传输系统, 包括移动终端和语音服务器, 其特征在于, 所述 移动终端为采用权利要求 13-18任一所述的语音传输终端,或者,所述移动终端 为采用权利要求 21-23任一所述的语音传输终端;所述语音服务器为采用权利要 求 19或 20所述的语音服务器。
25、 一种语音传输终端, 其特征在于, 包括:
一个或多个处理器; 和
存储器;
所述存储器存储有一个或多个程序, 所述一个或多个程序被配置成由所述 一个或多个处理器执行, 所述一个或多个程序包含用于进行以下操作的指令: 采集语音音频;
在语音音频采集过程中, 对采集到的语音音频进行处理;
在语音音频处理过程中, 处理得到的语音数据的长度达到预设数据长度时, 将所述语音数据作为一个语音数据段发送出去。
26、 根据权利要求 25所述的语音传输终端, 其特征在于, 还包括执行如下 操作的指令:
在语音音频采集过程中, 对采集到的语音音频进行编码及压缩处理。
27、 根据权利要求 25所述的语音传输终端, 其特征在于, 还包括执行如下 操作的指令: 在所述语音数据段中增加逻辑标识, 所述逻辑标识表示所述语音数据段在 所述语音音频处理过程中的处理次序。
28、 根据权利要求 25所述的语音传输终端, 其特征在于, 还包括执行如下 操作的指令:
所述语音音频采集结束后, 在处理得到的最后一个语音数据段增加语音结 束标识。
29、 根据权利要求 25所述的语音传输终端, 其特征在于, 还包括执行如下 操作的指令:
接收语音服务器返回的传输反馈信息, 所述传输反馈信息包括重发标识, 所述重发标识表示需要重发的语音数据段;
根据所述重发标识, 重新发送所述需要重发的语音数据段。
30、 根据权利要求 25所述的语音传输终端, 其特征在于, 还包括执行如下 操作的指令:
在语音音频采集结束后, 检测网络连接正常时, 为用户提供发送成功提示 信息。
31、 根据权利要求 25-30任一所述的语音传输终端, 其特征在于, 还包括执 行如下操作的指令:
在语音音频采集结束且处理后得到的各语音数据段均发送完毕后的预设时 间段内, 未接收到语音服务器返回的传输成功信息时, 重新发送所述语音音频 处理过程中得到的所有语音数据段。
32、 一种语音服务器, 其特征在于, 包括:
一个或多个处理器; 和
存储器;
所述存储器存储有一个或多个程序, 所述一个或多个程序被配置成由所述 一个或多个处理器执行, 所述一个或多个程序包含用于进行以下操作的指令: 接收权利要求 25所述的语音传输终端发送的语音数据段;
将接收到的所述语音数据段实时转发至语音接收终端。
33、 根据权利要求 32所述的语音语音服务器, 其特征在于, 还包括执行如 下操作的指令:
在语音数据段接收失败时, 向所述语音发送终端返回传输反馈信息, 所述 传输反馈信息包括重发标识, 所述重发标识表示需要重发的语音数据段, 以便 所述语音发送终端重新发送所述需要重发的语音数据段。
34、 一种语音传输终端, 其特征在于, 包括:
一个或多个处理器; 和
存储器;
所述存储器存储有一个或多个程序, 所述一个或多个程序被配置成由所述 一个或多个处理器执行, 所述一个或多个程序包含用于进行以下操作的指令: 接收语音数据段, 所述语音数据段为权利要求 25所述的语音发送终端发送 的语音数据段, 或者权利要求 32所述的语音服务器转发的语音发送终端发送的 语音数据段;
将得到的各语音数据段按照语音数据段在语音音频处理过程中的先后次序 组合起来得到语音数据文件;
对所述语音数据文件进行解析, 得到语音音频。
35、 根据权利要求 34所述的语音传输终端, 其特征在于, 各语音数据段中 携带有逻辑标识, 所述逻辑标识用于表示语音数据段在语音音频处理过程中的 处理次序;
还包括执行如下操作的指令:
根据各语音数据段中携带的逻辑标识, 按照语音数据段的处理次序组合得 到语音数据文件。
36、 根据权利要求 34所述的语音传输终端, 其特征在于, 所述语音发送终 端发送的最后一个语音数据段携带有语音结束标识;
还包括执行如下操作的指令:
在接收到携带有语音结束标识的语音数据段后, 对接收到的各语音数据段 按照语音数据段在语音音频处理过程中的先后次序进行组合, 得到语音数据文 件。
37、 一种语音传输系统, 包括移动终端和语音服务器, 其特征在于, 所述 移动终端为采用权利要求 25-31任一所述的语音传输终端,或者,所述移动终端 为采用权利要求 34-36任一所述的语音传输终端;所述语音服务器为采用权利要 求 32或 33所述的语音服务器。
PCT/CN2013/087653 2012-11-22 2013-11-22 语音传输方法、终端、语音服务器及语音传输系统 WO2014079382A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/719,144 US9832621B2 (en) 2012-11-22 2015-05-21 Method, terminal, server, and system for audio signal transmission

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210479379.XA CN103841002B (zh) 2012-11-22 2012-11-22 语音传输方法、终端、语音服务器及语音传输系统
CN201210479379.X 2012-11-22

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/719,144 Continuation US9832621B2 (en) 2012-11-22 2015-05-21 Method, terminal, server, and system for audio signal transmission

Publications (1)

Publication Number Publication Date
WO2014079382A1 true WO2014079382A1 (zh) 2014-05-30

Family

ID=50775555

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/087653 WO2014079382A1 (zh) 2012-11-22 2013-11-22 语音传输方法、终端、语音服务器及语音传输系统

Country Status (3)

Country Link
US (1) US9832621B2 (zh)
CN (1) CN103841002B (zh)
WO (1) WO2014079382A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113923177A (zh) * 2021-09-30 2022-01-11 完美世界(北京)软件科技发展有限公司 即时通讯的语音处理系统及方法、装置

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104616652A (zh) * 2015-01-13 2015-05-13 小米科技有限责任公司 语音传输方法及装置
JP6146827B2 (ja) * 2015-09-10 2017-06-14 Necプラットフォームズ株式会社 電話交換装置および方法並びにプログラム
CN106792277A (zh) * 2015-11-25 2017-05-31 北京国基科技股份有限公司 一种不稳定网络下的语音对讲方法及装置
CN105376144B (zh) * 2015-12-04 2019-06-18 小米科技有限责任公司 信息处理方法及装置
CN105786245B (zh) * 2016-02-04 2019-01-22 网易(杭州)网络有限公司 一种触摸屏操作控制方法和装置
CN108781406B (zh) * 2016-03-14 2021-09-07 罗伯特·博世有限公司 用于内部通话系统的无线接入点和路由音频流数据的方法
CN106060659A (zh) * 2016-05-30 2016-10-26 广州华多网络科技有限公司 一种音频信号监看方法、装置及系统
CN107800858A (zh) * 2016-09-05 2018-03-13 中兴通讯股份有限公司 一种支持语音重传的方法及装置
CN106686255A (zh) * 2017-03-01 2017-05-17 广东小天才科技有限公司 一种移动终端及其语音消息发送方法
CN109417504A (zh) * 2017-04-07 2019-03-01 微软技术许可有限责任公司 自动聊天中的语音转发
CN107025813B (zh) * 2017-05-08 2020-07-10 南京哇嗨网络科技有限公司 基于即时通讯工具的在线教育方法和系统
CN108880993A (zh) * 2018-07-02 2018-11-23 广东小天才科技有限公司 一种语音即时通信方法、系统及移动终端
CN108924361B (zh) * 2018-07-10 2021-02-19 南昌黑鲨科技有限公司 音频播放和采集控制方法、系统及计算机可读存储介质
WO2020022167A1 (ja) * 2018-07-27 2020-01-30 ソニー株式会社 音声通信端末、音声通信端末の情報処理方法、プログラム、配信サーバおよび配信サーバの情報処理方法
CN110113342A (zh) * 2019-05-10 2019-08-09 甄十信息科技(上海)有限公司 2g网络下的语音通信方法及设备
CN112509551A (zh) * 2020-11-27 2021-03-16 北京百度网讯科技有限公司 语音交互方法和装置、电子设备及可读存储介质
CN112579040B (zh) * 2020-12-25 2023-03-14 展讯半导体(成都)有限公司 嵌入式设备的录音方法及相关产品
CN112950087A (zh) * 2021-04-14 2021-06-11 未来穿戴技术有限公司 按摩数据管理方法、设备及计算机存储介质
CN113542513B (zh) * 2021-07-12 2023-12-01 宏图智能物流股份有限公司 一种仓库内大词汇语音快速传输方法
CN114979050B (zh) * 2022-05-13 2024-02-27 维沃移动通信(深圳)有限公司 语音生成方法、语音生成装置和电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1516436A (zh) * 2003-08-27 2004-07-28 腾讯科技(深圳)有限公司 一种即时通讯中音/视频分享的方法和系统
CN1731718A (zh) * 2004-08-06 2006-02-08 北京中星微电子有限公司 针对ip网络语音数据包丢失的降噪方法及装置
EP2043336A2 (en) * 2007-09-29 2009-04-01 Lenovo (Beijing) Limited Apparatus having mobile terminal as input/output device of computer and related system and method
CN102624874A (zh) * 2012-02-21 2012-08-01 腾讯科技(深圳)有限公司 一种语音信息传送方法及系统

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE502477T1 (de) * 2000-07-25 2011-04-15 America Online Inc Videonachrichtenübermittlung
US7286256B2 (en) * 2002-02-22 2007-10-23 Eastman Kodak Company Image application software providing a list of user selectable tasks
CN101115002A (zh) * 2007-03-19 2008-01-30 重庆邮电大学 利用TCP Veno提高无线自组织网络性能的方法
CN101409609A (zh) * 2007-10-09 2009-04-15 北京信威通信技术股份有限公司 一种无线系统中高效可靠传输语音的方法及装置
CN101159520A (zh) * 2007-10-29 2008-04-09 中兴通讯股份有限公司 数据传输方法
CN101431510B (zh) * 2007-11-09 2013-02-27 株式会社Ntt都科摩 在无线局域网中的多播方法
CN101552658A (zh) * 2008-04-03 2009-10-07 华为技术有限公司 一种发送状态报告的方法和装置
US8428949B2 (en) * 2008-06-30 2013-04-23 Waves Audio Ltd. Apparatus and method for classification and segmentation of audio content, based on the audio signal
CN102143137A (zh) * 2010-09-10 2011-08-03 华为技术有限公司 媒体流发送及接收方法、装置和系统
US20140101551A1 (en) * 2012-10-05 2014-04-10 Google Inc. Stitching videos into an aggregate video

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1516436A (zh) * 2003-08-27 2004-07-28 腾讯科技(深圳)有限公司 一种即时通讯中音/视频分享的方法和系统
CN1731718A (zh) * 2004-08-06 2006-02-08 北京中星微电子有限公司 针对ip网络语音数据包丢失的降噪方法及装置
EP2043336A2 (en) * 2007-09-29 2009-04-01 Lenovo (Beijing) Limited Apparatus having mobile terminal as input/output device of computer and related system and method
CN102624874A (zh) * 2012-02-21 2012-08-01 腾讯科技(深圳)有限公司 一种语音信息传送方法及系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113923177A (zh) * 2021-09-30 2022-01-11 完美世界(北京)软件科技发展有限公司 即时通讯的语音处理系统及方法、装置
CN113923177B (zh) * 2021-09-30 2023-01-06 完美世界(北京)软件科技发展有限公司 即时通讯的语音处理系统及方法、装置

Also Published As

Publication number Publication date
CN103841002A (zh) 2014-06-04
CN103841002B (zh) 2018-08-03
US20150256988A1 (en) 2015-09-10
US9832621B2 (en) 2017-11-28

Similar Documents

Publication Publication Date Title
WO2014079382A1 (zh) 语音传输方法、终端、语音服务器及语音传输系统
US11355130B2 (en) Audio coding and decoding methods and devices, and audio coding and decoding system
JP7326425B2 (ja) サイドリンク情報伝送方法及び端末
EP3720019B1 (en) Internet of things data transmission method, device and system
WO2017008627A1 (zh) 多媒体直播方法、装置和系统
WO2017020663A1 (zh) 弹幕视频直播方法、装置、视频源设备及网络接入设备
CN110166206B (zh) 一种harq-ack码本的确定方法和终端
WO2018036026A1 (zh) 一种数据传输的方法、基站、目标终端、系统及存储介质
WO2015058613A1 (zh) 一种检测数据包的方法、装置及存储介质
WO2016112728A1 (zh) 一种数据传输的方法、网络服务器、用户终端及系统
CN104780401B (zh) 视频数据的发送方法及装置
CN111800867B (zh) 半持续调度物理下行共享信道的反馈方法及终端设备
CN110234124B (zh) 信息传输方法及终端设备
CN107274882B (zh) 数据传输方法及装置
EP3709663A1 (en) Video transmission method, apparatus, and system, and computer readable storage medium
CN112291181B (zh) 一种基于多网卡的数据传输方法以及相关装置
TW201214221A (en) Method for transmitting touch panel data
CN110944306A (zh) 一种旁链路的链路释放方法及终端
CN108880762A (zh) 混合自动重传反馈的控制方法、基站及装置
WO2017000495A1 (zh) 一种添加联系人的方法及设备
WO2015078349A1 (zh) 麦克风收音状态的切换方法和装置
WO2019029173A1 (zh) 一种反馈应答信息的长度确定方法及相关产品
WO2016045062A1 (zh) 数据包传输的装置、系统及方法
WO2011130962A1 (zh) 远程处理方法、装置及系统
CN112888024A (zh) 数据处理方法、装置、存储介质及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13856355

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05.10.2015)

122 Ep: pct application non-entry in european phase

Ref document number: 13856355

Country of ref document: EP

Kind code of ref document: A1