WO2014046239A1 - 通信システムと方法とサーバ装置及び端末 - Google Patents

通信システムと方法とサーバ装置及び端末 Download PDF

Info

Publication number
WO2014046239A1
WO2014046239A1 PCT/JP2013/075469 JP2013075469W WO2014046239A1 WO 2014046239 A1 WO2014046239 A1 WO 2014046239A1 JP 2013075469 W JP2013075469 W JP 2013075469W WO 2014046239 A1 WO2014046239 A1 WO 2014046239A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal
codec
voice
audio codec
code
Prior art date
Application number
PCT/JP2013/075469
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
一範 小澤
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Publication of WO2014046239A1 publication Critical patent/WO2014046239A1/ja

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/173Transcoding, i.e. converting between two coded representations avoiding cascaded coding-decoding

Definitions

  • the present invention is based on a Japanese patent application: Japanese Patent Application No. 2012-208663 (filed on September 21, 2012), and the entire contents of this application are incorporated in the present specification by reference.
  • the present invention relates to a communication system and method, a server device, and a terminal.
  • a system using a virtual machine is known.
  • a virtual machine also referred to as “virtual terminal” or “virtual client”
  • a virtual OS Operating System
  • guest OS Virtual OS
  • the terminal communicates with the server apparatus via the network, operates the virtual terminal on the server apparatus as if operating the real terminal, operates the application, and generates screen information, for example.
  • the application on the virtual terminal transmits the screen information to the terminal and displays the screen information on the display device of the terminal.
  • a terminal of such a system is also referred to as a thin client terminal.
  • the terminal on the server device is connected to the virtual terminal, and the server device is accessed from home or away from the office, and it is securely connected to its own virtual terminal to carry out the work. can do.
  • a thin client terminal does not leave any data, so even if the terminal is lost, for example, confidential information and corporate information are not leaked to the outside.
  • Patent Document 1 a computer screen is transmitted as a video signal using hardware resources assigned to each of a plurality of users, thereby suppressing an increase in processing load and communication load.
  • An information processing apparatus, an information processing system, and a control method for the information processing apparatus that can use a thin client are disclosed.
  • Patent Document 2 after the server device acquires operation information in the terminal (client terminal), the processing requested by the operation is executed by an application that operates on the server device, and is executed by the application.
  • a thin client system that generates a screen for displaying a processing result and transmits the screen to a terminal (client terminal) is disclosed.
  • VoIP Voice Over IP: exchanges voice over IP (Internet Protocol) network) via a server device.
  • the bandwidth of the network is not so large.
  • the bandwidth of the network fluctuates over time due to traffic congestion.
  • voice data stays on the network.
  • the delay time until the voice data arrives at the client becomes long, and it becomes difficult to make a call smoothly.
  • the voice codec installed in the thin client terminal is different from the voice codec installed in the non-thin client terminal.
  • the server device needs to convert (transcode) the voice codec.
  • the server device decodes the audio signal encoded by the audio codec of the thin client terminal, encodes it with an encoding method corresponding to the audio codec of the non-thin client terminal, and transmits the encoded signal to the non-thin client terminal.
  • the server device also decodes the audio signal encoded with the audio codec of the non-thin client terminal, encodes it with an encoding method corresponding to the audio codec of the thin client terminal, and transmits the encoded signal to the non-thin client terminal. Since transcoding requires a large amount of processing, the load on the server device becomes large.
  • the present invention has been made in view of the above problems, and its purpose is to make it possible to eliminate voice delay due to fluctuations in network bandwidth, for example, between thin client terminals and non-thin client terminals, etc.
  • Another object of the present invention is to provide a system, apparatus, and method that can reduce the load on a server apparatus during a voice call between terminals when the voice codec is different.
  • a plurality of terminals and a server device connected to the plurality of terminals via a network are provided, and the plurality of terminals have first and second audio codecs different from each other.
  • the server apparatus estimates a bandwidth of the network, obtains a bit rate of an audio codec of the first and / or second terminal, and uses the obtained bit rate as the first and / or second Notifying the terminal, receiving a code by the voice codec of the first or second terminal, and a signal obtained by decoding the code for a time interval in which a signal obtained from a part of the code satisfies a predetermined condition, Performing transcoding processing for encoding at the obtained bit rate by the encoding method corresponding to the audio codec of the second or first terminal that is the transmission destination of the code, and performing the transcoding processing
  • the code the system comprising means for transmitting toward the second or first terminal is provided.
  • a terminal is connected to the terminal via a network, and screen information obtained by operating an application in a virtual client unit is transferred to the terminal by an operation on the terminal.
  • the terminal includes at least a first terminal;
  • the voice conversion unit performs transcoding, and at this time, a part of the code by the voice codec of the second terminal satisfies the predetermined condition for the time interval.
  • a transmission / reception unit that connects to a terminal via a network, receives an operation signal from the terminal, and transmits / receives a signal to / from the terminal and another terminal;
  • a control unit for determining whether the operation is a voice call based on an operation signal received from the terminal; When the control unit determines that the voice call is made, the packet storing the voice data transmitted from the terminal is transcoded according to the instruction of the control unit, or the packet is directly passed without being transcoded.
  • a voice conversion unit that outputs to the other party, Based on a response signal from the terminal with respect to transmission of a predetermined packet, a bandwidth of the network is estimated, a bit rate of a voice codec is calculated, and a bandwidth estimation rate calculation unit that notifies the terminal of the bit rate;
  • the terminal includes at least a first terminal;
  • the other terminal includes a second terminal having a voice codec different from the voice codec of the first terminal;
  • the voice conversion unit performs transcoding based on an instruction from the control unit, Transcode from the second terminal voice codec to the first terminal voice codec for a time interval in which a part of the code of the second terminal voice codec satisfies a predetermined condition, Output the coded signal for the first terminal, Transcoding from the speech codec of the first terminal to the speech codec of the second terminal for a time interval in which a signal obtained from a part of the code by the speech codec of the first terminal satisfies
  • the first terminal and the second terminal each have a voice call via a server device connected via a network
  • the server device estimates the bandwidth of the network based on response signals from the first and second terminals for transmission of a predetermined packet from the server device, and calculates the bit rate of the voice codec of the terminal And notifying the bit rate to the first and second terminals
  • it is determined whether the audio codec of the first terminal and the second terminal is the same, and if they are the same, bitstreams based on the audio codec of the first and second terminals are respectively Output directly to the second and first terminals
  • transcoding is performed in the server device, In that case, a part of the code by the audio codec of the second terminal is transcoded from the audio codec of the second terminal to the audio codec of the first terminal for a time interval that satisfies a predetermined condition.
  • screen information obtained by connecting to the server apparatus according to the present invention via a network and operating the application in the virtual client unit by the operation of the terminal by the server apparatus is obtained.
  • a terminal that transfers to the terminal, decodes screen information from the server apparatus by a decoder, displays the decoded information on a display unit, and makes a voice call with the other terminal through the server apparatus.
  • a transmission / reception process for connecting to a terminal via a network, receiving an operation signal from the terminal, and transmitting / receiving a signal to / from the terminal and another terminal;
  • Control processing for determining whether or not the operation is a voice call based on an operation signal received from the terminal;
  • the control unit determines that the voice call is made, the packet storing the voice data transmitted from the terminal is transcoded according to the instruction of the control unit, or the packet is directly passed without being transcoded.
  • the terminal includes at least a first terminal;
  • the other terminal includes a second terminal having a voice codec different from the voice codec of the first terminal;
  • the voice conversion process performs transcoding based on the instruction of the control process, and at that time, the second terminal A signal after the transcoding is performed by transcoding from the voice codec of the second terminal to the voice codec of the first terminal for a time interval in which one process of the code by the voice codec of the terminal satisfies a predetermined condition.
  • a server apparatus for voice calls between terminals when the voice codec is different such as between a thin client terminal and a non-thin client terminal. It is possible to reduce the load.
  • an application (214 in FIG. 2) is operated by a virtual client unit (211 in FIG. 2) by connecting to a terminal (170 in FIG. 1) via a network and operating on the terminal (170).
  • the server device (110) is transferred to the terminal (170) and displayed on the terminal, and the server device (110) receives the packet storing the operation signal from the terminal (170).
  • a control unit (212 in FIG. 2) for determining that the call is a voice call from the operation signal, and a packet storing the voice data transmitted from the terminal (170) when the call is determined to be a voice call,
  • a voice conversion unit (185 in FIG.
  • the terminal (first terminal: for example, thin client terminal) (170) and another terminal (second terminal: for example) having a different audio codec and the type of encoding from the terminal (170).
  • the voice conversion unit (185 in FIG. 2) is based on an instruction from the control unit (212 in FIG. 2).
  • the part of the code by the voice codec (first voice codec) of the other terminal (175) is only used for the other terminal ( 175) from the voice codec to the voice codec (second voice codec) of the terminal (170), and the transcoded signal is transcoded.
  • the terminal (170) is output to the terminal (170), and the terminal (170) is obtained only in a time interval in which a signal obtained from a part of the code by the voice codec (second voice codec) of the terminal (170) satisfies a predetermined condition. 170) from the voice codec (second voice codec) to the voice codec (first voice codec) of the other terminal (175), and the transcoded code is transferred to the different terminal (175).
  • Output It is possible to reduce the load on the server device during voice calls between a terminal (thin client terminal) and another terminal (non-thin client terminal), and to eliminate voice delays due to fluctuations in network bandwidth, etc. Yes.
  • FIG. 1 is a diagram illustrating the configuration of the first exemplary embodiment of the present invention.
  • a mobile 3G packet network is used as the network
  • an SGSN Serving GPRS (General Packet Radio Service) Support Node)
  • GGSN Gateway GPRS Support Node
  • the configuration of is shown.
  • an SGSN / GGSN device 190 represents a device in which an SGSN device and a GGSN device are integrated.
  • mobile LTE Long Term Evolution
  • WiFi Wireless Fidelity
  • WiMax Worldwide Interoperability for Microwave Access
  • IP Internet Protocol
  • NGN Next Generation Network
  • a mobile terminal 170 (a mobile phone terminal, a smartphone, a tablet, or the like) connects to the server apparatus 110 installed in the cloud network 130 and performs screen data transfer using a thin client method.
  • the example which performs a voice call with respect to the terminal 175 using the server apparatus 110 from the portable terminal 170 is shown.
  • the portable terminal 170 is a thin client terminal equipped with client software that implements the thin client method.
  • the terminal 175 is a non-thin client terminal not equipped with thin client type client software.
  • the mobile terminal 170 is connected to the mobile network 150, but the terminal 175 is connected to the fixed network 151.
  • the MGW device (media gateway device) 196 terminates the fixed network 151, converts voice into IP packets and transmits the IP packets to the packet transfer device, and converts IP packets into voice and sends them to the terminal 175 via the fixed network 151. Send.
  • a telephone book 111 in which a user name and a telephone number are registered is prepared and connected to the server apparatus 110 in advance. That is, a telephone directory (user name, telephone number, etc. registered) necessary for making a call from the portable terminal (thin client terminal) 170 is held on the server apparatus 110 side. For this reason, the mobile terminal 170 does not need to have any telephone directory. For this reason, even if the portable terminal 170 is lost, security for a telephone number, a user name, etc. can be ensured.
  • FIG. 1 shows screen data generated by starting a voice call VoIP application on a virtual client (not shown) in the server apparatus 110 in order for the mobile terminal 170 to connect to the server apparatus 110 and start a voice call with the terminal 175.
  • the screen data is decoded and displayed by the client software of the portable terminal 170, a user name is designated on the screen, and then a voice call is made from the portable terminal 170 to the terminal 175. To do.
  • the mobile terminal 170 includes client software 171 for operating as a thin client terminal.
  • the client software 171 will be described later.
  • the terminal 175 is a non-thin client terminal that is not a thin client. For this reason, no client software is installed. Therefore, in this embodiment, it is assumed that the audio codec installed in the client software of the mobile terminal 170 and the audio codec installed in the terminal 175 are different (encoding schemes are different).
  • the audio codec installed in the client software of the mobile terminal 170 is an AMR-NB (Adaptive Multi-Rate Narrow Band) standard of 3GPP (Third Generation Partner Partner Project) and the audio codec of the terminal 175.
  • AMR-NB Adaptive Multi-Rate Narrow Band
  • 3GPP hird Generation Partner Partner Project
  • ITU-T International Telecommunication Union Telecommunication Sector
  • Details of the AMR audio codec can be found in, for example, G.3 standard referenced in 3GPP TS26.090 standard.
  • For details of the 711 audio codec see, for example, ITU-T G. Reference is made to the 711 standard.
  • these audio codecs other known audio codecs may
  • Activation of a voice call VoIP application (not shown in FIG. 1; 214 in FIG. 2) on the virtual client of the server apparatus 110 (not shown in FIG. 1; 211 in FIG. 2) for the mobile terminal 170 to start a voice call
  • a packet storing an operation signal for starting the VoIP application is transmitted from the portable terminal 170 to the server apparatus 110.
  • the control unit (not shown in FIG. 1; 212 in FIG. 2) of the server device 110 determines that the voice call is being made, and the virtual client (not shown in FIG. 1).
  • the voice call VoIP application (not shown in FIG. 1; 214 in FIG. 2) is activated to generate a screen, and the screen information is encoded by the encoder (not shown in FIG. 1: 188 in FIG. 2). ) And transferred from the server device 110 to the portable terminal 170, and the screen information is decoded by the portable terminal 170 and displayed on the screen of the portable terminal 170. Then, the end user performs operations such as selection of a destination user name and a telephone number as the next action.
  • the audio signal attached to the screen is processed through a path different from the voice call path.
  • the screen capture unit not shown in FIG. 1; 180 in FIG. 2
  • it is compressed and encoded by the audio encoder (not shown in FIG. 1; 189 in FIG. 2) into a compressed encoded stream.
  • the packet is transmitted to the portable terminal 170 as a packet different from the voice call according to the defined protocol.
  • a packet obtained by storing a session control message based on a well-known session control protocol, and a bit obtained by compressing and encoding an audio signal with an AMR speech encoder installed in client software of mobile terminal 170 A packet storing a stream (code) is transmitted.
  • SIP Session Initiation Protocol
  • other well-known protocols may be used.
  • These packets reach the base station device 194 on the mobile network 150 in the area, and further pass through the RNC (Radio Network Controller) device 195 and the SGSN / GGSN device 190 to the server device 110 of the cloud network 130. To reach.
  • RNC Radio Network Controller
  • FIG. 2 is a diagram illustrating the configuration of the server device 110.
  • the server device 110 includes a packet transmission / reception unit 186, a control unit 212, a bandwidth estimation / rate calculation unit 183, a voice conversion unit 185, a screen generation unit 213, a voice call VoIP application software 214, a screen capture unit 180, An image encoder unit 188, a first packet transmission / reception unit 187, a second packet transmission unit 176, a third packet transmission unit 177, an audio encoder unit 189, and a virtual client unit 211 are provided.
  • the virtual client unit 211 includes a control unit 212, a screen generation unit 213, and voice call VoIP application software 214.
  • each of these units may execute at least one or all of the processes and functions by a program executed on the computer of the server apparatus 110.
  • a computer-readable recording medium semiconductor memory, magnetic / optical disk, etc.
  • the virtual client unit 211 is operating on the guest OS in the virtualization environment on the host OS.
  • a well-known OS can be used as the host OS and guest OS.
  • Linux registered trademark
  • Android registered trademark
  • Windows registered trademark
  • Other OSs such as Windows (registered trademark) can also be used.
  • the virtual client unit 211 includes a control unit 212 and a screen generation unit 213.
  • the mobile terminal 170 illustrated in FIG. 1 stores an operation signal for starting the voice call VoIP application software on the virtual client 211 in a packet and transmits the packet to the server apparatus 110.
  • the packet transmitting / receiving unit 186 of the server apparatus 110 receives the operation signal packet, extracts the operation signal from the packet, and outputs it to the control unit 212.
  • control unit 212 When the control unit 212 receives the operation signal and determines that it is a start signal of the VoIP application software for the voice call, the control unit 212 executes the voice call VoIP application software.
  • a screen is generated and output to the screen capture unit 180.
  • the screen capture unit 180 captures the generated screen at a predetermined screen resolution and a predetermined frame rate, and outputs the captured screen to the image encoder unit 188.
  • the image encoder unit 188 compresses and encodes the input screen using a predetermined image encoder at a predetermined screen resolution, bit rate, and frame rate to obtain a compressed encoded stream. 2 to the packet transmission unit 176.
  • a compression encoding method H.264 is used.
  • H.264 MPEG (Moving Picture Experts Group) -4, JPEG (Joint Photographic Experts Group) 2000, and the like can be used.
  • the second packet transmission unit 176 stores the compressed and encoded stream input from the image encoder unit 188 in a predetermined packet and outputs the packet to the SGSN / GGSN apparatus 190 illustrated in FIG.
  • the packet protocol may be RTP (Real-time Transport Protocol) / UDP (User Data Protocol) / IP (Internet Protocol), UDP / IP, or TCP (Transmission Control Protocol) / IP.
  • UDP / IP is used as an example.
  • the mobile terminal 170 receives the compression-encoded stream transmitted from the second packet transmission unit 176 of the server device 110, decodes it with a predetermined screen resolution and frame rate, and displays the display on the mobile terminal 170. (Not shown).
  • the control unit 212 obtains the destination user name (the user who owns the terminal 175 in FIG. 1) and the destination phone number (the phone number of the terminal 175 in FIG. 1) from the telephone directory 111 in FIG.
  • the screen is generated by the screen generation unit 213, and the generated screen is compression-encoded by the image encoder 188 and sent to the portable terminal 170 (FIG. 1).
  • the end user of the mobile terminal 170 selects the user and telephone number of the call destination while viewing the screen displayed on the display unit (not shown) of the mobile terminal 170.
  • the portable terminal 170 sends a packet storing the SIP message for starting the voice call to the server device 110 (FIG. 1).
  • the portable terminal 170 compresses and encodes the audio signal with an AMR encoder (263 in FIG. 4) installed in the client software (not shown in FIG. 1; 171 in FIG. 4) of the portable terminal 170.
  • the packet storing the bitstream is sent to the server device 110.
  • the server apparatus 110 processes a packet related to a voice call by using a path different from the audio associated with the screen, thereby reducing the delay of the voice call.
  • the packet transmission / reception unit 186 includes the packets received from the mobile terminal 170. Output the packet storing the SIP message to the control unit 212; -Outputs a packet storing a compression-encoded stream for audio to the audio conversion unit 185, The response packet is output to the band estimation / rate calculation unit 183.
  • control unit 212 When the control unit 212 receives the operation signal from the packet transmission / reception unit 186, the control unit 212 performs the following operation.
  • the operation signal is analyzed, and in the case of a voice call activation operation, the voice call VoIP application software 214 is activated.
  • the bandwidth estimation / rate calculation unit 183 is connected to the mobile terminal 170 (FIG. 1) using the response packet from the packet transmission / reception unit 186 and the response packet from the first packet transmission / reception unit 187.
  • the network 150 (FIG. 1) is instructed to estimate the upstream band and the downstream band.
  • control unit 212 instructs the network 151 (FIG. 1) connected to the terminal 175 to estimate the upstream band and the downstream band. Then, it instructs the band estimation / rate calculation unit 183 to calculate the bit rate from the estimated band at least for each of the uplink and downlink of the network 150 (FIG. 1) and notify the voice conversion unit 185 of the bit rate. .
  • the SDP (Session Description Protocol) from the mobile terminal 170 (FIG. 1) is input from the packet transmitting / receiving unit 186, and the voice codec (second voice codec) installed in the client software of the mobile terminal 170 (FIG. 1) Check ability information about.
  • an AMR audio codec is used as the second audio codec.
  • the first packet transmission / reception unit 187 of the server apparatus 110 inputs the SDP from the terminal 175 and checks the capability information regarding the voice codec (first voice codec) installed in the terminal 175.
  • G Assume that a 711 audio codec is used.
  • control unit 212 checks whether or not the first audio codec and the second audio codec match, but in the present embodiment, the first audio codec and the second audio codec do not match. A determination is made and a determination is made to perform transcoding.
  • the bandwidth estimation / rate calculation unit 183 inputs information included in the response packet from the packet transmission / reception unit 186 by a passive or active method, and the mobile network 150 (FIG. 1) to which the mobile terminal 170 (FIG. 1) is connected.
  • the band BW_1 of 1) is estimated.
  • the bandwidth estimation / rate calculation unit 183 inputs information included in the response packet from the first packet transmission / reception unit 187, and the bandwidth of the network 151 (FIG. 1) to which the terminal 175 (FIG. 1) is connected. BW_2 is estimated.
  • the band estimation / rate calculation unit 183 sends a predetermined probe packet to the mobile network 150 and / or the network 151 at predetermined time intervals.
  • the bandwidth of the mobile network 150 or / and the network 151 is estimated using the response signal packet from the portable terminal 170 or / and the terminal 175 with respect to the packet.
  • the probe packet includes a plurality of data having a predetermined size.
  • the band estimation / rate calculation unit 183 estimates the network band using three types of information included in the response signal packet.
  • the bandwidth estimation / rate calculation unit 183 estimates the downstream bandwidth of the mobile network 150 to which the mobile terminal 170 (FIG. 1) is connected using Equation (1) and Equation (2).
  • W is the bandwidth estimate
  • D (j) is the data size of the j-th packet sent from the packet transmitting / receiving unit 186 or the first packet transmitting / receiving unit 187 to the mobile terminal 170 or the terminal 175 (FIG. 1)
  • R (j) and R (j ⁇ 1) are reception times when the mobile terminal 170 or the terminal 175 (FIG. 1) receives the jth and j ⁇ 1th, respectively.
  • the band estimation / rate calculation unit 183 smoothes the band estimation value W calculated by Expression (1) temporally using Expression (2).
  • BW (n) (1- ⁇ ) ⁇ BW (n-1) + ⁇ ⁇ W ⁇ ⁇ ⁇ (2)
  • BW (n) is a band estimation value after smoothing at the nth time
  • is a constant in the range of 0 ⁇ ⁇ 1.
  • the band estimation / rate calculation unit 183 obtains an upstream band estimation value as follows.
  • the upstream bandwidth W Ask for ' By including the upstream data size P (m) transmitted from the mobile terminal 170 (FIG. 1) or the terminal 175 (FIG. 1) in the response signal packet, the upstream bandwidth W Ask for '.
  • T (m) is the reception time when the server apparatus 110 receives the response signal packet.
  • the band estimation / rate calculation unit 183 smoothes W ′ in the time direction, and sets the smoothed value BW ′ as an upstream band estimation value.
  • the band estimation / rate calculation unit 183 uses the band estimation value BW (n) smoothed by Expression (2), and at each predetermined time, according to Expression (5) and Expression (6), A downlink bit rate C (n) is calculated.
  • C (n) is the bit rate at the nth time
  • is a constant in the range of 0 ⁇ ⁇ 1.
  • the upstream bit rate C ′ (n) is calculated based on the equations (7) and (8) using the band estimation value BW ′ (n) smoothed by the equation (4). To do.
  • the bandwidth estimation / rate calculation unit 183 calculates the uplink and downlink bit rates for the second audio codec (here, the AMR-NB audio codec) installed in the client software of the mobile terminal 170 using Equation (7). , Based on the formula (8). Specifically, since the AMR-NB audio codec has eight types of bit rates (modes), the closest bit rate within the range not exceeding the bit rates of the equations (7) and (8) is set to the above 8 Select from different bit rates. Then, the band estimation / rate calculation unit 183 outputs the selected uplink and downlink bit rates to the packet transmission / reception unit 186.
  • the second audio codec here, the AMR-NB audio codec
  • the packet transmitting / receiving unit 186 inputs the selected uplink and downlink bit rates, enters the CMR (CodecCMode Request) of AMR-NB, and includes the downlink bit rate, and describes the CMR in the payload header of the packet Above, it outputs with respect to the portable terminal 170.
  • CMR CodecCMode Request
  • FIG. Here, for details of CMR, IETF (The Internet Engineering Task Force) RFC (Request for Comments) 3267 and the like are referred to.
  • the packet transmission / reception unit 186 outputs the uplink and downlink bit rates to the audio conversion unit 185.
  • the downlink and uplink bit rates for the terminal 175 are output to the first packet transmission / reception unit 187 and are notified to the terminal 175 using SDP.
  • the downlink and uplink bit rates may be notified using the SDP (Session Description Protocol) instead of using the CMR.
  • SDP Session Description Protocol
  • other known methods may be used.
  • FIG. 3 is a diagram illustrating the configuration of the audio conversion unit 185.
  • the voice conversion unit 185 includes transcoding / through switching units 220_1 and 220_2, level determination units 222 and 223, G.P. 711 decoder 221, G.711.
  • the transcoding / through switching units 220_1 and 220_2 transcode between the first audio codec and the second audio codec from the control unit 212, and the first audio codec and the second audio codec. Input an instruction, switch the process to transcoding, and perform the following process.
  • each of the first audio codec and the second audio codec is set to ITU-T G. 711, 3GPP AMR-NB.
  • transcoding / through The switching units 220_1 and 220_2 switch from transcoding to through mode, and allow the packet to pass through (the packet input to the transcoding / through switching unit 220_1 passes through the G.711 decoder 221 and the AMR encoder 224 (decodes).
  • the process and the encoding process are skipped) and transferred to the transcoding / through switching unit 220_2, and the packet input to the transcoding / through switching unit 220_2 is transmitted to the AMR decoder 225, G.71.
  • the encoder 228 and through (decoding, the encoding process is skipped), it is transferred to the transcoding / through the switching unit 220_1).
  • the transcoding from the first audio codec (G.711) to the second audio codec (AMR-NB) will be described.
  • the transcoding / through switching unit 220_1 receives a bit stream (code) by the first audio codec from the first packet transmitting / receiving unit 187 of FIG.
  • the level discriminating unit 222 extracts a part of the code from the bit stream (code) of the first audio codec (G.711). Specifically, 3 bits from the upper part are extracted and decoded except for the sign bit which is MSB (Most Significant Bit).
  • the decoded signal is smoothed or averaged over a predetermined time interval (for example, 20 ms) to obtain a processing result G1.
  • the predetermined condition is a comparison determination with a predetermined threshold value Th1 shown in the following equation, but other conditions can also be used.
  • Th1 is a predetermined threshold value relating to the level.
  • the level discriminating unit 222 In the time interval in which the processing result G1 is greater than Th1, the A.M. Instructs the 711 decoder 221 to encode the output signal, In the time interval in which the processing result G1 is less than Th1, the GMR is transmitted to the AMR encoder 224. The 711 decoder 221 is instructed not to encode the output signal.
  • the 711 decoder 221 receives the bit stream of the first audio codec (G.711) from the transcoding / through switching unit 220_1, decodes it, and outputs it to the AMR encoder 224.
  • the AMR encoder 224 has an AMR-NB encoder, inputs an instruction to perform AMR encoding from the level discriminating unit 222, inputs a downstream bit rate from the band estimation / rate calculating unit 183, and has an encoding instruction Only for the time interval,
  • the decoded signal (G.711 decoded signal) input from the 711 decoder 221 is encoded into AMR-NB and output to the transcoding / through switching unit 220_2.
  • bit rate for AMR-NB encoding follows the bit rate input from the band estimation / bit rate calculation unit 183.
  • the AMR encoder 224 does not perform AMR-NB encoding and does not output to the transcoding / through switching unit 220_2 in a time section where no encoding instruction is given.
  • the transcoding / through switching unit 220_2 outputs the bit stream (code) of the second audio codec input from the AMR encoder 224 to the packet transmitting / receiving unit 186 in FIG.
  • the transcoding / through switching unit 220_2 receives the bit stream (code) of the second audio codec from the packet transmission / reception unit 186, and outputs it to the level determination unit 223 and the AMR decoder 225.
  • the level discriminating unit 223 and the AMR decoder 225 receive the bit rate in the uplink direction of the second audio codec from the band estimation / rate calculating unit 183.
  • the level determination unit 223 receives the bit stream (code) of the second audio codec, and extracts a part from the code based on the bit rate information. Specifically, a portion indicating the gain is extracted from the code, the gain is decoded from the extracted code, and G2 is obtained for each 20 ms time interval. The decoded gain G2 is smoothed in the time direction according to the following equation.
  • Gm (1- ⁇ ) ⁇ Gm-1 + ⁇ ⁇ G2 ... (10)
  • Gm represents the gain after smoothing
  • represents a smoothing order constant. Assume that 0 ⁇ ⁇ 1.
  • the predetermined condition is the following expression (11), but other conditions can also be used.
  • the level discriminating unit 223 determines that the G.R. 711 encoder 228 with respect to the output signal (decoded signal) of AMR decoder 225 for G.711. 711 is instructed to encode. When the condition of the expression (11) is not satisfied, G. For G.711 encoder 228, G. 711 No encoding instruction is given.
  • the AMR decoder 225 receives the upstream bit rate of the second audio codec from the band estimation / rate calculation unit 183, performs AMR decoding on the bit stream of the second audio codec according to the bit rate, 711 to the encoder 228.
  • the 711 encoder 228 receives an encoding instruction from the level discriminating unit 223, and the G.711 encoder 228 outputs G.G. 711-encoded and output to the transcoding / through switching unit 220_1. G. The 711 encoder 228 uses the G.D. 711 encoding is not performed, and the output to the transcoding / through switching unit 220_1 is not performed.
  • the transcoding / through switching unit 220_1 is a G.
  • the bit stream (code) of the first audio codec input from the 711 encoder 228 is output to the first packet transmitting / receiving unit 187 in FIG.
  • the first packet transmitting / receiving unit 187 in FIG. 2 inputs the destination IP address and the SIP message from the control unit 212, and inputs the SDP describing the uplink and downlink bit rates from the band estimation / rate calculation unit 183. These are output as SIP / SDP packets to the MGW apparatus 196 in FIG. Further, the first packet transmission / reception unit 187 in FIG. 2 receives the bit stream by the first audio codec from the audio conversion unit 185, packetizes it by a predetermined protocol, and outputs it to the MGW apparatus 196 in FIG. .
  • RTP / UDP / IP is used as a predetermined protocol, but other well-known protocols may be used.
  • the packet transmission / reception unit 186 in FIG. 2 inputs from the band estimation / rate calculation unit 183 the CMR or SDP describing the uplink and downlink bit rates for the AMR-NB audio codec installed in the client software of the mobile terminal 170, A bit stream of the second audio codec is input from the conversion unit 185, and a predetermined packet is configured and output to the SGSN / GGSN apparatus 190 of FIG.
  • RTP / UDP / IP is used as a predetermined protocol, but other well-known protocols may be used.
  • CMR is used as a method for specifying the uplink and downlink bit rates for AMR-NB
  • CMR is incorporated into the payload format of the RTP packet.
  • FIG. 4 is a diagram illustrating the configuration of the mobile terminal 170 (FIG. 1) that is a thin client terminal.
  • the client software 171 includes a first packet transmission / reception unit 260, second and third packet reception units 250 and 251, packet transmission unit 258, image decoder 252, screen display unit 256, audio decoder 255, bit rate control unit 261, AMR.
  • a decoder 262, an AMR encoder 263, and an operation signal generator 257 are provided.
  • the processing and functions of these units are realized by a program executed on the computer of the mobile terminal 170.
  • client software 171 is installed in the portable terminal 170 to execute the thin client operation.
  • the thin client software is equipped with the AMR-NB audio codec that is the second audio codec.
  • the operation signal generation unit 257 in the case of a voice call, when the user operates on the screen of the portable terminal to activate the voice VoIP application software on the screen, the operation signal generation unit 257 generates an operation signal for activation, and packet This is packetized by the transmission unit 258 and transmitted from the portable terminal 170 to the mobile network 150 (FIG. 1).
  • the first packet transmission / reception unit 260 inputs a packet storing the SIP / SDP message and the second voice codec transmitted from the server apparatus 110 (FIG. 1).
  • the first packet transmitting / receiving unit 260 receives a probe packet from the band estimation / rate calculation unit 183 in FIG.
  • the first packet transmitting / receiving unit 260 extracts the AMR-NB uplink and downlink bit rate information from the SDP message or from the CMR of the RTP payload format of the packet, and outputs it to the bit rate control unit 261. Also, the bit stream of the second audio codec is extracted from the RTP payload and output to the AMR decoder 262.
  • a response signal packet including necessary information is created for the probe packet, and the response signal packet is transmitted from the mobile terminal 170 to the mobile network 150.
  • the necessary information is, for example, (1) The data size when the arrival time at the mobile terminal 170 starts to be delayed with respect to the flow packet, (2) The arrival time of the flow packet, (3) the data size included in the response packet sent from the mobile terminal to the server device; (4) Sending time when sending a response packet, Etc.
  • a response signal packet is created for the packet transmitted from the server device 110 (FIG. 1) and transmitted from the mobile terminal 170 to the mobile network 150 (FIG. 1).
  • response signal packet for example, (1) Received data size, (2) Reception time when the transmission packet is received by the mobile terminal, (3) the data size included in the response signal packet sent from the mobile terminal 170 to the server device 110; Etc. are included.
  • the bit rate control unit 261 outputs the downlink bit rate to the AMR decoder 262, and outputs the uplink bit rate to the AMR encoder 263.
  • the AMR decoder 262 receives the downlink bit rate from the bit rate control unit 261, selects one from eight AMR-NB modes based on the downlink bit rate, and outputs the bit stream of the second audio codec. Input and decode with an AMR decoder of the selected bit rate. However, in a time interval in which the bit stream is not input, a noise signal of a minute level by CNG (ComfortComNoise Generation) is generated (background noise (white noise, etc.) in a silent interval is generated in a pseudo manner) By connecting, the entire audio signal is generated and output from the portable terminal 170.
  • CNG ComfortComNoise Generation
  • the AMR encoder 263 selects one of eight modes based on the bit rate input from the bit rate control unit 261, encodes the voice uttered by the user of the mobile terminal 170 at the specified bit rate, 2 transmits a bit stream based on the second audio codec to the first packet transmitting / receiving unit 260, and the first packet transmitting / receiving unit 260 transmits the bit stream to the mobile network 150 from the mobile terminal 170.
  • the second packet receiving unit 250 receives a compression-encoded bitstream for the screen signal, decodes the compression-encoded bitstream using the same image codec as that of the server apparatus 110, and outputs it to the screen display unit 256.
  • the screen display unit 256 inputs the decoded screen signal, constructs a screen, and displays it on the screen of a display unit (not shown) of the mobile terminal.
  • the third packet receiving unit 251 When there is an audio signal associated with the screen, the third packet receiving unit 251 inputs a packet in which a compressed encoded bit stream for the audio signal is stored, extracts the compressed encoded bit stream for the audio signal, and extracts the audio decoder To 255.
  • the audio decoder 255 receives a compression-encoded bit stream for the audio signal, decodes it, and outputs it from a speaker (not shown) of the mobile terminal 170.
  • the case of a mobile 3G network has been described as the network 150.
  • a mobile LTE (Long Term Evolution) network may be used.
  • a fixed network, an NGN (Next Generation Network) network, a W-LAN network, the Internet network, or the like can also be used.
  • a fixed terminal can be used instead of the portable terminal.
  • the server device 110 can be arranged not in the corporate network but in a mobile network or a fixed network.
  • the server device 110 may be arranged in the mobile network. Alternatively, it may be arranged in a fixed network. Moreover, a smart phone and a tablet can also be used for the portable terminal 170 as a terminal. Other known audio codecs can be used for the first audio codec and the second audio codec.
  • a condition different from the condition of the above embodiment may be used.
  • the audio codec is transcoded by the audio conversion unit 185 of the server apparatus 110 when the audio codec at the terminal does not match.
  • the server apparatus 110 does not perform transcoding (the packet is passed through by the voice conversion unit 185 but does not pass through the voice conversion unit 185), and the packet transmission / reception unit 186 and the first packet transmission / reception.
  • a configuration may be adopted in which a packet is passed between the units 187.
  • the bit rate is calculated by the server in accordance with the fluctuation of the bandwidth in the mobile network or the like, and based on this, the thin client terminal The bit rate of the audio codec can be switched. For this reason, when the network bandwidth is narrowed, the above problem that the delay time becomes long and it becomes difficult to make a call can be solved.
  • the network codec bit rate is switched while switching the audio codec bit rate at the thin client terminal, if the audio codec between the terminals differs between the thin client terminal and the non-thin client terminal, the load on the server device is small. Can be transcoded.
  • a terminal A server device that connects to the terminal via a network, transfers screen information obtained by operating an application in a virtual client unit to the terminal by an operation on the terminal, and displays the screen information on the terminal;
  • the server device determines whether or not the operation is a voice call based on an operation signal received from the terminal;
  • the control unit determines that the voice call is made, the packet storing the voice data transmitted from the terminal is transcoded based on the instruction from the control unit or is passed without being transcoded.
  • the terminal includes at least a first terminal;
  • the voice conversion unit performs transcoding, and at this time, a part of the code by the voice codec of the second terminal satisfies the predetermined condition for the time interval.
  • a first decoder for decoding a bitstream according to the audio codec of the second terminal A second decoder for decoding a bitstream by an audio codec of the first terminal; A first encoder; A second encoder; The condition that the signal extracted by extracting a part of the bit stream by the audio codec of the second terminal and smoothed or averaged over a predetermined time interval is equal to or greater than a predetermined threshold is satisfied.
  • a first determination unit that instructs the first encoder to encode the output of the first decoder for the time interval;
  • the portion representing the gain is extracted from the bit stream by the voice codec of the first terminal, the gain is decoded from the extracted code, the decoded gain obtained for each predetermined time interval is smoothed in the time direction, and smoothed
  • a second determination unit that instructs the second encoder to encode the output of the second decoder when a condition that the converted gain is equal to or greater than a predetermined threshold is satisfied;
  • the first encoder encodes the signal decoded by the first decoder using a coding method of the voice codec of the first terminal for a time interval instructed by the first determination unit.
  • the second encoder encodes the signal decoded by the second decoder in the encoding method of the audio codec of the second terminal for a time interval instructed by the second determination unit.
  • Output for the second terminal do not encode in the time interval without an instruction to encode, do not output,
  • a bit stream by the audio codec of the second terminal is directed to the first terminal.
  • the band estimation rate calculation unit calculates uplink and downlink bit rates for the audio codec of the first terminal and the second terminal, and notifies the first terminal and the second terminal, respectively, Further, the uplink and downlink bit rates are output to the audio conversion unit,
  • the communication system according to appendix 2 wherein in the voice conversion unit, the bit rate of encoding of the first and second encoders is in accordance with the bit rate output from the band estimation rate calculation unit.
  • the controller is (A) Analyzing the operation signal, and in the case of a voice call start operation, start a voice call application, (B) Obtaining the destination telephone number selected by the user of the first terminal from the voice call application, obtaining the destination address from the destination telephone number, (C) setting the destination address in the message received from the first terminal; (D) Instructing the bandwidth estimation rate calculation unit to estimate an upstream bandwidth and a downstream bandwidth for a network connected to the first terminal, (E) Instructing the network connected to the second terminal to estimate the upstream bandwidth and the downstream bandwidth, (F) Check capability information regarding the first audio codec that is the audio codec of the second terminal and the second audio codec that is the audio codec of the first terminal, and determine whether they match.
  • the band estimation rate calculation unit from information included in a response signal from each terminal of the first and second terminals,
  • the data size D (j) of the j-th (j is a predetermined positive integer) packet sent from the server device to the downstream bandwidth W of the network to which each terminal is connected, Obtained by dividing by the difference R (j) -R (j-1) between the reception times R (j) and R (j-1) when the terminal receives the j-th and (j-1) -th packets.
  • the band estimation value W is smoothed in time to obtain a band estimation value BW (n) at the nth time after smoothing
  • the response signal from each terminal includes the uplink data size transmitted by each terminal, The bandwidth W ′ in the upstream direction of the network to which each terminal is connected, the mth (m is a predetermined positive integer) data size P (m), the mth, (m ⁇ 1) th, By dividing by the difference T (m) ⁇ T (m ⁇ 1) between the reception times T (m) and T (m ⁇ 1) at which the response signal is received, W ′ is smoothed in the time direction, and the band estimation value BW ′ (n) at the n-th time after smoothing is set as an upstream band estimation value,
  • At least one of the first and second terminals is a decoder that inputs an encoded bit stream transmitted from the server device and outputs a speech decoded signal.
  • a noise signal is generated in CNG (Comfort Noise Generation) and connected to the decoded signal.
  • a transmission / reception unit that connects to a terminal via a network, receives an operation signal from the terminal, and transmits / receives a signal to / from the terminal and another terminal;
  • a control unit for determining whether the operation is a voice call based on an operation signal received from the terminal; When the control unit determines that the voice call is made, the packet storing the voice data transmitted from the terminal is transcoded according to the instruction of the control unit, or the packet is directly passed without being transcoded.
  • a voice conversion unit that outputs to the other party, Based on a response signal from the terminal with respect to transmission of a predetermined packet, a bandwidth of the network is estimated, a bit rate of a voice codec is calculated, and a bandwidth estimation rate calculation unit that notifies the terminal of the bit rate;
  • the terminal includes a first terminal;
  • the other terminal includes a second terminal having a voice codec different from the voice codec of the first terminal;
  • the voice conversion unit performs transcoding, and at this time, a part of the code by the voice codec of the second terminal satisfies the predetermined condition for the second period.
  • the server device is connected to the terminal via a network, and by operating on the terminal, screen information obtained by operating an application in a virtual client unit is transferred to the terminal and displayed on the terminal.
  • the server device according to appendix 8 which is characterized.
  • a first decoder for decoding a bitstream according to the audio codec of the second terminal A second decoder for decoding a bitstream by an audio codec of the first terminal; A first encoder; A second encoder; The condition that the signal extracted by extracting a part of the bit stream by the audio codec of the second terminal and smoothed or averaged over a predetermined time interval is equal to or greater than a predetermined threshold is satisfied.
  • a first determination unit that instructs the first encoder to encode the output of the first decoder for the time interval;
  • the portion representing the gain is extracted from the bit stream by the voice codec of the first terminal, the gain is decoded from the extracted code, the decoded gain obtained for each predetermined time interval is smoothed in the time direction, and smoothed
  • a second determination unit that instructs the second encoder to encode the output of the second decoder when a condition that the converted gain is equal to or greater than a predetermined threshold is satisfied;
  • the first encoder encodes the signal decoded by the first decoder using a coding method of the voice codec of the first terminal for a time interval instructed by the first determination unit.
  • the second encoder encodes the signal decoded by the second decoder in the encoding method of the audio codec of the second terminal for a time interval instructed by the second determination unit. Output for the second terminal, do not encode in the time interval without an instruction to encode, do not output, Based on an instruction from the control unit, when the audio codec of the first terminal and the second terminal is the same, a bit stream by the audio codec of the second terminal is directed to the first terminal.
  • the server apparatus according to appendix 8 or 9, characterized in that the server apparatus outputs the bit stream based on the voice codec of the first terminal as it is for the second terminal.
  • the band estimation rate calculation unit calculates uplink and downlink bit rates for the audio codec of the first terminal and the second terminal, and notifies the first terminal and the second terminal, respectively, Further, the uplink and downlink bit rates are output to the audio conversion unit,
  • the server apparatus according to appendix 8, wherein in the voice conversion unit, the bit rates of encoding of the first and second encoders are in accordance with the bit rate output from the band estimation rate calculation unit.
  • the controller is (A) Analyzing the operation signal, and in the case of a voice call start operation, start a voice call application, (B) Obtaining the destination telephone number selected by the user of the first terminal from the voice call application, obtaining the destination address from the destination telephone number, (C) setting the destination address in the message received from the first terminal; (D) Instructing the bandwidth estimation rate calculation unit to estimate an upstream bandwidth and a downstream bandwidth for a network connected to the first terminal, (E) Instructing the network connected to the second terminal to estimate the upstream bandwidth and the downstream bandwidth, (F) Check capability information regarding the first audio codec that is the audio codec of the second terminal and the second audio codec that is the audio codec of the first terminal, and determine whether they match.
  • the band estimation rate calculation unit from information included in a response signal from each terminal of the first and second terminals,
  • the data size D (j) of the j-th (j is a predetermined positive integer) packet sent from the server device to the downstream bandwidth W of the network to which each terminal is connected, Obtained by dividing by the difference R (j) -R (j-1) between the reception times R (j) and R (j-1) when the terminal receives the j-th and (j-1) -th packets.
  • the band estimation value W is smoothed in time to obtain a band estimation value BW (n) at the nth time after smoothing
  • the response signal from each terminal includes the uplink data size transmitted by each terminal, The bandwidth W ′ in the upstream direction of the network to which each terminal is connected, the mth (m is a predetermined positive integer) data size P (m), the mth, (m ⁇ 1) th, By dividing by the difference T (m) ⁇ T (m ⁇ 1) between the reception times T (m) and T (m ⁇ 1) at which the response signal is received, W ′ is smoothed in the time direction, and the band estimation value BW ′ (n) at the n-th time after smoothing is set as an upstream band estimation value,
  • the first terminal and the second terminal make a voice call via a server device connected via a network
  • the server device estimates the bandwidth of the network based on response signals from the first and second terminals for transmission of a predetermined packet from the server device, and calculates the bit rate of the voice codec of the terminal And notifying the bit rate to the first and second terminals,
  • it is determined whether the audio codec of the first terminal and the second terminal is the same, and if they are the same, bitstreams based on the audio codec of the first and second terminals are respectively Output directly to the second and first terminals,
  • transcoding is performed in the server device, In that case, a part of the code by the audio codec of the second terminal is transcoded from the audio codec of the second terminal to the audio codec of the first terminal for a time interval that satisfies a predetermined condition.
  • a communication method comprising transcoding from a voice codec of a first terminal to a voice codec of the second terminal, and outputting a code after the transcoding to the second terminal.
  • the server device is connected to the terminal via a network, and by operating on the terminal, screen information obtained by operating an application in a virtual client unit is transferred to the terminal and displayed on the terminal.
  • a transmission / reception process for connecting to a terminal via a network, receiving an operation signal from the terminal, and transmitting / receiving a signal to / from the terminal and another terminal; Control processing for determining whether or not the operation is a voice call based on an operation signal received from the terminal; When the control unit determines that the voice call is made, the packet storing the voice data transmitted from the terminal is transcoded according to the instruction of the control unit, or the packet is directly passed without being transcoded.
  • Voice conversion processing Based on a response signal from the terminal to the packet transmission, the network bandwidth is estimated, a voice codec bit rate is calculated, and a bandwidth estimation rate calculation process for notifying the terminal of the bit rate;
  • the terminal includes a first terminal;
  • the other terminal includes a second terminal having a voice codec different from the voice codec of the first terminal;
  • the voice conversion process performs transcoding, Transcode from the second terminal voice codec to the first terminal voice codec for a time interval in which one process of the code by the second terminal voice codec satisfies a predetermined condition, Output the coded signal for the first terminal, Transcoding from the speech codec of the first terminal to the speech codec of the second terminal for a time interval in which a signal obtained from one processing of the code by the speech codec of the first terminal satisfies a predetermined condition And outputting the code after transcoding for the second terminal,
  • a first decoding process for decoding a bitstream by an audio codec of the second terminal A second decoding process for decoding a bitstream by the audio codec of the first terminal; A first encoding process; A second encoding process; The condition that the signal extracted by extracting a part of the bit stream by the audio codec of the second terminal and smoothed or averaged over a predetermined time interval is equal to or greater than a predetermined threshold is satisfied.
  • a first determination process for instructing the first encoding process to encode the output of the first decoding process for the time interval The portion representing the gain is extracted from the bit stream by the voice codec of the first terminal, the gain is decoded from the extracted code, the decoded gain obtained for each predetermined time interval is smoothed in the time direction, and smoothed
  • the first encoding process includes encoding the speech codec of the first terminal for a time interval instructed to encode the signal decoded by the first decoder from the first determination unit.
  • the signal decoded by the second decoder is encoded with the encoding method of the audio codec of the second terminal for a time interval instructed to encode the signal from the second determination unit.
  • Encode and output for the second terminal do not encode and do not output in a time interval without an instruction to encode, Based on an instruction from the control process, when the audio codec of the first terminal and the second terminal is the same, a bit stream by the audio codec of the second terminal is directed to the first terminal.
  • server apparatus 111 phone book 130 cloud network 150 mobile network 151 fixed network 170 mobile terminal 171 client software 175 terminal 176 second packet transmission unit 177 third packet transmission unit 180 screen capture unit 183 bandwidth estimation / rate calculation unit 185 voice Conversion unit 186 Packet transmission / reception unit 187 First packet transmission / reception unit 188 Image encoder unit 189 Audio encoder unit 190 SGSN / GGSN device 194 Base station device 195 RNC device 196 MGM device 211 Virtual client unit 212 Control unit 213 Screen generation unit 214 Voice call VoIP application software 220_1, 220_2 transcoding / through switching unit 221 G. 711 decoder 222, 223 level determination unit 224 AMR encoder 225 AMR decoder 228 G.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
PCT/JP2013/075469 2012-09-21 2013-09-20 通信システムと方法とサーバ装置及び端末 WO2014046239A1 (ja)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012208663 2012-09-21
JP2012-208663 2012-09-21

Publications (1)

Publication Number Publication Date
WO2014046239A1 true WO2014046239A1 (ja) 2014-03-27

Family

ID=50341540

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/075469 WO2014046239A1 (ja) 2012-09-21 2013-09-20 通信システムと方法とサーバ装置及び端末

Country Status (2)

Country Link
TW (1) TW201421963A (zh)
WO (1) WO2014046239A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003158534A (ja) * 2001-11-21 2003-05-30 Ntt Comware Corp シンクライアントサーバ、呼接続方法、そのプログラム及びそのプログラムが記録された記録媒体
JP2005039724A (ja) * 2003-07-18 2005-02-10 Motorola Inc 通信制御方法及び通信制御装置
WO2011055721A1 (ja) * 2009-11-04 2011-05-12 日本電気株式会社 ゲートウェイ装置、携帯端末、携帯通信方法及びプログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003158534A (ja) * 2001-11-21 2003-05-30 Ntt Comware Corp シンクライアントサーバ、呼接続方法、そのプログラム及びそのプログラムが記録された記録媒体
JP2005039724A (ja) * 2003-07-18 2005-02-10 Motorola Inc 通信制御方法及び通信制御装置
WO2011055721A1 (ja) * 2009-11-04 2011-05-12 日本電気株式会社 ゲートウェイ装置、携帯端末、携帯通信方法及びプログラム

Also Published As

Publication number Publication date
TW201421963A (zh) 2014-06-01

Similar Documents

Publication Publication Date Title
JP4645856B2 (ja) パケット交換網−回線交換網間のメディア通信におけるプロトコル変換システム
KR101479393B1 (ko) 대역 내 신호들을 이용한 코덱 전개
JP2010521856A (ja) 通信システムにおけるデータ伝送方法
US9826072B1 (en) Network-terminal interoperation using compatible payloads
JP5943082B2 (ja) リモート通信システム、サーバ装置、リモート通信方法、および、プログラム
US8359620B2 (en) Set-top box for wideband IP telephony service and method for providing wideband IP telephony service using set-top box
WO2014017468A1 (ja) 通信システムと方法とプログラム
KR101709244B1 (ko) 통신 시스템과 방법과 프로그램
TWI519104B (zh) 聲音資訊傳送方法以及封包通信系統
WO2021017807A1 (zh) 通话连接建立方法和第一终端、服务器及存储介质
WO2014046239A1 (ja) 通信システムと方法とサーバ装置及び端末
JP5858164B2 (ja) 通信システム、サーバ装置、サーバ装置の制御方法及びプログラム
JP4120440B2 (ja) 通信処理装置、および通信処理方法、並びにコンピュータ・プログラム
US7619994B2 (en) Adapter for use with a tandem-free conference bridge
JPWO2010035791A1 (ja) ゲートウェイ装置と方法とシステム並びに端末
JP6972576B2 (ja) 通信装置、通信システム、通信方法及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13839675

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13839675

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP