JP2005045742A - Calling device, calling method, and calling system - Google Patents

Calling device, calling method, and calling system Download PDF

Info

Publication number
JP2005045742A
JP2005045742A JP2003280435A JP2003280435A JP2005045742A JP 2005045742 A JP2005045742 A JP 2005045742A JP 2003280435 A JP2003280435 A JP 2003280435A JP 2003280435 A JP2003280435 A JP 2003280435A JP 2005045742 A JP2005045742 A JP 2005045742A
Authority
JP
Japan
Prior art keywords
gain
means
output
data
adjusting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2003280435A
Other languages
Japanese (ja)
Other versions
JP4207701B2 (en
Inventor
Tadayuki Hattori
Akihiro Hokimoto
Satoru Kawabata
Yoshiyuki Kunito
晃弘 保木本
義之 國頭
哲 川畑
忠幸 服部
Original Assignee
Sony Corp
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp, ソニー株式会社 filed Critical Sony Corp
Priority to JP2003280435A priority Critical patent/JP4207701B2/en
Publication of JP2005045742A publication Critical patent/JP2005045742A/en
Application granted granted Critical
Publication of JP4207701B2 publication Critical patent/JP4207701B2/en
Application status is Expired - Fee Related legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Abstract

PROBLEM TO BE SOLVED: To stop ringing when a headset is inserted into a PC and separated from an ear. If you can't hear the ringtone even when you are away from the PC, you won't know if you get a call.
A headphone reproduction unit 44 converts ring tone data multiplied by a gain coefficient k6 into an analog signal and then supplies the analog signal to a headphone 7b. Therefore, the headphone 7b generates a ringtone of the headphone ringing volume level set by the user at the ear of the user at the timing when a call is received from another VoIP client 5. Further, the gain adjusting unit 43 multiplies the ring tone PCM data, which is a decoded output from the decoding unit 41, by a gain coefficient k7 which is a speaker ringing tone volume level set by the user, and supplies the result to the speaker reproducing unit 45.
[Selection] Figure 2

Description

  The present invention relates to a call device and a call method using a network such as the Internet, which enables a call in a high sound quality environment, and in addition to call voice, background music (BGM) or sound effects (Sound effect: It relates to a call device and a call method for transmitting / receiving SE) and a call system.

  In Japanese Patent Application Laid-Open No. 2002-237873, the applicant of the present application is a digital mobile phone having a music data reproduction function, and when listening to music with an earphone, if there is an incoming call, it is superimposed on the music. Release the ringtone from the earphone, stop the music playback according to the operation of the on-hook / off-hook button, and make a call with the combination of the earphone and the external microphone according to the operation of the switch on the main unit, or A technique of switching whether to make a call using a combination of a speaker of a telephone body and a microphone is disclosed.

  In recent years, telephones using the Internet have become widespread. However, for example, when a personal computer is used as a communication device, a handset can be used by freeing both hands of a user in preparation for operations such as a keyboard and a mouse. In many cases, a headset including a headphone and a microphone is used instead of using a speaker and a microphone of the PC main body to prevent echoes.

  In particular, when a PC main body speaker and microphone are used, an echo canceller is required. If there is no echo canceller, the voice you pronounced will come back through the other party's speaker and the other party's microphone via communication, making it very difficult to speak. In particular, it becomes more and more difficult to make a stereo voice call.

JP 2002-237873 A

  By the way, when the technique disclosed in Patent Document 1 is used in an Internet telephone, a ring tone cannot be heard when the headset is inserted into the PC and separated from the ear. If you can't hear the ringtone even when you are away from the PC, you won't know if you get a call. Therefore, it is conceivable to reproduce the ringtone from both the headset and the speaker built in the PC. However, if the speaker is set so that sound can be heard even in the distance, and it happens to receive a call with a headset, it will unintentionally receive a loud sound, which is unusable for the user.

  The same is true for mobile phone devices. When the mobile phone is placed away from the mobile phone, it is set to a loud sound so that it can hear the ringtone.

  In order to solve the above-described problem, a call device according to the present invention stores ringtone data in units of files as a reception system in a call device that performs two-way communication for voice conversation via a network. Ringtone data storage means, first gain adjustment means for adjusting gain by multiplying data from the ringtone data storage means by a variable gain coefficient, and headphones based on the output of the first gain adjustment means A headphone reproducing means for driving; a second gain adjusting means for adjusting the gain by multiplying the data from the ring tone data storing means by a variable gain coefficient; and a speaker based on the output of the second gain adjusting means. Driving speaker reproducing means.

  The first gain adjustment means adjusts the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient. The headphone reproducing means drives the headphones based on the output of the first gain adjusting means. The second gain adjustment means adjusts the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient. The speaker reproducing means drives the speaker based on the output of the second gain adjusting means.

  In order to solve the above-described problem, a call method according to the present invention stores ringtone data in units of files as a reception system in a call method for performing bidirectional communication for voice conversation via a network. A first gain adjustment step of adjusting gain by multiplying data from the ring tone data storage means by a variable gain coefficient; a headphone reproduction step of driving headphones based on the output of the first gain adjustment step; A second gain adjustment step of adjusting the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient; and a speaker reproduction step of driving the speaker based on the output of the second gain adjustment step. .

  In the first gain adjustment step, the gain is adjusted by multiplying the data from the ring tone data storage means by a variable gain coefficient. In the headphone reproduction process, the headphone is driven based on the output of the first gain adjustment process. In the second gain adjustment step, the gain is adjusted by multiplying the data from the ring tone data storage means by a variable gain coefficient. In the speaker reproduction step, the speaker is driven based on the output of the second gain adjusting means.

  In order to solve the above problems, the call system according to the present invention is a call system that performs two-way communication for voice conversation using a plurality of call devices connected to a network. As a reception system, ringtone data storage means for storing ringtone data in file units, and first gain adjustment means for adjusting gain by multiplying data from the ringtone data storage means by a variable gain coefficient; A headphone reproducing means for driving the headphones based on the output of the first gain adjusting means; a second gain adjusting means for adjusting the gain by multiplying the data from the ring tone data storing means by a variable gain coefficient; And speaker reproducing means for driving the speaker based on the output of the second gain adjusting means.

  In the receiving system of the call device, the first gain adjustment means adjusts the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient. The headphone reproducing means drives the headphones based on the output of the first gain adjusting means. The second gain adjustment means adjusts the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient. The speaker reproducing means drives the speaker based on the output of the fifth gain adjusting means.

  According to the communication device of the present invention, the first gain adjustment means adjusts the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient. The headphone reproducing means drives the headphones based on the output of the first gain adjusting means. The second gain adjustment means adjusts the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient. The speaker reproducing means drives the speaker based on the output of the second gain adjusting means. Therefore, optimal volume adjustment can be performed with the headset and the speaker. It is also possible to increase only the ringtone of the speaker so that it can be heard from a distance.

  According to the calling method of the present invention, the first gain adjustment step adjusts the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient. The headphone playback step drives the headphones based on the output of the first gain adjusting means. In the second gain adjustment step, the gain is adjusted by multiplying the data from the ring tone data storage means by a variable gain coefficient. The speaker reproduction process drives the speaker based on the output of the second gain adjustment process. Therefore, optimal volume adjustment can be performed with the headset and the speaker. It is also possible to increase only the ringtone of the speaker so that it can be heard from a distance.

  According to the calling system of the present invention, in the receiving system of the calling device, the first gain adjusting means adjusts the gain by multiplying the data from the ring tone data storing means by the variable gain coefficient. The headphone reproducing means drives the headphones based on the output of the first gain adjusting means. The second gain adjustment means adjusts the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient. The speaker reproducing means drives the speaker based on the output of the second gain adjusting means. Therefore, optimal volume adjustment can be performed with the headset and the speaker. It is also possible to increase only the ringtone of the speaker so that it can be heard from a distance.

  Hereinafter, as a best mode for carrying out the present invention, a VoIP call system in accordance with an Internet telephone protocol called Voice over IP (VoIP) and a VoIP client used in the VoIP call system will be described. .

  First, an outline of the VoIP call system 1 will be described. This VoIP call system transmits and receives background music (BGM) or sound effect (SE) in addition to call voice between VoIP clients.

  As shown in FIG. 1, a VoIP client (Client) 2 is connected to the Internet 4 via, for example, a public line 3 and the like, and is interactively connected to another VoIP client 5 connected to the Internet 4 for voice conversation. Communicate. A VoIP server (Server) 6 is also connected to the Internet 4 and performs communication control based on VoIP. In this VoIP call system 1, a call between two parties of the VoIP client 2 and the VoIP client 5 is taken as an example, but the number of VoIP clients is not limited to two, and therefore there are two or more participants in the call system. Of course.

  The Internet 4 is a network environment that is spread all over the world by connecting a plurality of communication lines such as general public lines and information communication networks. Currently, broadband transmission is enabled by the widespread use of broadband and high-speed communication lines. The network is composed of communication lines of 500kbps or higher using optical fiber, asymmetric digital subscriber line, radio, etc.

  The VoIP server 6 is in the VoIP call system 1 and manages the contractor's IP address, authenticates, or controls communication. It consists of a computer such as a workstation. Of course, a server for billing processing and a server for processing management information such as the contractor's IP address may be provided separately.

  The VoIP client 2 is, for example, a personal computer (PC) to which a headset 7 including a microphone and a speaker or a microphone 7a and a headphone 7b is connected. The PC becomes the VoIP client 2 by executing the VoIP client program 2a realized by software. In the following, it is assumed that the VoIP client 2 makes a call to the VoIP client 5, that is, the VoIP client 2 transmits first and the VoIP client 5 receives. Of course, the VoIP client 5 is also composed of a PC that executes the VoIP client program 5a, and performs the same operation when it first becomes the transmission side.

  The VoIP client 2 on the transmission side is a background sound during a VoIP call, for example, music (Back ground music: BGM) that is a continuous sound for several minutes, for example, a sound effect (Sound effect) for several seconds. : SE) can be mixed into the call voice. The VoIP client 2 individually adjusts the volume level of the background sound and the sound effect as well as the call sound.

  Further, when the VoIP client 2 becomes the receiving side, the volume of the ringtone can be adjusted independently by the headset 7 and the speaker.

  Hereinafter, a configuration and operation in which the VoIP client 2 can individually adjust the volume level of the background sound and the sound effect, and a configuration and operation in which the volume of the ringtone can be independently adjusted by the headset 7 and the speaker will be described with reference to FIG. To do. The VoIP client 2 is configured such that the transmission system and the reception system are functionally described below by executing the VoIP client program 2a. First, in the transmission system 10, an electric signal based on a user's voice collected by the microphone 7 a and converted into an electric signal is captured by the microphone capture unit 11. The gain adjustment unit 12 multiplies the electrical signal based on the sound captured by the microphone capture unit 11 by a gain coefficient k1 that is a microphone volume level set by the user. The multiplication output of the gain adjusting unit 12 is supplied to the adding unit 13.

  In addition, the VoIP client 2 converts MP3 (MPEG-1 Audio Layer-III), MPEG4, or MPEG4 sound effects such as machine gun gunshots, thunders, applause sounds, laughter, etc., for example, into PCM data. A plurality of files are compressed in advance by a compression technique such as ATRAC (Adaptive Transform Acoustic Coding) and stored in the SE file storage unit 14 as SE data in file units. Examples of the SE file storage unit 14 include a hard disk drive (HDD), ROM, and magneto-optical disk as will be described later.

  In addition, the VoIP client 2 uses, for example, MP3, MPEG4, ATRAC, etc. after converting background sounds in units of several minutes, such as wave sounds, birdsongs, or music of various genres, into PCM data, for example. A plurality of files are stored in the BGM file storage unit 15 as BGM data in file units in advance by compression using a compression technique.

  When the SE file stored in the SE file storage unit 14 is selected as desired by the user, it is decoded by the decoding unit 17 while being read by the SE file reading unit 16 into a RAM (not shown), and becomes PCM data. The gain adjustment unit 18 multiplies the decoding output (PCM data) of the decoding unit 17 by a gain coefficient k2 that is an SE volume level set by the user. The multiplication output of the gain adjusting unit 18 is supplied to the adding unit 13.

  When the BGM file stored in the BGM file storage unit 15 is also selected as desired by the user, the BGM file reading unit 17 decodes the BGM file into a RAM (not shown) and decodes it into PCM data. The gain adjustment unit 21 multiplies the decoding output of the decoding unit 20 by a gain coefficient k3 that is a BGM volume level set by the user. The multiplication output of the gain adjusting unit 21 is supplied to the adding unit 13. The addition unit 13 adds the multiplication outputs of the three gain adjustment units 12, 18, and 21 while performing saturation processing, and supplies the addition output to the encoding unit 22.

  The encoding unit 22 compresses the addition output (PCM data) of the addition unit 13 to tens of kbps, for example, 64 kbps, using a compression technique such as MP3, MPEG4, or ATRAC. The compression technology such as MP3, MPEG4, or ATRAC performed by the encoding unit 22 is a high-efficiency acoustic compression coding / decoding technology applied to PCM audio data or the like employed in a CD. Therefore, the audio that has been packetized, transmitted over the Internet, and played back on the receiving side can be converted into two stereo channels and has high sound quality.

  The compressed data is supplied to an RTP packetizing unit 23 that packetizes the data in accordance with a real-time transport protocol (RTP). The RTP packetizing unit 23 puts the compressed data into an RTP packet and further packetizes with UDP and IP. Details of packetization according to RTP will be described later. The packetized packet data is sent from the transmission processing unit 24 to the Internet.

  In the reception system 30, packet data transmitted from another VoIP client 5 via the Internet 4 is received by the reception processing unit 31. The packetized data received by the reception processing unit 31 is solved by the RTP depacketize unit 32. The de-jitter unit 33 corrects the arrival time based on the RTP time stamp and sequential number solved from the IP and UDP by the RTP depacketization unit 32.

  A packet loss compensator 34 compensates for packet loss based on the RTP time stamp and sequential number, and sends compensation data to the decoder 35. The decoding unit 35 decodes the compressed data subjected to arrival time correction and packet loss compensation into PCM data, and sends the PCM data to the gain adjusting unit 36. The gain adjusting unit 36 multiplies the PCM data by a gain coefficient k5 that is a playback volume level set by the user. The multiplication output of the gain adjusting unit 36 is sent to the adding unit 37. Further, in order to share the transmitted voice with the other party, the gain adjustment unit 38 multiplies the transmission voice data by a gain coefficient k4 that is a loopback volume level set by the user. The multiplication output of the gain adjustment unit 38 is also supplied to the addition unit 37.

  Further, the VoIP client 2 converts the ring tone (Ring Tone) into, for example, PCM data and then compresses it in advance using a compression technique such as MP3, MPEG4, or ATRAC, and the ring tone file storage unit as ring tone data for each file. 39 stores a plurality of files.

  The ring tone file from the ring tone file storage unit 39 is selected in advance according to the user's request, and is read into a RAM (not shown) by the ring tone reading unit 40 according to the timing of the incoming call, and is converted into PCM data by the decoding unit 41. Decoded. The decoded output of the decoding unit 41 is supplied to the gain adjustment unit 42 and the gain adjustment unit 43. The gain adjusting unit 42 multiplies the ring tone decoded output (PCM data) by a gain coefficient k6 that is a headphone ringing volume level set by the user, and supplies the result to the adding unit 37. The adder 37 adds the call voice as the multiplication output of the gain adjustment unit 36 and the PCM data of the own call sound as the multiplication output of the gain adjustment unit 38 to the mixing output (PCM data) such as background sound. The output is supplied to the headphone playback unit 44. The headphone reproducing unit 44 converts the added output into an analog signal, amplifies it, and supplies it to the headphone 7b. The headphones 7b generate the mixing output in the user's ear.

  In addition, the adding unit 37 receives the headphone call set by the user in the decoded output (PCM data) of the ring tone file read by the ring tone file reading unit 40 at the timing when a call is received from another VoIP client 5. The data multiplied by the gain coefficient k6, which is the volume level, is supplied to the headphone playback unit 44. The headphone reproducing unit 44 converts the ring tone data multiplied by the gain coefficient k6 into an analog signal and then supplies the analog signal to the headphone 7b. Therefore, the headphone 7b emits a ringtone of the headphone ringing volume level set by the user at the ear of the user at the timing when a call from another VoIP client 5 is received.

  The gain adjustment unit 43 multiplies the ring tone PCM data, which is the decoded output from the decoding unit 41, by a gain coefficient k7 which is a speaker ringing tone volume level set by the user, and supplies the result to the speaker reproduction unit 45. The speaker reproducing unit 45 converts the multiplication output into an analog signal, amplifies it, and supplies it to the speaker 46. The speaker 46 generates a ringtone having a speaker ringtone volume level set by the user for the speaker.

  Therefore, when the VoIP client 2 becomes the receiving side, the volume of the ringtone can be adjusted independently by the headset 7 and the speaker.

  Next, packetization and depacketization based on RTP will be described. RTP is a transport protocol for transmitting / receiving voice and moving images in real time in an IP network such as the Internet. It is recommended in RFC1889. RTP is located in the transport layer and is generally used with Real-time Control Protocol over User Datagram Protocol (UDP).

  As shown in FIG. 3, the RTP packet includes an IP header, a UDP header, an RTP header, and RTP data. The RTP header includes version information (Verasion: V), padding (Padding: P), presence / absence of extension header (extension: X), number of transmission sources (Contoributing source: CRSC), marker information (Marker: M), payload type (Payload Type: PT), sequence number (Sequence Number), RTP time stamp, synchronization transmission source (Sychronization Source: SSRC) identifier, and each field storing a contribution transmission source (Contoributeing source: CRSC) identifier is provided. .

  The RTP packetizing unit 23 in FIG. 2 packetizes the compressed data that is the output of the encoding unit 22 according to the RTP described above. The compressed data itself is included in the RTP payload portion shown in FIG. The RTP packet is sent from the transmission processing unit 24 to another VoIP client (for example, the VoIP client 5 in FIG. 1) via the Internet 4.

  In the reception system 30 of another VoIP client 5, the reception processing unit 31 receives the RTP packet. Here, the operation of the other VoIP client 5 will be described with reference to FIG. The RTP depacketizer 32 separates the RTP header and RTP data from the IP header and the UDP header. The sequence number and type stamp stored in the RTP header are sent to the de-jitter unit 33.

  The de-jitter unit 33 corrects the arrival time non-uniformity based on the sequence number and the type stamp. Since the RTP packet is transmitted by the Internet through which other data is transmitted, the RTP packet may be affected by the transmission, and the arrival times are not equal. The communication interval may become uneven due to clogging or stretching on the time axis. Therefore, the de-jitter unit 33 corrects based on the sequence number and the type stamp so as to have equal intervals.

  The packet compensator 34 corrects the packet loss based on the sequence number and the type stamp. Since the RTP packet is transmitted / received via the Internet, the packet may be lost or may not be received. Therefore, the packet compensation unit 34 compensates for packet loss by using the same packet as the preceding or succeeding packet instead of the missing packet or setting the missing data to 0.

  Then, the decoding unit 35 decodes the mixing data such as the call sound and the background sound in which the arrival time is corrected and the packet loss is compensated, and converts it into PCM data.

  In the VoIP client 2 having such a functional configuration, by applying the present invention, a characteristic is that the volume level of the background sound as well as the call sound can be individually adjusted.

  The volume level of the call sound is adjusted by multiplying the audio data by a gain coefficient k1, which is a microphone volume level set by the user, in the gain adjustment unit 12. Further, the adjustment of the sound effect or the volume level of the BGM is performed by adjusting the gain coefficient k2 that is the SE volume level set by the user in each audio data by the gain adjustment unit 18 or the gain adjustment unit 21 or the gain coefficient that is the BGM volume level. This is done by multiplying k3.

  The call sound data, the sound effect, or the BGM audio data after the volume level is adjusted by each gain adjustment unit 12, gain adjustment unit 18, and gain adjustment unit 21 are synthesized by the addition unit 13, and are sent to the encoding unit 22. Are encoded by the RTP packetizing unit 23 and transmitted from the transmission processing unit 24 to the other VoIP client 5 of the other party.

  The other party's VoIP client 5 receives the RTP packet transmitted via the Internet 4 by the reception processing unit 31, depackets it by the RTP depacketization unit 32, corrects the arrival time interval by the dejitter unit 33, After the packet loss is compensated by the packet compensation unit 34, the decoding unit 35 decodes it to PCM data. The decoded audio data (PCM data) is multiplied by the gain adjustment unit 36 by the gain adjustment unit 36 by the receiving side user, and the call sound from the sender is mixed with BGM or SE. In this state, it can be heard through the headphones 44.

  The VoIP client 2 has the functions shown in FIG. 2 by executing software modules corresponding to the protocols of each layer based on the open system interconnection (OSI) architecture shown in FIG. To achieve.

  In FIG. 4, each layer will be described from the lower layer to the upper layer. First, functions as a physical layer include a universal serial bus (USB) camera driver, a USB audio driver, and various drivers. This layer matches the physical conditions of the transmission conditions of video data from the camera driver and audio data from the audio driver. Next, the function as the data link layer includes an operating system (OS). This is for executing error-free data transfer between adjacent nodes.

  As a function of the network layer, there is the Internet Protocol (IP). The network layer selects a communication path used for data transmission / reception and performs communication control such as flow control and quality control. IP, which is a connectionless packet transfer protocol that does not pursue reliability, leaves the reliability assurance function, flow control function, and error recovery function to the upper layers (transport layer and application layer).

  The functions as the transport layer include the Transport Control Protocol / User Datagram Protocol. In the transport layer, end-to-end transmission is performed using an IP address. Regardless of the type of network, flow control and sequence control are performed according to the required quality class. TCP has a reliability guarantee function, attaches a sequence number to each byte of transferred data, and retransmits data if a reception notification (ACK) is not sent from the receiving side. UDP provides a function for transmitting datagrams between applications. When streaming audio / video images using an IP network, a transport protocol such as TCP that retransmits when an error occurs cannot generally be used. TCP is a protocol for one-to-one communication, and information cannot be transmitted to a plurality of partners. Therefore, UDP is used for such applications.

  UDP is designed to allow application processes to transfer data to other application processes on a remote machine with minimal overhead. Therefore, the information entered in the UDP header is only the source port number, destination port number, data length, and checksum, and there is no field for entering the number indicating the order of packets in TCP. When the order of the packets is changed due to transmission through the network, processing for returning the order to the correct state cannot be performed. Also, there is no field for inputting time information such as a time stamp at the time of transmission in TCP or UDP.

  The function as the session layer includes a session initiation protocol (SIP) and a module required for the software for synthesizing the speech sound and BGM or SE, which is a main part of the present invention. On-hold tone generation, BGM synthesis, ring tone generation, codec, and RTP. The session layer controls information transfer. Manage conversation modes between applications and control conversation units. SIP is an application layer signaling protocol for establishing, changing and terminating multimedia sessions on an IP network. It is standardized by RFC3261.

  As a function as a presentation layer, there is VoIP call control. The presentation layer manages the expression format of information transmitted and received by the application, and performs data conversion and encryption.

  As a function as an application layer, there is a graphical user interface (GUI). In the application layer, the external specification of the communication function used in the user program is managed, and information is exchanged based thereon.

  Next, a hardware configuration of the VoIP client 2 that actually executes the software module will be described. FIG. 5 shows the configuration of the VoIP client 2. In FIG. 5, a CPU (Central Processing Unit) 51 is loaded into a RAM (Random Access Memory) 53 from various programs constituting the software module stored in a ROM (Read Only Memory) 52 or a storage unit 58. Various processes are executed in accordance with various programs constituting the software module. The RAM 53 also appropriately stores data necessary for the CPU 51 to execute various processes.

  The CPU 51, ROM 52 and RAM 53 are connected to each other via a bus 54. An input / output interface 55 is also connected to the bus 54. The input / output interface 55 includes an input unit 56 including a keyboard and a mouse, a display including a CRT and an LCD, an output unit 57 including headphones and speakers, a storage unit 58 including a hard disk, a modem, and a terminal. A communication unit 59 composed of an adapter or the like is connected. The microphone 7 a of the headset 7 is included in the input unit 56. The headphone 7 b is included in the output unit 57.

  The communication unit 59 performs communication processing via the Internet 4. Data provided from the CPU 51 is transmitted. The communication unit 59 outputs data received from the communication partner to the CPU 51, RAM 53, and storage unit 58. The storage unit 58 exchanges information with the CPU 51 to save and erase information. The communication unit 59 also performs analog signal or digital signal communication processing with other clients.

  A drive 60 is connected to the input / output interface 55 as necessary, and a magnetic disk 61, an optical disk 62, a magneto-optical disk 63, a semiconductor memory 64, or the like is appropriately mounted, and a computer program read from these is loaded. It is installed in the storage unit 58 as necessary.

  The storage unit 58 is, for example, an HDD, and constitutes the SE file storage unit 14, the BGM file storage unit 15, and the ring tone file storage unit 39 shown in FIG.

  The above hardware configuration shows the configuration of the VoIP clients 2 and 5 and also the configuration of the VoIP server 6 and a Web server described later.

  Next, GUI (Graphical Use Interface) displayed on the display constituting the output unit 57 will be described with reference to FIG. This GUI belongs to the application layer of the VoIP client. It is an interface for the user to visually operate the PC, and handles user's manual input information. From the upper part to the lower part, the GUI includes an application control unit 71, an information display unit 72, a dial unit 73, a headset volume unit 74, a speaker volume unit 75, a sound effect (SE) selection display unit 76, and an SE control unit 77. , A BGM selection display unit 78 and a BGM control unit 79 are provided.

  The application control unit 71 performs termination processing for the VoIP client application. The information display unit 72 displays dial numbers and partner information (busy, etc.). The dial unit 73 is a numeric keypad for dialing a VoIP partner. The headset volume unit 74 is for adjusting the volume output from the headphones 7 b of the headset 7. When the user moves the slider 74a to the left and right using the mouse, the gain coefficient k5 in the gain adjusting unit 36 is set. Moreover, you may use in order to adjust the volume of the ringtone output from the headphones 7b. In this case, the gain coefficient k6 in the gain adjustment unit 42 is set by the user moving the slider 74a left and right using the mouse.

  The speaker volume unit 75 is for adjusting the volume of the ringtone output from the speaker 46. When the user moves the slider 75a left and right using the mouse, the gain coefficient k7 in the gain adjusting unit 43 is set.

  The SE selection display unit 76 displays usable SE sound source data files (SE files stored in the SE file storage unit 14) to be selected by the user. For example, gunshot sound, thunder, applause sound, A sound effect such as cheers is displayed for the user to select. The SE control unit 77 causes the user using the play button 77b, the stop button 77c, and the slider 77a to play and stop sound effects and adjust the volume via an input unit such as a mouse.

  For example, it is assumed that the user selects a desired SE on the SE selection display unit 76 using the mouse, moves the slider 77a to an appropriate position, and clicks the play button 77b. Then, the decoding unit 17 decodes the desired SE file read by the SE file reading unit 16, and the gain adjustment unit 18 sets the gain coefficient k2 that is the SE volume level corresponding to the slider 77a to the PCM data of the SE file. And output to the adder 13. Thereby, the user can express feelings for the other party with various sound effects.

  The BGM selection display unit 78 displays usable BGM sound source data files to be selected by the user. The BGM control unit 70 causes the user using the playback button 79b, the stop button 79c, and the slider 79a to perform playback and stop of the BGM and volume adjustment via an input unit such as a mouse. For example, it is assumed that the user selects a desired BGM on the BGM selection display unit 78 using the mouse, moves the slider 79a to an appropriate position, and clicks the play button 79b. Then, the decoding unit 20 decodes the desired BGM file read by the BGM file reading unit 19, and the gain adjustment unit 21 sets the gain coefficient k3 that is the BGM volume level corresponding to the slider 79a to the PCM data of the BGM file. And output to the adder 13. Thereby, like SE, the user's mood and the atmosphere of the place can be communicated to the communication partner by the volume selected and adjusted by the user himself / herself.

  Accordingly, the VoIP client 2 can independently adjust the volume of the ringtone by the headset 7 and the speaker 46 by executing various programs constituting the software module.

It is a block diagram of a VoIP call system. It is a functional block diagram of a VoIP client. It is a format diagram of an RTP packet. It is a figure which shows the software module which a VoIP client performs. It is a hardware block diagram of PC used as a VoIP client. It is a figure which shows GUI displayed on the display part of a VoIP client.

Explanation of symbols

  1 VoIP system, 2,5 VoIP client, 4 Internet, 6 VoIP server, 7 headset, 12 gain adjustment unit, 13 synthesis unit, 14 SE file, 15 BGM file, 17 decoding unit, 18 gain adjustment unit, 21 gain adjustment Unit, 22 encoding, 36 gain adjusting unit, 42 gain adjusting unit, 43 gain adjusting unit

Claims (5)

  1. In a communication device that performs two-way communication for voice conversation over a network,
    As a receiving system,
    Ringtone data storage means for storing ringtone data in units of files;
    First gain adjusting means for adjusting the gain by multiplying the data from the ring tone data storing means by a variable gain coefficient;
    Headphones reproducing means for driving headphones based on the output of the first gain adjusting means;
    Second gain adjusting means for adjusting the gain by multiplying the data from the ring tone data storing means by a variable gain coefficient;
    And a speaker reproducing means for driving the speaker based on the output of the second gain adjusting means.
  2. As a transmission system,
    Third gain adjusting means for adjusting the gain by multiplying the sound signal from the sound converting means for converting the collected sound into an electric signal by a variable gain coefficient;
    Sound data storage means for storing sound data in file units;
    Decoding means for decoding sound data in units of files read from the sound data storage means;
    A fourth gain adjusting means for adjusting the gain by multiplying the decoding output from the decoding means by a variable gain coefficient;
    Combining means for combining the third output from the third gain adjusting means and the fourth output from the fourth gain adjusting means;
    Encoding means for encoding the combined output of the combining means;
    Transmission means for transmitting the encoded output from the encoding means to the network,
    As a receiving system,
    Receiving means for receiving the encoded output transmitted from the transmitting means of another telephone device via the network;
    Decoding means for decoding the encoded data received by the receiving means;
    Fifth gain adjusting means for adjusting the gain by multiplying the decode output from the decoding means by a variable gain coefficient;
    The call device according to claim 1, further comprising sound output means for converting the output from the fifth gain adjustment means into sound and outputting the sound.
  3. In a call method that performs two-way communication for voice conversation over a network,
    As a receiving system,
    A first gain adjustment step of adjusting the gain by multiplying the data from the ring tone data storage means storing the ring tone data in file units by a variable gain coefficient;
    A headphone playback step of driving headphones based on the output of the first gain adjustment step;
    A second gain adjustment step of adjusting the gain by multiplying the data from the ring tone data storage means by a variable gain coefficient;
    A speaker reproduction step of driving a speaker based on the output of the second gain adjustment step.
  4. As a transmission system,
    A third gain adjustment step of adjusting the gain by multiplying the audio signal from the audio conversion means for converting the collected audio into an electric signal by a variable gain coefficient;
    A sound data storage process for storing sound data in units of files;
    A decoding step of decoding sound data in file units read from the sound data storage means;
    A fourth gain adjustment step of adjusting the gain by multiplying the decode output from the decoding step by a variable gain coefficient;
    A synthesis step of synthesizing the third output from the third gain adjustment step and the fourth output from the fourth gain adjustment unit;
    An encoding step for encoding the combined output of the combining step;
    A transmission step of transmitting the encoded output from the encoding step to the network,
    As a receiving system,
    A receiving step of receiving an encoded output transmitted from a transmitting step of another communication device via the network;
    A decoding step of decoding the encoded data received in the receiving step;
    A fifth gain adjustment step of adjusting the gain by multiplying the decode output from the decoding step by a variable gain coefficient;
    The call method according to claim 3, further comprising sound output means for converting the output from the fifth gain adjustment step into sound and outputting the sound.
  5. In a call system that performs two-way communication for voice conversation using a plurality of call devices connected to a network,
    The plurality of call devices, as a reception system, ringtone data storage means for storing ringtone data in file units,
    First gain adjusting means for adjusting the gain by multiplying the data from the ring tone data storing means by a variable gain coefficient;
    Headphones reproducing means for driving headphones based on the output of the first gain adjusting means;
    Second gain adjusting means for adjusting the gain by multiplying the data from the ring tone data storing means by a variable gain coefficient;
    And a speaker reproducing means for driving the speaker based on the output of the second gain adjusting means.
JP2003280435A 2003-07-25 2003-07-25 Call device, call method, and call system Expired - Fee Related JP4207701B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2003280435A JP4207701B2 (en) 2003-07-25 2003-07-25 Call device, call method, and call system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2003280435A JP4207701B2 (en) 2003-07-25 2003-07-25 Call device, call method, and call system

Publications (2)

Publication Number Publication Date
JP2005045742A true JP2005045742A (en) 2005-02-17
JP4207701B2 JP4207701B2 (en) 2009-01-14

Family

ID=34266266

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2003280435A Expired - Fee Related JP4207701B2 (en) 2003-07-25 2003-07-25 Call device, call method, and call system

Country Status (1)

Country Link
JP (1) JP4207701B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2429870A (en) * 2005-08-29 2007-03-07 Nec Infrontia Corp Attenuating incoming ringer tone alert
GB2461058A (en) * 2008-06-18 2009-12-23 Skype Ltd Controlling an audio output device in a terminal executing a communication client
KR101099239B1 (en) 2009-08-07 2011-12-27 대덕대학산학협력단 Terminal apparatus for softphone with easy telephone bell sound recognition
JP5733445B1 (en) * 2014-03-25 2015-06-10 Nttエレクトロニクス株式会社 Automatic packet receiver

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2429870A (en) * 2005-08-29 2007-03-07 Nec Infrontia Corp Attenuating incoming ringer tone alert
GB2429870B (en) * 2005-08-29 2007-10-24 Nec Infrontia Corp Voice call system adapted to support computer terminal
US8594316B2 (en) 2005-08-29 2013-11-26 Nec Infrontia Corporation Voice call system adapted to support a computer terminal and that adjusts a ringer tone
GB2461058A (en) * 2008-06-18 2009-12-23 Skype Ltd Controlling an audio output device in a terminal executing a communication client
GB2461058B (en) * 2008-06-18 2012-10-17 Skype Audio device control method and apparatus
US8862993B2 (en) 2008-06-18 2014-10-14 Skype Audio device control method and apparatus
KR101099239B1 (en) 2009-08-07 2011-12-27 대덕대학산학협력단 Terminal apparatus for softphone with easy telephone bell sound recognition
JP5733445B1 (en) * 2014-03-25 2015-06-10 Nttエレクトロニクス株式会社 Automatic packet receiver
WO2015145869A1 (en) * 2014-03-25 2015-10-01 Nttエレクトロニクス株式会社 Packet automatic reception device

Also Published As

Publication number Publication date
JP4207701B2 (en) 2009-01-14

Similar Documents

Publication Publication Date Title
US7236580B1 (en) Method and system for conducting a conference call
JP3475809B2 (en) Portable TV wireless phone
US9094525B2 (en) Audio-video multi-participant conference systems using PSTN and internet networks
JP5185631B2 (en) Multimedia conferencing method and signal
US6826174B1 (en) Voice-over-IP interface for standard household telephone
US8681202B1 (en) Systems and methods for implementing internet video conferencing using standard phone calls
JP3237566B2 (en) Call method, the audio transmitting apparatus and audio receiving apparatus
US6661886B1 (en) Method and system for real-time monitoring of voice mail during active call
US8169937B2 (en) Managing a packet switched conference call
EP1578084A2 (en) Systems and methods for videoconference and/or data collaboration initiation
DE69925004T2 (en) Communication management system for computer network-based telephones
US8004556B2 (en) Conference link between a speakerphone and a video conference unit
US6724862B1 (en) Method and apparatus for customizing a device based on a frequency response for a hearing-impaired user
EP1113657A2 (en) Apparatus and method for packet-based media communications
US20070294263A1 (en) Associating independent multimedia sources into a conference call
US7995589B2 (en) Controlling voice communications over a data network
JP3948904B2 (en) Teleconference bridge with edge point mixing
US7983200B2 (en) Apparatus and method for packet-based media communications
US20120086769A1 (en) Conference layout control and control protocol
AU2004233529B2 (en) System and method for providing unified messaging system service using voice over internet protocol
US20040207724A1 (en) System and method for real time playback of conferencing streams
CN101069439B (en) Terminal for multimedia ring back tone service and metnod for controlling terminal
EP1961203B1 (en) Telephone call processing method and apparatus
US20030023672A1 (en) Voice over IP conferencing server system with resource selection based on quality of service
US20030235186A1 (en) Internet cordless phone

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20060725

A977 Report on retrieval

Effective date: 20071225

Free format text: JAPANESE INTERMEDIATE CODE: A971007

A131 Notification of reasons for refusal

Effective date: 20080108

Free format text: JAPANESE INTERMEDIATE CODE: A131

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20080310

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20080930

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20081013

FPAY Renewal fee payment (prs date is renewal date of database)

Year of fee payment: 3

Free format text: PAYMENT UNTIL: 20111031

FPAY Renewal fee payment (prs date is renewal date of database)

Free format text: PAYMENT UNTIL: 20121031

Year of fee payment: 4

LAPS Cancellation because of no payment of annual fees