WO2006067981A1 - 電子メール送信端末および電子メールシステム - Google Patents
電子メール送信端末および電子メールシステム Download PDFInfo
- Publication number
- WO2006067981A1 WO2006067981A1 PCT/JP2005/022677 JP2005022677W WO2006067981A1 WO 2006067981 A1 WO2006067981 A1 WO 2006067981A1 JP 2005022677 W JP2005022677 W JP 2005022677W WO 2006067981 A1 WO2006067981 A1 WO 2006067981A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- processing
- voice
- terminal
- Prior art date
Links
- 230000005540 biological transmission Effects 0.000 title claims description 32
- 238000012545 processing Methods 0.000 claims abstract description 194
- 230000000694 effects Effects 0.000 claims abstract description 49
- 239000003550 marker Substances 0.000 claims abstract description 26
- 238000000034 method Methods 0.000 claims description 88
- 230000008569 process Effects 0.000 claims description 71
- 230000015572 biosynthetic process Effects 0.000 claims description 56
- 238000003786 synthesis reaction Methods 0.000 claims description 56
- 230000002194 synthesizing effect Effects 0.000 claims description 29
- 230000004044 response Effects 0.000 claims description 4
- 230000006870 function Effects 0.000 description 31
- 230000008859 change Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 9
- 230000006835 compression Effects 0.000 description 7
- 238000007906 compression Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000033764 rhythmic process Effects 0.000 description 5
- 239000011435 rock Substances 0.000 description 4
- 238000003780 insertion Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 230000036651 mood Effects 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000001308 synthesis method Methods 0.000 description 2
- 241000255925 Diptera Species 0.000 description 1
- 241000282414 Homo sapiens Species 0.000 description 1
- 125000002066 L-histidyl group Chemical group [H]N1C([H])=NC(C([H])([H])[C@](C(=O)[*])([H])N([H])[H])=C1[H] 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000012966 insertion method Methods 0.000 description 1
- 230000001020 rhythmical effect Effects 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/107—Computer-aided management of electronic mailing [e-mailing]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/7243—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
- H04M1/72433—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/72442—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for playing music files
Definitions
- the present invention relates to an e-mail content creation and transmission technique.
- Electronic mail using a mobile phone terminal has an advantage that messages can be easily exchanged without selecting time and place.
- pictograms are used for the purpose of enhancing expressiveness.
- mobile phone terminals become more sophisticated, it is also possible to exchange messages including still images and moving images.
- Non-Patent Document 1 An example of a mail service using audio media is “Singing Mail” in Non-Patent Document 1 below. This uses an application program to convert input kana characters into singing voices. And it is possible to change the arrangement of voice quality 'melody ⁇ accompaniment freely.
- Non-Patent Document 1 http: ⁇ www.g-search.or.jp/release/2004/20040726.html
- Non-Patent Document 1 it is difficult to express accents and emotions peculiar to human beings with the power of converting input characters into artificial speech using an application program.
- An object of the present invention is to provide an expressive e-mail system using audio media.
- the invention according to claim 1 is directed to voice input means, means for recording the voice input by the voice input means, and storing the voice data as recorded data, and the recorded data.
- the invention according to claim 2 is the electronic mail transmitting terminal according to claim 1, wherein the music data includes sound effect data, and the synthesis processing means reproduces the synthesized voice data. Means for synthesizing the sound effect data at a specified point on the time axis.
- the invention according to claim 3 is the electronic mail transmitting terminal according to claim 1 or claim 2, wherein the music data is a recording medium detachably attached to the electronic mail transmitting terminal or It is stored in a storage device of a server on a network to which the e-mail transmission terminal can be connected.
- the invention according to claim 4 is the electronic mail transmitting terminal according to claim 3, wherein the music data stored in the recording medium or the storage device is encrypted and the composition is performed.
- the processing means includes means for decrypting the encrypted music data.
- the invention according to claim 5 is the electronic mail transmitting terminal according to any one of claims 1 to 4, wherein the music data is preliminarily set on one or more timings on a reproduction time axis.
- the invention according to claim 6 is the electronic mail transmitting terminal according to claim 5, wherein the music data with the power includes music data in which a timing marker is set at the head of each measure. .
- the invention according to claim 7 is the electronic mail transmitting terminal according to any one of claims 1 to 6, wherein the synthesis processing means responds to a designation operation by a user.
- the invention according to claim 8 is the electronic mail transmitting terminal according to any one of claims 1 to 6, further comprising means for compressing and converting the synthesized voice data.
- the invention according to claim 9 is the electronic mail transmitting terminal according to claim 7, further comprising means for compressing and converting the synthesized audio data with video data.
- the invention according to claim 10 is the electronic mail transmitting terminal according to any one of claims 1 to 6, wherein the electronic mail transmitting terminal is a mobile phone terminal, and the synthesized voice data is It is converted to the standard music data format specified for mobile phone terminals.
- the invention according to claim 11 is the electronic mail transmitting terminal according to any one of claims 1 to 10, further comprising means for generating scenario data in which a rule of the synthesis process is recorded,
- the transmission means includes means for transmitting the recording data and the scenario data as an electronic mail.
- the invention according to claim 12 is the electronic mail transmitting terminal according to any one of claims 1 to 11, further comprising processing means for processing the voice input by the voice input means.
- the processing means includes means for modulating voice and means for giving Z or special effects to the voice, and the synthesis processing means synthesizes the recording data and the music data after being processed by the processing means.
- the invention according to claim 13 is the electronic mail transmitting terminal according to claim 12, wherein the processing means is a tempo changing process and a Z or pitch shift process of the voice input by the voice input means. Means for performing.
- the invention according to claim 14 is the electronic mail transmitting terminal according to claim 12, wherein the processing means applies an equalizer process, a harmonization process, and an echo process to the voice input by the voice input means. Including means for executing one or more processes.
- the invention according to claim 15 is the electronic mail transmitting terminal according to any one of claims 12 to 14, wherein a plurality of setting information corresponding to a plurality of themes is prepared in advance. Each setting information defines the contents of the processing executed by the processing means. When one setting information is selected, the contents of the processing by the processing means are determined.
- the invention according to claim 16 is the electronic mail transmitting terminal according to any one of claims 1 to 15, further comprising means for modulating music defined by the music data and Z or Means for giving a special effect to the music defined by the music data.
- the invention of claim 17 is a system for transferring electronic mail, comprising a terminal and a synthesis server, wherein the terminal is a voice input means and a sound input by the voice input means.
- the invention of claim 18 is a system for transferring electronic mail, comprising a terminal and a synthesis server, wherein the terminal is a voice input means and a sound input by the voice input means.
- Means for transmitting voice to the synthesis server, and the synthesis server records the received voice as recorded data, and synthesizes the recorded data and music data to generate synthesized voice data.
- the invention according to claim 19 is the electronic mail system according to claim 17 or 18, wherein the music data includes sound effect data, and the synthesis processing means includes It includes means for synthesizing sound effect data at a specified point on the playback time axis of the synthesized sound data.
- the invention according to claim 20 is the electronic mail system according to any one of claims 17 to 19, wherein the music data is preliminarily provided with one or a plurality of timing markers on a reproduction time axis.
- the composition processing means including the specified marker-added music data Means for decomposing the recorded data into a plurality of recorded data elements with a portion below a predetermined volume as a silent portion and separating the recorded data into a plurality of recorded data elements, and each of the plurality of recorded data elements with the marker And means for synthesizing the music data in synchronism with the time set by the timing marker.
- the invention according to claim 21 is the electronic mail system according to claim 20, wherein the music data with the power includes music data in which a timing marker is set at the head of each measure.
- the invention according to claim 22 is the electronic mail system according to any one of claims 17 to 21, and further on the time axis of the synthesized speech data in response to a designation operation by a user.
- the invention according to claim 23 is the electronic mail system according to any one of claims 17 to 21, further comprising means for compressing and converting the synthesized speech data.
- the invention according to claim 24 is the electronic mail system according to claim 22, further comprising means for compressing and converting the synthesized audio data with video data.
- the invention according to claim 25 is the electronic mail system according to any one of claims 17 to 21, wherein the terminal is a mobile phone terminal, and the synthesized voice data is transmitted to the mobile phone terminal.
- ⁇ ⁇ is converted to a standard music data format.
- the invention according to claim 26 is the electronic mail system according to any one of claims 17 to 25, wherein the synthesis server further processes a voice received from the terminal.
- the processing means includes means for modulating sound and means for giving a special effect to Z or the sound, and the synthesis processing means includes the recording data and the music data after being processed by the processing means, Is synthesized.
- the invention according to claim 27 is the electronic mail system according to claim 26, wherein the processing means performs a tempo change process and a Z or pitch shift process of the voice input by the voice input means. Means for performing.
- the invention according to claim 28 is the electronic mail system according to claim 26, wherein
- the processing means includes means for executing one or a plurality of processes among the equalizer process, the harmonization process, and the echo process on the voice input by the voice input means.
- the invention according to claim 29 is the electronic mail system according to any one of claims 26 to 28, wherein a plurality of setting information corresponding to a plurality of themes is prepared in advance, and each setting is set.
- the information defines the content of the processing executed by the processing means, and the content of the processing by the processing means is determined by selecting one setting information.
- the invention of claim 30 is the electronic mail system according to any of claims 17 to 29, further comprising means for modulating music defined by the music data and Z or Means for giving a special effect to music defined by music data.
- a terminal such as a mobile phone synthesizes a recorded voice or music file by weighting his / her own voice to the music just by playing it. This makes it possible to create a synthesized voice mail full of originality.
- the music data stored in the recording medium is encrypted, and it is necessary to read it with a predetermined program in order to decrypt it, so that it is possible to prevent unauthorized use of content. .
- the timing data is added to the music data, so that the recording data can be synthesized without a sense of incongruity according to the BGM rhythm.
- noise gate processing during or after recording, it is possible to create a content that is easy to hear in terms of music, eliminating environmental noise outdoors.
- the synthesized voice data into a general-purpose data format that is standardized by mobile phones, it is a versatile email that transcends between mobile phone carriers regardless of the playback environment. It can be provided as a system.
- FIG. 1 is a block diagram of a mobile phone terminal that is relevant to the embodiment.
- FIG. 2 is a diagram showing a layer structure of voice electronic mail.
- FIG. 3 is a diagram showing a layer structure of multimedia mail.
- FIG. 4 is a diagram showing a general-purpose music file format.
- FIG. 5 is a flowchart showing main processing of composition processing.
- FIG. 6 is a flowchart of noise gate processing.
- FIG. 7 is a flowchart of BGM data acquisition processing.
- FIG. 8 is a flowchart showing multimedia mail creation processing.
- FIG. 9 is a flowchart showing voice mail creation processing.
- FIG. 10 is a diagram showing a flow of processing in the first embodiment.
- FIG. 11 is a diagram showing a flow of processing in the second embodiment.
- FIG. 12 is a diagram showing a flow of processing in the third embodiment.
- FIG. 13 is a diagram showing a flow of processing in the fourth embodiment.
- FIG. 14 is a flowchart of audio processing.
- FIG. 1 is a block diagram showing the configuration of the mobile phone terminal 100 according to the first embodiment of the present invention.
- the mobile phone terminal 100 includes a microphone device 101 for inputting voice, a microphone interface (Mic
- I / F 102
- ROM media 103 which is card media
- Card Cont I / F card control interface
- 104 for accessing the card media
- a memory 106 used as a storage area for various application programs
- MMU memory management unit
- Operation unit 107 audio processing unit 108 that performs audio signal encoding and decoding processing, video processing unit 109 that performs video signal encoding and decoding processing, mobile phone terminal 100 performs voice calls, and mobile phone terminal 100 performs data
- a communication unit 110 that executes communication processing with the base station via the antenna 111 and a CPU 112 that controls the mobile phone terminal 100 are provided.
- the mobile phone terminal 100 includes a monitor 113 and a speaker 114.
- ROM card 103 for example, a CompactFlash (registered trademark) card, a SmartMedia (registered trademark), an SD memory card (registered trademark), or the like can be used.
- memory 106 SDRAM or the like can be used.
- the audio processing unit 108 has a function of encoding and decoding audio signals based on standards such as MP3 and AAC, and the video processing unit 109 encodes and decodes video signals based on standards such as MPEG4. It has a function to perform.
- the ROM card 103 stores BGM data BD and sound effect data ED.
- BGM data BD is background music data that is synthesized with voice input by microphone power in voice electronic mail (voice mail) or multimedia mail transmitted in the present embodiment.
- voice mail voice electronic mail
- the sound effect data ED is data for adding a sound effect of a relatively short time to the synthesized sound data obtained by synthesizing the sound input with the microphone force and the BGM data BD. For example, sound data such as clapping, shouts, and cymbals are included.
- BGM data BD and sound effect data ED are stored in ROM card 103 in a file format compressed in accordance with, for example, MP3 or AAC standards. Or, in mobile phone terminals, it becomes a standard and file format such as melody ringtone format Stored in M card 103.
- the BGM data BD and sound effect data ED stored in the ROM card 103 are encrypted and decrypted by the authoring program AP stored in the memory 106.
- the authoring program AP has the decryption key information necessary to decrypt the BGM data BD and the sound effect data ED. By decoding these data, the BGM and sound effects can be played back. Yes.
- the authoring program AP stored in the memory 106 can execute various functions for creating a voice electronic mail or multimedia mail in the present embodiment, including a voice synthesis process.
- the authoring program AP has a recording function for voice input from the microphone device 101, a reading process and a decoding key processing function for BGM data BD and sound effect data ED stored in the ROM card 103, Noise gate processing function during or after recording microphone input sound, recorded sound data and BGM data BD combining function, synthesized sound data and sound effect data ED combining process function, synthesized sound data and video data Synthetic audio data and synthetic audio data with video data saving function, synthetic audio data and synthetic audio data playback function with video data, synthetic audio data and synthetic audio data with video data as e-mail Function that transmits to other terminals can be executed
- the above-mentioned functions executed by the authoring program AP are functions executed by the authoring program AP using hardware resources such as the CPU 112 and RAM (not shown). is there.
- the noise gate function includes a noise gate function during voice recording and a noise gate function executed after voice recording.
- the noise gate function during voice recording starts recording when the volume of the voice input from the microphone device 101 exceeds a predetermined level, and generates recording data by stopping recording when the volume falls below the predetermined level. It is a function.
- the noise gate function to be executed after voice recording is already recorded. This function discards data at locations where the volume is lower than a predetermined level in the recorded data where sound is generated, and holds only locations where the volume exceeds a predetermined level as recorded data.
- Each of these three data is called a recording data element.
- VI, V2, and V3 shown as the recording data in Fig. 2 correspond to the above three recording data elements.
- the BGM data BD is divided into a plurality of bars as shown in the figure.
- the BGM data BD in the present embodiment has a leading position of each bar (this is a point on the time axis).
- the timing marker that specified) is recorded.
- the BGM data BD in which the timing marker that specifies the start position of the measure is set is called the music data with a marker.
- the force that a marker is set at the head position of each measure is an example. Markers may be set at different points from the bar break.
- the authoring program AP links each recording data element disassembled as described above to each timing marker position of the music data with marker.
- each recording data element is synthesized at the beginning of each measure of the music data with marker.
- synthesized audio data is generated by synthesizing the recording data and the BGM data BD.
- B GM each measure head background music to be played based on the data BD, "Hello from the relationship Nag first measure of head and recording timing of the actual ”Occurs from the head of the second bar,“ Today is the weather, ”and from the head of the third bar,“ Let's go play somewhere! ” In this way, the background music and recorded data can be synchronized. It is done.
- the phrase of the recording data element V3 extends to the fourth measure area of the BGM data BD.
- the third measure of the BGM data BD may be played back in a loop so that the knock ground music and the recorded data can be combined without a sense of incongruity.
- the sound effect data synthesis function is also a processing function by the coseling program AP.
- the sound effect data ED can be synthesized at any point on the playback time axis of the synthesized audio data.
- the synthesis method is as follows.
- the authoring program AP inputs the sound effect insertion instruction by the user while reproducing the BGM data BD (or reproducing the synthesized voice data). While listening to the background music, the user operates the operation unit 107 at a point where the sound effect is to be inserted, and issues a sound effect insertion instruction. In this way, synthesized voice data in which BGM data BD, recording data, and sound effect data ED are synthesized can be generated.
- the authoring program AP can synthesize video data such as still images or moving images with the generated synthesized audio data to generate synthesized audio data with video data.
- the synthesis method is as follows.
- the first method is a method of linking each video data to each measure of BGM data BD as shown in FIG. As shown in the figure, one video data may be linked across multiple bars.
- the authoring program AP is executed and the synthesized audio data is played back while the switching point of the video data is specified by the user.
- synthesized audio data is generated by synthesizing BGM data BD, recording data, and sound effect data ED, and further, synthesized audio data with video data is generated by synthesizing the video data. Is done.
- the authoring program AP uses the audio processing unit 108 to compress the synthesized voice data generated by the above processing based on a standard such as MP3 or AAC. Also, the synthesized audio data with video data generated by the above processing is subjected to video compression processing based on a standard such as MPEG4 using the audio processing unit 108 and the video processing unit 109. In this manner, the synthesized voice data in the present embodiment is stored in the memory 106 as voice electronic mail compressed based on a standard such as MP3 or AAC. Alternatively, the synthesized audio data with video data is stored in the memory 106 as multimedia mail compressed based on standards such as MPEG4 (audio is based on standards such as MP3 and AAC! /).
- the authoring program AP can also convert the synthesized voice data into a general-purpose music data format (for example, mmf) that is normally used in a mobile phone terminal.
- the authoring program AP does not use the audio processing unit 108, and converts the synthesized voice data into a general-purpose music data format by processing on the CPU 112.
- the data generated in this way is stored in the memory 106 as voice electronic mail.
- the CPU 112 designates the destination address of the e-mail and directs the voice e-mail or multimedia mail to the destination address.
- the CPU 112 designates the destination address of the e-mail and directs the voice e-mail or multimedia mail to the destination address.
- voice electronics The terminal that receives the e-mail can view the audio synthesized with the BGM.
- This voice is actually recorded by the sender, not artificial voice, so it is possible to accurately convey emotional expressions.
- since such recorded audio is played back in accordance with the BGM, it is possible to produce the transmitted content in various ways, unlike ordinary voice calls.
- it is possible to further enhance the expressiveness of voice mail by using sound effects inserted at arbitrary points.
- video is played in addition to these expressive voice mails, enabling communication with richer expressiveness.
- synthesized audio data or synthesized audio data with video data is composed of multiple data layers.
- synthesized audio data or synthesized audio data with video data can be defined by the name of each data file (data identification information) synthesized with the BGM data BD and synchronization information (time information, etc.) for the BGM data BD. Is possible.
- the data identification information and the synchronization information are used as scenario data, and only the recording data and scenario data are transmitted to the destination address.
- the scenario data includes information specifying BGM data BD, sound effect data ED, and video data, and information indicating the timing at which these data and recording data are combined.
- the sending terminal can reduce the amount of data transfer without having to send each data file.
- the data described in the scenario data is executed by executing a predetermined application program. According to the identification information, BGM data, sound effect data, video data, etc. are read out from the recording medium and played according to the described synchronization information. [0071] ⁇ Composition process flow>
- noise gate processing step S10
- the recorded data is decomposed into recorded data elements.
- FIG. 6 is a flowchart showing the processing contents of the noise gate processing (step S 10).
- step S11 the noise gate processing timing is selected. If it is set to perform noise gate processing after recording, microphone recording processing (step S12) is performed, and then noise gate processing (step S13) is performed on the recording data. If it is set to perform noise gate processing during recording, noise gate recording processing (step S14) is executed.
- the recording data element is generated by the above processing. Note that the user can freely change the setting of whether to perform noise gate processing after recording or to perform gate processing during recording.
- FIG. 7 is a flowchart showing the processing contents of the BGM processing (step S20).
- step S21 the reading destination of the BGM data BD is selected.
- the card medium ROM card 103
- step S22 the card medium
- step S25 connection processing to the public server is performed.
- step S23 and S26 a plurality of BGM data BD BGM data BDs to be subjected to medium power synthesis processing are selected.
- the obtained BGM data BD is decrypted (steps S24, 27).
- a synthesis process is performed (step S30).
- the contents of the synthesizing process are as described above, and the recording data elements are synthesized so as to synchronize with the timing markers of the music data with markers.
- the data after the synthesis processing has a layer structure as shown in FIG. 2, and the BGM data BD and the recording data are synthesized in a separable state. For example, link information between BGM data BD and recorded data is generated. Alternatively, synthesized voice data including a plurality of tracks (channels) is generated, and each data is stored in each track. Note that the recording data element is synthesized at each timing marker position of the BGM data BD.
- a specific recording data element may be synthesized at a plurality of timing marker positions. For example, it is possible to repeat a specific message by synthesizing a specific recording data element at a plurality of consecutive timing marker positions, and to emphasize a particularly important phrase in the recording message. Yes, it is possible.
- a sound effect insertion process (step S40) is performed.
- the sound effect insertion method is as described above, and the sound effect is synthesized at an arbitrary point on the playback time axis of the synthesized voice data in response to the user's designated operation.
- the synthesized speech data after the sound effect is inserted also has a layer structure as shown in FIG. 2, and is synthesized in a state where each data can be separated.
- the method for obtaining the sound effect data ED is the same as the method for obtaining the BGM data BD shown in FIG.
- step S50 a selection is made as to whether or not to add video. If a video is added, a multimedia mail creation process (step S60) is performed. If a video is not added, a voice electronic mail (voice mail) creation process (step S70) is performed. Selection of whether or not to add video is specified by the user.
- a multimedia mail creation process step S60
- a voice electronic mail (voice mail) creation process step S70
- FIG. 8 is a flowchart showing the contents of the multimedia mail creation process.
- a file format is selected (step S61).
- the moving image compression format such as MPEG
- the specified video data is added and the compression processing is executed according to the set file format (step S62).
- Scenario data is set as a file format!
- the scenario data defining the synthesized audio data with video data is generated (step S63). This scenario data also includes information specifying video data.
- the multimedia mail generated in this way is stored in the memory 106 (step S64).
- the data after the processing in step S62 is compressed audio data with video data
- the data after the processing in step S63 is scenario data and recording data.
- the multimedia mail stored in the memory 106 is sent to another terminal with a destination address designated (step S6). 5) 0
- FIG. 9 is a flowchart showing the contents of voice electronic mail (voice mail) creation processing.
- a file format is selected (step S71).
- a voice compression format such as MP3 or AAC is set as the file format
- compression processing is executed according to the set file format (step S72).
- the general-purpose music file format standardized by the mobile phone terminal is set as the fill format
- conversion processing to the set file format is performed (step S73).
- scenario data is set as the file format !, scenario data defining the synthesized speech data is generated (step S74).
- the voice e-mail generated in this manner is stored in the memory 106 (step S75).
- the voice e-mail stored in the memory 106 is compressed after the processing in steps S72 and S73.
- it is converted synthesized voice data, and the data after the processing of step S74 is scenario data and recorded data.
- the voice electronic mail stored in the memory 106 is transmitted to another terminal with the destination address designated (step S76).
- a voice electronic mail or a composition process for generating a multimedia mail is executed in the mobile phone terminal 100. That is, as shown in FIG. 10, the mobile phone terminal 100 executes all the processes of voice input, voice recording, synthesis, and mail transmission, and sends the combined e-mail to the receiving terminal 200.
- the synthesis process is executed in the synthesis server 300 connected via a network.
- the cellular phone terminal 100 executes only voice input processing and voice recording processing.
- the mobile phone terminal 100 also transmits the recording data to the composition server 300, and the composition server 300 executes the composition process.
- information indicating the conditions for the composition process may be transmitted from the mobile phone terminal 100 to the composition server 300.
- the information indicating the conditions of the synthesis process may be information of the same type as the scenario data described in the first embodiment.
- the synthesizing servo 300 performs a synthesizing process similar to that of the first embodiment, thereby performing voice electronic mail.
- a message or multimedia mail is generated, this e-mail is stored in a storage device.
- information specifying a URL is transmitted to the mail receiving terminal 200 as access path information to the electronic mail stored in the storage device.
- the mail receiving terminal 200 can receive voice electronic mail or multimedia mail by specifying this URL.
- the power combining server 300 that transmits the access path information (URL information) from the combining server 300 to the mail receiving terminal 200 has the access path information. It may be possible to transmit the access path information from the mobile phone terminal 100 to the mail receiving terminal 200 from the mobile phone terminal 100 that is the mail transmitting terminal.
- the cell phone terminal 100 which is a mail transmission terminal, does not execute the combining process, it is possible to reduce the processing load on the terminal. .
- the third embodiment is different from the second embodiment in that voice recording is also performed by the synthesis server 300.
- the other points are the same as in the second embodiment.
- the cellular phone terminal 100 that is a mail transmission terminal connects a telephone line to the synthesis server 300.
- the user utters a message toward the microphone device 101.
- This message is transferred to the synthesis server 300 through the telephone line, and recording processing is performed in the synthesis server 300.
- the subsequent processing is the same as in the second embodiment.
- the processing load on the terminal can be reduced.
- the fourth embodiment is different from the first embodiment in that various voice processing is added to the voice input from the microphone device 101, and the voice after the voice processing is synthesized with the BGM data BD. is there.
- step S30 The processing flow is substantially the same as that described with reference to the flowcharts of Figs. However, the synthesis process in step S30 is different. That is, in the synthesis process of step S30 in the first embodiment, the recorded voice is directly synthesized with the BGM data BD! /, But in this embodiment, the recorded voice is checked. After that, it is synthesized with BGM data BD.
- FIG. 13 is a flowchart of the synthesis process (step S30) in this embodiment. This synthesis process is also a process executed by the authoring program AP. In the synthesis processing flow, first, voice processing (step S31) is executed. Audio processing is processing that modulates recorded audio or that gives special effects to recorded audio.
- the processing for giving special effects to the sound includes equalizer processing, harmonization processing, echo processing, and the like.
- Equalizer processing is processing that changes the frequency characteristics of a voice message by emphasizing the high range, enhancing the low range, or cutting a specific range.
- the harmonization process is a process that adds other chords to the pitch of the recorded voice to make the voice message a chord.
- the echo process is a process for reproducing a recorded voice with a time difference and making the sound resonate. For example, a message with a beautiful atmosphere can be generated by the harmonization process. In addition, a fantastic atmosphere can be produced by echo processing.
- step S31 it is determined whether or not to change the tempo of the recording sound (step S301). If the tempo is not changed, the process proceeds to step S305.
- step S302 When changing the tempo (Yes in step S301), it is determined whether or not to perform the synchronization process (step S302).
- Manual tempo change processing is specified by the user. This is a process for changing the tempo of the recorded sound in accordance with the set tempo setting.
- the automatic tempo change process is a process in which the tempo of the recorded sound is changed to synchronize with the BGM data BD.
- a method for obtaining a tempo when recording sound (before changing the tempo) will be described.
- One is a method determined by the guide voice.
- voice When voice is input by the user, a guide voice with a certain rhythm is played like a metronome. The user records the voice while listening to the guide voice. The recorded voice includes information on the guide voice, and the tempo of the recorded voice is determined by the guide voice.
- Another method is to automatically acquire the rhythm of the recorded voice.
- Voice analysis processing is performed on the recorded voice, and the voice is divided into syllable units. The voice timing power of each syllable automatically acquires the tempo of the recorded voice.
- the tempo at the time of recording the recorded sound is determined by the above two methods.
- the manual tempo change processing method is a method for correcting the tempo at the time of recording in accordance with the tempo setting designated by the user. For example, if the recording tempo is determined by the guide voice, the tempo of the guide voice is corrected to the set tempo. Accordingly, the tempo of the recorded sound is also changed. Alternatively, if the recording tempo is automatically acquired, the tempo is automatically changed according to the set tempo setting.
- the automatic tempo change processing method is a method in which the tempo at the time of recording of the recorded sound is automatically changed so as to be synchronized with the BGM data BD.
- the tempo of BGM data BD is recorded on BGM data BD, it can be used.
- BGM data BD can be obtained by voice analysis. For example, it is possible to analyze the tempo based on the rhythmic sound such as a drum. If the recording tempo is determined by the guide voice, the tempo of the guide voice is changed to match the tempo of the BGM data. Accordingly, the tempo of the recorded sound is also changed to synchronize with the BGM. Alternatively, when the recording tempo is automatically acquired, this automatically acquired tempo is the BGM data BD template. It is automatically changed to synchronize with the port.
- step S302 When the synchronization process is performed (Yes in step S302), the tempo of the recording sound is automatically changed to synchronize with the BGM data BD (step S303). When sync processing is not performed (No in step S302), the tempo of the recorded sound is changed according to the set value (step S304).
- step S305 it is determined whether or not to perform pitch shift processing. If it is set to perform pitch shift processing (Yes in step S305), pitch shift processing is executed according to the set value (step S306).
- the set value for the pitch shift process is the pitch shift amount.
- the shift amount of the recorded voice can be set by the user.
- step S307 it is determined whether or not to perform equalizer processing. If it is set to perform equalizer processing (Yes in step S307), equalizer processing is executed according to the set value (step S308).
- the set value of the equalizer process is information on a sound range to be emphasized or information on a sound range to be cut, and can be set by the user.
- step S309 it is determined whether or not to perform harmonization processing. If it is set to perform the harmonic processing (Yes in step S309), the harmonic processing is executed according to the set value (step S310).
- the setting value of the harmonization process can be set by the user. For example, a third sound is added to the bass sound. Or, if you can make settings such as adding 3rd and 5th sounds to the bass sound.
- step S311 it is determined whether or not to perform echo processing (step S311). If it is set to perform echo processing (Yes in step S311), echo processing is executed according to the set value (step S312).
- the set value for echo processing is information that specifies the time difference of echo sound, the duration of echo sound, etc., and can be set by the user.
- step S31 it is only necessary that the user can set which audio processing is executed.
- the user can set automatic tempo change and harmonize processing, set tempo manual change processing, pitch shift processing and echo processing, or freely perform audio processing.
- a combination can be selected.
- the user can specify the point on the time axis of the voice to which voice processing is applied. This means that even if For example, the first half of the voice can produce a fantastic atmosphere by echo processing, and the second half can be uptempoed to create a message with a sense of speed.
- This setting set specifies the combination of voice processing to be executed and the setting value of each voice processing to be executed. For example, it would be convenient to have a set that matches the tune, such as rock, norad, and rap. Alternatively, it is convenient to have a set that matches emotions such as sadness, anger, and joy. If the user selects rock music as BGM data BD and selects rock music as the setting set, the user can easily create a rock voice message.
- step S32 when the audio processing is completed in step S31, the synthesis processing of the audio subjected to the audio processing and the BGM data BD is performed (step S32).
- steps S40, S50, S60 (or S70) shown in FIG. 5 are executed, and multimedia mail or voice mail is transmitted.
- the recorded voice is modulated or synthesized with BGM data BD after giving various special effects. This makes it possible to further enhance the expressiveness of voice mail and voice mail with video.
- BGM data BD By devising a combination of audio processing settings and BGM data BD, it is possible to play messages that match the mood and tone of BGM data BD.
- the noise gate process and the tempo change process the BGM is merged with the recorded voice that does not flow regardless of the recorded voice to constitute one sound.
- the noise gate processing it is linked and merged with the recorded voice clause or the sentence strength BGM measure.
- the tempo change process merges the syllables of V and the recorded voice into the BGM rhythm within each measure of the BGM, creating a unified sound.
- the BGM data BD is acquired, and then the voice processing and the synthesis processing of the recorded voice are performed. Audio processing may be executed during recording. In this case, voice recording and voice processing are performed in parallel, and then BGM data BD is acquired, and the voice after voice processing and BGM data BD are synthesized.
- BGM data BD information is required, so automatic change processing should be performed after obtaining BGM data BD. Goodbye!
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Strategic Management (AREA)
- Human Computer Interaction (AREA)
- Operations Research (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Computer Hardware Design (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Economics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Telephone Function (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020077016878A KR101236496B1 (ko) | 2004-12-24 | 2005-12-09 | 전자 메일 송신 단말 및 전자 메일 시스템 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004373043A JP2006127443A (ja) | 2004-09-30 | 2004-12-24 | 電子メール送信端末および電子メールシステム |
JP2004-373043 | 2004-12-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006067981A1 true WO2006067981A1 (ja) | 2006-06-29 |
Family
ID=36601583
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/022677 WO2006067981A1 (ja) | 2004-12-24 | 2005-12-09 | 電子メール送信端末および電子メールシステム |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR101236496B1 (ja) |
WO (1) | WO2006067981A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011507092A (ja) * | 2007-12-13 | 2011-03-03 | サムスン エレクトロニクス カンパニー リミテッド | マルチメディア電子メール合成装置及びその方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07212475A (ja) * | 1994-01-25 | 1995-08-11 | Hitachi Ltd | ボイスメール音声メッセージ重畳方式 |
JP3060617U (ja) * | 1998-12-28 | 1999-09-07 | 日立ソフトウエアエンジニアリング株式会社 | 音声合成出力機能付画像合成出力装置 |
JP2000224269A (ja) * | 1999-01-28 | 2000-08-11 | Feisu:Kk | 電話機および電話システム |
JP2001125599A (ja) * | 1999-10-25 | 2001-05-11 | Mitsubishi Electric Corp | 音声データ同期装置及び音声データ作成装置 |
JP2003195863A (ja) * | 2001-12-27 | 2003-07-09 | Sony Corp | 情報作成装置、携帯電話機及び情報作成方法 |
JP2004326152A (ja) * | 2003-04-21 | 2004-11-18 | Yamaha Corp | 音楽コンテンツ利用装置及びプログラム |
-
2005
- 2005-12-09 WO PCT/JP2005/022677 patent/WO2006067981A1/ja not_active Application Discontinuation
- 2005-12-09 KR KR1020077016878A patent/KR101236496B1/ko not_active IP Right Cessation
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07212475A (ja) * | 1994-01-25 | 1995-08-11 | Hitachi Ltd | ボイスメール音声メッセージ重畳方式 |
JP3060617U (ja) * | 1998-12-28 | 1999-09-07 | 日立ソフトウエアエンジニアリング株式会社 | 音声合成出力機能付画像合成出力装置 |
JP2000224269A (ja) * | 1999-01-28 | 2000-08-11 | Feisu:Kk | 電話機および電話システム |
JP2001125599A (ja) * | 1999-10-25 | 2001-05-11 | Mitsubishi Electric Corp | 音声データ同期装置及び音声データ作成装置 |
JP2003195863A (ja) * | 2001-12-27 | 2003-07-09 | Sony Corp | 情報作成装置、携帯電話機及び情報作成方法 |
JP2004326152A (ja) * | 2003-04-21 | 2004-11-18 | Yamaha Corp | 音楽コンテンツ利用装置及びプログラム |
Non-Patent Citations (1)
Title |
---|
NAKAJIMA A.: "MIDI Bible I MIDI 1.0 Kikaku Kisohen", vol. 2ND ED., 1 June 1998, KABUSHIKI KAISHA RITT MYUJIKKU, pages: 246 - 247, XP003007771 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011507092A (ja) * | 2007-12-13 | 2011-03-03 | サムスン エレクトロニクス カンパニー リミテッド | マルチメディア電子メール合成装置及びその方法 |
Also Published As
Publication number | Publication date |
---|---|
KR101236496B1 (ko) | 2013-02-21 |
KR20070091679A (ko) | 2007-09-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100619826B1 (ko) | 이동 통신 단말기의 음악 및 음성 합성 장치와 방법 | |
TW529018B (en) | Terminal apparatus, guide voice reproducing method, and storage medium | |
KR100549634B1 (ko) | 데이터 압축 방법, 데이터 전송 방법 및 데이터 재생 방법 | |
KR101236496B1 (ko) | 전자 메일 송신 단말 및 전자 메일 시스템 | |
JP2006127443A (ja) | 電子メール送信端末および電子メールシステム | |
KR20080016109A (ko) | 배경음이 출력되는 오디오북 서비스 제공 방법 및 시스템 | |
KR100731232B1 (ko) | 악곡 데이터 편집 재생 장치 및 휴대 정보 단말기 | |
US20210367987A1 (en) | Live Broadcast Network Using Musical Encoding to Deliver Solo, Group or Collaborative Performances | |
JP4512286B2 (ja) | 番組送出システム及びこれに用いる番組送出装置 | |
JP2005062420A (ja) | コンテンツ生成システム、コンテンツ生成方法およびコンテンツ生成プログラム | |
JP3859200B2 (ja) | ポータブルミキシング記録装置及びその制御方法並びにプログラム | |
JP2002229576A (ja) | 携帯カラオケ端末、模範歌唱信号送出装置および携帯カラオケシステム | |
JP2001051688A (ja) | 音声合成を用いた電子メール読み上げ装置 | |
US20040193429A1 (en) | Music file generating apparatus, music file generating method, and recorded medium | |
JP2007251581A (ja) | 音声送信端末および音声再生端末 | |
JP2008191536A (ja) | 歌声の録音及び伴奏曲との合成装置 | |
TW200427297A (en) | Speech and music reproduction apparatus | |
JP2005217614A (ja) | 電話端末装置及び楽曲再生方法 | |
JP4244706B2 (ja) | 音声再生装置 | |
JP4337726B2 (ja) | 携帯端末装置、プログラムおよび記録媒体 | |
JP4514513B2 (ja) | 音楽メール出力方法、音楽メール出力システム、及び音楽出力装置 | |
JP4153453B2 (ja) | 音楽再生装置 | |
GB2395631A (en) | Audio file reproduction in a mobile telecommunications device | |
KR100755526B1 (ko) | 벨 소리 생성 장치와 벨 소리 생성 방법 및 벨 소리 생성방법이 기록된 기록매체 | |
JP2006235468A (ja) | 楽曲ファイル生成装置およびそれを用いた携帯端末装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS KE KG KM KN KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020077016878 Country of ref document: KR |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 05814474 Country of ref document: EP Kind code of ref document: A1 |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 5814474 Country of ref document: EP |