WO2011122522A1 - 感性表現語選択システム、感性表現語選択方法及びプログラム - Google Patents
感性表現語選択システム、感性表現語選択方法及びプログラム Download PDFInfo
- Publication number
- WO2011122522A1 WO2011122522A1 PCT/JP2011/057543 JP2011057543W WO2011122522A1 WO 2011122522 A1 WO2011122522 A1 WO 2011122522A1 JP 2011057543 W JP2011057543 W JP 2011057543W WO 2011122522 A1 WO2011122522 A1 WO 2011122522A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound
- expression word
- frequency
- sensitivity
- sensitivity expression
- Prior art date
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 276
- 238000010187 selection method Methods 0.000 title claims description 18
- 230000005236 sound signal Effects 0.000 claims abstract description 81
- 230000035945 sensitivity Effects 0.000 claims description 238
- 230000002996 emotional effect Effects 0.000 claims description 48
- 238000000034 method Methods 0.000 claims description 23
- 238000001228 spectrum Methods 0.000 claims description 17
- 230000007423 decrease Effects 0.000 claims description 16
- 230000001953 sensory effect Effects 0.000 claims description 13
- 230000010365 information processing Effects 0.000 claims description 4
- 230000003595 spectral effect Effects 0.000 claims description 3
- 230000007613 environmental effect Effects 0.000 description 44
- 238000010586 diagram Methods 0.000 description 20
- 241000282414 Homo sapiens Species 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 10
- 230000005484 gravity Effects 0.000 description 10
- 241000282412 Homo Species 0.000 description 8
- 239000002184 metal Substances 0.000 description 4
- 230000005477 standard model Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000035807 sensation Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/075—Musical metadata derived from musical analysis or for use in electrophonic musical instruments
- G10H2240/085—Mood, i.e. generation, detection or selection of a particular emotional content or atmosphere in a musical piece
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
- G10H2250/215—Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
- G10H2250/235—Fourier transform; Discrete Fourier Transform [DFT]; Fast Fourier Transform [FFT]
Definitions
- the present invention relates to a sensitivity expression word selection system, a sensitivity expression word selection method, and a program.
- Patent Document 1 a stereo telephone device capable of realizing high-quality sound and realistic telephone communication has been proposed (for example, Patent Document 1).
- the stereo telephone device described in Patent Document 1 can perform stereo voice communication between stereo telephones, it can have a conversation with a voice with a stereoscopic effect rather than a monaural sound.
- Patent Document 2 has been proposed as a technique aiming to convey the environmental sound of the place to the other party.
- the telephone number of the content server is input together with the telephone number of the receiver.
- Content servers include those that collect environmental sounds around the caller and distribute them as stereophonic sound data in real time, and those that distribute music.
- the receiving side telephone device since the information of the content server designated on the transmitting side is notified when the telephone makes a call, it is connected to the content server based on this IP address information to obtain the stereophonic data, 3D sound is played by a surround system connected to the telephone device. Thereby, the receiver can experience almost the same atmosphere as the caller while talking to the caller.
- human beings live in various sounds including voices and feel sensibility not only to the meaning content of the voice but also to the sound itself. For example, when considering a place where there are a large number of humans, even if not all humans are uttering, there are sounds of people moving around and opening materials. In such a case, the human feels that the place is “gray”, for example. On the other hand, even if there are a large number of people, there may be no sound at all or there may be almost no sound. In such a case, humans feel that the scene is a “scene”. In this way, human beings feel various sensibilities with sounds (including silence) that are felt on the spot.
- Patent Document 1 and Patent Document 2 are intended to reproduce the sound field that is generated on the spot as faithfully as possible to reproduce a realistic sound field. It was not possible to convey the sensibility.
- the present invention has been invented in view of the above-mentioned problems, and its purpose is to share the sensibility with each other by expressing the atmosphere and each other's situation with a sensitivity expression word appealing to human sensitivity. It is an object to provide a sensitivity expression word selection system, a sensitivity expression word selection method, and a program that are easy to perform and provide a sense of reality.
- the present invention that solves the above-described problems is a signal analysis unit that analyzes an audio signal and generates sensible sound information related to a sound generated at the acquisition location of the audio signal, and the acquisition location based on the sensitivity sound information.
- This is a sensitivity expression word selection system including a sensitivity expression word selection unit that selects a sensitivity expression word that expresses a content that a person feels from sounds generated in the system.
- the present invention that solves the above problems analyzes an audio signal, generates Kansei sound information related to a sound generated at the acquisition location of the audio signal, and generates the Kansei sound information at the acquisition location based on the Kansei sound information.
- This is a sensitivity expression word selection method for selecting a sensitivity expression word that expresses a content that a person feels from a sound that is present.
- the present invention that solves the above-described problem is a signal analysis process that analyzes an audio signal and generates sensuous sound information related to a sound that is generated at the audio signal acquisition location, and the acquisition location based on the sensitivity sound information.
- This is a program that causes an information processing apparatus to execute a sensitivity expression word selection process for selecting a sensitivity expression word that expresses a content that a person feels from sounds generated in the system.
- FIG. 1 is a block diagram of a sensitivity expression word selection system according to the present embodiment.
- FIG. 2 is a block diagram of the Kansei expression word selection system according to the first embodiment.
- FIG. 3 is a diagram showing an example of the emotional expression word database 21.
- FIG. 4 is a block diagram of a Kansei expression word selection system according to the second embodiment.
- FIG. 5 is a diagram for explaining an example of frequency information of an audio signal.
- the sensitivity sound information is the sound pressure level and the frequency centroid (normalized value)
- the sensitivity expression words are mapped in two dimensions of the sound pressure level (normalized value) and the frequency centroid (normalized value). It is the figure which showed an example of the sentiment expression word database 21.
- FIG. 6 when the sensitivity sound information is the sound pressure level and the frequency centroid (normalized value), the sensitivity expression words are mapped in two dimensions of the sound pressure level (normalized value) and the frequency centroid (normalized value). It is the figure which showed an example of
- FIG. 7 is a diagram for explaining an example in which the frequency information is the slope of the spectrum envelope.
- FIG. 8 is a diagram for explaining an example in which the frequency information is the number of harmonics.
- FIG. 9 is a diagram for explaining an example in which frequency information is a frequency band and a frequency centroid.
- FIG. 10 is a block diagram of a Kansei expression word selection system according to the third embodiment.
- FIG. 11 is a block diagram of a Kansei expression word selection system according to the fourth embodiment.
- FIG. 12 is a block diagram of a Kansei expression word selection system according to the fifth embodiment.
- FIG. 13 is a block diagram of a Kansei expression word selection system according to the sixth embodiment.
- FIG. 1 is a block diagram of a Kansei expression word selection system according to the present embodiment.
- the emotional expression word selection system of the present embodiment includes an input signal analysis unit 1 and a sensitivity expression word selection unit 2.
- the input signal analysis unit 1 inputs an audio signal acquired in a certain predetermined field, analyzes the audio signal, and sensed sound related to the sound generated in the predetermined field (hereinafter referred to as “sensitive sound”). Generate information.
- Kansei sound is a concept that includes various sounds generated when an audio signal is acquired, for example, voice and environmental sounds other than voice. Humans live in various sounds including voices, and feel sensibility not only to the meaning content of the voice but also to the sound itself. For example, when considering a place where there are a large number of humans, even if not all humans are uttering, there are sounds of people moving around and opening materials. In such a case, the human feels that the place is “gray”, for example.
- the input signal analysis unit 1 analyzes the audio signal of the sensibility sound generated in a predetermined field, analyzes what kind of sensation sound is generated on the spot, and obtains kansei sound information regarding the sensibility sound. Generate.
- the sensitivity sound information is the level of the sound pressure of the audio signal, the frequency of the audio signal, and the type of the audio signal (for example, the type of environmental sound excluding sound such as sound, rain sound, car sound, etc.). Etc.
- the emotional expression word selection unit 2 selects an emotional expression word corresponding to the emotional sound generated when the audio signal is acquired based on the emotional sound information generated by the input signal analysis unit 1.
- a sensitivity expression word is a word that expresses a content that a person feels, for example, feelings, sensitivity, and sensation, in a sound generated when an audio signal is acquired.
- the sensitivity expression word selection unit 2 selects an onomatopoeia word or a mimetic word sensitivity expression word such as “Zawazawa” or “Gayagaya”. Also, if the sound pressure level is almost close to 0 and it is considered that the sound pressure level is close to silence, an onomatopoeia or a mimetic word sensitivity expression word such as “scene” is selected.
- the sensibility expression word selection unit 2 selects “dod” that makes the image of construction noise when the frequency of the audio signal is low, “boon” that makes the image of the exhaust sound of the car, and conversely when it is high. Select an emotional expression that expresses a metallic image like this, or an emotional expression that strikes a tree such as “Concon”.
- the sensitivity expression word selection unit 2 selects a more accurate sensitivity expression word according to the type of sound generated on the spot. For example, “Dodok” or “Boone” can be selected by distinguishing the exhaust sound of a car as a construction drill.
- the emotional expression word selected in this way is output in accordance with text data, metadata such as Exif, the format used for video search tags, and the output of the emotional expression word by sound.
- the affective sound information is generated by paying attention to the volume of the audio signal acquired from the affective sound generated in a certain predetermined place.
- a sensitivity expression word onomatopoeia, mimicry word, etc.
- FIG. 2 is a block diagram of the Kansei expression word selection system according to the first embodiment.
- the sensitivity expression word selection system includes an input signal analysis unit 1 and a sensitivity expression word selection unit 2.
- the input signal analysis unit 1 has a sound pressure level calculation unit 10.
- the sound pressure level calculation unit 10 calculates the sound pressure of the audio signal of the input sensitivity sound and normalizes the sound pressure level (0 to 1.0) to the sensitivity expression word selection unit 2 as sensitivity sound information. Output.
- the emotional expression word selection unit 2 includes a sensitivity expression word database 21 and a sensitivity expression word search unit 22.
- the sensitivity expression word database 21 is a database in which sensitivity expression words corresponding to values (0 to 1.0) of sensitivity sound information are stored.
- FIG. 3 shows an example of the emotional expression word database 21.
- the sensitivity expression word database 21 shown in FIG. 3 shows sensitivity sound information values (sound pressure level: 0 to 1.0) and corresponding sensitivity expression words (for example, onomatopoeia and mimicry words). For example, when the value of the sensitivity sound information is “0.0”, the sensitivity expression word is “scene”, and when the value of the sensitivity sound information is “0.1”, the sensitivity expression word is “Kokosoko”. . When the value of the sensitivity sound information is “0.9 or more and less than 0.95”, the sensitivity expression word is “Wai Wai” and the value of the sensitivity sound information is “0.95 or more and 1 or less. If the value is “”, the sensitivity expression word is “Gayagaya”. In this way, a sensitivity expression word corresponding to the value of the sensitivity sound information is stored.
- the emotional expression word search unit 22 inputs the emotional sound information from the input signal analysis unit 1 and searches the emotional expression word database 21 for the emotional expression words corresponding to the emotional sound information. For example, when the value of the emotional sound information obtained from the input signal analysis unit 1 is “0.64”, the emotional expression word corresponding to “0.64” is selected from the emotional expression word database 21. In the example of the emotional expression word database 21 shown in FIG. 3, the emotional expression words corresponding to “0.64” are “pecha-pecha” between 0.6 and 0.7. Therefore, “pecha pecha” is searched as a sensitivity expression word corresponding to the value “0.64” of the sensitivity sound information. The retrieved emotional expression word is output in accordance with text data, metadata such as Exif, a format used for a tag for moving image search, output of an emotional expression word by sound, and the like.
- a sensitivity expression word (onomatopoeia or mimetic word) corresponding to the sound level of the place is selected.
- Kansei expression words (onomatopoeia and mimicry words) that appeal to the human sensibility of the situation.
- frequency analysis is performed on an audio signal acquired from a sensory sound generated in a certain predetermined field, and attention is paid to the sound volume and the frequency spectrum. Then, Kansei sound information is generated.
- a sensitivity expression word suitable for the place where the audio signal is acquired is selected based on the sensitivity sound information will be described.
- FIG. 4 is a block diagram of the Kansei expression word selection system according to the second embodiment.
- the input signal analysis unit 1 includes a frequency analysis unit 11 in addition to the first embodiment.
- the frequency analysis unit 11 calculates frequency information representing characteristics on the frequency of the sound, such as the fundamental frequency of the input signal, the frequency center of gravity, the frequency band, the slope of the spectrum envelope, and the number of harmonics.
- Figure 5 shows a conceptual diagram of each item.
- the fundamental frequency is a frequency that represents the pitch of a periodic sound, and is determined by the vibration cycle of the sound.
- the frequency centroid is a weighted average frequency with energy as a weight, and represents the pitch of sound in the case of noise.
- the frequency band is a frequency band that can be taken by the input audio signal.
- the spectral envelope represents a general tendency of the spectrum, and its inclination affects the timbre.
- the frequency analysis unit 11 outputs the frequency information as described above as sensitivity sound information.
- the sensitivity expression word search unit 22 inputs the sound pressure level and frequency information as sensitivity sound information, and selects a sensitivity expression word corresponding to the sensitivity sound information from the sensitivity expression word database 21. For this reason, the sensitivity expression word database 21 stores sensitivity expression words corresponding to the sensitivity sound information learned by considering not only the sound pressure level but also the frequency information. The sensitivity expression word search unit 22 inputs the sound pressure level and frequency information as sensitivity sound information, and selects a sensitivity expression word suitable for the sound pressure level and frequency information from the sensitivity expression word database 21.
- the sensitivity sound information is the sound pressure level and the frequency centroid (normalized value)
- the sensitivity expression words are mapped in two dimensions of the sound pressure level (normalized value) and the frequency centroid (normalized value).
- An example of the sentiment expression word database 21 is shown.
- the emotional expression word search unit 22 determines that a powerful sound is produced when the audio signal is acquired, Select the sensibility expression word “dondon”.
- the sensory sound information having a small sound pressure level value and a large frequency centroid value it is determined that there is an unsatisfactory sound when the audio signal is acquired, and the sensitivity expression word “tonton” is used. select.
- the sensory sound information having a large sound pressure level value and a large frequency centroid value it is determined that a sharp sound is heard when the audio signal is acquired, and the sensitivity expression word “kinkin” is selected. .
- the example of the sound pressure level and the frequency center of gravity or the fundamental frequency is shown, but the present invention is not limited to this.
- the frequency information is the slope of the spectrum envelope and the slope is negative
- the sensitivity expression corresponding to the sound pressure level from the sensitivity expression word having muddy sound as the sensitivity expression word of dull impression
- a sensitivity expression word corresponding to the sound pressure level may be selected from a sensitivity expression word having no muddy sound as a sensitivity expression word having a sharp impression.
- the frequency information is the number of overtones, and when the number is large, the sensitivity expression corresponding to the sound pressure level from the sensitivity expression word with muddy sound that becomes a dirty impression (becomes noise). If a word is selected and the number is small, a sensitivity expression word corresponding to the sound pressure level may be selected from a sensitivity expression word having no muddy sound that becomes a clean impression (close to pure tone).
- a sensitivity expression word corresponding to the sound pressure level for example, “Don Dong” is selected from the sensitivity expression words expressing low sound.
- a metallic impression of sharp impression including high-frequency sounds
- a sensitivity expression word for example, “kinkin” may be selected.
- the emotional expression word selected in this way is output in accordance with text data, metadata such as Exif, the format used for video search tags, and the output of the emotional expression word by sound.
- an audio signal acquired from a sensory sound generated in a certain predetermined place is identified as a sound and an environmental sound other than the sound
- Sensitive sound information is generated by paying attention to the size, frequency analysis, and discrimination between sound and environmental sound.
- FIG. 10 is a block diagram of a sensitivity expression word selection system according to the third embodiment.
- the input signal analysis unit 1 includes a voice / environmental sound determination unit 12 in addition to the second embodiment.
- the voice / environmental sound determination unit 12 determines whether the input audio signal is voice uttered by a person or other environmental sound. As a determination method, the following method can be considered.
- a linear prediction of about several ms (10th order in the case of 8 kHz sampling) is performed on the audio signal, and if the linear prediction gain is large, it is determined as speech, and if it is small, it is determined as an environmental sound.
- long-term prediction of about ten or more ms is performed on the audio signal.
- the input sound of the audio signal is converted into a cepstrum, the distance between the converted signal and the standard model of sound is measured, and if the input sound is more than a certain distance, it is determined as the environmental sound excluding the sound.
- GMM Gaussian Mixture Model
- HMM Hidden Markov Model
- a GMM or HMM is created from a voice previously uttered by a person using a statistical or machine learning algorithm.
- the garbage model is a model created from sounds other than human voices, and the universal model is a model created by combining all voices voiced by humans and other voices.
- the input signal analysis unit 1 includes the sound pressure level calculated by the sound pressure level calculation unit 10, the frequency information calculated by the frequency analysis unit 11, and the type of sound calculated by the sound / environmental sound determination unit 12 (voice or Environmental sound other than sound) is output as Kansei sound information.
- the emotional expression word search unit 22 has the same basic configuration as that of the second embodiment, but sets the sound pressure level, frequency information, and sound type (sound or environmental sound other than sound) as sensitivity sound information. To search for Kansei expressions. For this reason, the sensitivity expression word database 21 stores not only the sound pressure level and frequency information but also the sensitivity expression words corresponding to the sensitivity sound information learned in consideration of the type of environmental sound other than speech or speech. .
- the emotional expression word search unit 22 uses the sensitivity expression word “ Search for "hisohiso".
- the sensitivity expression word search unit 22 is a sensory expression word “Gongon” when the sound generated when the audio signal is acquired is an environmental sound other than voice, the frequency center of gravity is low, and the sound pressure level is low. Search for emotional expression words corresponding to environmental sounds other than voice such as “”.
- the sound that is generated when the audio signal is acquired is an environmental sound other than speech
- the frequency center of gravity is high and the sound pressure level is high
- the sound is expressed as an environmental sound other than speech, such as the sensitivity expression word “kinkin”.
- the retrieved emotional expression words are output according to the format used for text data, metadata such as Exif, and tags for moving image search.
- the emotional expression word search unit 22 analyzes the number of speakers based on the sound pressure level and frequency information, and the emotional expression word suitable for the number of people. May be selected. For example, if one person is speaking in a low voice, “Buzzy”, if the voice is loud, “Wah”, if multiple people are speaking in a low voice, “Hisou”, if multiple people are loud, Search for “Wai Wai”.
- the emotional expression word selected in this way is output in accordance with text data, metadata such as Exif, the format used for video search tags, and the output of the emotional expression word by sound.
- the sound pressure level, the frequency information, and the discrimination between the sound and the environmental sound are described.
- the sound pressure level, the sound and the environmental sound are only distinguished from each other. It is also possible to select a Kansei expression word using a combination with the identification.
- the voice and the environmental sound other than the voice are identified, it is possible to select a sensitivity expression word corresponding to the type of the sound generated when the audio signal is acquired.
- the type of environmental sound other than sound is identified, and the loudness, frequency analysis, and sensitivity sound identification (sound, car Sensitive sound information is generated focusing on the type of environmental sound such as sound.
- a sensitivity expression word suitable for the place where the audio signal is acquired is selected based on the sensitivity sound information will be described.
- FIG. 11 is a block diagram of the Kansei expression word selection system according to the fourth embodiment.
- the input signal analysis unit 1 includes a voice / environmental sound type determination unit 13 in addition to the second embodiment.
- the voice / environmental sound type determination unit 13 determines the type of environmental sound other than voice and voice uttered by a person with respect to the input audio signal.
- a determination method a method using GMM or a method using HMM can be considered.
- GMM and HMM created in advance for each type of environmental sound other than voice are stored, and the type of environmental sound closest to the input sound is selected.
- the method described in the document “Speech Language Information Processing 29-14“ Examination of Environmental Sound Identification Using HMM ”” can be referred to for a method for identifying these types of environmental sounds.
- the input signal analysis unit 1 includes the sound pressure level calculated by the sound pressure level calculation unit 10, the frequency information calculated by the frequency analysis unit 11, and the environmental sound type (voice, The type of environmental sound (car sound, rain sound, etc.) is output as sensitivity sound information.
- the sensitivity expression word search unit 22 inputs the sound pressure level, frequency information, and environmental sound type (type of environmental sound such as voice, car sound, rain sound) as sensitivity sound information, and selects a sensitivity expression word. To do. Therefore, the sensitivity expression word database 21 stores sensitivity expression words corresponding to the sensitivity sound information learned by considering not only the sound pressure level and frequency information but also the type of sound or environmental sound other than sound.
- the sensitivity expression word search unit 22 has a high frequency center of gravity and a low sound pressure level.
- the Kansei expression word “kankan” corresponding to “sounding metal” is searched.
- the type of sound generated when the audio signal is acquired is “sounding metal”
- the frequency center of gravity is low, and the sound pressure level is low, it corresponds to “sounding metal” Search for the emotional expression word "gangan”.
- the retrieved emotional expression word is output in accordance with text data, metadata such as Exif, a format used for a tag for moving image search, output of an emotional expression word by sound, and the like.
- the type of environmental sound is identified, so that a sensitivity expression word corresponding to the type of sound generated when the audio signal is acquired is selected. be able to.
- FIG. 12 is a block diagram of a sensitivity expression word selection system according to the fifth embodiment.
- the input signal analysis unit 1 includes an active determination unit 30 in addition to the fourth embodiment.
- the active determination unit 30 outputs an audio signal to the sound pressure level calculation unit 10, the frequency analysis unit 11, and the sound / environmental sound type determination unit 13 only when the audio signal is at a certain level.
- the operation of selecting the emotional expression word is performed, so that it is possible to prevent unnecessary processing of the emotional expression word selection.
- FIG. 13 is a block diagram of a Kansei expression word selection system according to the sixth embodiment.
- the sensitivity expression word selection system includes a computer 50 and a sensitivity expression word database 21.
- the computer 50 has a program memory 52 in which a program is stored, and a CPU 51 that is operated by the program.
- the CPU 51 performs processing similar to the operation of the sound level calculation unit 10 in the sound level calculation processing 100, performs processing similar to the operation of the frequency calculation unit 11 in the frequency calculation processing 101, and operates as the operation of the sound / environment determination unit 12.
- a similar process is performed in the voice / environment determination process 102, and a process similar to the operation of the affective expression word search unit 22 is performed in the affective expression word search process 200.
- the sentiment expression word database 21 may be stored inside the computer 50.
- the example corresponding to the third embodiment is taken as an example.
- the present invention is not limited to this, and the computer corresponding to the first, second, fourth, and fifth embodiments is used. Can also be realized.
- a sensitivity expression word selection system comprising: a sensitivity expression word selection unit that selects a sensitivity expression word that expresses a content felt by a person from sounds generated at the acquisition location based on the sensitivity sound information.
- the said signal analysis part analyzes at least any one of the sound pressure level of an audio signal, the frequency information showing the characteristic of the frequency of an audio signal, and the kind of sound of an audio signal, and the additional remarks which generate
- the sensitivity expression word selection system according to 1 or 2
- the sensitivity expression word selection unit When the fundamental frequency or the frequency centroid is low, select a sensitivity expression word that expresses a low sound, The sentiment expression word selection system according to appendix 3 or appendix 4, wherein the affective expression word that expresses a high sound is selected when the fundamental frequency or the frequency centroid is high.
- the sensitivity expression word selection unit When the frequency band is narrow and the fundamental frequency or the frequency centroid is low, a non-metallic impression that does not include a high-frequency sound is given, and a sensitivity expression word that expresses a low sound is selected, When the frequency band is wide and the fundamental frequency or the frequency center of gravity is high, a metallic impression including a high frequency sound is given, and a sensitivity expression word that expresses a high sound is selected.
- the Kansei expression word selection system according to any one of the above.
- the sensitivity expression word selection unit When the slope of the spectrum envelope is negative, select a sensitivity expression word with muddy sound as a sensitivity expression word of a dull impression, The sensitivity expression word selection system according to any one of supplementary notes 3 to 6, wherein a sensitivity expression word having no muddy sound is selected as a sensitivity expression word having a sharp impression when the slope of the spectrum envelope is positive.
- the sensitivity expression word selection unit As the sound pressure level increases and as the frequency centroid or the fundamental frequency decreases, select a sensitivity expression that expresses a powerful sound, As the sound pressure level decreases and as the frequency centroid or the fundamental frequency increases, select a sensitivity expression that expresses an unsatisfactory sound, As the sound pressure level decreases and as the frequency centroid or the fundamental frequency decreases, select a sensitivity expression that expresses a dull sound, The sensitivity expression word selection system according to any one of appendix 3 to appendix 7, wherein a sensitivity expression word that expresses a sharp sound is selected as the sound pressure level increases and the frequency centroid or the fundamental frequency increases. .
- the Kansei expression word selection unit selects a Kansei expression word that matches the sound type. Selection system.
- a sensitivity expression word selection method for selecting a sensitivity expression word that expresses a content that a person feels from sounds generated at the acquisition location based on the sensitivity sound information Analyzing an audio signal and generating sensory sound information related to the sound generated at the acquisition location of the audio signal.
- the said sensitive expression word is the Kansei expression word selection method of Additional remark 10 which is at least any one of an onomatopoeia and a mimetic word.
- Additional remark 13 The sensitivity expression word selection method of Additional remark 12 which selects the emotional expression word which expresses a noise as the said sound pressure level becomes large, when the said sensitivity sound information contains a sound pressure level.
- the sensitivity sound information includes a frequency band and a fundamental frequency or a frequency centroid
- the frequency band is narrow and the fundamental frequency or the frequency centroid is low, a non-metallic impression that does not include a high-frequency sound is given, and a sensitivity expression word that expresses a low sound is selected
- the frequency band is wide and the fundamental frequency or the frequency center of gravity is high, a metallic impression including a high frequency sound is given and a sensitivity expression word expressing a high sound is selected.
- the method for selecting a sensibility expression word according to any one of the above.
- the sensitivity sound information includes the sound pressure level and the frequency centroid or the fundamental frequency
- select a sensitivity expression expressing a powerful sound As the sound pressure level increases, and as the frequency centroid or the fundamental frequency decreases, select a sensitivity expression that expresses an unsatisfactory sound, As the sound pressure level decreases and as the frequency centroid or the fundamental frequency decreases, select a sensitivity expression that expresses a dull sound.
- the sensitivity expression word selection method according to any one of appendix 12 to appendix 16, wherein a sensitivity expression word that expresses a sharp sound is selected as the sound pressure level increases and the frequency center of gravity or the fundamental frequency increases.
- Signal analysis processing for analyzing audio signals and generating sensory sound information related to the sound generated at the acquisition location of the audio signals;
- a program for causing an information processing apparatus to execute a sensitivity expression word selection process for selecting a sensitivity expression word that expresses a content felt by a person from sounds generated at the acquisition location based on the sensitivity sound information.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Hospice & Palliative Care (AREA)
- Psychiatry (AREA)
- General Health & Medical Sciences (AREA)
- Child & Adolescent Psychology (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
第1の実施の形態を説明する。
第2の実施の形態を説明する。
第3の実施の形態を説明する。
第4の実施の形態を説明する。
第5の実施の形態を説明する。
第6の実施の形態を説明する。
前記感性音情報に基づいて、前記取得場所で発生している音から人が感じる内容を表現する感性表現語を選択する感性表現語選択部と
を有する感性表現語選択システム。
付記1に記載の感性表現語選択システム。
付記1又は付記2に記載の感性表現語選択システム。
前記感性表現語選択部は、前記音圧レベルが大きくなるにつれて、騒がしさを表現する感性表現語を選択する
付記3に記載の感性表現語選択システム。
前記感性表現語選択部は、
前記基本周波数又は前記周波数重心が低い場合には、低い音を表現する感性表現語を選択し、
前記基本周波数又は前記周波数重心が高い場合には、高い音を表現する感性表現語を選択する
付記3又は付記4に記載の感性表現語選択システム。
前記感性表現語選択部は、
前記周波数帯域が狭く、前記基本周波数又は前記周波数重心が低い場合には、高域の音を含まない非金属的な印象を与え、かつ、低い音を表現する感性表現語を選択し、
前記周波数帯域が広く、前記基本周波数又は前記周波数重心が高い場合には、高域の音を含む金属的な印象を与え、かつ、高い音を表現する感性表現語を選択する
付記3から付記5のいずれかに記載の感性表現語選択システム。
前記感性表現語選択部は、
前記スペクトル包絡の傾きが負の場合には、鈍い印象の感性表現語として濁音がある感性表現語を選択し、
前記スペクトル包絡の傾きが正の場合には鋭い印象の感性表現語として濁音がない感性表現語を選択する
付記3から付記6のいずれかに記載の感性表現語選択システム。
前記感性表現語選択部は、
前記音圧レベルが大きくなるにつれ、かつ、前記周波数重心又は前記基本周波数が低くなるにつれて、迫力のある音を表現する感性表現語を選択し、
前記音圧レベルが小さくなるにつれ、かつ、前記周波数重心又は前記基本周波数が高くなるにつれて、物足りない音を表現する感性表現語を選択し、
前記音圧レベルが小さくなるにつれ、かつ、前記周波数重心又は前記基本周波数が低くなるにつれて、鈍い音を表現する感性表現語を選択し、
前記音圧レベルが大きくなるにつれ、かつ、前記周波数重心又は前記基本周波数が高くなるにつれて、鋭い音を表現する感性表現語を選択する
付記3から付記7のいずれかに記載の感性表現語選択システム。
付記3から付記8のいずれかに記載の感性表現語選択システム。
前記感性音情報に基づいて、前記取得場所で発生している音から人が感じる内容を表現する感性表現語を選択する
感性表現語選択方法。
付記10に記載の感性表現語選択方法。
付記10又は付記11に記載の感性表現語選択方法。
付記12に記載の感性表現語選択方法。
前記基本周波数又は前記周波数重心が低い場合には、低い音を表現する感性表現語を選択し、
前記基本周波数又は前記周波数重心が高い場合には、高い音を表現する感性表現語を選択する
付記12又は付記13に記載の感性表現語選択方法。
前記周波数帯域が狭く、前記基本周波数又は前記周波数重心が低い場合には、高域の音を含まない非金属的な印象を与え、かつ、低い音を表現する感性表現語を選択し、
前記周波数帯域が広く、前記基本周波数又は前記周波数重心が高い場合には、高域の音を含む金属的な印象を与え、かつ、高い音を表現する感性表現語を選択する
付記12から付記14のいずれかに記載の感性表現語選択方法。
前記スペクトル包絡の傾きが負の場合には、鈍い印象の感性表現語として濁音がある感性表現語を選択し、
前記スペクトル包絡の傾きが正の場合には鋭い印象の感性表現語として濁音がない感性表現語を選択する
付記12から付記15のいずれかに記載の感性表現語選択方法。
前記前記音圧レベルが大きくなるにつれ、かつ、前記周波数重心又は前記基本周波数が低くなるにつれて、迫力のある音を表現する感性表現語を選択し、
前記音圧レベルが小さくなるにつれ、かつ、前記周波数重心又は前記基本周波数が高くなるにつれて、物足りない音を表現する感性表現語を選択し、
前記音圧レベルが小さくなるにつれ、かつ、前記周波数重心又は前記基本周波数が低くなるにつれて、鈍い音を表現する感性表現語を選択し、
前記音圧レベルが大きくなるにつれ、かつ、前記周波数重心又は前記基本周波数が高くなるにつれて、鋭い音を表現する感性表現語を選択する
付記12から付記16のいずれかに記載の感性表現語選択方法。
付記12から付記17のいずれかに記載の感性表現語選択方法。
前記感性音情報に基づいて、前記取得場所で発生している音から人が感じる内容を表現する感性表現語を選択する感性表現語選択処理と
を情報処理装置に実行させるプログラム。
2 感性表現語選択部
10 音圧レベル算出部
11 周波数解析部
12 音声・環境音判定部
13 音声・環境音種別判定部
21 感性表現データベース
22 感性表現語検索部
30 アクティブ判定部
50 コンピュータ
51 CPU
52 プログラムメモリ
Claims (19)
- オーディオ信号を分析し、前記オーディオ信号の取得場所で発生している音に関する感性音情報を生成する信号分析部と、
前記感性音情報に基づいて、前記取得場所で発生している音から人が感じる内容を表現する感性表現語を選択する感性表現語選択部と
を有する感性表現語選択システム。 - 前記感性表現語は、擬音語、及び擬態語の少なくともいずれかである
請求項1に記載の感性表現語選択システム。 - 前記信号分析部は、オーディオ信号の音圧レベル、オーディオ信号の周波数の特徴を現す周波数情報、及びオーディオ信号の音の種別の少なくともいずれかを分析し、感性音情報を生成する
請求項1又は請求項2に記載の感性表現語選択システム。 - 前記感性音情報が音圧レベルを含む場合、
前記感性表現語選択部は、前記音圧レベルが大きくなるにつれて、騒がしさを表現する感性表現語を選択する
請求項3に記載の感性表現語選択システム。 - 前記感性音情報が基本周波数又は周波数重心を含む場合、
前記感性表現語選択部は、
前記基本周波数又は前記周波数重心が低い場合には、低い音を表現する感性表現語を選択し、
前記基本周波数又は前記周波数重心が高い場合には、高い音を表現する感性表現語を選択する
請求項3又は請求項4に記載の感性表現語選択システム。 - 前記感性音情報が周波数帯域、及び基本周波数又は周波数重心を含む場合、
前記感性表現語選択部は、
前記周波数帯域が狭く、前記基本周波数又は前記周波数重心が低い場合には、高域の音を含まない非金属的な印象を与え、かつ、低い音を表現する感性表現語を選択し、
前記周波数帯域が広く、前記基本周波数又は前記周波数重心が高い場合には、高域の音を含む金属的な印象を与え、かつ、高い音を表現する感性表現語を選択する
請求項3から請求項5のいずれかに記載の感性表現語選択システム。 - 前記感性音情報がスペクトル包絡の傾きを含む場合、
前記感性表現語選択部は、
前記スペクトル包絡の傾きが負の場合には、鈍い印象の感性表現語として濁音がある感性表現語を選択し、
前記スペクトル包絡の傾きが正の場合には鋭い印象の感性表現語として濁音がない感性表現語を選択する
請求項3から請求項6のいずれかに記載の感性表現語選択システム。 - 感性音情報が音圧レベル、及び周波数重心又は基本周波数を含む場合、
前記感性表現語選択部は、
前記音圧レベルが大きくなるにつれ、かつ、前記周波数重心又は前記基本周波数が低くなるにつれて、迫力のある音を表現する感性表現語を選択し、
前記音圧レベルが小さくなるにつれ、かつ、前記周波数重心又は前記基本周波数が高くなるにつれて、物足りない音を表現する感性表現語を選択し、
前記音圧レベルが小さくなるにつれ、かつ、前記周波数重心又は前記基本周波数が低くなるにつれて、鈍い音を表現する感性表現語を選択し、
前記音圧レベルが大きくなるにつれ、かつ、前記周波数重心又は前記基本周波数が高くなるにつれて、鋭い音を表現する感性表現語を選択する
請求項3から請求項7のいずれかに記載の感性表現語選択システム。 - 前記感性音情報が音の種別を含む場合、前記感性表現語選択部は、音の種別に合った感性表現語を選択する
請求項3から請求項8のいずれかに記載の感性表現語選択システム。 - オーディオ信号を分析し、前記オーディオ信号の取得場所で発生している音に関する感性音情報を生成し、
前記感性音情報に基づいて、前記取得場所で発生している音から人が感じる内容を表現する感性表現語を選択する
感性表現語選択方法。 - 前記感性表現語は、擬音語、及び擬態語の少なくともいずれかである
請求項10に記載の感性表現語選択方法。 - オーディオ信号の音圧レベル、オーディオ信号の周波数の特徴を現す周波数情報、及びオーディオ信号の音の種別の少なくともいずれかを分析し、感性音情報を生成する
請求項10又は請求項11に記載の感性表現語選択方法。 - 前記感性音情報が音圧レベルを含む場合、前記音圧レベルが大きくなるにつれて、騒がしさを表現する感性表現語を選択する
請求項12に記載の感性表現語選択方法。 - 前記感性音情報が基本周波数又は周波数重心を含む場合、
前記基本周波数又は前記周波数重心が低い場合には、低い音を表現する感性表現語を選択し、
前記基本周波数又は前記周波数重心が高い場合には、高い音を表現する感性表現語を選択する
請求項12又は請求項13に記載の感性表現語選択方法。 - 前記感性音情報が周波数帯域、及び基本周波数又は周波数重心を含む場合、
前記周波数帯域が狭く、前記基本周波数又は前記周波数重心が低い場合には、高域の音を含まない非金属的な印象を与え、かつ、低い音を表現する感性表現語を選択し、
前記周波数帯域が広く、前記基本周波数又は前記周波数重心が高い場合には、高域の音を含む金属的な印象を与え、かつ、高い音を表現する感性表現語を選択する
請求項12から請求項14のいずれかに記載の感性表現語選択方法。 - 前記感性音情報がスペクトル包絡の傾きを含む場合、
前記スペクトル包絡の傾きが負の場合には、鈍い印象の感性表現語として濁音がある感性表現語を選択し、
前記スペクトル包絡の傾きが正の場合には鋭い印象の感性表現語として濁音がない感性表現語を選択する
請求項12から請求項15のいずれかに記載の感性表現語選択方法。 - 感性音情報が音圧レベル、及び周波数重心又は基本周波数を含む場合、
前記前記音圧レベルが大きくなるにつれ、かつ、前記周波数重心又は前記基本周波数が低くなるにつれて、迫力のある音を表現する感性表現語を選択し、
前記音圧レベルが小さくなるにつれ、かつ、前記周波数重心又は前記基本周波数が高くなるにつれて、物足りない音を表現する感性表現語を選択し、
前記音圧レベルが小さくなるにつれ、かつ、前記周波数重心又は前記基本周波数が低くなるにつれて、鈍い音を表現する感性表現語を選択し、
前記音圧レベルが大きくなるにつれ、かつ、前記周波数重心又は前記基本周波数が高くなるにつれて、鋭い音を表現する感性表現語を選択する
請求項12から請求項16のいずれかに記載の感性表現語選択方法。 - 前記感性音情報が音の種別を含む場合、前記音の種別に合った感性表現語を選択する
請求項12から請求項17のいずれかに記載の感性表現語選択方法。 - オーディオ信号を分析し、前記オーディオ信号の取得場所で発生している音に関する感性音情報を生成する信号分析処理と、
前記感性音情報に基づいて、前記取得場所で発生している音から人が感じる内容を表現する感性表現語を選択する感性表現語選択処理と
を情報処理装置に実行させるプログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/638,856 US9286913B2 (en) | 2010-03-30 | 2011-03-28 | Atmosphere expression word selection system, atmosphere expression word selection method, and program |
JP2012508289A JPWO2011122522A1 (ja) | 2010-03-30 | 2011-03-28 | 感性表現語選択システム、感性表現語選択方法及びプログラム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010-078123 | 2010-03-30 | ||
JP2010078123 | 2010-03-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011122522A1 true WO2011122522A1 (ja) | 2011-10-06 |
Family
ID=44712219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/057543 WO2011122522A1 (ja) | 2010-03-30 | 2011-03-28 | 感性表現語選択システム、感性表現語選択方法及びプログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US9286913B2 (ja) |
JP (1) | JPWO2011122522A1 (ja) |
WO (1) | WO2011122522A1 (ja) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014142627A (ja) * | 2013-01-24 | 2014-08-07 | ▲華▼▲為▼終端有限公司 | 音声識別方法および装置 |
JP2014142626A (ja) * | 2013-01-24 | 2014-08-07 | ▲華▼▲為▼終端有限公司 | 音声識別方法および装置 |
JP2015528969A (ja) * | 2012-08-02 | 2015-10-01 | マイクロソフト コーポレーション | 人間対話証明として読み上げる能力を使用すること |
JP2017187676A (ja) * | 2016-04-07 | 2017-10-12 | キヤノン株式会社 | 音声判別装置、音声判別方法、コンピュータプログラム |
JP2017211995A (ja) * | 2017-06-22 | 2017-11-30 | オリンパス株式会社 | 再生装置、再生方法、再生プログラム、音声要約装置、音声要約方法および音声要約プログラム |
US11562819B2 (en) * | 2018-03-05 | 2023-01-24 | Kaha Pte. Ltd. | Method and system for determining and improving behavioral index |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002057736A (ja) * | 2000-08-08 | 2002-02-22 | Nippon Telegr & Teleph Corp <Ntt> | データ伝送方法、データ伝送装置及びデータ伝送プログラムを記録した媒体 |
WO2008032787A1 (fr) * | 2006-09-13 | 2008-03-20 | Nippon Telegraph And Telephone Corporation | ProcÉDÉ de dÉtection de sensations, dispositif de dÉtection de sensations, programme de dÉtection de sensations contenant le procÉDÉ, et support d'enregistrement contenant le programme |
JP2008204193A (ja) * | 2007-02-20 | 2008-09-04 | Nippon Telegr & Teleph Corp <Ntt> | コンテンツ検索・推薦方法、コンテンツ検索・推薦装置およびコンテンツ検索・推薦プログラム |
WO2008134625A1 (en) * | 2007-04-26 | 2008-11-06 | Ford Global Technologies, Llc | Emotive advisory system and method |
WO2009090600A1 (en) * | 2008-01-16 | 2009-07-23 | Koninklijke Philips Electronics N.V. | System and method for automatically creating an atmosphere suited to social setting and mood in an environment |
JP2010258687A (ja) * | 2009-04-23 | 2010-11-11 | Fujitsu Ltd | 無線通信装置 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH06268722A (ja) | 1993-03-11 | 1994-09-22 | Hitachi Telecom Technol Ltd | ステレオ電話装置 |
JP2000081892A (ja) * | 1998-09-04 | 2000-03-21 | Nec Corp | 効果音付加装置および効果音付加方法 |
US7035873B2 (en) * | 2001-08-20 | 2006-04-25 | Microsoft Corporation | System and methods for providing adaptive media property classification |
JP2002318594A (ja) * | 2001-04-20 | 2002-10-31 | Sony Corp | 言語処理装置および言語処理方法、並びにプログラムおよび記録媒体 |
US6506148B2 (en) * | 2001-06-01 | 2003-01-14 | Hendricus G. Loos | Nervous system manipulation by electromagnetic fields from monitors |
JP2006033562A (ja) * | 2004-07-20 | 2006-02-02 | Victor Co Of Japan Ltd | 擬声語受信装置 |
CN101069213B (zh) * | 2004-11-30 | 2010-07-14 | 松下电器产业株式会社 | 场景修饰表现生成装置以及场景修饰表现生成方法 |
JP2007306597A (ja) | 2007-06-25 | 2007-11-22 | Yamaha Corp | 音声通信装置、音声通信システム、及び音声通信装置用プログラム |
CN102405495B (zh) * | 2009-03-11 | 2014-08-06 | 谷歌公司 | 使用稀疏特征对信息检索进行音频分类 |
CN102117614B (zh) * | 2010-01-05 | 2013-01-02 | 索尼爱立信移动通讯有限公司 | 个性化文本语音合成和个性化语音特征提取 |
US9224033B2 (en) * | 2010-11-24 | 2015-12-29 | Nec Corporation | Feeling-expressing-word processing device, feeling-expressing-word processing method, and feeling-expressing-word processing program |
US9183632B2 (en) * | 2010-11-24 | 2015-11-10 | Nec Corporation | Feeling-expressing-word processing device, feeling-expressing-word processing method, and feeling-expressing-word processing program |
WO2012070430A1 (ja) * | 2010-11-24 | 2012-05-31 | 日本電気株式会社 | 感性表現語処理装置、感性表現語処理方法および感性表現語処理プログラム |
US8183997B1 (en) * | 2011-11-14 | 2012-05-22 | Google Inc. | Displaying sound indications on a wearable computing system |
-
2011
- 2011-03-28 JP JP2012508289A patent/JPWO2011122522A1/ja active Pending
- 2011-03-28 WO PCT/JP2011/057543 patent/WO2011122522A1/ja active Application Filing
- 2011-03-28 US US13/638,856 patent/US9286913B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002057736A (ja) * | 2000-08-08 | 2002-02-22 | Nippon Telegr & Teleph Corp <Ntt> | データ伝送方法、データ伝送装置及びデータ伝送プログラムを記録した媒体 |
WO2008032787A1 (fr) * | 2006-09-13 | 2008-03-20 | Nippon Telegraph And Telephone Corporation | ProcÉDÉ de dÉtection de sensations, dispositif de dÉtection de sensations, programme de dÉtection de sensations contenant le procÉDÉ, et support d'enregistrement contenant le programme |
JP2008204193A (ja) * | 2007-02-20 | 2008-09-04 | Nippon Telegr & Teleph Corp <Ntt> | コンテンツ検索・推薦方法、コンテンツ検索・推薦装置およびコンテンツ検索・推薦プログラム |
WO2008134625A1 (en) * | 2007-04-26 | 2008-11-06 | Ford Global Technologies, Llc | Emotive advisory system and method |
WO2009090600A1 (en) * | 2008-01-16 | 2009-07-23 | Koninklijke Philips Electronics N.V. | System and method for automatically creating an atmosphere suited to social setting and mood in an environment |
JP2010258687A (ja) * | 2009-04-23 | 2010-11-11 | Fujitsu Ltd | 無線通信装置 |
Non-Patent Citations (1)
Title |
---|
KAZUSHI ISHIHARA: "Automatic Transformation of Environmental Sounds into Onomatopoeia Based on Japanese Syllable Structure", IEICE TECHNICAL REPORT, vol. 103, no. 154, pages 19 - 24 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015528969A (ja) * | 2012-08-02 | 2015-10-01 | マイクロソフト コーポレーション | 人間対話証明として読み上げる能力を使用すること |
US10158633B2 (en) | 2012-08-02 | 2018-12-18 | Microsoft Technology Licensing, Llc | Using the ability to speak as a human interactive proof |
JP2014142627A (ja) * | 2013-01-24 | 2014-08-07 | ▲華▼▲為▼終端有限公司 | 音声識別方法および装置 |
JP2014142626A (ja) * | 2013-01-24 | 2014-08-07 | ▲華▼▲為▼終端有限公司 | 音声識別方法および装置 |
US9607619B2 (en) | 2013-01-24 | 2017-03-28 | Huawei Device Co., Ltd. | Voice identification method and apparatus |
US9666186B2 (en) | 2013-01-24 | 2017-05-30 | Huawei Device Co., Ltd. | Voice identification method and apparatus |
JP2017187676A (ja) * | 2016-04-07 | 2017-10-12 | キヤノン株式会社 | 音声判別装置、音声判別方法、コンピュータプログラム |
JP2017211995A (ja) * | 2017-06-22 | 2017-11-30 | オリンパス株式会社 | 再生装置、再生方法、再生プログラム、音声要約装置、音声要約方法および音声要約プログラム |
US11562819B2 (en) * | 2018-03-05 | 2023-01-24 | Kaha Pte. Ltd. | Method and system for determining and improving behavioral index |
Also Published As
Publication number | Publication date |
---|---|
US9286913B2 (en) | 2016-03-15 |
JPWO2011122522A1 (ja) | 2013-07-08 |
US20130024192A1 (en) | 2013-01-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108305603B (zh) | 音效处理方法及其设备、存储介质、服务器、音响终端 | |
JP4327241B2 (ja) | 音声強調装置および音声強調方法 | |
WO2011122522A1 (ja) | 感性表現語選択システム、感性表現語選択方法及びプログラム | |
US20070038455A1 (en) | Accent detection and correction system | |
CN110149805A (zh) | 双向语音翻译系统、双向语音翻译方法和程序 | |
WO2011122521A1 (ja) | 情報表示システム、情報表示方法及びプログラム | |
US20060193671A1 (en) | Audio restoration apparatus and audio restoration method | |
RU2003129075A (ru) | Способ и система динамической адаптации синтезатора речи для повышения разборчивости синтезтруемой им речи | |
US11727949B2 (en) | Methods and apparatus for reducing stuttering | |
WO2022089097A1 (zh) | 音频处理方法、装置及电子设备和计算机可读存储介质 | |
JP4185866B2 (ja) | 音響信号処理装置および音響信号処理方法 | |
US20160034247A1 (en) | Extending Content Sources | |
JP2023527473A (ja) | オーディオ再生方法、装置、コンピュータ可読記憶媒体及び電子機器 | |
CN110910895B (zh) | 一种声音处理的方法、装置、设备和介质 | |
CN113781989B (zh) | 一种音频的动画播放、节奏卡点识别方法及相关装置 | |
JP2005070430A (ja) | 音声出力装置および方法 | |
CN112581935B (zh) | 环境感知语音辅助设备以及相关系统和方法 | |
CN115273826A (zh) | 歌声识别模型训练方法、歌声识别方法及相关装置 | |
JP2008040431A (ja) | 音声加工装置 | |
JP2011013383A (ja) | オーディオ信号補正装置及びオーディオ信号補正方法 | |
CN111696566A (zh) | 语音处理方法、装置和介质 | |
US20240233741A9 (en) | Controlling local rendering of remote environmental audio | |
JP2009020352A (ja) | 音声処理装置およびプログラム | |
JP4297433B2 (ja) | 音声合成方法及びその装置 | |
Peng | Multisensor Speech Enhancement Technology in Music Synthesizer Design |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11762746 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012508289 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13638856 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11762746 Country of ref document: EP Kind code of ref document: A1 |