EP1182645A1 - Emotionsausdrückende Vorrichtung und Verfahren zum Ausdruckbringen mehrerer Emotionen durch Stimmen - Google Patents
Emotionsausdrückende Vorrichtung und Verfahren zum Ausdruckbringen mehrerer Emotionen durch Stimmen Download PDFInfo
- Publication number
- EP1182645A1 EP1182645A1 EP01119055A EP01119055A EP1182645A1 EP 1182645 A1 EP1182645 A1 EP 1182645A1 EP 01119055 A EP01119055 A EP 01119055A EP 01119055 A EP01119055 A EP 01119055A EP 1182645 A1 EP1182645 A1 EP 1182645A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- voice data
- quasi
- emotion
- emotions
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 77
- 230000014509 gene expression Effects 0.000 title claims abstract description 62
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 63
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 63
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 52
- 238000013500 data storage Methods 0.000 claims abstract description 49
- 230000000694 effects Effects 0.000 claims description 18
- 230000009471 action Effects 0.000 description 46
- 230000008451 emotion Effects 0.000 description 22
- 238000001514 detection method Methods 0.000 description 19
- 238000010276 construction Methods 0.000 description 17
- 238000004364 calculation method Methods 0.000 description 12
- 230000033001 locomotion Effects 0.000 description 10
- 230000004044 response Effects 0.000 description 9
- 241001465754 Metazoa Species 0.000 description 8
- 241000282414 Homo sapiens Species 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 5
- 230000001815 facial effect Effects 0.000 description 4
- 230000010365 information processing Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000035807 sensation Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000002996 emotional effect Effects 0.000 description 2
- 235000003642 hunger Nutrition 0.000 description 2
- 230000033764 rhythmic process Effects 0.000 description 2
- 208000032140 Sleepiness Diseases 0.000 description 1
- 206010041349 Somnolence Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000037321 sleepiness Effects 0.000 description 1
- 210000001364 upper extremity Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
Definitions
- This invention relates to a quasi-emotion expression device for expressing a plurality of quasi-emotions through voices and to a method for expressing a plurality of quasi-emotions through voices.
- a device for expressing quasi-emotions of a pet type robot through voices has been comprised of, for example, a voice data storage section for storing voice data for each of a plurality of different quasi-emotions, a plurality of sensors for detecting stimuli from the outside, a quasi-emotion generation section for generating the intensity of each quasi-emotion based on the detection result of the sensors, a voice data reading section for reading, from the voice data storage section, voice data corresponding to a quasi-emotion with highest intensity of the quasi-emotions generated by the quasi-emotion generation section, and a voice output section for outputting a voice based on the voice data read by the voice data reading section.
- a voice is outputted based on the voice data corresponding to a quasi-emotion with highest intensity of the quasi-emotions generated by the quasi-emotion generation section, so that no more than one quasi-emotion generated by a pet type robot can be expressed at a time.
- the actual pet is not able to transmit distinctly each of a plurality of different emotions to an observer when it feels them simultaneously
- a pet type robot is developed capable of transmitting distinctly each of a plurality of quasi-emotions to an observer, it will provide attractiveness and cuteness not expected from an actual pet.
- said objective is solved by a quasi-emotion expression device for expressing a plurality of quasi-emotions through voices according to claim 1.
- said objective is solved by a method for expressing a plurality of quasi-emotions through voices according to claim 7.
- Said apparatus and said method are suited for transmitting distinctly each of a plurality of different quasi-emotions to an observer.
- Fig. 1 to Fig. 5 illustrate an embodiment of a voice synthesis device, a quasi-emotion expression device and a voice synthesizing method.
- the voice synthesis device, the quasi-emotion expression device and the voice synthesizing method are applied to a case where a plurality of different quasi-emotions generated by a pet type robot 1 are expressed through voices, as shown in Fig. 1.
- Fig. 1 is a block diagram of the same.
- the pet type robot 1 as shown in Fig. 1, is comprised of an external information input section 2 for inputting external information on stimuli, etc given from the outside; an internal information input section 3 for inputting internal information obtained within the pet type robot 1; a control section 4 for controlling quasi-emotions or actions of the pet type robot 1; and a quasi-emotion expression section 5 for expressing quasi-emotions or actions of the pet type-robot 1 based on the control result of the control section 4.
- the external information input section 2 comprises, as visual information input devices, a camera 2a for detecting user 6's face, gesture, position, etc, and an IR (infrared) sensor 2b for detecting surrounding obstacles; as an auditory information input device, a mike 2c for detecting user 6's utterance or ambient sounds; and further, as tactile information devices, a pressure sensitive sensor 2d for detecting stroking or patting by the user 6, a torque sensor 2e for detecting forces and torques in legs or forefeet of the pet type robot 1, and a potential sensor 4f for detecting positions of articulations of legs and forefeet of the pet type robot 1.
- the information from these sensors 2a-2f is outputted to the control section 4.
- the internal information input section 3 comprises a battery meter 3a for detecting information on hunger of the pet type robot 1, and a motor thermometer 3b for detecting information on fatigue of the pet type robot 1.
- the information from these sensors 3a, 3b is outputted to the control section 4.
- the control section 4 comprises a facial information detection device 4a and a gesture information detection device 4b for detecting facial information on the user 6 from signals of the camera 2a; a voice information detection device 4c for detecting voice information on the user 6 from signals of the mike 2c; a contact information detection device 4d for detecting tactile information on the user 6 from signals from the pressure sensitive sensor 2d; an environment detection device 4e for detecting environments from signals of the camera 2a, IR sensor 2b, mike 2c and pressure sensitive sensor 2d; and a movement detection device 4f for detecting movements and resistance forces of arms of the pet type robot 1 from signals of the torque sensor 2c and potential sensor 2f.
- the internal information recognition and processing device 4g is adapted to recognize internal information on the pet type robot 1 based on signals from the battery meter 3a and the motor thermometer 3b, and to output the recognition result to the storage information processing device 4h and the quasi-emotion generation device 4j.
- Fig. 2 is a block diagram of the same.
- the user and environment recognition device 4i comprises a user identification device 7 for identifying the user 6, a user condition distinction device 8 for distinguishing user conditions, a reception device 9 for receiving information on the user 6, and an environment recognition device 10 for recognizing surrounding environments.
- the user identification device 7 is adapted to identify the user 6 based on the information from the facial information detection device 4a and the voice information detection device 4c, and to output the identification result to the user condition distinction device 8 and the reception device 9.
- the user condition distinction device 8 is adapted to distinguish user 6's conditions based on the information from the facial information detection device 4a, the movement detection device 4f and the user identification device 7, and to output the distinction result to the quasi-emotion generation device 4j.
- the reception device 9 is adapted to input information separately from the gesture information detection device 4b, the voice information detection device 4c, the contact Information detection device 4d and the user identification device 7, and to output the received information to a characteristic action storage device 4m.
- the environment recognition device 10 is adapted to recognize surrounding environments based on the information from the environment detection device 4e, and to output the recognition result to the action determination device 4k.
- the quasi-emotion generation device 4j is adapted to generate a plurality of different quasi-emotions of the pet type robot 1 based on the information from the user condition distinction device 8 and quasi-emotion models in the storage information processing device 4h, and to output them to the action determination device 4k and the characteristic action storage and processing device 4m.
- the quasi-emotion models are calculation formulas used for finding parameters, such as grief, delight, fear, ashamed, fatigue, hunger and sleepiness, expressing quasi-emotions of the pet type robot 1, and generate quasi-emotions of the pet type robot 1 in response to the user information (user 6's temper or command) detected as voices or images and environmental information (lightness of the room or sound, etc) .
- Generation of the quasi-emotions is performed by generating the intensity of each quasi-emotion. For example, when the user 6 appears in front of the robot, a quasi-emotion of "delight” is emphasized by generating the quasi-emotion such that the intensity of the quasi-emotion of "delight” is "5" and that of a quasi-emotion of "anger” is "0,” and on the contrary, when a foreigner appears in front of the robot, the quasi-emotion of "anger” is emphasized by generating the quasi-emotion such that the intensity of the quasi-emotion of "delight” is "0" and that of the quasi-emotion of "anger” is “5.”
- the character forming device 4n is adapted to form the character of the pet type robot 1 into any of a plurality of different characters, such as "a quick-tempered one", “a cheerful one” and “a gloomy one", based on the information from the user and environment recognition device 4i, and to output the formed character of the pet type robot 1 as character data to the quasi-emotion generation device 4j and the action determination device 4k.
- the growing stage calculation device 4p is adapted to change the quasi-emotions of the pet type robot 1 through praising and scolding by the user, based on the information from the user and environment information recognition device 4j, to allow the pet type robot 1, and to out put the growth result as growth data to the action determination device 4k.
- the quasi-emotion models are prepared such that the pet type robot 1 moves childish when very young and moves matured as it grows.
- the growing process is specified, for example, as three stages of "childhood,” "youth” and "old age.”
- the characteristic action storage and processing device 4m is adapted to store and process characteristic actions such as actions through which the pet type robot 1 becomes tame gradually with the user 6, or actions of learning user 6's gestures, and to output the processed result to the action determination device 4k.
- the quasi-emotion expression section 5 comprises a visual emotion expression device 5a for expressing quasi-emotions visually, an auditory emotion expression device 5b for expressing quasi-emotions auditorily, and a tactile emotion expression device 5c for expressing quasi -emotions tactilely.
- the visual emotion expressing device 5a is adapted to drive movement mechanisms such as the face, arms and body of the pet type robot 1, based on action set parameters from an action set parameter setting device 12 (described later), and through the device 5a, the quasi-emotions of the pet type robot 1 are transmitted to the user 6 as attention or locomotion information (for example, facial expression, nodding or dancing) .
- the movement mechanisms may be, for example, actuators such as a motor, an electromagnetic solenoid, and a pneumatic or hydraulic cylinder.
- the auditory emotion expression device 5b is adapted to output voices by driving a speaker, based on voice data synthesized by a voice data synthesis device 15 (described later), and through the device 5b, the quasi-emotions of the pet type robot 1 are transmitted to the user 6 as tone or rhythm information (for example, cries).
- the tactile emotion expression device 5c is adapted to drive the movement mechanisms such as the face, arms and body, based on the action set parameters from the action set parameter setting device 12, and the quasi-emotions of the pet type robot 1 are transmitted to the user 6 as resistance force or rhythm information (for example, tactile sensation received by the user 6 when the robot performs a trick of "hand up").
- the movement mechanisms may be, for example, actuators such as a motor, an electromagnetic solenoid, and a pneumatic or hydraulic cylinder.
- Fig. 3 is a block diagram of the same.
- the action determination device 4k as shown in Fig. 3, comprises an action set selection device 11, an action set parameter setting device 12, an action reproduction device 13, a voice data registration data base 14 with voice data stored for each quasi-emotion, and a voice data synthesis device 15 for synthesizing voice data of the voice data registration data base.
- the action set selection device 11 is adapted to determine a fundamental action of the pet type robot 1 based on the information from the quasi-emotion generation device 4j, by referring to an action set (action library) of the storage information processing device 4h, and to output the determined fundamental action to the action set parameter setting device 12.
- action library sequences of actions are registered for specific expression of the pet type robot 1, for example, a sequence of actions of "moving each leg in a predetermined order" for the action pattern of "advancing,” and a sequence-of actions of "folding the hind legs in a sitting posture and put forelegs up and down alternately” for the action pattern of "dancing.”
- the action reproduction device 13 is adapted to correct an action set of the action set selection device 11 based on the action set of the characteristic action storage device 4m, and to output the corrected action set to the action set parameter setting device 12.
- the action set parameter setting device 12 is adapted to set action set parameters such as the speed at which the pet type robot 1 approaches the user 6, for example, the resistance force when it grips the user 6's hand, etc, and to output the set action set parameters to the visual emotion expressing device 5a and the tactile emotion expression device 5c.
- the voice data registration data base 14, as shown in Fig. 4, contains a plurality of voice data pieces, and voice data correspondence tables 100-104 in which voice data is registered corresponding to each quasi-emotion, one for each growing stage.
- Fig. 4 is a diagram showing the data structure of the voice data correspondence tables.
- the voice data correspondence table 100 is a table which is to be referred to when the growing stage of the pet type robot1 is in "childhood,” and in which are registered records, one for each quasi-emotion. These records are arranged such that they include a field 110 for voice data pieces 1i (i represents a record number) which are to be outputted when the character of the pet type robot 1 is "quick-tempered,” a field 112 for voice data pieces 2i which are to be outputted when the character of the pet type robot 1 is "cheerful, " and a field 114 for voice data pieces 3i which are to be outputted when the character of the pet type robot 1 is "gloomy.”
- the voice data correspondence table 102 is a table which is to be referred to when the growing stage of the pet type robot 1 is in "youth,” in which are registered records, one for each quasi-emotion. These records, like the records of the voice correspondence table 100, are arranged such that they include fields 110-114.
- the voice data correspondence table 104 is a table which is to be referred to when the growing stage of the pet type robot 1 is in "old age,” in which are registered records, one for each quasi-emotion. These records, like the records of the voice correspondence table 100, are arranged such that they include fields 110-114.
- voice data to be outputted for each quasi-emotion can be identified in response to the growing stage and the character of the pet type robot 1.
- the growing stage of the pet type robot 1 is in "childhood,” so that when its character is "cheerful,” it is seen that music data 11 may be read for the quasi-emotion of "delight,” and music data 12 for the quasi-emotion of "sorrow,” and music data 13 for the quasi-emotion of "anger.”
- the voice data synthesis device 15 is comprised of a CPU, a ROM, a RAM, an I/F, etc connected by bus, and further includes a voice data synthesis IC having a plurality of channels for synthesizing and outputting voice data preset for each channel.
- the CPU of the voice data synthesis device 15 is made of a microprocessing unit, etc, and adapted to start a given program stored in a given region of the ROM and to execute voice data synthesis processing shown by the flow chart in Fig. 5 by interruption at given time intervals (for example, 100ms) according to the program.
- Fig. 5 is a flow chart showing the voice data synthesis procedure.
- the voice data synthesis procedure is one through which voice data corresponding to each quasi-emotion generated by the quasi-emotion generation device 4j is read from the voice data registration data base 14 and synthesized, based on the information from the user and environment information recognition device 4i, the quasi-emotion generation device 4j, the character forming device 4n and the growing stage calculation device 4p, and when executed by the CPU, first, as shown in Fig. 5, the procedure proceeds to step S100.
- step S100 after determined whether or not a voice stopping command has been entered from the control device 4, etc, it is determined whether or not voice output is to be stopped. If it is determined that the voice output is not stopped (No), the procedure proceeds to step S102, where it is determined whether or not voice data is to be updated, and if it is determined that the voice data is updated (Yes), the procedure proceeds to step S104.
- step S104 one of the voice data correspondence tables 100-106 is identified, based on the growth data from the growing stage calculation device 4p, and the procedure proceeds to step S106, where a field from which the voice data is read, is identified from among the fields in the voice data correspondence table identified at step S104, based on the character data from the character forming device 4n. Then, the procedure proceeds to step S108.
- step S108 voice output time necessary to measure the length of time that has elapsed from the start of the voice output, is set to "0," and the procedure proceeds to step S110, where voice data corresponding to each quasi-emotion generated by the quasi-emotion generation device 4j is read from the voice data registration data base 14, by referring to the field identified at step S106 from among the fields in the voice data correspondence table identified at step S104. Then, the procedure proceeds to step S112.
- a volume parameter of the voice volume is determined such that the read-out voice data has the voice volume in response to the intensity of the quasi-emotion generated by the quasi-emotion generation device 4j, and the procedure proceeds to step S114, where other parameters for specifying the total volume, tempo or other acoustic effects are determined. Then, the procedure proceeds to step S116, where voice output time is added, and to step S118.
- step S118 it is determined whether or not the voice output time exceeds a predetermined value (upper limit of the output time specified for each voice data piece), and if it is determined that the voice output time is less than the predetermined value (No), the procedure proceeds to step S120, where the determined voice parameters and the read-out voice data are preset for each channel in the voice data synthesis IC. A series of processes is then completed and the procedure is returned to the original processing.
- a predetermined value upper limit of the output time specified for each voice data piece
- step S118 if it is determined that the voice output time is exceeds a predetermined value (Yes), the procedure proceeds to step S122, where an output stopping flag is set indicative of whether or not the voice output is to be stopped, and the procedure proceeds to step S124, where a stopping command to stop the voice output is outputted to the voice data synthesis IC to thereby stop the voice output. Then a series of processes is completed and the procedure is returned to the original processing.
- step S102 if it is determined that the voice data is not updated (No), the procedure proceeds to step S110.
- step S110 if it is determined that the voice output is stopped (Yes), the procedure proceeds to step S126, where a stopping command to stop the voice output is outputted to the voice data synthesis IC to thereby stop the voice output. Then, a series of processes is completed and the procedure is returned to the original processing.
- the stimuli are given to the pet type robot 1 by a user stroking or speaking, for example, to the robot, the stimuli are recognized by the sensors 2a-2f, the detection devices 4a-4f and the user and environment information recognition device 4i, and the intensity of each quasi-emotion is generated by the quasi-emotion generation device 4j, based on the recognition result. For example, if it is assumed that the robot has quasi-emotions of "delight,” “sorrow,” “anger,” “surprise,” “hatred” and “terror,” the intensity of each quasi-emotion is generated as having the grades of "5,” “4,” “3,” “2” and “1.”
- the character of the pet type robot 1 is formed by the character forming device 4n into any of a plurality of characters such as "a quick-tempered one,” “a cheerful one” and “a gloomy one,” based on the information from the user and environment recognition device 4i, and the formed character is outputted as character data.
- the quasi-emotions of the pet type robot 1 are changed by the growing stage calculation device 4p to allow the pet type robot 1 to grow, based on the information from the user and environment information recognition device 4j, and the growth result is outputted as growth data.
- the growing process changes through three stages of "childhood,” “youth” and "old age” in this order.
- one of the voice data correspondence tables 100-106 is identified by the voice data synthesis device 15 at steps S104-S106, based on the growth data from the growing stage calculation device 4p, and a field from which voice data is read, is identified from among the fields in the identified voice data correspondence table, based on the character data from the character forming device 4n. For example, if the growing stage is in "childhood" and the character is "quick-tempered," the voice correspondence table 100 is identified as a voice data correspondence table, and the field 100 as a field from which voice data is read.
- voice data corresponding to each quasi-emotion generated by the quasi-emotion generation device 4j is read from the voice data registration data base 14, by referring to the field identified from among the fields in the identified voice data correspondence table, and a voice parameter of the voice volume is determined such that the read-out voice data has the voice volume in response to the intensity of the quasi-emotion generated by the quasi-emotion generation device 4j.
- the determined voice parameter and read-out voice data are preset for each channel in the voice data synthesis IC, and voice data is synthesized by the voice data synthesis IC, based on the preset voice parameter, to be outputted to the auditory emotion expression device 5c.
- Voices are outputted by the auditory emotion expression device 5c, based on the voice data synthesized by the voice data synthesis device 15.
- voice data corresponding to each quasi-emotion is synthesized and a voice is outputted with the voice volume in response to the intensity of each quasi-emotion. For example, if a quasi-emotion of "delight” is strong, the voice corresponding to the quasi-emotion of "delight” of output voices is outputted with relatively large volume, and if a quasi-emotion of "anger” is strong, the voice corresponding to the quasi-emotion of "anger” is outputted with relatively large volume.
- stimuli given from the outside are recognized; a plurality of quasi-emotions are generated, based on the recognition result; voice data corresponding to each quasi-emotion generated is read from the voice data registration data base 14 and synthesized; and a voice is outputted, based on the synthesized voice data.
- a voice corresponding to each quasi-emotion is synthesized to be outputted, so that each of a plurality of different quasi-emotions can be transmitted relatively distinctly to a user.
- attractiveness and cuteness not expected from an actual pet can be expressed.
- the character of the pet type robot 1 is formed into any of a plurality of different characters; and voice data corresponding to each quasi-emotion generated is read from the voice data registration data base 14 and synthesized, by referring to a field corresponding to the formed character of the fields in the voice data correspondence table.
- growing stages of the pet type robot 1 are specified; and voice data corresponding to each quasi-emotion generated is read from the voice data registration data base 14 and synthesized, by referring to a voice data correspondence table corresponding to the specified growing stage.
- the intensity of each quasi-emotion is generated; and the read-out voice data is synthesized such that it has the voice volume in response to the intensity of the generated quasi-emotion.
- the intensity of each of a plurality of different quasi-emotions can be transmitted relatively distinctly to a user.
- attractiveness and cuteness not expected from an actual pet can be expressed further.
- the voice data registration data base 14 corresponds to the voice data storage means of the claims; the quasi-emotion generation device 4j to the quasi-emotion generation means of the claims; the voice data synthesis device 15 to the voice data synthesis means of the claims; and the auditory emotion expression device 5b to the voice output means of the claims.
- the sensors 2a-2f, the detection devices 4a-4f and the user and environment information recognition device 4i correspond to the stimulus recognition means of the claims; the character forming device 4n to the character forming means of the claims; and the growing stage calculation device 4p to the growing stage specifying means of the claims.
- a different synthesized voice is outputted for each character or each growing stage, alternatively, it may be arranged such that a switch for selecting the voice data correspondence table is provided at a position accessible to a user for switching, and voice data corresponding to each quasi-emotion generated is read from the voice data registration data base 14 and synthesized, by referring to the voice data correspondence- table selected by the switch.
- voice data is stored in the voice data registration data base 14 in advance, alternatively, voice data downloaded from the internet, etc, or voice data read from a portable storage medium, etc, may be registered in the voice data registration data base 14.
- the contents of the voice data correspondence tables 100-104 are registered in advance, alternatively, they may be registered and compiled a discretion of a user.
- the read-out voice data is synthesized such that it has the voice volume in response to the intensity of the generated quasi-emotion, alternatively, it may be arranged such that an effect is given, for example, of changing the voice frequency or the voice pitch in response to the intensity of the generated quasi-emotion.
- voice data may be synthesized, based on the information from the user condition recognition device 8. For example, if it is recognized that the user is in a good tamper, movement may be accelerated to produce a light feeling, or on the contrary, if it is recognized that the user is not in a good temper, total voice volume is decreased to keep quiet conditions.
- voice data may be synthesized, based on the information from the environment recognition device 10. For example, if it is recognized that it is light in the surrounding environment, movement may be accelerated to produce a light feeling, or if it is recognized that it is calm in the surrounding environment, total voice volume is decreased to keep quiet conditions.
- voice output may be stopped or resumed in response to stimuli given from the outside, for example, by a voice stopping switch provided in the pet type robot 1.
- three growing stages are specified, alternatively, two stages, or four or more stages may be specified. If growing stages increase in number or have a continuous value, a great number of voice data correspondence tables must be prepared, which increases the memory occupancy ratio. In such a case, voice data may be identified using a given calculation formula based on the growing stage, or voice data to be synthesized is given a certain acoustic effect based on the growing stage, using a given calculation formula.
- characters of the pet type robot 1 are divided into three categories, alternatively, they may be divided into two, or four or more categories. If characters of the pet type robot 1 increase in number or have a continuous value, a great number of voice data correspondence tables must be prepared, which increases the memory occupancy ratio. In such a case, voice data may be identified using a given calculation formula based on the growing stage, or voice data to be synthesized may be given a certain acoustic effect based on the growing stage, using a given calculation formula.
- the voice data synthesis IC is provided in the voice synthesis device 15, alternatively, it may be provided in the auditory emotion expression device 5b.
- the voice data synthesis device 15 is arranged such that voice data read from the voice data registration data base 14 is outputted to each channel in the voice data synthesis IC.
- the voice data registration data base 14 is used as a built-in memory of the pet type robot 1, alternatively, it may be used as a memory mounted detachably to the pet type robot 1.
- a user may remove the voice data registration data base 14 from the pet type robot 1 and mount it back to the pet type robot 1 after writing new voice data on an outside PC, to thereby update the contents of the voice data registration data base 14.
- voice data compiled originally on an outside PC may be used, as well as voice data obtained by an outside PC through networks such as the internet, etc.
- a user is able to enjoy new quasi-emotion expressions of the pet type robot 1.
- an interface and a communication device for communicating with outside sources through the interface may be provided in the pet type robot 1, and the interface may be connected to networks such as the internet, etc, or PCs storing voice data, for communication by radio or cables, so that voice data in the voice data registration data base 14 may be updated by downloading the voice data from networks or PCs.
- a voice data registration data base 14, a voice data synthesis device 15 and an auditory emotion expression device 5b alternatively, the voice registration data base 14, the voice data synthesis device 15 and the auditory emotion expression device 56 may be modularized integrally, and the modularized unit may be mounted detachably to a portion of the auditory emotion expression device 5b in Fig. 3. That is, when the existing pet type robot is required to perform quasi-emotion expression according to the voice synthesizing method of this embodiment, in place of the existing auditory emotion expression device 5b, the above described module may be mounted. In such a construction, emotion expression according to the voice synthesizing method of this embodiment can be performed relatively easily, without need of changing the construction of the existing pet type robot to a large extent.
- the storage medium includes a semiconductor storage medium such as a RAM, a ROM or the like, a magnetic storage medium such as an FD, an HD or the like, an optically readable storage medium such as a CD, a CVD, an LD, a DVD or the like, and a magnetic storage/optically readable storage medium such as an MD or the like, and further any storage medium readable by a computer, whether the reading methology is electrical, magnetic or optical.
- the voice synthesis device, the quasi-emotion expression device and the voice synthesizing method according to this embodiment are applied, as shown in Fig. 1, to a case where a plurality of different quasi-emotions generated are expressed through voices, alternatively, it but may be applied to other cases to the extent that they fall within the spirit of this invention.
- this embodiment may be applied to a case where a plurality of different quasi-emotions are expressed through voices in a virtual pet type robot implemented by software on a computer.
- the embodiment described above teaches a voice synthesis device applied to a quasi-emotion expression device which utilizes quasi-emotion generation means for generating a plurality of different quasi-emotions to express said plurality of quasi-emotions through voices, wherein when voice data storage means is provided in which voice data is stored for each of said quasi-emotions, voice data corresponding to each quasi-emotion generated by said quasi-emotion generating means is read from said voice data storage means and synthesized.
- voice data corresponding to each quasi-emotion generated by the quasi-emotion generation means is read from the voice data storage means and synthesized.
- voice data includes, for example, voice data in which voices of human beings or animals are recorded, musical data in which music is recorded, or sound effect data in which sound effect is recorded.
- the embodiment can be applied not only to the pet type robot, but also, for example, to a virtual pet type robot implemented on a computer through software.
- quasi-emotion generation means may be utilized for generating a plurality of quasi-emotions, for example, based on stimuli given from the outside, and in the latter case, quasi-emotion generation means may be utilized for generating a plurality of quasi-emotions, for example, based on the contents inputted into a computer by a user.
- the same is true for the voice synthesis device set forth in claim 2 and the voice synthesizing method set forth in claim 9.
- an embodiment of a voice synthesis device applied to a quasi-emotion expression device which utilizes quasi-emotion generation means for generating a plurality of different quasi-emotions to express said plurality of quasi-emotions through voices, said device comprising voice data storage means for storing voice data for each of said quasi-emotions; and voice data synthesis means for reading from said voice data storage means and synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means can be taken.
- voice data corresponding to each quasi-emotion generated by the quasi-emotion generation means is read from the voice data storage means and synthesized.
- the voice data storage means which stores voice data by all possible means and at all times, may be one in which voice data has been stored in advance, or one in which in stead of the voice data being stored in advance, it is stored as input data from the outside during operation of this device.
- a voice corresponding to each quasi-emotion is synthesized, so that each of a plurality of different quasi-emotions can be transmitted relatively distinctly to an observer.
- attractiveness and cuteness not expected from an actual pet can be expressed.
- the quasi-emotion expression device is characterized by a device for expressing a plurality of quasi-emotions through voices, comprising voice data storage means for storing voice data for each of said quasi-emotions; quasi-emotion generation means for generating said plurality of quasi-emotions; voice data synthesis means for reading from said voice data storage means and synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means; and voice output means for outputting a voice based on voice data synthesized by said voice data synthesis means.
- a plurality of quasi-emotions are generated by the quasi-emotion generation means, and through the voice data synthesis means, voice data corresponding to each quasi-emotion generated is read from the voice data storage means and synthesized.
- a voice is outputted, based on the synthesized voice data, by the voice output means.
- the embodiment can be applied not only to the pet type robot, but also, for example, to a virtual pet type robot implemented on a computer through software.
- the quasi-emotion generation means may generate a plurality of quasi-emotions, for example, based on stimuli given from the outside, and in the latter case, the quasi-emotion generation means may generate a plurality of quasi-emotions, for example, based on the contents inputted into a computer by a user.
- the quasi-emotion expression device is characterized by a device for expressing a plurality of quasi-emotions through voices, comprising voice data storage means for storing voice data for each of said quasi-emotions; stimulus recognition means for recognizing stimuli given from the outside; quasi-emotion generation means for generating said plurality of quasi-emotions based on the recognition result of said stimulus recognition means; voice data synthesis means for reading from said voice data storage means and synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means; and voice output means for outputting a voice based on voice data synthesized by said voice data synthesis means.
- stimuli refer to not only ones that are perceivable by the five senses of human beings or animals, but also to ones that are detectable by detection means even if they are not perceivable by the five senses of human beings or animals.
- the stimulus recognition means may be provided, for example, with image input means such as a camera when recognizing stimuli perceivable by visual sensation of human beings or animals, and tactile detection means such as a pressure sensor or a tactile sensor when recognizing stimuli perceivable by tactile sensation of human beings or animals.
- a voice corresponding to each quasi-emotion is synthesized to be outputted, so that each of a plurality of different quasi-emotions can be transmitted relatively distinctly to an observer.
- attractiveness and cuteness not expected from an actual pet can be expressed.
- the quasi-emotion expression device further comprises character forming means for forming any of a plurality of different characters, wherein said voice data storage means is capable of storing, for each of said characters, a voice data correspondence table in which said voice data is registered corresponding to each of said quasi-emotions; and said voice data synthesis means is adapted to read from said voice storage means and synthesize voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means, by referring to a voice data correspondence table corresponding to a character formed by said character forming means.
- any of a plurality of different characters is formed by the character forming means, and through the voice data synthesis means, voice data corresponding to each quasi-emotion generated by the quasi-emotion expression means is read from the voice data storage means and synthesized, by referring to a voice data correspondence table corresponding to the formed character.
- the voice data storage means which stores voice data correspondence tables by all possible means and at all times, may be one in which voice data correspondence tables have been stored in advance, or one in which in spite of the voice data correspondence tables being stored in advance, the voice data correspondence tables are stored as input information from the outside during operation of the device.
- a different synthesized voice can be outputted for each character, so that each of a plurality of different characters can be transmitted relatively distinctly to an observer.
- attractiveness and cuteness not expected from an actual pet can be expressed.
- the quasi-emotion expression device further comprises growing stage specifying means for specifying growing stages, wherein said voice data storage means is capable of storing, for each of said growing stages, a voice data correspondence table in which said voice data is registered corresponding to each of said quasi-emotions; and said voice data synthesis means is adapted to read from said voice storage means and synthesize voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means, by referring to a voice data correspondence table corresponding to a growing stage specified by said growing stage specifying means.
- growing stages are specified by the growing stage specifying means, and through the voice data synthesis means, voice data corresponding to each quasi-emotion generated by the quasi-emotion expression means is read from the voice data storage means and synthesized, by referring to a voice data correspondence table corresponding to the specified growing stage.
- a different synthesized voice can be outputted for each growing stage, so that each of a plurality of growing stages can be transmitted relatively distinctly to an observer.
- attractiveness and cuteness not expected from an actual pet can be expressed.
- said voice data storage means is capable of storing a plurality of voice data correspondence tables in which said voice data is registered corresponding to each of said quasi-emotions; table selection means is provided for selecting any of said plurality of voice data correspondence tables; and said voice data synthesis means is adapted to read from said voice storage means and synthesize voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means, by referring to a voice data correspondence table selected by said table selection means.
- voice data corresponding to each quasi-emotion generated by the quasi-emotion expression means is read from the voice data storage means and synthesized, by referring to the selected voice data correspondence table.
- the selection means may be adapted to select the voice data correspondence table by hand, or based on random numbers or a given condition.
- a different synthesized voice can be outputted for each selection by the selection means, so that attractiveness and cuteness not expected from an actual pet can be expressed.
- said quasi-emotion generation means is adapted to generate the intensity of each of said quasi-emotions; and said voice data synthesis means is adapted to produce an acoustic effect equivalent to the intensity of the quasi-emotion generated by said quasi-emotion generation means and synthesize said voice data.
- the intensity of each quasi-emotion is generated by the quasi-emotion generation means, and through the voice data synthesis means, an acoustic effect equivalent to the intensity of the generated quasi-emotion is given to the read-out voice data and the voice data is synthesized.
- the acoustic effect refers to one that changes voice data such that the voice outputted based on the voice data is changed before and after the acoustic effect is given, and includes, for example, an effect of changing the volume of the voice, an effect of changing the frequency of the voice, or an effect of changing the pitch of the voice.
- the intensity of each of a plurality of different quasi-emotions can be transmitted relatively distinctly to an observer.
- attractiveness and cuteness not expected from an actual pet can be expressed.
- the voice synthesizing method is characterized by a voice synthesizing method applied to a quasi-emotion expression device which utilizes quasi-emotion generation means for generating a plurality of different quasi-emotions to express said plurality of quasi-emotions through voices, wherein when voice data storage means is provided in which voice data is stored for each of said quasi-emotions, voice data corresponding to each quasi-emotion generated by said quasi-emotion generating means is read from said voice data storage means and synthesized.
- the first voice synthesizing method is characterized by a method that may be applied to a quasi-emotion expression device which utilizes quasi-emotion generation means for generating a plurality of different quasi-emotions to express said plurality of quasiemotions through voices, said method including steps of storing voice data for each of said quasi-emotions to voice data storage means, and reading from said voice data storage means and synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means.
- the first voice synthesizing method may be applied not only to the pet type robot, but also, for example, to a virtual pet type robot implemented on a computer through software.
- quasi-emotion generation means maybe utilized for generating a plurality of quasi-emotions, for example, based on stimuli given from the outside
- quasi-emotion generation means may be utilized for generating a plurality of quasi-emotions, for example, based on the contents inputted into a computer by a user.
- the first quasi-emotion expressing method is characterized by a method for expressing a plurality of quasi-emotions through voices, including steps of storing voice data for each of said quasi-emotions to the voice data storage means, generating said plurality of quasi-emotions, reading from said voice data storage means and synthesizing voice data corresponding to each quasi-emotion generated at said quasi-emotion generating step, and outputting a voice based on voice data synthesized at said voice data synthesizing step.
- the first quasi-emotion expressing method can be applied not only to the pet type robot, but also, for example, to a virtual pet type robot implemented on a computer through software.
- the quasi-emotion generating step are generated a plurality of quasi-emotions, for example, based on stimuli given from the outside, and in the latter case, at the quasi-emotion generating step are generated a plurality of quasi-emotions, for example, based on the contents inputted into a computer by a user.
- the second quasi-emotion expressing method is characterized by a method of expressing a plurality of quasi-emotions through voices, including steps of storing voice data for each of said quasi-emotions to the voice data storage means, recognizing stimuli given from the outside, generating said plurality of quasi-emotions based on the recognition result of said stimulus recognizing step, reading from said voice data storage means and synthesizing voice data corresponding to each quasi-emotion generated at said quasi-emotion generating step, and outputting a voice based on voice data synthesized at said voice data synthesizing step.
- the third quasi-emotion expressing method is characterized by either of the first and the second quasi-emotion expressing method, further including a step of forming any of a plurality of different characters, wherein at said voice data storing step is stored, for each of said characters in said voice data storage means, a voice data correspondence table in which said voice data is registered corresponding to each of said quasi-emotions, and at said voice data synthesizing step is read from said voice storage means and synthesized voice data corresponding to each quasi-emotion generated at said quasi-emotion generating step, by referring to a voice data correspondence table corresponding to a character formed at said character forming step.
- the fourth quasi-emotion expressing method is characterized by any of the first through the third quasi-emotion expressing method, further including a step of specifying growing stages, wherein at said voice data storing step is stored, for each of said growing stages in said voice data storage means, a voice data correspondence table in which said voice data is registered corresponding to each of said quasi-emotions, and at said voice data synthesizing step is read from said voice storage means and synthesized voice data corresponding to each quasi-emotion generated at said quasi-emotion generating step, by referring to a voice data correspondence table corresponding to a growing stage specified at said growing stage specifying step.
- the fifth quasi-emotion expressing method is characterized by any of the first through the fourth quasi-emotion expressing method, wherein at said voice data storing step are stored, in said voice data storage means, a plurality of voice data correspondence tables in which said voice data is registered corresponding to each of said quasi-emotions, a step is included of selecting any of said plurality of voice data correspondence tables, and at said voice data synthesizing step is read from said voice storage means and synthesized voice data corresponding to each quasi-emotion generated at said quasi-emotion generating step, by referring to a voice data correspondence table selected at said table selecting step.
- the selecting step may be selected the voice data correspondence table by hand, or based on random numbers or a given condition.
- the sixth quasi-emotion expressing method is characterized by any of the first through fifth quasi-emotion expressing method, wherein at said quasi-emotion generating step is generated the intensity of each of said quasi-emotions, and at said voice data synthesizing step is produced an acoustic effect equivalent to the intensity of the quasi-emotion generated at said quasi-emotion generating step and synthesized said voice data.
- voice synthesis devices quasi-emotion expression devices and voice synthesizing methods have been suggested to achieve the foregoing object, but in addition to these devices, the following storage medium can also be suggested.
- This storage medium is characterized by a computer readable storage medium for storing a quasi-emotion expression program for expressing a plurality of different quasi-emotions through voices, wherein a program is stored for executing processing implemented by quasi-emotion generation means for generating said plurality of quasi-emotions, voice data synthesis means for reading from said voice data storage means and synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means, and voice output means for outputting a voice based on voice data synthesized by said voice data synthesis means, on a computer with voice data storage means for storing voice data on each of said quasi-emotions.
- the quasi-emotion expression program stored in the storage medium is read by a computer and the computer runs according to the read-out program.
- the embodiment described above teaches a quasi-emotion expression device for expressing a plurality of quasi-emotions through voices, especially for a pet-robot, comprising quasi-emotion generation means 4j for generating said plurality of quasi-emotions, voice data storage means 14 for storing voice data for each of said quasi-emotions; and voice data synthesis means 15 for reading voice data from said voice data storage means 14 and synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means 4j.
- Said quasi-emotion expression device further comprises a voice output means 5b for outputting a voice based on voice data synthesized by said voice data synthesis means 15.
- the quasi-emotion expression device further comprises a stimulus recognition means 4i for recognizing stimuli given from an outside.
- the quasi-emotion generation means 4j is provided for generating said plurality of quasi-emotions based on recognition results of said stimulus recognition means 4i.
- Said voice data storage means 14 is provided for storing a plurality of voice data corresponding to each of said quasi-emotions on at least one voice data correspondence table 100,102,104.
- a plurality of voice data correspondence tables 100,102,104 are provided, and a table selection means is provided for selecting any of said plurality of voice data correspondence tables 100,102,104, and said voice data synthesis means 15 is provided for reading voice data from said selected voice data correspondence table 100,102,104 and for synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means 4j.
- the quasi-emotion expression device further comprises a character forming means 4n for forming any of a plurality of different characters.
- Said voice data storage means 14 is provided for storing voice data for each of said characters corresponding to each of said quasi-emotions and according to the character on a voice data correspondence table 100,102,104, and said voice data synthesis means 15 is provided for reading voice data from said voice data correspondence table 100,102,104 and for synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means 4j and corresponding to the character formed by said character forming means 4n.
- the embodiment comprises growing stage specifying means 4p for specifying growing stages.
- Said voice data storage means 14 is provided for storing voice data for each of said growing stages corresponding to each of said quasi-emotions and according to the growing stage on a voice data correspondence table 100,102,104, and said voice data synthesis means 15 is provided reading voice data from said voice data correspondence table 100,102,104 and for synthesizing voice data corresponding to each quasi-emotion generated by said quasi-emotion generation means 4j and corresponding to a growing stage specified by said growing stage specifying means 4p.
- said quasi-emotion generation means 4j is provided for generating an intensity of each of said quasi-emotions
- said voice data synthesis means 15 is provided for producing an acoustic effect equivalent to the intensity of the quasi-emotion generated by said quasi-emotion generation means 4j and for synthesizing the related voice data.
- the method for expressing a plurality of quasi-emotions through voices comprises the steps of: generating said plurality of quasi-emotions, storing voice data for each of said quasi-emotions; reading from said voice data and synthesizing voice data corresponding to each quasi-emotion as generated.
- Said method further comprises outputting a voice based on voice data as synthesized.
- the method for expressing a plurality of quasi-emotions through voices further comprises recognizing stimuli given from an outside and generating said plurality of quasi-emotions based on recognition results.
- the method for expressing a plurality of quasi-emotions through voices further comprises storing a plurality of voice data corresponding to each of said quasi-emotions on at least one voice data correspondence table.
- Method for expressing a plurality of quasi-emotions through voices further comprises the steps of: forming any of a plurality of different characters, storing voice data for each of said characters corresponding to each of said quasi-emotions and according to the character on a voice data correspondence table, and reading voice data from said voice data correspondence table and synthesizing voice data corresponding to each quasi-emotion as generated and corresponding to the character as formed.
- said method comprises the steps of: specifying growing stages, storing voice data for each of said growing stages corresponding to each of said quasi-emotions and according to the growing stage on a voice data correspondence table, and reading voice data from said voice data correspondence table and synthesizing voice data corresponding to each quasi-emotion as generated and corresponding to a growing stage as specified.
- the method for expressing a plurality of quasi-emotions through voices comprises the steps of: generating an intensity of each of said quasi-emotions, and producing an acoustic effect equivalent to the intensity of the quasi-emotion as generated and synthesizing the related voice data.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Toys (AREA)
- Manipulator (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2000237853A JP2002049385A (ja) | 2000-08-07 | 2000-08-07 | 音声合成装置、疑似感情表現装置及び音声合成方法 |
JP2000237853 | 2000-08-07 |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1182645A1 true EP1182645A1 (de) | 2002-02-27 |
Family
ID=18729640
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP01119055A Withdrawn EP1182645A1 (de) | 2000-08-07 | 2001-08-07 | Emotionsausdrückende Vorrichtung und Verfahren zum Ausdruckbringen mehrerer Emotionen durch Stimmen |
Country Status (3)
Country | Link |
---|---|
US (1) | US20020019678A1 (de) |
EP (1) | EP1182645A1 (de) |
JP (1) | JP2002049385A (de) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7222076B2 (en) | 2001-03-22 | 2007-05-22 | Sony Corporation | Speech output apparatus |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002268699A (ja) * | 2001-03-09 | 2002-09-20 | Sony Corp | 音声合成装置及び音声合成方法、並びにプログラムおよび記録媒体 |
US7457752B2 (en) * | 2001-08-14 | 2008-11-25 | Sony France S.A. | Method and apparatus for controlling the operation of an emotion synthesizing device |
JP3824920B2 (ja) * | 2001-12-07 | 2006-09-20 | ヤマハ発動機株式会社 | マイクロホンユニット及び音源方向同定システム |
JP4556425B2 (ja) * | 2003-12-11 | 2010-10-06 | ソニー株式会社 | コンテンツ再生システム、コンテンツ再生方法、コンテンツ再生装置 |
KR100762653B1 (ko) * | 2004-03-31 | 2007-10-01 | 삼성전자주식회사 | 캐릭터 육성 시뮬레이션을 제공하는 이동 통신 장치 및 방법 |
JP4661074B2 (ja) * | 2004-04-07 | 2011-03-30 | ソニー株式会社 | 情報処理システム、情報処理方法、並びにロボット装置 |
JP5688574B2 (ja) * | 2009-11-04 | 2015-03-25 | 株式会社国際電気通信基礎技術研究所 | 触覚提示付ロボット |
EP2505001A1 (de) * | 2009-11-24 | 2012-10-03 | Nokia Corp. | Vorrichtung |
TWI413938B (zh) * | 2009-12-02 | 2013-11-01 | Phison Electronics Corp | 情感引擎、情感引擎系統及電子裝置的控制方法 |
US8731932B2 (en) | 2010-08-06 | 2014-05-20 | At&T Intellectual Property I, L.P. | System and method for synthetic voice generation and modification |
FR3004831B1 (fr) * | 2013-04-19 | 2022-05-06 | La Gorce Baptiste De | Controle numerique des effets sonores d'un instrument de musique. |
JP6053847B2 (ja) * | 2014-06-05 | 2016-12-27 | Cocoro Sb株式会社 | 行動制御システム、システム及びプログラム |
US11010726B2 (en) | 2014-11-07 | 2021-05-18 | Sony Corporation | Information processing apparatus, control method, and storage medium |
JP2017042085A (ja) * | 2015-08-26 | 2017-03-02 | ソニー株式会社 | 情報処理装置、情報処理方法およびプログラム |
JP6212525B2 (ja) * | 2015-09-25 | 2017-10-11 | シャープ株式会社 | ネットワークシステム、機器、およびサーバ |
EP3499501A4 (de) * | 2016-08-09 | 2019-08-07 | Sony Corporation | Informationsverarbeitungsvorrichtung und informationsverarbeitungsverfahren |
CN110177660B (zh) * | 2017-01-19 | 2022-06-14 | 夏普株式会社 | 言行控制装置、机器人、存储介质及控制方法 |
WO2018190178A1 (ja) * | 2017-04-12 | 2018-10-18 | 川崎重工業株式会社 | 乗物の疑似感情生成システムおよび会話情報出力方法 |
JP7420385B2 (ja) * | 2018-08-30 | 2024-01-23 | Groove X株式会社 | ロボット及び音声生成プログラム |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0730261A2 (de) * | 1995-03-01 | 1996-09-04 | Seiko Epson Corporation | Interaktive Spracherkennungsvorrichtung |
US5559927A (en) * | 1992-08-19 | 1996-09-24 | Clynes; Manfred | Computer system producing emotionally-expressive speech messages |
WO1997041936A1 (en) * | 1996-04-05 | 1997-11-13 | Maa Shalong | Computer-controlled talking figure toy with animated features |
EP1107227A2 (de) * | 1999-11-30 | 2001-06-13 | Sony Corporation | Sprachverarbeitung |
EP1113417A2 (de) * | 1999-12-28 | 2001-07-04 | Sony Corporation | Vorrichtung, Verfahren und Aufzeichnungsmedium zur Sprachsynthese |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0612401A (ja) * | 1992-06-26 | 1994-01-21 | Fuji Xerox Co Ltd | 感情模擬装置 |
US5860064A (en) * | 1993-05-13 | 1999-01-12 | Apple Computer, Inc. | Method and apparatus for automatic generation of vocal emotion in a synthetic text-to-speech system |
US5632189A (en) * | 1995-03-14 | 1997-05-27 | New Venture Manufactururing & Service, Inc. | Saw shifting apparatus |
JPH10289006A (ja) * | 1997-04-11 | 1998-10-27 | Yamaha Motor Co Ltd | 疑似感情を用いた制御対象の制御方法 |
US5966691A (en) * | 1997-04-29 | 1999-10-12 | Matsushita Electric Industrial Co., Ltd. | Message assembler using pseudo randomly chosen words in finite state slots |
US6185534B1 (en) * | 1998-03-23 | 2001-02-06 | Microsoft Corporation | Modeling emotion and personality in a computer user interface |
JP2001038053A (ja) * | 1999-08-03 | 2001-02-13 | Konami Co Ltd | ゲーム展開の制御方法、ゲーム装置及び記録媒体 |
-
2000
- 2000-08-07 JP JP2000237853A patent/JP2002049385A/ja active Pending
-
2001
- 2001-08-06 US US09/922,760 patent/US20020019678A1/en not_active Abandoned
- 2001-08-07 EP EP01119055A patent/EP1182645A1/de not_active Withdrawn
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5559927A (en) * | 1992-08-19 | 1996-09-24 | Clynes; Manfred | Computer system producing emotionally-expressive speech messages |
EP0730261A2 (de) * | 1995-03-01 | 1996-09-04 | Seiko Epson Corporation | Interaktive Spracherkennungsvorrichtung |
WO1997041936A1 (en) * | 1996-04-05 | 1997-11-13 | Maa Shalong | Computer-controlled talking figure toy with animated features |
EP1107227A2 (de) * | 1999-11-30 | 2001-06-13 | Sony Corporation | Sprachverarbeitung |
EP1113417A2 (de) * | 1999-12-28 | 2001-07-04 | Sony Corporation | Vorrichtung, Verfahren und Aufzeichnungsmedium zur Sprachsynthese |
Non-Patent Citations (1)
Title |
---|
MORIYAMA ET AL: "Emotion recognition and synthesis system on speech", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, LOS ALAMITOS, CA, US, vol. 1, 7 June 1999 (1999-06-07), pages 840 - 844, XP002154370 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7222076B2 (en) | 2001-03-22 | 2007-05-22 | Sony Corporation | Speech output apparatus |
Also Published As
Publication number | Publication date |
---|---|
US20020019678A1 (en) | 2002-02-14 |
JP2002049385A (ja) | 2002-02-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1182645A1 (de) | Emotionsausdrückende Vorrichtung und Verfahren zum Ausdruckbringen mehrerer Emotionen durch Stimmen | |
Hermann et al. | Sound and meaning in auditory data display | |
US6714840B2 (en) | User-machine interface system for enhanced interaction | |
KR100864339B1 (ko) | 로봇 장치 및 로봇 장치의 행동 제어 방법 | |
KR20100121420A (ko) | 가상 세계에서의 객체를 제어하는 시스템, 방법 및 기록 매체 | |
KR20030074473A (ko) | 스피치 합성 방법 및 장치, 프로그램, 기록 매체, 억제정보 생성 방법 및 장치, 및 로봇 장치 | |
JP3211186B2 (ja) | ロボット、ロボットシステム、ロボットの学習方法、ロボットシステムの学習方法および記録媒体 | |
KR20010101883A (ko) | 로봇 장치와 그 제어 방법, 및 로봇 장치의 성격 판별 방법 | |
EP1256931A1 (de) | Verfahren und Vorrichtung zur Sprachsynthese und Roboter | |
WO2002076686A1 (fr) | Appareil d'apprentissage d'actions et procede d'apprentissage d'actions pour systeme robotique, et support de memoire | |
JP7528981B2 (ja) | 機器の制御装置、機器、機器の制御方法及びプログラム | |
KR100580617B1 (ko) | 오브젝트 성장제어 시스템 및 그 방법 | |
CN112601592A (zh) | 机器人及声音生成程序 | |
WO2021174144A1 (en) | Systems and methods for interactive, multimodal book reading | |
KR20200065499A (ko) | 음악과 춤의 상관관계를 학습하여 춤을 추는 로봇 | |
JP2024108175A (ja) | ロボット、音声合成プログラム、及び音声出力方法 | |
Hahn et al. | Pikapika–the collaborative composition of an interactive sonic character | |
Tejada et al. | Blowhole: Blowing-Activated Tags for Interactive 3D-Printed Models. | |
WO2020166373A1 (ja) | 情報処理装置および情報処理方法 | |
JP3860409B2 (ja) | ペットロボット装置及びペットロボット装置プログラム記録媒体 | |
CN111278611A (zh) | 信息处理设备、信息处理方法和程序 | |
EP4263013B1 (de) | Interaktiver spielzeugsatz zur wiedergabe digitaler medien | |
JP7414735B2 (ja) | 複数のロボットエフェクターを制御するための方法 | |
JP2002307349A (ja) | ロボット装置、情報学習方法、プログラム及び記録媒体 | |
JP2001105363A (ja) | ロボットにおける自律的行動表現システム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE FR GB IT Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR |
|
AX | Request for extension of the european patent |
Free format text: AL;LT;LV;MK;RO;SI |
|
17P | Request for examination filed |
Effective date: 20020820 |
|
AKX | Designation fees paid |
Free format text: DE FR GB IT |
|
17Q | First examination report despatched |
Effective date: 20021021 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20030501 |