CN101276434A - Method and apparatus for learning behavior in software robot - Google Patents

Method and apparatus for learning behavior in software robot Download PDF

Info

Publication number
CN101276434A
CN101276434A CNA2008100099134A CN200810009913A CN101276434A CN 101276434 A CN101276434 A CN 101276434A CN A2008100099134 A CNA2008100099134 A CN A2008100099134A CN 200810009913 A CN200810009913 A CN 200810009913A CN 101276434 A CN101276434 A CN 101276434A
Authority
CN
China
Prior art keywords
state
action
variation
type
perception
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008100099134A
Other languages
Chinese (zh)
Inventor
李江熙
金光春
金礼薰
金钟焕
赵世衡
崔胜唤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of CN101276434A publication Critical patent/CN101276434A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/16Automatic learning of transformation rules, e.g. from examples
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/44Browsing; Visualisation therefor
    • G06F16/444Spatial browsing, e.g. 2D maps, 3D or virtual spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/02Comparing digital values
    • G06F7/023Comparing digital values adaptive, e.g. self learning

Abstract

Disclosed is a method and apparatus for learning behavior in a software robot. The method includes detecting a kind of an object in cyberspace related to a kind of presently manifested action, and a kind and the variation of at least one state among percept states or emotional states preset so as to change in relation to the kind of the action; finding episodes respectively corresponding to each of one or more objects in the cyberspace, each of one or more emotional states and each of one or more percept states, respectively defined in the software robot, a kind of an object in cyberspace related to the detected kind of the action among multiple episodes for responding a combination of kinds of respective one or more actions and for storing variation related to each state, and a kind of at least one state among percept states or emotional states preset so as to change in relation to the kind of the action; using variation stored in response to the found episode and variation generated in response to the manifested action, and calculating a representative variation; and storing the representative variation as a variation of the found episode.

Description

The method and apparatus of learning behavior in software robot
Technical field
The present invention relates to a kind of hereditary robot, more particularly, relate to a kind of method and apparatus that is used for the software robot learning behavior (that is action) among hereditary robot.
Background technology
Usually, hereditary robot be meant unique genetic code with himself artificial creature, software robot (that is, Sobot) or general robot.In addition, the robot genetic code is represented the individual machine people's gene group be made up of a plurality of artificial chromosomes (artificial chromosome).Here, software robot is meant the artificial creature (artificial creature) with form of software, and this software of the transmission artificial creature by network can be as ageng independently and user interactions and can be as the intelligent cell of the hardware robot that sensor network is connected with robot.
The a plurality of artificial chromosomes that define in above-mentioned soma people are with the environmental interaction of robot outside the time, the change of internal state (comprising motivation (motivation), homeostasis (homeostasis) and emotional state etc.) of definition robot, and the individual character or the specific character of the robot of definite expression behaviour of following by described change.At this, show definition in the table 1 to artificial creature, motivation, homeostasis, mood and behavior etc.
Table 1
The artificial creature The artificial creature has mood according to the action of robot self motivation, and can be in real time and human interaction, and selects self behavior.
Individual character Not simple, the technology summarized of behavior, but its part or all of determinative, and if think deeply as the people, can be interpreted as individual character.This notion comprises motivation, homeostasis and mood.Therefore, personality engine be meant have motivation, the engine of homeostasis and mood.It is corresponding with the determinative that produces various types of internal states and behavior performance.
Motivation Biosome is aroused and keep biological behavior and control the processing of the pattern of behavior.Can select and act of execution, for example, curiosity, intimate sense, dullness, avoidance,
Control desire etc.
Homeostasis Even be subjected to the influence of the change of outside and internal environment, also make biosome physiological status can be maintained the function of individual steady state (SS).Allow to select and act of execution.For example hungry, sleepy, fatigue etc.
Mood The destabilization of the subjectivity that when biosome is carried out specific behavior, takes place.For example, happy, sad, angry, fear etc.
Behavior To the general terms of individuality action, comprise moving, stopping etc. to particular place.For example, for animal, comprise sleep, eat and race etc.The number of the individual behavior that can select is limited, and in special time, each individuality only can be carried out a behavior.
In addition, above-mentioned artificial chromosome can be categorized as the hereditary information relevant with fundamental, the relevant hereditary information and determine relevant hereditary information with behavior with internal state.Here, the hereditary information relevant with fundamental is meant that the appreciable impact internal state changes and the basic parameter of external behavior performance, and being meant with internal state correlated inheritance information influences the parameter of importing the internal state of relevant robot with the outside that is applied to robot.In addition, determine that with behavior relevant hereditary information is meant the parameter of determining the external behavior of being correlated with above-mentioned internal state according to current definite internal state.
Here, internal state is meant the state such as motivation, homeostasis, mood etc.Therefore, as shown in table 2, by each internal state with determine the internal state of robot according to the parameter (promptly with internal state correlated inheritance information) of the internal state of outside stimulus.
Table 2
Figure A20081000991300061
If determine that with behavior relevant hereditary information comprises the various expression behaviours of replacing the said external stimulation, then can represent and the definite relevant hereditary information of behavior in the mode identical with table 2.Therefore, determine that with behavior relevant hereditary information comprises and the parameter of being correlated with about the specific action of each internal state, that is, the parameter of internal state (such as motivation, homeostasis and mood), the value of described parameter makes each action can show themselves.
In addition, the basic parameter that these internal states of appreciable impact change and external behavior shows can be expressed as followsin: whether mutability, initial value, mean value, convergency value, along with the pad value of time lapse, by the particular value of special time appointment etc.The hereditary information relevant with fundamental can dispose these basic parameters for specific purpose.Therefore, relevant with fundamental hereditary information comprises: whether each internal state (that is, motivation, homeostasis) is according to the internal state of mood and mutability, initial value, mean value, convergency value, pad value, particular value etc.
In this case, the robot genome comprises: the hereditary information relevant with fundamental, the relevant hereditary information and determine relevant hereditary information with behavior with internal state.The hereditary information relevant with fundamental comprises the parameter of internal state and for changing with the corresponding internal state of each internal state and the parameter of the key element of external behavior performance necessity.The hereditary information relevant with internal state comprise various outside stimuluss parameter and with the parameter of the corresponding internal state of each outside stimulus.With behavior determine relevant hereditary information comprise various performances actions parameter and with the parameter of the corresponding internal state of performance action difference.Therefore, shown in following table 3, can the robot genome be expressed as the relevant hereditary information of internal state, move with the corresponding fundamental of internal state difference, outside stimulus and performance by two-dimensional matrix.
Table 3
Figure A20081000991300071
Figure A20081000991300081
Therefore, the current robot platform is determined specific expression behaviour based on current internal state (that is, such as the state of motivation, homeostasis and mood etc.), and realizes following described definite behavior.For example, if the internal state of robot and starvation are corresponding, then robot determines the behavior of asking for whatsit to the people, and determines to put into practice with described.Therefore, robot can be presented as true biosome and take action.
Software robot with above-mentioned feature should not be subjected to time and space constraint ground to provide service to the user in various environment.In order freely to transmit by network, software robot must have the IP address that has started the device that changes, and is present in the equipment of present employing.For with the user interactions of equipment, software robot can be carried out and the true biological the same function of function,, own selection behavior, adapts oneself to circumstances, expresses the mood of oneself etc. that is.
For software robot is conformed, should instruct software robot how to show off.When software robot showed its response to interested object, the user gave software robot and rewards (that is, praising) or punishment (that is blame).In this way, when the interested object of the next one near the time, can change tendency, all avoiding in this way still near interested object.This be known as " preference study ".Preference study instruction software robot with special object liked or dislikes corresponding preference degree.For example, if the user gives software robot and praises when robot finds yellow ball,, and reduce the motivational state of escaping, the strength of association between the associated inner state of adjustable whole robot and the behavior of robot then by increasing happy in the emotional state.
The action of user expectation shows off in one group of similar movement that phonetic study allows to determine according to any voice command of user.Phonetic study can be instructed the behavior of order arbitrarily that is suitable in whole actions, progressively reduces to become the set of learning objective, and strengthens the study of the result of the action by every group of similar movement.For example, the set similar to " sitting down " comprises " taking one's seat ", " rolling up " and " lying down ", and the set similar to " coming here " comprises " catching up with ", " approaching ", " playing " and " touch ".
Therefore, in the learning functionality of above-mentioned existing software robot, the behavior of instruction software robot is limited in one group of similar movement, and may only instruct some specific actions of software robot.In addition, strengthening between the learning period, the user must one by one award or punishes the behavior of software robot.By using the learning method of prior art, can realize learning mood and motivation, but can not learn how to keep homeostasis.
Summary of the invention
The present invention is intended to solve the problems referred to above that occur in the prior art, the related method and apparatus between the action that the invention provides a kind of learning software robot and the internal state.
In addition, the invention provides a kind of can realize the soma philtrum all may move and all internal states between study.
In addition, the invention provides a kind of method and apparatus, wherein, although the user does not give artificial award or punishment by feedback, for study, each output that software robot can perception can be identified as rewards or punishment.
In addition, the invention provides a kind of method and apparatus, wherein, software robot can the study action relevant with physical state and emotional state.
In order to realize these aspects of the present invention, a kind of the method according to this invention is provided, described method comprises: detect the type of object in the information space (cyberspace) relevant with the type of the action of current performance and a plurality of perception states preset for the change of the type of action or the type and the variation of at least one state in the emotional state; Find respectively and each corresponding plot in each in each in the one or more objects in the information space of definition, the one or more emotional state and the one or more perception state respectively in the soma people, find the type of object in the information space relevant in a plurality of plots with the type of the action that detects, with the combination and the storage variation relevant of each type of responding one or more actions, and find and be the default perception state of the change of the type of moving or the type of at least one state in the emotional state with each state; Use is calculated representative the variation in response to the variation of the plot storage of finding and the variation that produces in response to the action that shows; And the variation that the representativeness variation is stored as the plot that finds.
Description of drawings
By the detailed description of carrying out below in conjunction with accompanying drawing, above-mentioned and other exemplary feature of the present invention, each side and advantage will become more apparent, wherein:
Fig. 1 illustrates the diagrammatic sketch of the configuration of software robot according to an embodiment of the invention;
Fig. 2 illustrates the relation between the plot storer (episode memory) and blackboard according to an embodiment of the invention;
Fig. 3 is the diagrammatic sketch that the structure that is stored in the plot in the plot storer according to an embodiment of the invention is shown;
Fig. 4 A to Fig. 4 C illustrates the processing that is used to the incident of storing according to embodiments of the invention; And
Fig. 5 illustrates the change of remembering by action learning according to an embodiment of the invention.
Embodiment
Below, exemplary embodiment of the present invention is described with reference to the accompanying drawings.Although identical parts are shown in the different accompanying drawings, in the following description and the drawings, identical parts are represented by identical label.In addition, in the description below the present invention, when the detailed description of known function that is herein incorporated and structure may make theme of the present invention not know, will omit its detailed description.
Software robot is present in the information space based on its characteristic.In the information space that software robot exists, can there be one or more software robots, can there be the various assemblies of representing in the information space simultaneously, such as article, food, toy, chair etc.In the present invention, software robot and all component are called as object.In addition, in information space, except described object, can there be environmental information, comprises the positional information of environment cause, object, the interactive information between the object etc.The environment cause is corresponding with the main cause of the environment attribute of expression information space, and can comprise temperature, humidity, time, amount of sunlight, sound, space attribute etc.The positional information of object is represented the fixed position of each object in the information space, perhaps the current location of each object that has moved.The information of mutual (such as when software robot is eaten, perhaps when software robot is played football) of directly realizing between interactive information between the object and the object is corresponding.
Environmental information can be delivered to software robot.Usually, the positional information of environment cause and object is delivered to software robot by specific function, and the positional information of environment cause of transmitting and object is by the sensing cell sensing of software robot.Interactive information between the object can be used as the incident that specific function represents and is passed to software robot.
It is necessary that Event Function passes to software robot to situation about will occur in the information space, and comprises the identifying information of type (what) of the identifying information of the object (who) of situation influence, the motion relevant with situation and the identifying information of the effect (parameter) that causes owing to moving.In addition, incident can be classified as external event (about between the object that differs from one another mutual) and internal event (single object is interior).Mutual incident between the object that external event and expression differ from one another is corresponding.For example, under the situation that software robot is eaten, object and software robot and food are corresponding, and the type of motion and table manner are answered, and the effect that causes owing to motion can be to feel saturated happy after having meal.The inherent influence that causes is necessary to internal event owing to the result of the specific action of software robot to processing.For example, under the situation that software robot is walked, object and the software robot relevant with this situation are corresponding, and the type of motion is with corresponding on foot, and the effect that causes owing to motion can be corresponding with fatigue.Software robot can be by the appearance of this incidents of sensing such as sensing cell or physical state unit.
According to embodiments of the invention, above-mentioned software robot can be configured to as shown in Figure 1, and Fig. 1 illustrates the diagrammatic sketch of the configuration of software robot according to an embodiment of the invention.With reference to Fig. 1, software robot comprises: physical state unit 10, perception unit 20, emotional state unit 30, behavior management unit 40, sensor unit 80, short-term storage 70, plot storer 60, behavior performance element 50, blackboard 90 and storer (not shown).
Software robot disposes various modules, such as physical state unit 10, perception unit 20, emotional state unit 30, behavior management unit 40, sensor unit 80 and behavior performance element 50, and each module has correlativity and exchanges the data of agreeing each other.If complicated correlativity is by standardization, then in execution in step according to the form of the data of each correlativity exchange and in execution in step the method according to each correlativity swap data must all be defined.To overcoming this inconvenience, blackboard is necessary.Its structure is: various modules are shared blackboard 90, and blackboard 90 is as the device of unified various information resources.Described structure corresponding to when a plurality of people writing information and the identical notion of notion when sharing on blackboard, to solve the problem of complicacy to necessary each other information.The public data area that can be called as blackboard is present in the center of blackboard 90, and the information that a plurality of module provides is by unification.Realize blackboard 90 with C++ blackboard class.C++ blackboard class has the various data structures of following table 4 definition, and each data message is provided for each module that makes up virtual organism by corresponding Put function and Get function, is perhaps upgraded by each module.
Table 4
Structure Definition
Environment value
91 Be delivered to the virtual environment information of software robot
External event value 92 Information about situation about occurring in the information space
Internal event value 93 The information of situation about occurring about software robot inside
Sensor values 94 The information space information of software robot sensing
Physical state value 95 The condition value of software robot
Perception value
96 The perception information of software robot
Emotional state value 97 The domination mood value of software robot
Behavior adds the value 98 of object Selection with make a display of one's action with the relevant object of selecting of action
Sensor tabulation
99 Be present in the tabulation of the sensing of soma philtrum
Physical state tabulation 100 Be present in the physical state tabulation of soma philtrum
Perception tabulation
101 Be present in the perception tabulation of soma philtrum
Mood tabulation
102 Be present in the tabulation of the sensation of soma philtrum
Behavior tabulation
103 The action list related that is present in the soma philtrum
The sensor unit 80 environment for use information and external event be as input information, upgrades sensing data, and the sensing data (consequent sensor values 94) that will influence software robot outputs to blackboard 90.Form with environmental information and external event is delivered to virtual organism with all information in the information space.Yet information that can not be sensed can exist according to the position or the ability of virtual organism.For this reason, but sensor unit 80 serves as the wave filter that only sensitive information in many input informations is delivered in the software robot.For example, be not included in sensor values 94 about the information of the outer object of the visual range that is arranged in virtual organism, not processed in the external event with the irrelevant incident of software robot.
Physical state unit 10 changes the physical state of robot according to external event, internal event and environmental information, and end value is outputed to blackboard 90 as physical state value 95.Such as following table 5 description, the example of above-mentioned physical state can comprise: the state that the state of stomach, the state of energy, health consume, movable state, healthy state, the state of growth etc.
Table 5
State Definition Influence
The state of stomach The quantity of food of before food digestion, absorbing The state influence of stomach is hungry.
The state of energy The size of the energy that is just keeping Whether the state influence digestion of energy takes place.
The state that health consumes The excreta amount that must drain The state that health consumes influences excreta.
Movable state The energy of action Movable state influence is tired.
Healthy state Healthy situation Healthy state influences activity.
The state of growing up The degree that physics is grown up The state of growing up influences the external shape of virtual organism.
Perception unit 20 is corresponding to such module: the result that the perception of managing physical state and the software robot relevant with the environmental information of information space is followed, by sensor unit 80 sensing external environment conditions, detect internal state by physical state value 95, then perception value 96 is outputed to blackboard 90.For example, if sensor unit 80 transmission information such as being that 100 strength is patted with having size, then can realize the perception of " sensation pain ".If the size of the energy that keeps, then can obtain the perception of " positive hungry " less than 10.According to embodiments of the invention, come the type of configuration-aware state as the definition in the table 6.
Table 6
State Definition
Brightness The brightness of virtual environment
Sound The loudness of the sound that produces in the virtual environment
Taste The degree of feeding taste
Hungry Hungry degree
Tired Tired degree
Hit Pat the degree of virtual organism by situation about taking place in the virtual environment
Pat Pat the degree of virtual organism by situation about occurring in the virtual environment
Emotional state unit 30 is corresponding to such module: the emotional state of management software robot changes emotional states with reference to perception value 96, and the emotional state that changes is outputed to blackboard 90 as emotional state value 97.That emotional state can comprise is happy, sad, angry, fear etc., and emotional state unit 30 is defined as arranging mood with having peaked emotional state in the emotional state.
Short-term storage 70 is corresponding to such storer: the information that produces in the storage short time, and with software robot centering position, by use in the spherical coordinates system three variablees (comprise γ, θ and
Figure A20081000991300131
) store position and time t that other object exists.
Behavior management unit 40 is corresponding to such module: finally determine the behavior of software robot, determine behaviors with reference to perception value 96, emotional state value 97, short-term storage 70 and plot storer 60, and the end value 98 that therefore behavior is added object outputs to blackboard 90.Behaviors are mainly determined with reference to plot storer 60 in behavior management unit 40, and if necessary, then the behavior of instructing of behavior management unit 40 control users guiding comes expression behaviour.The 97 not participative behavior selections of emotional state value itself, and after behavior is selected, influence the behavior that how to show selection.That is to say that " walk " afterwards in the selection behavior, mood is used to the difference of the behavior of producing, such as " walking joyously ", " offendedly walking " etc.In addition, if perception value 96 and emotional state value 97 are included in the non-steady state scope of performance non-steady state, then behavior management unit 40 is with reference to plot storer 60, and determines the behavior that must carry out owing to that reason.Above-mentioned non-steady state scope is pre the inside constant of software robot, and corresponding with genetic value.
In above-mentioned storer (not shown), be stored in unsure state scope and a plurality of artificial chromosome's information that the soma philtrum is provided with.In addition, in storer, be stored in the type of various physical states, perception state, emotional state and the action of the setting of soma philtrum.In addition, in storer, storage and the corresponding information relevant of the type of each action with perception state, physical state or emotional state, described perception state, physical state and emotional state are relevant.In addition, in storer, the variation that storage is relevant with emotional state or physical state, described emotional state is relevant with the type of arbitrary act with physical state.
Plot storer 60 is corresponding to such module: as shown in Figure 2, be responsible for behavior and perception and the behavior study relevant with affective state with software robot, be worth 97 with reference to perception value 96 and emotional state, and definite plot and behavior add the value 98 of object.Fig. 2 illustrates the relation between the plot storer 60 and blackboard 90 according to an embodiment of the invention.
Plot storer 60 comprises a plurality of plots 68, and each plot has structure as shown in Figure 3, and Fig. 3 is the diagrammatic sketch that the structure that is stored in the plot in the plot storer according to an embodiment of the invention is shown.Each plot 68 is corresponding to such information: be present in the combination of the type of each perception state in the information space and emotional state and object and behavior in a plurality of internal states that described information representation defines in the soma people, and each plot 68 can be represented and each combination corresponding action, perception state, emotional state and relation between objects.With reference to Fig. 3, plot 68 comprises behavior (that is, action) 61 and object 62, and comprise kind 63, state 64 and change 65 and the number of times 66 that occurs as variable.Implication as following every information of table 7 definition.
Table 7
The element of plot Definition
Behavior (that is action) 61 The unique identifying information of the behavior of selecting and showing
Object 62 The unique identifying information of the object relevant with the behavior of performance
Kind 63 The relevant plot of expression is with corresponding or corresponding with the memory about emotional state about the memory of perception state, and the value of relevant plot with perception still has the information of the value of mood
State 64 At state 64, according to kind storage perception state unique identifying information or the value of the unique identifying information of emotional state, and its initial value equals " 0 "
Change 65 The change of correlation behavior amount
The frequency 66 that occurs Represent that the combination of identical behavior, object and state instructed how many times, and its initial value equals " 0 "
The sum and the corresponding full-size thereof that are stored in the plot 68 in the plot storer 60 are determined regularly according to the quantity of the type of the perception amount of state that defines in the soma people and the quantity of emotional state, the quantity that is present in the object in the information space and action, and can be realized the calculating of sum by following equation (1).
The quantity (1) of the quantity * object of the type of the sum of plot=(quantity of perception amount of state+emotional state) * action
As shown below, illustrate plot 68 is stored in processing in the plot storer 60.Software robot can be according to external event, environmental information, internal state and user's guiding performance specific action.As the result of performance specific action, emotional state relevant with specific action or perception state change.At this moment, if the type of the type of the emotional state relevant with specific action or perception state is scheduled, unique artificial chromosome of then relevant with specific action emotional state or perception state changes also scheduled.Along with specific action shows off, plot storer 60 detects the type of specific action, and can change the internal state equity of the type of described specific action, kind, state and variation and software robot with the object that variation is connected with respect to specific action with the type of specific action, kind, state by sensing.Plot storer 60 finds the plot of the combination identical with the combination of the type of the type of action that detects, object, kind, state and variation therein.For example, " eat object 1 " in the software robot act of execution, and relatively the type of the state that changes of object 1 and hungry (its variation: " 10 ") and happy (its variation: "+5 ") be accordingly under situation, and plot storer 60 finds about behavior and comprising of " eating object 1 " eats-plot of object 1-perception-hunger-(10) and eating-object 1-mood-happy-(5).If find plot 68 with like combinations, plot storer 60 change detected 65 in the plot that finds then.Then, by the variation 65 that use to detect with because representative the variation calculated in the variation that top specific action produces.Because plot storer 60 has the result of the study that is stored in behavior wherein, so because the variation that specific action produces is not stored in wherein according to present situation, but after the calculated amount that the representativeness that has reflected level of learning changes, the representativeness of storage computation changes therein.
For this reason, change detected 65 can be considered to existing representative and change, and is used to calculate the representative equation that changes by equation (2) definition expression:
The variation (2) of representative variation the=(1-p) * existing representative variation+p * generation
Wherein, the degree that the scheduled representativeness of the variable effect that " p " expression produces changes, and have scope 0<p<1.
With reference to Fig. 4 A and Fig. 4 B, the processing that one group of plot 68 is stored in the plot storer 60 is described according to following mode.
Fig. 4 A illustrates 6 plots that are stored in according to an embodiment of the invention in the plot storer 60.The combination of 6 plots respectively with eat-object 1-perception-hunger-(10)-1, eat-object 2-perception-hunger-(12)-1, eat-object 2-mood-sadness-(5)-1, eat-object 1-mood-happy-(10)-1, object put into mouth-object 3-mood-fear-(15)-1 and object is put into mouth-object 4-mood-happy-(8)-1 corresponding.The combination of the type of action shown in Fig. 4, object, kind, Status Type and variation (its all with respect to the specific action of current performance and perception), and the action of current performance is corresponding with " eating object 1 ".The state that changes relatively with the action of " eating object 1 " is " hunger ", and the variation of hypothesis change state equals " 20 ".In addition, suppose that the representative degree that changes of the variable effect that produces according to the action that shows equals " 0.1 ".Therefore, shown in Fig. 4 B, plot storer 60 finds with relevant the having of action of current performance and eats-plot of the combination of object 1-perception-hunger-(20), at this moment, the plot of detection only only needs in the type of type, object, kind and the state of action the combinations matches with the corresponding plot of action of current performance.In the plot of describing in Fig. 4 A, the plot relevant with the action of current performance is corresponding with first plot, so plot storer 60 detects " 20 " for having representative the variation now.Then, plot storer 60 changes as follows by using above-mentioned equation (2) to calculate representativeness.
The representative variation=(1-0.1) * (10)+0.1 * (20)=-11
Therefore, shown in Fig. 4 C, plot storer 60 has the new representativeness that is stored in the plot relevant with the action of current performance and changes " 11 ", and has the number of times " 2 " that is stored in announcement wherein by the number of times increase by 1 that will announce (publication).Have according to the final plot of this situation and to eat-combination of object 1-perception-hunger-(11)-2.By this way, the example of the change by learning and memory shown in Figure 5, Fig. 5 illustrates according to an embodiment of the invention to change memory by action learning, and has described the value that is become " 100 " by remembering and relearned and be " 30 ".
In order to remember various correlativitys in minimum storage, the learning method of above-mentioned plot storer 60 is contemplated into each perception state and each emotional state is independent of each other.Therefore, if when specific behavior shows off, the change of each perception state and the change of each emotional state are remembered independently, then can remember bulk information in minimum storage.In addition, plot storer 60 can be configured to carry out periodically.This is because plot storer 60 is remembered the variation of perception state and the variation of emotional state, thereby only can realize effective study when carrying out plot storer 60 with proper spacing.
As mentioned above, in the present invention, can be implemented in the action of software robot and the study of the association between the internal state, although and the user does not give artificial the award or punishment by the mode of feedback, but also make the software robot can each input of sensing, just, input can be perceived as award or punishment about study.In addition, even action is relevant with physical state and emotional state, software robot also can be learnt.
Although reference certain exemplary embodiments of the present invention shows and has described the present invention, it should be appreciated by those skilled in the art, under the situation that does not break away from the spirit and scope of the present invention, can carry out various changes to its form and details.Therefore, the spirit and scope of the present invention be can't help embodiments described herein and are limited, but are limited by claim and equivalent thereof.

Claims (12)

1, a kind of in software robot the method for learning behavior, described method comprises step:
Detect the type of object in the information space relevant and a plurality of perception states preset for the change of the type of action and the type and the variation of at least one state in the emotional state with the type of the action of current performance;
Find respectively and each corresponding plot in each in each in the one or more objects in the information space of definition, the one or more emotional state and the one or more perception state respectively in the soma people, find the type of object in the information space relevant in a plurality of plots with the type of the action that detects, with the combination and the storage variation relevant of each type of responding described one or more actions, and find a plurality of perception states preset into the change of the type of moving or the type of at least one state in the emotional state with each state;
Use is calculated representative the variation in response to the variation of the plot storage of finding and the variation that produces in response to the action that shows; And
Representativeness is changed the variation that is stored as the plot that finds.
2, the method for claim 1, wherein the perception state is corresponding with the state that reflects the result who follows by the physical state of perception information space environment information and software robot.
3, method as claimed in claim 2, wherein, calculate the quantity of a plurality of plots with following formula:
The quantity of the quantity * object of the type of the quantity of a plurality of plots=(quantity of perception amount of state+emotional state) * action,
Wherein, all perception amount of state that the perception amount of state equals to define in the soma people,
The quantity of emotional state equals the quantity of all emotional states of defining in the soma people,
The quantity of type of action equals the quantity of the type of the action that defines in the soma people,
The quantity of all objects that the quantity of object equals to exist in information space.
4, method as claimed in claim 3, wherein, each in described a plurality of plots also comprises and relevant emotional state and the relevant corresponding kind of information of perception state.
5, method as claimed in claim 4, wherein, each in described a plurality of plots also comprises the information of the number of times that occurs about relevant action.
6, method as claimed in claim 4, wherein, carry out by using corresponding to the variation of the plot storage of finding and the representative processing that changes of change calculations that produces corresponding to the action that shows with following formula:
The variation of representative variation the=(1-p) * existing representative variation+p * generation
Wherein, the degree that the predetermined representativeness of the variable effect that " p " expression produces changes, and have scope 0<p<1.
7, a kind of in software robot the equipment of learning behavior, described equipment comprises:
The behavior performance element is used to realize the action of software robot; And
The plot memory cell,
Detect the type of object in the information space relevant and a plurality of perception states preset for the change of the type of action and the type and the variation of at least one state in the emotional state with the type of the action of the current performance of behavior performance element;
Find respectively with the information space that in the soma people, defines respectively in one or more objects in each, each the corresponding plot in each and the one or more perception state in one or more emotional state, find the type of object in the information space relevant in a plurality of plots with the type of the action that detects, with the combination and the storage variation relevant of each type of responding described at least one action, and find the perception state preset into the change of the type of moving or the type of at least one state in the emotional state with each state;
The representative variation calculated in variation that use is stored in response to the plot that finds and the variation that produces in response to the action that shows; And
Representativeness is changed the variation that is stored as the plot that finds.
8, equipment as claimed in claim 7, wherein, the perception state is corresponding with the state that reflects the result who follows by the physical state of perception information space environment information and software robot.
9, equipment as claimed in claim 8, wherein, plot storer such as following equation define the quantity of calculating a plurality of plots:
The quantity of the quantity * object of the type of the quantity of a plurality of plots=(quantity of perception amount of state+emotional state) * action,
Wherein, all perception amount of state that the perception amount of state equals to define in the soma people,
The quantity of emotional state equals the quantity of all emotional states of defining in the soma people,
The quantity of type of action equals the quantity of the type of the action that defines in the soma people,
The quantity of all objects that the quantity of object equals to exist in information space.
10, equipment as claimed in claim 9, wherein, each in described a plurality of plots also comprises and relevant emotional state and the relevant corresponding kind of information of perception state.
11, equipment as claimed in claim 10, wherein, each in described a plurality of plots also comprises the information of the number of times that occurs about relevant action.
12, equipment as claimed in claim 7, wherein, plot storer such as following equation define to be carried out by using corresponding to the variation of the plot storage of finding and the representative processing that changes of change calculations that produces corresponding to the action that shows:
The variation of representative variation the=(1-p) * existing representative variation+p * generation
Wherein, the degree that the predetermined representativeness of the variable effect that " p " expression produces changes, and have scope 0<p<1.
CNA2008100099134A 2007-02-07 2008-02-13 Method and apparatus for learning behavior in software robot Pending CN101276434A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
KR20070012951 2007-02-07
KR10-2007-0012951 2007-02-07
KR10-2007-0061095 2007-06-21

Publications (1)

Publication Number Publication Date
CN101276434A true CN101276434A (en) 2008-10-01

Family

ID=39883565

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008100099134A Pending CN101276434A (en) 2007-02-07 2008-02-13 Method and apparatus for learning behavior in software robot

Country Status (2)

Country Link
KR (1) KR100909532B1 (en)
CN (1) CN101276434A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102200787A (en) * 2011-04-18 2011-09-28 重庆大学 Robot behaviour multi-level integrated learning method and robot behaviour multi-level integrated learning system
CN102498469A (en) * 2009-09-22 2012-06-13 微软公司 Multi-level event computing model

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020082077A1 (en) * 2000-12-26 2002-06-27 Johnson Douglas R. Interactive video game system with characters that evolve physical and cognitive traits

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293605A (en) * 1999-01-18 2001-05-02 索尼公司 Robot, main unit of robot and coupling unit of robot
JP4552465B2 (en) * 2003-03-11 2010-09-29 ソニー株式会社 Information processing apparatus, action control method for robot apparatus, robot apparatus, and computer program
JP2004283960A (en) 2003-03-20 2004-10-14 Sony Corp Robot device, method of controlling behavior and program thereof
JP4555039B2 (en) 2004-03-30 2010-09-29 日本電気株式会社 Robot, robot control method, robot control program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020082077A1 (en) * 2000-12-26 2002-06-27 Johnson Douglas R. Interactive video game system with characters that evolve physical and cognitive traits

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HIROHIDE USHIDA 等: "Emotion Model for Life-like Agent and its evaluation", 《PROCEEDINGS FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE(AAAI-98)》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102498469A (en) * 2009-09-22 2012-06-13 微软公司 Multi-level event computing model
CN102498469B (en) * 2009-09-22 2015-11-25 微软技术许可有限责任公司 The method, apparatus and system of use case
CN102200787A (en) * 2011-04-18 2011-09-28 重庆大学 Robot behaviour multi-level integrated learning method and robot behaviour multi-level integrated learning system
CN102200787B (en) * 2011-04-18 2013-04-17 重庆大学 Robot behaviour multi-level integrated learning method and robot behaviour multi-level integrated learning system

Also Published As

Publication number Publication date
KR100909532B1 (en) 2009-07-27
KR20080074008A (en) 2008-08-12

Similar Documents

Publication Publication Date Title
Barto Intrinsic motivation and reinforcement learning
Uher Comparative personality research: methodological approaches
Mikhalevich et al. Is behavioural flexibility evidence of cognitive complexity? How evolution can inform comparative cognition
Barrett Variety is the spice of life: A psychological construction approach to understanding variability in emotion
CN101241561B (en) Apparatus and method for expressing behavior of software robot
Uher et al. Personality in the behaviour of great apes: temporal stability, cross-situational consistency and coherence in response
US8204839B2 (en) Apparatus and method for expressing behavior of software robot
Parisi Internal robotics
JP5227362B2 (en) Emotion engine, emotion engine system, and electronic device control method
Sims The problems with prediction: The dark room problem and the scope dispute
US7984013B2 (en) Method and apparatus for learning behavior in software robot
CN109643126A (en) The autonomous humanoid robot of behavior, server and behaviour control program
Bronfman et al. When will robots be sentient?
US20090024249A1 (en) Method for designing genetic code for software robot
Ziemke et al. Evolving cognitive scaffolding and environment adaptation: a new research direction for evolutionary robotics
CN101276434A (en) Method and apparatus for learning behavior in software robot
Theriault Morality and model coherence: A constructivist and biologically tractable account of moral motivation.
Magnani Animal abduction
KR20020067696A (en) Robot apparatus, information display system, and information display method
Cazalis et al. The living organism: strengthening the basis
Ma Towards computational models of animal cognition, an introduction for computer scientists
Estrada AIdeal: Sentience and Ideology
Jose Extended mind hypothesis and extended knowledge
Shu Influencing identity through objects in ‘constructed realities’: The role of a ‘diegetic prototype’in influencing a person's sense of identity in relation to nature
Loula et al. On Building Meaning: a biologically-inspired experiment on symbol-based communication

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20081001