CN109550252A - A kind of game AI training method, apparatus and system - Google Patents
A kind of game AI training method, apparatus and system Download PDFInfo
- Publication number
- CN109550252A CN109550252A CN201811323771.9A CN201811323771A CN109550252A CN 109550252 A CN109550252 A CN 109550252A CN 201811323771 A CN201811323771 A CN 201811323771A CN 109550252 A CN109550252 A CN 109550252A
- Authority
- CN
- China
- Prior art keywords
- data
- game
- decision
- training
- current status
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/60—Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
- A63F13/67—Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor adaptively or by learning from player actions, e.g. skill level adjustment or by storing successful combat sequences for re-use
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/30—Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
- A63F13/35—Details of game servers
- A63F13/358—Adapting the game course according to the network or server load, e.g. for reducing latency due to different connection speeds between clients
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/50—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers
- A63F2300/53—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers details of basic data processing
- A63F2300/534—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by details of game servers details of basic data processing for network load management, e.g. bandwidth optimization, latency reduction
Abstract
The invention discloses a kind of game AI training methods, apparatus and system, the method comprise the steps that the current status data of acquisition is transmitted to server-side when running game process;Control the execution of delay voltage;The corresponding decision data of current status data is obtained, and terminates the execution of delay voltage;Decision movement is executed according to the game AI role that decision data control is trained to, and generates reward data and succeeding state data;By current status data, decision data, reward data and succeeding state data organization at training sample, and training sample is transmitted to server-side, to realize that server-side is trained training network based on training sample and updates decision networks according to the data of training network, until network convergence.The present invention can control the execution of delay voltage during client waits server-side to return to decision data, so that game continuous service, so that the game without built-in game pause function is also able to achieve game AI training and is normally carried out.
Description
Technical field
The present invention relates to game artificial intelligence technical field more particularly to a kind of game AI training methods, apparatus and system.
Background technique
Artificial intelligence (AI, Artificial Intelligence) is to make computer to simulate certain thought processes of people
(such as learn, plan) technology with intelligent behavior, the game AI for applying to field of play is mainly made by series of algorithms
Response type, self-adapting type or intellectual behavior are generated in non-player role, make non-player role that there is intelligent behavior.Outstanding
Game AI can make corresponding response according to the variation of itself environment, increase the rich and diversity of game.
In order to allow game AI that can spontaneously produce the response to environment, at present by enhancing learning art come training game
AI, the search for identity using program and the imitation to human behavior, response of the spontaneous generation to environment eliminate engineer's trip
Play rule.Currently, to make sampling is synchronous with training to carry out, need to suspend the operation of game in game AI training process, etc.
It is further continued for running after game AI has been trained, i.e. game is alternately performed with game AI, and game running generates volume of data, game
Pause, game AI obtain the data and make a policy, and game waits game AI to make a policy, and transports again after getting decision
Row, realizes that sampling is synchronous with training with this.
However, the problems such as many game are due to game internal clocking at present, does not support to suspend trip in training game AI
Play, game, which does not suspend, can make sampling asynchronous with game AI training, can not be normally carried out so as to cause game AI training.
Summary of the invention
The embodiment of the present invention provides a kind of game AI training method, apparatus and system, to solve in the prior art without temporary
The game for stopping function can not be normally carried out the technical problem of game AI training.
In order to solve the above-mentioned technical problems, the present invention provides a kind of game AI training methods, are suitable for client, described
Method includes:
When running game process, the current status data of acquisition is transmitted to server-side;Wherein, the game process is
It is distributed by the server-side;
Control the execution of delay voltage;Wherein, the delay voltage is that the server-side is waited to return to the current state
When the corresponding decision data of data, movement that the game AI role being trained to executes;
The corresponding decision data of the current status data is obtained, and terminates the execution of the delay voltage;
The game AI role being trained to according to the corresponding decision data control of the current status data executes decision
Movement, and generate reward data and succeeding state data;
By the current status data, the decision data, the reward data and the succeeding state data organization at
Training sample, and the training sample is transmitted to server-side, to realize that the server-side is based on the training sample to described
The training network of server-side is trained and updates the decision networks of the server-side according to the data of training network, until network
Convergence.
Further, the game process is that the process distributed by the server-side includes:
Whole game data to be trained is divided into advance by several pieces game data by the server-side, and will be described several
Part game data is respectively allocated to corresponding client;It wherein, include that at least one is to be run in every part of game data
Game process.
Further, the corresponding decision data of the current status data that obtains includes:
Detect the execution time of the delay voltage;
When the execution time of the delay voltage being more than threshold value, phase is randomly selected from the decision data that server-side returns
The data answered are as the corresponding decision data of the current status data.
Further, the game process includes several;Then, described when running game process, by the current of acquisition
Status data transfers to server-side specifically includes:
When running game process, the corresponding current status data of each game process is acquired;
After the current status data of each game process has been collected, by the current shape of each game process
State data are used as with a batch of current status data uniform transmission to the server-side.
It is further, described to obtain the corresponding decision data of the current status data specifically:
It is unified to obtain with a batch of whole corresponding decision data of the current status data.
Further, data correlation is added according to current status data of the collection sequencing to each game process
Information, to realize the corresponding current status data association of each decision data;Wherein, each decision data also has
The identical data association information of the corresponding current status data.
The present invention also provides a kind of game AI training devices, comprising:
First state data module, for when running game process, the current status data of acquisition to be transmitted to service
End;Wherein, the game process is distributed by the server-side;
Delay voltage execution module, for controlling the execution of delay voltage;Wherein, the delay voltage is to wait server-side
When returning to the corresponding decision data of the current status data, movement that the game AI role being trained to executes;
First decision data module is prolonged for obtaining the corresponding decision data of the current status data, and described in end
The execution acted late;
Decision action executing module, for being trained to according to the corresponding decision data control of the current status data
Game AI role execute decision movement, and generate reward data and succeeding state data;
First training sample module is used for the current status data, the decision data, the reward data and institute
Succeeding state data organization is stated into training sample, and the training sample is transmitted to server-side, to realize the service end group
It is trained in training network of the training sample to the server-side and updates the service according to the data of training network
The decision networks at end, until network convergence.
The present invention also provides a kind of game AI training methods, are suitable for server-side, which comprises
Receive the current status data sent by client;Wherein, the current status data is run by the client
Game process generates;
According to the current status data, generates corresponding decision data and feed back to client;Wherein, the decision number
Generation when executing delay voltage according to the game AI role being trained to for client control;
Receive the training sample that is sent by client, wherein the training sample for client according to the decision data,
The current status data and the reward data that generates and succeeding state number after decision movement are executed according to the decision data
According to organizing;
Training network is trained according to the training sample and updates decision-making mode according to the data of the trained network
Network, until network convergence.
Further, before the current status data for receiving and being sent by client, further includes:
Complete machine game data to be trained is divided into several pieces game data, and the several pieces game data is divided respectively
It is assigned to corresponding client;It wherein, include at least one game process to be run in every part of game data.
Further, described that training network is trained and according to the number of the trained network according to the training sample
It is specifically included according to decision networks is updated:
It randomly selects several training samples and is input to the trained network, and using stochastic gradient descent method to trip
Play AI is trained;
The frequency of training of the trained network is trained in detection according to several training samples of selection;
When the frequency of training is more than preset times, according to the data of training network to the data of the decision networks into
Row updates.
Further, described according to the current status data, generate corresponding decision data and feed back to client it
Afterwards, further includes:
Delete the corresponding current status data of the decision data for having fed back to client.
The present invention also provides a kind of game AI training devices, comprising:
Second status data module, for receiving the current status data sent by client;Wherein, the current state
Data are generated by client running game process;
Second decision data module, for generating corresponding decision data and feeding back to according to the current status data
Client;Wherein, the decision data is that the client controls generation when the game AI role being trained to executes delay voltage;
Second training sample module, for receiving the training sample of client transmission, wherein the training sample is client
End is according to the decision data, the current status data and time generated after decision movement is executed according to the decision data
Count off evidence and succeeding state data organization form;
Training module, for being trained and training network according to the number of the trained network according to the training sample
According to decision networks is updated, until network convergence.
The present invention also provides a kind of game AI training systems, which is characterized in that including server-side and N number of client, N >=
1;Wherein, each client executing following steps:
When running game process, the current status data of acquisition is transmitted to the server-side;Wherein, the game into
Journey is distributed by the server-side;
Control the execution of delay voltage;Wherein, the delay voltage is that server-side is waited to return to the current status data
When corresponding decision data, the movement for the game AI role execution being trained to;
The corresponding decision data of the current status data is obtained, and terminates the execution of the delay voltage;
The game AI role being trained to according to the corresponding decision data control of the current status data executes decision
Movement, and generate reward data and succeeding state data;
By the current status data, the decision data, the reward data and the succeeding state data organization at
Training sample, and the training sample is transmitted to server-side, to realize that the server-side is based on the training sample to described
The training network of server-side is trained and updates the decision networks of the server-side according to the data of training network, until network
Convergence;
Complete machine game data to be trained is being divided into several pieces game data by the server-side, and the several pieces are swum
Play data are respectively allocated to after corresponding client, execute following steps:
Receive the current status data sent by client;Wherein, described in the current status data is run as client
Game process generates;
According to the current status data, generates corresponding decision data and feed back to client;Wherein, the decision number
Generation when executing delay voltage according to the game AI role being trained to for client control;
Receive the training sample that client is sent, wherein the training sample is client according to the decision data, institute
It states current status data and executes the reward data and succeeding state data generated after decision movement according to the decision data
It organizes;
Training network is trained according to the training sample and updates decision-making mode according to the data of the trained network
Network, until network convergence.
A kind of game AI training method of above-mentioned offer, apparatus and system, can be by waiting server-side to return to decision number
According to period, the game AI role that client control is trained to executes delay voltage, so that the game running of client continues, is not required to
Suspend game to wait the decision data of server-side, so that the game without built-in game pause function is also able to achieve trip
Play AI training is normally carried out.
Detailed description of the invention
Fig. 1 is the structural schematic diagram of game AI training platform provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart for game AI training method that the embodiment of the present invention one provides;
Fig. 3 is the flow chart of one embodiment of step S100 in embodiment illustrated in fig. 2;
Fig. 4 is the flow chart of one embodiment of step S300 in embodiment illustrated in fig. 2;
Fig. 5 is the structural schematic diagram of the deep neural network in the embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of game AI training device provided by Embodiment 2 of the present invention;
Fig. 7 is a kind of flow chart for game AI training method that the embodiment of the present invention three provides;
Fig. 8 is the flow chart of one embodiment of step S704 in embodiment illustrated in fig. 7;
Fig. 9 is a kind of structural schematic diagram for game AI training device that the embodiment of the present invention four provides.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Embodiment one
Game AI training method provided in this embodiment is executed by game AI training platform, is in development of games stage reality
It applies, game AI training platform is to collect game running data, being trained customization to game AI.Specifically, this implementation
The game AI training method that example provides is executed by the part of running game in game AI training platform, such as game AI training platform point
Client to sample game data and the server-side to game AI training, referring to Fig. 1, Fig. 1 is that the embodiment of the present invention provides
Game AI training platform structural schematic diagram;Then, game AI training method provided in this embodiment is by client executing, in reality
It applies in example, is described so that game AI training platform is divided into client and server-side as an example.
Referring to Fig. 2, Fig. 2 is a kind of flow chart for game AI training method that the embodiment of the present invention one provides;The present invention
A kind of game AI training method is provided, client is suitable for, which comprises
S100, when running game process, the current status data of acquisition is transmitted to server-side;Wherein, the game
Process is distributed by the server-side.
Wherein, current status data is the number for the current state for indicating that several game roles are locating in going game
According to the continuous service of game, the state of the game role in game may change, and current status data represents trip
The status data taken action of play role's the last time, can be, but not limited to include the current operation status of several roles state
The status data of the going games such as the status data of data, current action state;For example, having A role, B role, the angle C in game
Color, corresponding current state are to attack state, state of escaping, in certain specific location wait state.In general, game angle
The current state of color can reflect the ambient enviroment of game AI role in gaming.
Specifically, client runs the game process distributed by server-side, the raw operational data of game process is generated, it will
After raw operational data is pre-processed, identified, it is converted into current status data used in decision, and by establishing with server-side
Connection, current status data is transmitted to server-side.
Optionally, the game process of client operation can be server-side and be assigned in all game datas of the client
Game process, be also possible to the game of the partial game data that server-side is assigned in all game datas of the client into
Journey.
Preferably, the game process is that the process distributed by the server-side includes:
Whole game data to be trained is divided into advance by several pieces game data by the server-side, and will be described several
Part game data is respectively allocated to corresponding client;It wherein, include that at least one is to be run in every part of game data
Game process.
Specifically, whole game data is divided into several pieces game data, and distribute to corresponding client, multiple client
The operation of corresponding game data is carried out, the generation of data can be accelerated, realizes distributed running game data, sampling efficiency
Height shortens the game R&D cycle, even and can also accelerate the generations of data without the game of built-in game acceleration function;Together
When, the current status data of several clients is transmitted to server-side processing, can give full play to GPU (that is: the image of server-side
Processor) parallel processing function, reduce research and development cost.
Referring to Fig. 3, Fig. 3 is the flow chart of one embodiment of step S100 in embodiment illustrated in fig. 2;Preferably, institute
Stating game process includes several;Then, described when running game process in step S100, by the current status data of acquisition
Server-side is transmitted to specifically include:
S101, when running game process, acquire the corresponding current status data of each game process;
S102, after the current status data of each game process has been collected, by each game process
Current status data is used as with a batch of current status data uniform transmission to the server-side.
Specifically, client runs several game process, the corresponding current status data of each game process is generated, is collected
Same batch is re-used as after the current status data of complete each game process and is transmitted to server-side, it is possible to reduce transmission time
Number, and multiple game roles are had under normal circumstances, in game environment, the operation of each game role may use different trips
It plays Process flowchart, it is unified that the current status data of each game process is transmitted to server-side, it can allow pair of server-side
Game environment has more comprehensive analysis on the whole, and the decision data of return more meets current game environment.
Preferably, data correlation letter is added according to current status data of the collection sequencing to each game process
Breath, to realize the corresponding current status data association of each decision data;Wherein, each decision data also have with
The identical data association information of its corresponding described current status data.
Wherein, data association information is the associated letter of current status data for realizing that each decision data is corresponding
Breath, and the data correlation data being added in each decision data and corresponding current status data are identical, can make
It obtains client and acquires decision data corresponding with current status data.
Optionally, data association information be SID information, SID information may include the corresponding game of current status data into
The current time stamp of client when the process ID of journey, acquisition current status data.
Optionally, when multiple client running game data in a distributed manner, the data association information can be added pair
The client-side information answered.Then, client-side information can be the MAC Address of client.
S200, the execution for controlling delay voltage;Wherein, the delay voltage is to wait the server-side return described current
When the corresponding decision data of status data, movement that the game AI role being trained to executes.
Wherein, decision data is that server-side carries out a series of calculating to current status data, analysis, simulates the use obtained
The data for responding the behavior act of locating game environment are executed in control game AI role.Decision data is determining by server-side
Plan assessing network is calculated, specifically, decision networks is a kind of deep neural network, evaluates to execute under each state and respectively determine
The value of plan calculates and generates decision data.
Specifically, server-side is to client end response and return pair after current status data is transmitted to server-side by client
The decision data value client answered needs the regular hour, and server-side is being waited to return to the corresponding decision data of current status data
When, the game continuous service of client controls the game AI role being trained to and executes delay voltage.In this way, being waited in client
When server-side returns to current status data corresponding decision data, the game of client does not need to suspend yet, continuous service game,
Specifically game AI role executes delay voltage in the operational process when waiting, so that the game of pause function can also be not just
Often carry out game AI training.
Optionally, delay voltage can be that game AI role is as you were, be also possible to other random movements.
S300, the corresponding decision data of the current status data is obtained, and terminates the execution of the delay voltage.
Referring to Fig. 4, Fig. 4 is the flow chart of one embodiment of step S300 in embodiment illustrated in fig. 2;Preferably, institute
Stating the corresponding decision data of the acquisition current status data includes:
S301, the execution time for detecting the delay voltage;
S302, when the execution time of the delay voltage being more than threshold value, it is random from the decision data that server-side returns
Corresponding data are extracted as the corresponding decision data of the current status data.
Specifically, detection delay voltage the execution time, and when being executed between be more than threshold value when, before server-side
Corresponding data are randomly selected in the decision data of return as the corresponding decision-making state of current status data, are avoided due to net
It is excessively slow that the decision data of server-side caused by the reasons such as network is unstable returns, so that the game AI role in game stops always
Stay in the state for executing delay voltage, the situation that game sampling efficiency lowers;And by from the decision data that server-side returns
Corresponding data are randomly selected as the corresponding decision data of the current status data, may further ensure that game is persistently transported
Row does not have to pause game.
Optionally, the threshold value for executing the time may be configured as 100ms.
It is preferably, described to obtain the corresponding decision data of the current status data specifically:
It is unified to obtain with a batch of whole corresponding decision data of the current status data.
Specifically, client acquires the corresponding current status data of each game process when game process includes several
As same batch uniform transmission to server-side after complete, then, when client obtains decision data, uniformly obtained from server-side same
Whole corresponding decision datas of the current status data of batch.
S400, the game AI role being trained to according to the corresponding decision data control of the current status data execute
Decision movement, and generate reward data and succeeding state data.
Wherein, decision movement refers to the movement executed according to decision data, is the response to current state, such as when knowing
When B role is escaping, decision movement is pursuit B role, then the game AI role being trained to executes decision movement, to B role
Escape condition responsive be to B role pursue.Reward data refer to execute decision act as a result, reward data can be
It is positive or negative, it is however generally that, it is that positive reward data means to execute the result is that " good ", to reaching final mesh
Indicate positive effect.Reward data is determined by the Reward Program designed.The game AI role being trained to is exactly by long-term instruction
It is experienced, it is interacted with the external world and observes the return under each state to learn action sequence.Succeeding state data are to execute decision to move
Current status data after work, succeeding state data are also by the current status data as subsequent transmission to server-side, client
Persistently carry out data sampling.
S500, by the current status data, the decision data, the reward data and the succeeding state data group
It is made into training sample, and the training sample is transmitted to server-side, to realize that the server-side is based on the training sample pair
The training network of the server-side is trained and updates according to the data of training network the decision networks of the server-side, until
Network convergence.
Wherein, the current status data in training sample, decision data, reward data and succeeding state data are opposite
The data answered;The training network and decision networks of server-side are deep neural networks.
Optionally, the data of training network and decision networks may be configured as identical at the beginning.
Referring to Fig. 5, Fig. 5 is the structural schematic diagram of the deep neural network in the embodiment of the present invention;Optionally, depth mind
Structure through network can be set as including 2 convolutional layers and 2 full articulamentums, be specifically divided into the first convolutional layer 51, the second convolutional layer
52, the first full articulamentum 53, the second full articulamentum 54;Convolutional layer is used for feature extraction, and full articulamentum is determined as classifier calculated
The decision in plan space is distributed;Each convolutional layer is all made of batch standardization and relu activation primitive, and each full articulamentum is all made of
Relu activation primitive.
It optionally, is the consumption of the study idea of balance games AI and calculating, the parameter of convolutional layer and full articulamentum can basis
Specific game is adjusted, it is however generally that, the convolution kernel size of the first convolutional layer 51 can be bigger, and step-length is generally 1~4, leads to
The n times power that road number is 2;Port number is bigger, and the learning ability of model is stronger, but corresponding calculating consumption is bigger.Illustratively,
The convolution kernel size of one convolutional layer 51 is set as 6 × 6, port number 8, step-length 3;The convolution kernel of second convolutional layer 52 is having a size of 3
× 3, port number 8, step-length 2;First complete 53 port number of articulamentum is 128;The port number of second full articulamentum 54 is decision
Space number.
It should be noted that training network needs to be implemented the training of preset times, and the parameter of decision networks is regularly updated,
And preset times are arranged according to the case where specific game data.
When it is implemented, the current status data of acquisition is passed when the game process of client operation service end distribution
Transport to server-side;When client waits the server-side to return to the current status data corresponding decision data, control delay
The execution of movement, so that without suspending game, continuous service game in waiting process;It is corresponding to obtain the current status data
Decision data, and terminate the execution of the delay voltage;According to the corresponding decision data control of the current status data
The game AI role being trained to executes decision movement, and generates reward data and succeeding state data;By the current state number
According to, the decision data, the reward data and the succeeding state data organization at training sample, and by the training sample
It is transmitted to server-side, to realize that the server-side is trained simultaneously based on training network of the training sample to the server-side
The decision networks of the server-side is updated according to the data of training network, until network convergence, is realized with this and swum in continuous service
In the case where play, the training to game AI is normally completed.
Implement game AI training method provided in this embodiment, during decision data capable of being returned to by waiting server-side,
The game AI role that client control is trained to executes delay voltage, so that the game running of client continues, does not need to suspend
Game waits the decision data of server-side, so that the game without built-in game pause function is also able to achieve game AI instruction
White silk is normally carried out.
In order to make it easy to understand, being further Jie to the implementation process of game AI training method provided in this embodiment below
It continues:
Referring to Fig. 1, Fig. 1 is the structural schematic diagram of game AI training platform provided in an embodiment of the present invention;In game AI
Training platform building server-side and client, client can execute the game AI training method in one or more hosts,
There are the first exchange data pool, game management module, training sample molded tissue block in client foundation;Have second in server-side creation
Exchange pond, decision networks, training network.It is counted between first exchange data pool and the second exchange data pool by network connection
According to interaction;Game management module can with running game data, generate operation data etc., training sample molded tissue block can will be related
Data organization is at training sample for server-side training.
Further, client first exchange data pool with the second of server-side exchange data pool establish it is stateful
Pond, decision pond and training pool;Decision networks and training network are deep neural network, contain function model.
The game data that client is distributed at game management module operation service end, the operation data of generation is located in advance
Current status data is generated after reason, identification, and by the state pool of current status data deposit client, when the game data of operation
It, can be after the current status data for having collected each game process, by working as each game process when containing several game process
Preceding status data is transmitted to the state pool of server-side as a batch.
At this point, the game AI role that client control is trained to executes delay voltage.
Under normal circumstances, within a certain period of time, client can get current state number from the decision pond of server-side
According to corresponding decision data, and it is stored in the decision pond of client, wherein the generation process of decision data is the state pool of server-side
After getting current status data, server-side is pre-processed to current status data and is transferred to the decision networks of server-side,
Decision networks carries out the processing such as analytical calculation to current status data, obtains decision data;In abnormal cases, client is one
Server-side has not been obtained in threshold value of fixing time and returns to corresponding decision data, the acquisition that client can be stored from decision pond at random
Decision data as the corresponding decision data of current status data.
After client obtains decision data, the delay voltage executes stopping, and client is determined according to decision data execution
It instigates to make, generates reward data and succeeding state data.
The training sample molded tissue block of client will obtain respectively current state number from the state pool of client, decision pond
According to, decision data, and by current status data, decision data and corresponding reward data, succeeding state data organization at instruction
Practice sample, is stored in the training pool of client;
The training sample for being stored into client training pool is transmitted to the training pool storage of server-side, to realize the instruction of server-side
Practice after the training sample randomly selected is transferred to trained network by pond, training network is trained game AI role and periodically more
New decision networks, until network convergence.
Embodiment two
Referring to Fig. 6, Fig. 6 is a kind of structural schematic diagram of game AI training device provided by Embodiment 2 of the present invention;This
Inventive embodiments two additionally provide a kind of game AI training device, comprising:
First state data module 11, for when running game process, the current status data of acquisition to be transmitted to clothes
Business end;Wherein, the game process is distributed by the server-side;
Delay voltage execution module 12, for controlling the execution of delay voltage;Wherein, the delay voltage be etc. it is to be serviced
When end returns to the current status data corresponding decision data, movement that the game AI role being trained to executes;
First decision data module 13, for obtaining the corresponding decision data of the current status data, and described in end
The execution of delay voltage;
Decision action executing module 14, for being instructed according to the corresponding decision data control of the current status data is described
Experienced game AI role executes decision movement, and generates reward data and succeeding state data;
First training sample module 15, for by the current status data, the decision data, the reward data and
The succeeding state data organization is transmitted to server-side at training sample, and by the training sample, to realize the server-side
It is trained based on training network of the training sample to the server-side and updates the clothes according to the data of training network
The decision networks at business end, until network convergence.
Preferably, the first decision data module 13 further include:
Detection unit, for detecting the execution time of the delay voltage;
Stochastic Decision-making data capture unit, for when the execution time of the delay voltage be more than threshold value when, from server-side
Corresponding data are randomly selected in the decision data of return as the corresponding decision data of the current status data.
Preferably, the game process includes several;Then, the first state data module 11 further include:
Current status data acquisition unit, it is corresponding current for when running game process, acquiring each game process
Status data;
Current status data uniform transmission unit has been collected for the current status data when each game process
Afterwards, using the current status data of each game process as with a batch of current status data uniform transmission to the clothes
Business end.
Preferably, the first decision data module 13 further include:
Decision data unifies acquiring unit, corresponds to for unified obtain with a batch of whole current status data
Decision data.
Preferably, the game AI training device further include:
Data association information writing unit, for the current state according to collection sequencing to each game process
Data association information is added in data, to realize the corresponding current status data association of each decision data;Wherein, each institute
It states decision data and also has the identical data association information of the corresponding current status data.
Technical solution provided in this embodiment, by first state data module 11, when client running game process,
The current status data of acquisition is transmitted to server-side;Wherein, the game process is distributed by the server-side;By prolonging
Slow action executing module 12 controls the execution of delay voltage;Wherein, the delay voltage is to wait server-side return described current
When the corresponding decision data of status data, movement that the game AI role being trained to executes;Pass through the first decision data module 13
The corresponding decision data of the current status data is obtained, and terminates the execution of the delay voltage;Then it is acted by decision
The game AI role that execution module 14 is trained to according to the corresponding decision data control of the current status data, which executes, to determine
It instigates to make, and generates reward data and succeeding state data;And then pass through the first training sample module 15 again for the current shape
State data, the decision data, the reward data and the succeeding state data organization are at training sample, and by the training
Sample delivery is to server-side, to realize that the server-side is instructed based on training network of the training sample to the server-side
The decision networks for practicing and updating according to the data of training network the server-side, until network convergence.So set, can pass through
During waiting server-side to return to decision data, the game AI role that client control is trained to executes delay voltage, so that client
The game running at end continues, and does not need pause game to wait the decision data of server-side, so that temporary without built-in game
The game for stopping function is also able to achieve game AI training and is normally carried out.
It should be noted that the game AI training device provided by Embodiment 2 of the present invention is for executing above-described embodiment
Described in one any one the step of game AI training method, the working principle and beneficial effect of the two are corresponded, because without
It repeats again.
It will be understood by those skilled in the art that the schematic diagram of the game AI training device is only game AI training device
Example, do not constitute the restriction to game AI training device, may include than illustrating more or fewer components, or combination
Certain components or different components, such as the game AI training device can also include input-output equipment, network insertion
Equipment, bus etc..
Embodiment three
Game AI training method provided in this embodiment is executed by server-side, in embodiment, with game AI training platform
It is divided into for client and server-side and is described.Referring to Fig. 7, Fig. 7 is a kind of game AI that the embodiment of the present invention three provides
The flow chart of training method
The embodiment of the present invention three additionally provides a kind of game AI training method, and this method is suitable for server-side, the method
Include:
The current status data that S701, reception are sent by client;Wherein, the current status data is by the client
Running game process generates;
S702, according to the current status data, generate corresponding decision data and feed back to client;Wherein, described
Decision data is generation when the game AI role that client control is trained to executes delay voltage;
Specifically, prolonging after server-side receives current status data in the game AI execution that client control is trained to
Late when movement, after carrying out the processing such as a series of analysis, calculating to current status data, it is corresponding certainly to generate current status data
Plan data, and return to corresponding client.The generation of decision data is that the data processing of the decision networks based on server-side obtains
It arrives.
The training sample that S703, reception are sent by client, wherein the training sample is client according to the decision
Data, the current status data and the reward data that generates and subsequent shape after decision movement are executed according to the decision data
State data organization forms;
Wherein, the current status data in training sample, decision data, reward data and succeeding state data are opposite
The data answered.
S704, training network is trained according to the training sample and is determined according to the update of the data of the trained network
Plan network, until network convergence.
Wherein, the training network of server-side and decision networks are deep neural networks.
Optionally, the data of training network and decision networks may be configured as identical at the beginning.
Referring to Fig. 5, the structural schematic diagram of the deep neural network in Fig. 5 embodiment of the present invention;Optionally, depth nerve
The structure of network can be set as including 2 convolutional layers and 2 full articulamentums, be specifically divided into the first convolutional layer 51, the second convolutional layer 52,
First full articulamentum 53, the second full articulamentum 54;Convolutional layer is used for feature extraction, and full articulamentum is empty as classifier calculated decision
Between decision distribution;Each convolutional layer is all made of batch standardization and relu activation primitive, and each full articulamentum is all made of relu
Activation primitive.
It optionally, is the consumption of the study idea of balance games AI and calculating, the parameter of convolutional layer and full articulamentum can basis
Specific game is adjusted, it is however generally that, the convolution kernel size of the first convolutional layer 51 can be bigger, and step-length is generally 1~4, leads to
The n times power that road number is 2;Port number is bigger, and the learning ability of model is stronger, but corresponding calculating consumption is bigger.Illustratively,
The convolution kernel size of one convolutional layer 51 is set as 6 × 6, port number 8, step-length 3;The convolution kernel of second convolutional layer 52 is having a size of 3
× 3, port number 8, step-length 2;First complete 53 port number of articulamentum is 128;The port number of second full articulamentum 54 is decision
Space number.
It should be noted that training network needs to be implemented the training of preset times, and regularly update the parameter of decision networks.
Preferably, it is described receive the current status data sent by client before, further includes:
Complete machine game data to be trained is divided into several pieces game data, and the several pieces game data is divided respectively
It is assigned to corresponding client;It wherein, include at least one game process to be run in every part of game data.
Specifically, server-side establishes a connection with several clients, several clients run corresponding game number
According to being interactively communicated respectively with server-side.The game data that server-side distributes to several clients can be it is identical, can also be with
It gives the distribution of different clients different game datas, is that several clients handle different tasks respectively.It is designed in this way, point
Complete machine game data is run to cloth, the generation of data can be accelerated, improves sampling efficiency, shortens the R&D cycle;By several
The current status data of client is sent to the generation that server-side carries out decision data, can give full play at the image of server-side
The parallel processing function of device GPU is managed, research and development cost is saved.
Optionally, when the current status data received has data association information, decision data of the server-side to generation
Data association information is written, specifically, the data association information of write-in decision data and corresponding current status data
Data association information is identical.
Optionally, data association information be SID information, SID information may include the corresponding game of current status data into
The current time stamp of client when the process ID of journey, acquisition current status data.
Optionally, when multiple client running game data in a distributed manner, the data association information can be added pair
The client-side information answered.Then, client-side information can be the MAC Address of client.
Referring to Fig. 8, Fig. 8 is the flow chart of one embodiment of step S704 in embodiment illustrated in fig. 7;Preferably, institute
State according to the training sample to training network be trained and according to the data of the trained network update decision networks it is specific
Include:
S801, it several training samples is randomly selected is input to the trained network, and use stochastic gradient descent
Method is trained game AI;
Specifically, server-side can store the training sample obtained from client, when being trained to game AI, from storage
Several training samples are randomly selected in training sample, are trained using stochastic gradient descent method.Due to training sample scale
It is larger, the whole training sample of input can not needed using stochastic gradient descent method and be trained, to reduce to calculating
Training speed is accelerated in the requirement of memory source.
Optionally, when detecting that the quantity of training sample of server-side storage is more than preset value, certain training will be deleted
Sample is deleted since the training sample being stored in earliest specifically, may be configured as server-side according to the deposit sequence of training sample
It removes.
S802, detection train the frequency of training of the trained network according to several training samples of selection;
S803, when the frequency of training be more than preset times when, according to training network data to the decision networks
Data are updated.
Preferably, described according to the current status data, generate corresponding decision data and feed back to client it
Afterwards, further includes:
Delete the corresponding current status data of the decision data for having fed back to client.
In this preferred embodiment, to the corresponding current status data of the decision data for having fed back to client into
Row is deleted, and can also be deleted to the decision data for having fed back to client to avoid the waste of memory headroom, further
It removes.
When it is implemented, server-side receives the current status data sent by client;And according to the current state number
According to generating corresponding decision data and feed back to client;Wherein, the decision data client control is trained to
Game AI role generates when executing delay voltage;Server-side receives the training sample sent by client;And then according to the instruction
Practice sample to be trained training network and update decision networks according to the data of the trained network, until network convergence.
Implement the game AI training method that the embodiment of the present invention three provides, feelings that can be lasting in the game running of client
Under condition, particularly when the game AI role that client control is trained to executes delay voltage, decision data is generated, to make
The game for obtaining no built-in game pause function is also able to achieve being normally carried out for game AI training.
Example IV
Referring to Fig. 9, Fig. 9 is a kind of structural schematic diagram for game AI training device that the embodiment of the present invention four provides;This
Invention additionally provides a kind of game AI training device, comprising:
Second status data module 21, for receiving the current status data sent by client;Wherein, the current shape
State data are generated by client running game process;
Second decision data module 22, for generating corresponding decision data and feeding back according to the current status data
To client;Wherein, the decision data is that the client controls production when the game AI role being trained to executes delay voltage
It is raw;
Second training sample module 23, for receiving the training sample of client transmission, wherein the training sample is visitor
Family end generates after executing decision movement according to the decision data, the current status data and according to the decision data
Reward data and succeeding state data organization form;
Training module 24, for being trained and training network according to the trained network according to the training sample
Data update decision networks, until network convergence.
Preferably, the game AI training device further include:
Distribution module, for complete machine game data to be trained to be divided into several pieces game data, and by the several pieces
Game data is respectively allocated to corresponding client;It wherein, include at least one trip to be run in every part of game data
Play process.
Preferably, the training module 24 further include:
Training unit is input to the trained network for randomly selecting several training samples, and using random
Gradient descent method is trained game AI;
Frequency of training detection unit, for detecting several training sample training trained networks according to selection
Frequency of training;
Updating unit, for being determined to described according to the data of training network when the frequency of training is more than preset times
The data of plan network are updated.
Technical solution provided in this embodiment receives the current shape sent by client by the second status data module 21
State data;By the second decision data module 22 according to the current status data, generates corresponding decision data and feed back to
Client;Wherein, the decision data is that the client controls generation when the game AI role being trained to executes delay voltage;
The training sample that client is sent is received by the second training sample module 23;By training module 24 according to the training sample
Training network is trained and updates decision networks according to the data of the trained network, until network convergence.So set,
It can particularly be executed in the game AI role that client control is trained in the case where the game running of client continues
When delay voltage, decision data is generated, so that the game without built-in game pause function is also able to achieve game AI training
It is normally carried out.
It should be noted that the game AI training device that the embodiment of the present invention four provides is for executing above-described embodiment
Described in three any one the step of game AI training method, the working principle and beneficial effect of the two are corresponded, because without
It repeats again.
It will be understood by those skilled in the art that the schematic diagram of the game AI training device is only game AI training device
Example, do not constitute the restriction to game AI training device, may include than illustrating more or fewer components, or combination
Certain components or different components, such as the game AI training device can also include input-output equipment, network insertion
Equipment, bus etc..
Embodiment five
The embodiment of the present invention five additionally provides a kind of game AI training system, which is characterized in that including server-side and N number of visitor
Family end, N >=1;Wherein, each client executing following steps:
When running game process, the current status data of acquisition is transmitted to the server-side;Wherein, the game into
Journey is distributed by the server-side;
Control the execution of delay voltage;Wherein, the delay voltage is that server-side is waited to return to the current status data
When corresponding decision data, the movement for the game AI role execution being trained to;
The corresponding decision data of the current status data is obtained, and terminates the execution of the delay voltage;
The game AI role being trained to according to the corresponding decision data control of the current status data executes decision
Movement, and generate reward data and succeeding state data;
By the current status data, the decision data, the reward data and the succeeding state data organization at
Training sample, and the training sample is transmitted to server-side, to realize that the server-side is based on the training sample to described
The training network of server-side is trained and updates the decision networks of the server-side according to the data of training network, until network
Convergence;
Complete machine game data to be trained is being divided into several pieces game data by the server-side, and the several pieces are swum
Play data are respectively allocated to after corresponding client, execute following steps:
Receive the current status data sent by client;Wherein, described in the current status data is run as client
Game process generates;
According to the current status data, generates corresponding decision data and feed back to client;Wherein, the decision number
Generation when executing delay voltage according to the game AI role being trained to for client control;
Receive the training sample that client is sent, wherein the training sample is client according to the decision data, institute
It states current status data and executes the reward data and succeeding state data generated after decision movement according to the decision data
It organizes;
Training network is trained according to the training sample and updates decision-making mode according to the data of the trained network
Network, until network convergence.
It should be noted that the embodiment of the present invention five provide the game AI training system in each client and
The correlation step that server-side executes is respectively corresponding the step of implementing game AI training method in one and embodiment three, working principle
It corresponds, thus repeats no more with beneficial effect.
Optionally, the step of each client executing in the game AI training system can also include such as embodiment one
Described in any one the step of game AI training method, server-side execute the step of can also include as embodiment three it is any one
The step of game AI training method described in item, and working principle and beneficial effect correspond, thus repeat no more.
It should be noted that the apparatus embodiments described above are merely exemplary, wherein described be used as separation unit
The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with
It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual
It needs that some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.In addition, device provided by the invention
In embodiment attached drawing, the connection relationship between module indicate between them have communication connection, specifically can be implemented as one or
A plurality of communication bus or signal wire.Those of ordinary skill in the art are without creative efforts, it can understand
And implement.
The above is a preferred embodiment of the present invention, it is noted that for those skilled in the art
For, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also considered as
Protection scope of the present invention.
Claims (13)
1. a kind of game AI training method, which is characterized in that be suitable for client, which comprises
When running game process, the current status data of acquisition is transmitted to server-side;Wherein, the game process is by institute
State server-side distribution;
Control the execution of delay voltage;Wherein, the delay voltage is that the server-side is waited to return to the current status data
When corresponding decision data, the movement for the game AI role execution being trained to;
The corresponding decision data of the current status data is obtained, and terminates the execution of the delay voltage;
The game AI role being trained to according to the corresponding decision data control of the current status data executes decision movement,
And generate reward data and succeeding state data;
By the current status data, the decision data, the reward data and the succeeding state data organization at training
Sample, and the training sample is transmitted to server-side, to realize that the server-side is based on the training sample to the service
The training network at end is trained and updates the decision networks of the server-side according to the data of training network, until network is received
It holds back.
2. game AI training method as described in claim 1, which is characterized in that the game process is by the server-side point
The process matched includes:
Whole game data to be trained is divided into advance by several pieces game data by the server-side, and the several pieces are swum
Play data are respectively allocated to corresponding client;It wherein, include at least one game to be run in every part of game data
Process.
3. game AI training method as described in claim 1, which is characterized in that the acquisition current status data is corresponding
Decision data include:
Detect the execution time of the delay voltage;
When the execution time of the delay voltage being more than threshold value, randomly selected from the decision data that server-side returns corresponding
Data are as the corresponding decision data of the current status data.
4. game AI training method as described in claim 1, which is characterized in that the game process includes several;Then, institute
It states when running game process, the current status data of acquisition is transmitted to server-side and is specifically included:
When running game process, the corresponding current status data of each game process is acquired;
After the current status data of each game process has been collected, by the current state number of each game process
According to as with a batch of current status data uniform transmission to the server-side.
5. game AI training method as claimed in claim 4, which is characterized in that the acquisition current status data is corresponding
Decision data specifically:
It is unified to obtain with a batch of whole corresponding decision data of the current status data.
6. game AI training method as claimed in claim 4, which is characterized in that according to collection sequencing to each trip
Data association information is added in the current status data of play process, to realize the corresponding current state number of each decision data
According to association;Wherein, each decision data also has the identical data correlation of the corresponding current status data
Information.
7. a kind of game AI training device characterized by comprising
First state data module, for when running game process, the current status data of acquisition to be transmitted to server-side;Its
In, the game process is distributed by the server-side;
Delay voltage execution module, for controlling the execution of delay voltage;Wherein, the delay voltage is that server-side is waited to return
When the corresponding decision data of the current status data, movement that the game AI role being trained to executes;
First decision data module, for obtaining the corresponding decision data of the current status data, and it is dynamic to terminate the delay
The execution of work;
Decision action executing module, the trip for being trained to according to the corresponding decision data control of the current status data
The AI role that plays executes decision movement, and generates reward data and succeeding state data;
First training sample module, for by the current status data, the decision data, the reward data and it is described after
Continuous status data is organized into training sample, and the training sample is transmitted to server-side, to realize that the server-side is based on institute
Training sample is stated to be trained the training network of the server-side and update the server-side according to the data of training network
Decision networks, until network convergence.
8. a kind of game AI training method, which is characterized in that be suitable for server-side, which comprises
Receive the current status data sent by client;Wherein, the current status data is by the client running game
Process generates;
According to the current status data, generates corresponding decision data and feed back to client;Wherein, the decision data is
Generation when the game AI role that the client control is trained to executes delay voltage;
Receive the training sample that is sent by client, wherein the training sample is client according to the decision data, described
Current status data and the reward data and succeeding state data group that generate after decision movement are executed according to the decision data
It knits;
Training network is trained according to the training sample and updates decision networks according to the data of the trained network, directly
To network convergence.
9. game AI training method as claimed in claim 8, which is characterized in that sent in the reception by client current
Before status data, further includes:
Complete machine game data to be trained is divided into several pieces game data, and the several pieces game data is respectively allocated to
Corresponding client;It wherein, include at least one game process to be run in every part of game data.
10. game AI training method as claimed in claim 8, which is characterized in that it is described according to the training sample to training
Network is trained and updates decision networks according to the data of the trained network and specifically includes:
It randomly selects several training samples and is input to the trained network, and using stochastic gradient descent method to game AI
It is trained;
The frequency of training of the trained network is trained in detection according to several training samples of selection;
When the frequency of training is more than preset times, carried out more according to data of the data of training network to the decision networks
Newly.
11. game AI training method as claimed in claim 8, which is characterized in that it is described according to the current status data, it is raw
At corresponding decision data and feed back to after client, further includes:
Delete the corresponding current status data of the decision data for having fed back to client.
12. a kind of game AI training device characterized by comprising
Second status data module, for receiving the current status data sent by client;Wherein, the current status data
It is generated by client running game process;
Second decision data module, for generating corresponding decision data and feeding back to client according to the current status data
End;Wherein, the decision data is that the client controls generation when the game AI role being trained to executes delay voltage;
Second training sample module, for receiving the training sample of client transmission, wherein the training sample is client root
Execute according to the decision data, the current status data and according to the decision data return number generated after decision movement
It is formed according to succeeding state data organization;
Training module, for according to the training sample to training network be trained and according to the data of the trained network more
New decision networks, until network convergence.
13. a kind of game AI training system, which is characterized in that including server-side and N number of client, N >=1;Wherein, each described
Client executing following steps:
When running game process, the current status data of acquisition is transmitted to the server-side;Wherein, the game process is
It is distributed by the server-side;
Control the execution of delay voltage;Wherein, the delay voltage is to wait server-side to return to the current status data to correspond to
Decision data when, movement that the game AI role that is trained to executes;
The corresponding decision data of the current status data is obtained, and terminates the execution of the delay voltage;
The game AI role being trained to according to the corresponding decision data control of the current status data executes decision movement,
And generate reward data and succeeding state data;
By the current status data, the decision data, the reward data and the succeeding state data organization at training
Sample, and the training sample is transmitted to server-side, to realize that the server-side is based on the training sample to the service
The training network at end is trained and updates the decision networks of the server-side according to the data of training network, until network is received
It holds back;
Complete machine game data to be trained is being divided into several pieces game data by the server-side, and by the several pieces game number
After being respectively allocated to corresponding client, following steps are executed:
Receive the current status data sent by client;Wherein, the current status data runs the game by client
Process generates;
According to the current status data, generates corresponding decision data and feed back to client;Wherein, the decision data is
Generation when the game AI role that the client control is trained to executes delay voltage;
Receive the training sample that client is sent, wherein the training sample be client according to the decision data, described work as
Preceding status data and the reward data and succeeding state data organization that generate after decision movement are executed according to the decision data
It forms;
Training network is trained according to the training sample and updates decision networks according to the data of the trained network, directly
To network convergence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811323771.9A CN109550252A (en) | 2018-11-07 | 2018-11-07 | A kind of game AI training method, apparatus and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811323771.9A CN109550252A (en) | 2018-11-07 | 2018-11-07 | A kind of game AI training method, apparatus and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109550252A true CN109550252A (en) | 2019-04-02 |
Family
ID=65866066
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811323771.9A Pending CN109550252A (en) | 2018-11-07 | 2018-11-07 | A kind of game AI training method, apparatus and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109550252A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110251942A (en) * | 2019-06-04 | 2019-09-20 | 腾讯科技(成都)有限公司 | Control the method and device of virtual role in scene of game |
CN110782004A (en) * | 2019-09-26 | 2020-02-11 | 超参数科技(深圳)有限公司 | Model training method, model calling equipment and readable storage medium |
CN110909890A (en) * | 2019-12-04 | 2020-03-24 | 腾讯科技(深圳)有限公司 | Game artificial intelligence training method and device, server and storage medium |
CN111841017A (en) * | 2020-05-29 | 2020-10-30 | 北京编程猫科技有限公司 | Game AI programming realization method and device |
CN112169311A (en) * | 2020-10-20 | 2021-01-05 | 网易(杭州)网络有限公司 | Method, system, storage medium and computer device for training AI (Artificial Intelligence) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070265043A1 (en) * | 2006-04-12 | 2007-11-15 | Wang Andy Y | Team-based networked video gaming and automatic event management |
CN106422332A (en) * | 2016-09-08 | 2017-02-22 | 腾讯科技(深圳)有限公司 | Artificial intelligence operation method and device applied to game |
CN107256174A (en) * | 2017-05-27 | 2017-10-17 | 武汉秀宝软件有限公司 | The implementation method and device of artificial intelligence |
CN107480059A (en) * | 2017-08-03 | 2017-12-15 | 网易(杭州)网络有限公司 | Acquisition methods, device, storage medium, processor and the service end of the sequence of operation |
CN107506830A (en) * | 2017-06-20 | 2017-12-22 | 同济大学 | Towards the artificial intelligence training platform of intelligent automobile programmed decision-making module |
-
2018
- 2018-11-07 CN CN201811323771.9A patent/CN109550252A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070265043A1 (en) * | 2006-04-12 | 2007-11-15 | Wang Andy Y | Team-based networked video gaming and automatic event management |
CN106422332A (en) * | 2016-09-08 | 2017-02-22 | 腾讯科技(深圳)有限公司 | Artificial intelligence operation method and device applied to game |
CN107256174A (en) * | 2017-05-27 | 2017-10-17 | 武汉秀宝软件有限公司 | The implementation method and device of artificial intelligence |
CN107506830A (en) * | 2017-06-20 | 2017-12-22 | 同济大学 | Towards the artificial intelligence training platform of intelligent automobile programmed decision-making module |
CN107480059A (en) * | 2017-08-03 | 2017-12-15 | 网易(杭州)网络有限公司 | Acquisition methods, device, storage medium, processor and the service end of the sequence of operation |
Non-Patent Citations (1)
Title |
---|
房晓溪: "《Java程序设计》", 28 February 2005 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110251942A (en) * | 2019-06-04 | 2019-09-20 | 腾讯科技(成都)有限公司 | Control the method and device of virtual role in scene of game |
CN110251942B (en) * | 2019-06-04 | 2022-09-13 | 腾讯科技(成都)有限公司 | Method and device for controlling virtual character in game scene |
CN110782004A (en) * | 2019-09-26 | 2020-02-11 | 超参数科技(深圳)有限公司 | Model training method, model calling equipment and readable storage medium |
CN110782004B (en) * | 2019-09-26 | 2022-06-21 | 超参数科技(深圳)有限公司 | Model training method, model calling equipment and readable storage medium |
CN110909890A (en) * | 2019-12-04 | 2020-03-24 | 腾讯科技(深圳)有限公司 | Game artificial intelligence training method and device, server and storage medium |
CN111841017A (en) * | 2020-05-29 | 2020-10-30 | 北京编程猫科技有限公司 | Game AI programming realization method and device |
CN112169311A (en) * | 2020-10-20 | 2021-01-05 | 网易(杭州)网络有限公司 | Method, system, storage medium and computer device for training AI (Artificial Intelligence) |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109550252A (en) | A kind of game AI training method, apparatus and system | |
CN109902820B (en) | AI model training method, device, storage medium and equipment | |
CN109754060A (en) | A kind of training method and device of neural network machine learning model | |
CN106951926A (en) | The deep learning systems approach and device of a kind of mixed architecture | |
CN112015536B (en) | Kubernetes cluster container group scheduling method, device and medium | |
KR20210028728A (en) | Method, apparatus, and device for scheduling virtual objects in a virtual environment | |
CN110102050A (en) | Virtual objects display methods, device, electronic equipment and storage medium | |
CN106681826B (en) | Resource planning method, system and device for cluster computing architecture | |
CN108888958A (en) | Virtual object control method, device, equipment and storage medium in virtual scene | |
CN110251942B (en) | Method and device for controlling virtual character in game scene | |
CN108289246A (en) | Data processing method, device, storage medium and electronic device | |
CN109819032A (en) | A kind of base station selected cloud robot task distribution method with computation migration of joint consideration | |
CN111708641A (en) | Memory management method, device and equipment and computer readable storage medium | |
Ye et al. | A new approach for resource scheduling with deep reinforcement learning | |
CN109255439A (en) | A kind of DNN model training method and device that multiple GPU are parallel | |
CN110659023B (en) | Method for generating programming content and related device | |
CN106557611A (en) | The Dynamic Load-balancing Algorithm research of distributed traffic network simulation platform and application | |
CN111193802A (en) | Dynamic resource allocation method, system, terminal and storage medium based on user group | |
CN105553732A (en) | Distributed network simulation method and system | |
US9977795B1 (en) | System and method for multiplayer network gaming | |
CN110598853A (en) | Model training method, information processing method and related device | |
CN106294395B (en) | A kind of method and device of task processing | |
CN103577705B (en) | A kind of system controls the data processing method and device of group | |
CN113326103B (en) | Virtual machine creation method and device | |
CN110465092A (en) | A kind of method and relevant apparatus of resource allocation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190402 |
|
RJ01 | Rejection of invention patent application after publication |