CN110443284A - Training method, call method, server and the readable storage medium storing program for executing of AI model - Google Patents
Training method, call method, server and the readable storage medium storing program for executing of AI model Download PDFInfo
- Publication number
- CN110443284A CN110443284A CN201910636868.3A CN201910636868A CN110443284A CN 110443284 A CN110443284 A CN 110443284A CN 201910636868 A CN201910636868 A CN 201910636868A CN 110443284 A CN110443284 A CN 110443284A
- Authority
- CN
- China
- Prior art keywords
- model
- vector
- role
- label
- full articulamentum
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/60—Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/70—Game security or game management aspects
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/70—Game security or game management aspects
- A63F13/79—Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Business, Economics & Management (AREA)
- Computer Security & Cryptography (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides training method, call method, server and the readable storage medium storing program for executing of a kind of AI model, this method comprises: obtaining first sample data set, wherein, the first sample data set includes that the role of first kind characteristics of image, primary vector feature and mark campaigns for label;Obtain the second sample data set, wherein second sample data set includes the policy tag of the second class characteristics of image, secondary vector feature and mark;Load AI model to be trained, wherein the AI model includes that role campaigns for model and tactful prediction model;Model is campaigned for the role according to the first sample data set and is iterated training, until the role campaigns for model convergence, and training is iterated to the tactful prediction model according to second sample data set, until the tactful prediction model convergence.The application can be improved the forecasting efficiency and accuracy of AI model.
Description
Technical field
This application involves the technical field of artificial intelligence more particularly to a kind of training methods of AI model, call method, clothes
Business device and readable storage medium storing program for executing.
Background technique
With the fast development of artificial intelligence (Artificial Intelligence, AI) technology, artificial intelligence technology quilt
It is widely used in every field, currently, may be implemented by artificial intelligence technology empty in chess game in field of game entertainment
Playing a game between quasi- AI and true man, and most top professional athlete can be defeated.And cards game, often more people participate in, trip
Board between play participation players is mutually unaware of, and therefore, research and development cards game AI model has bigger challenge.
Currently, being based primarily upon deep neural network (Deep Neural Network, DNN) and supervised learning to realize AI
Model.However, predicting, being needed by multiple model prediction ability for strategy based on the AI model that DNN and supervised learning are realized
Determine strategy prediction result, efficiency is lower, and user experience is bad, in addition, being the number based on each party based on DNN and supervised learning
According to the independent training for carrying out AI model, the utilization data being unable to fully, the accuracy of AI model is poor.Therefore, how AI is improved
The accuracy of model is current urgent problem to be solved.
Summary of the invention
The main purpose of the application is to provide a kind of training method of AI model, call method, server and readable deposits
Storage media, it is intended to improve the accuracy of AI model.
In a first aspect, the application provides a kind of training method of AI model, the training method of the AI model includes following
Step:
Obtain first sample data set, wherein the first sample data set includes first kind characteristics of image, primary vector
The role of feature and mark campaigns for label;
Obtain the second sample data set, wherein second sample data set includes the second class characteristics of image, secondary vector
The policy tag of feature and mark;
Load AI model to be trained, wherein the AI model includes that role campaigns for model and tactful prediction model;
Model is campaigned for the role according to the first sample data set and is iterated training, until the role campaigns for
Model convergence, and training is iterated to the tactful prediction model according to second sample data set, until the plan
Slightly prediction model convergence.
Second aspect, the application also provide a kind of call method of AI model, the call method of the AI model include with
Lower step:
Determine whether the call instruction of triggering AI model, wherein the AI model includes that role's election contest model and strategy are pre-
Survey model;
If monitoring the call instruction of triggering, obtain currently to office data, and according to described currently to office data,
Determine that model to be called is that role campaigns for model, or tactful prediction model;
If model to be called is that role campaigns for model, the role is called to campaign for model based on described currently to inning
According to determining that role campaigns for label, and generate the role and campaign for the corresponding role of label and campaign for instruction;
If model to be called is tactful prediction model, the tactful prediction model is called to be based on described currently to inning
According to, determining strategy prediction result, and generate the corresponding tactful output order of the tactful prediction result.
The third aspect, the application also provide a kind of server, and the server includes processor, memory and storage
On the memory and the computer program that can be executed by the processor, the memory are stored with AI model, the AI
Model includes that role campaigns for model and tactful prediction model, wherein realizing when the computer program is executed by the processor
The step of call method of AI model as described above.
Fourth aspect, the application also provide a kind of computer readable storage medium, on the computer readable storage medium
It is stored with computer program, wherein realizing the called side such as above-mentioned AI model when the computer program is executed by processor
The step of method.
The application provides training method, call method, server and the readable storage medium storing program for executing of a kind of AI model, and the application is logical
Cross obtain be stored with role campaign for model needed for class characteristics of image, vector characteristics and be labeled with role campaign for label first
Sample data set, then model is campaigned for role based on first sample data set and is iterated training, available accurate role
Campaign for model, at the same obtain be stored with tactful prediction model needed for class characteristics of image, vector characteristics and be labeled with strategy
Second sample data set of label, then training is iterated to tactful prediction model based on the second sample data set, it is available
Accurately strategy prediction model, so as to obtain include role campaign for model and strategy prediction model AI model, Neng Gouti
The accuracy of high AI model.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in embodiment description
Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is some embodiments of the present application, general for this field
For logical technical staff, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is a kind of flow diagram of the training method for AI model that embodiments herein provides;
Fig. 2 is an expression schematic diagram of class characteristics of image needed for role's election contest model in one embodiment of the application;
Fig. 3 is an expression schematic diagram of class characteristics of image needed for tactful prediction model in one embodiment of the application;
Fig. 4 is the hierarchical structure schematic diagram that role campaigns for model in the embodiment of the present application;
Fig. 5 is a hierarchical structure schematic diagram of tactful prediction model in the embodiment of the present application;
Fig. 6 is the another hierarchical structure schematic diagram of tactful prediction model in the embodiment of the present application
Fig. 7 is a kind of flow diagram of the call method for AI model that embodiments herein provides;
Fig. 8 is the training of AI model and the schematic flow block diagram of calling in the embodiment of the present application;
Fig. 9 is the structural schematic block diagram for the server that one embodiment of the application is related to.
The embodiments will be further described with reference to the accompanying drawings for realization, functional characteristics and the advantage of the application purpose.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen
Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, shall fall in the protection scope of this application.
Flow chart shown in the drawings only illustrates, it is not necessary to including all content and operation/step, also not
It is that must be executed by described sequence.For example, some operation/steps can also decompose, combine or partially merge, therefore practical
The sequence of execution is possible to change according to the actual situation.
The embodiment of the present application provides a kind of training side of artificial intelligence (Artificial Intelligence, AI) model
Method, call method, server and readable storage medium storing program for executing.Wherein, the training method of the AI model can be applied to server, wherein
The server can be the server of separate unit, or the server cluster being made of multiple servers.
With reference to the accompanying drawing, it elaborates to some embodiments of the application.In the absence of conflict, following
Feature in embodiment and embodiment can be combined with each other.
Fig. 1 is please referred to, Fig. 1 is a kind of flow diagram of the training method for AI model that embodiments herein provides.
As shown in Figure 1, the training method of the AI model includes step S101 to step S104.
Step S101, first sample data set is obtained, wherein the first sample data set includes that First Kind Graph picture is special
The role of sign, primary vector feature and mark campaigns for label.
The AI model includes that role campaigns for model and tactful prediction model, by from game to extracting the role in office data
Class characteristics of image and vector characteristics needed for campaigning for model, and mark role and campaign for label, with the class characteristics of image that extracts and
It is that one group of sample data forms sample data set that the role of vector characteristics and mark, which campaigns for label, is denoted as first sample data
Collection, wherein the first sample data set includes class characteristics of image and vector characteristics and mark needed for the role campaigns for model
Role campaign for label, class characteristics of image and vector characteristics needed for role is campaigned for model are denoted as first kind characteristics of image respectively
With primary vector feature.It includes that participation role campaigns for corresponding label and is not involved in the corresponding mark of role's election contest that role, which campaigns for label,
Label.When needing to train AI model, server obtains first sample data set.
It should be noted that the game can be cards game to office data, including but not limited to bucket ground to office data
Primary games to office data and Random Factor Mahjong to office data, or remaining type game to office data, the application couple
This is not especially limited.
Wherein, the game play a game data include a large amount of game participation players information of playing a game, the multiple including every innings of game
The original character of each game participation players holds information, character output information and character in every wheel in information, every innings of game
Hold information, concealed character information, occur the information and Role Information of character.Multiple information is used to characterize times that game is played a game
Number, it includes holding character and each holding the quantity of character that character, which holds information, concealed character information include concealed character with
And the quantity of concealed character, the information for not occurring character include not occurring character and not occurring the quantity of character, role each
Information is for characterizing game participation players role in playing a game.
Vector characteristics needed for the role campaigns for model are used to characterize the role that game is played a game and campaign for situation.It needs to illustrate
It is that class characteristics of image needed for the corresponding role of different cards games campaigns for model and vector characteristics are different, below with fighting landlord
For game, class characteristics of image needed for campaigning for model to role and vector characteristics are explained.
In fighting landlord game, game participation players include upper game participation players, going game participation players and under
Family's game participation players, the maximum quantity of identical board are 4, which holds information and hold board information, packet for game participation players
Include the board held and board number;Character output information is the information of playing a card of game participation players, including the board and board number played a card;It is concealed
Character information is card in one's hand information, including card in one's hand and board number;The information for not occurring character is not occur the information of board, including do not occur
Board and board number.
Such characteristics of image is used to characterize holding board information and not occurring the information of board for game participation players, and horizontal axis is all
For board by arranging from big to small, the longitudinal axis is the number of every kind of board, i.e., the number of board is 1, then the longitudinal axis is [1000], and the number of board is 2,
Then the longitudinal axis is [1100], and the number of board is 3, then the longitudinal axis is [1110], and the number of board is 4, then the longitudinal axis is [1111];The vector is special
Sign is five dimensional vectors, and the role of family's game participation players campaigns for participation situation on one-dimensional representation, and 1 indicates that participation role is competing
Choosing, 0 indicates to have neither part nor lot in role's election contest, and the role of two-dimensional representation player whose turn comes next's game participation players, which campaigns for, participates in situation, and 1 indicates to participate in
Role's election contest, 0 indicates to have neither part nor lot in role's election contest, and the role of last another game participation players of three dimensional representation campaigns for multiple, 1 times of table
It is shown as [100], 2 times of expressions [010], 3 times are expressed as [001].
Fig. 2 is an expression schematic diagram of class characteristics of image needed for role's election contest model in one embodiment of the application, such as Fig. 2
Shown, such characteristics of image includes the feature representation for holding board information of game participation players and does not occur the mark sheet of the information of board
It reaches, the A figure in Fig. 2 is the feature representation for holding board information of game participation players, and the B figure in Fig. 2 is the information for not occurring board
Feature representation.
Step S102, the second sample data set is obtained, wherein second sample data set includes that the second class image is special
The policy tag of sign, secondary vector feature and mark.
By, to class characteristics of image and vector characteristics needed for extracting the strategy prediction model in office data, and being marked from game
Policy tag is infused, is that one group of sample data is formed with the policy tag of the class characteristics of image and vector characteristics and mark that extract
Sample data set is denoted as the second sample data set, wherein second sample data set includes class needed for the strategy prediction model
The policy tag of characteristics of image and vector characteristics and mark, by class characteristics of image and vector characteristics needed for tactful prediction model
It is denoted as the second class characteristics of image and secondary vector feature respectively.Policy tag includes main policy tag and from policy tag.It is needing
When training AI model, server obtains the second sample data set.
It should be noted that class characteristics of image and vector characteristics needed for the corresponding tactful prediction model of difference cards game
Difference, the corresponding policy tag of different cards games is also different, below by taking fighting landlord game as an example, to needed for tactful prediction model
Class characteristics of image and vector characteristics and policy tag be explained.Wherein, which is main board label, main
Board label include but is not limited to single board, Dan Shunzi, antithetical phrase, it is double along son, three, three along son, four, rocket and otherwise rise etc. respectively
Corresponding label, should be band board label from policy tag, band board label include but is not limited to single board, two individual, an antithetical phrase,
Two antithetical phrases and without the corresponding label such as board.
Wherein, class characteristics of image needed for the strategy prediction model is used to characterize the spatial distribution of board type, such image is special
The horizontal axis of sign is the character of all boards by arranging from big to small, and the longitudinal axis of such characteristics of image is that every kind of board corresponds to character
Number, if the number of character is 1, the longitudinal axis is [1000], if the number of character is 2, the longitudinal axis is [1100], if character
Number be 3, then the longitudinal axis is [1110], if the number of character is 4, the longitudinal axis is [1111], needed for the strategy prediction model
Class characteristics of image include 13 channels, respectively going game participation players hold board information (1 channel), the nearest first round
Information of playing a card (3 of three information of playing a card (3 channels) of three game participation players, nearest second wheel game participation players
Channel), the information of playing a card (3 channels) of three game participation players of nearest third round, the whole before nearest third round play a card
Information (1 channel) does not occur the information (1 channel) of board and card in one's hand information (1 channel).
Fig. 3 is an expression schematic diagram of class characteristics of image needed for tactful prediction model in one embodiment of the application, such as Fig. 3
Shown, such characteristics of image includes 13 channels, and the A figure in Fig. 3 is that going game participation players hold board information
The feature representation of BR22AAKK10109873, the B figure in Fig. 3 are the information of playing a card of upper family's game participation players of the nearest first round
The feature representation of QQQ8, the C figure in Fig. 3 are the mark sheet of the nearest second information 4445 of playing a card for taking turns upper family's game participation players
It reaches, the D figure in Fig. 3 is the feature representation of the information 6663 of playing a card of family's game participation players in nearest third round, the E figure in Fig. 3
It plays a card the feature representation of information 210109988765433 for the whole before nearest third round, the F figure in Fig. 3 is not occur board
Information BR222AAAAKKKKQJJJJ101099777 feature representation, the mark sheet that the G figure in Fig. 3 is card in one's hand information R23
It reaches, the H figure in Fig. 3 is the information of playing a card of nearest first round going game participation players, nearest first round player whose turn comes next game participation object for appreciation
The information of playing a card of family, the information of playing a card of nearest second wheel going game participation players, nearest second wheel player whose turn comes next's game participation players
Information of playing a card, play a card information or the nearest third round player whose turn comes next game participation players of nearest third round going game participation players
The feature representation for information of playing a card.
Wherein, vector characteristics needed for the strategy prediction model are for characterizing the unrelated feature in space, including three game
The role of participation players holds board number and role and campaigns for multiple, further includes the quantity of playing a card of upper game participation players, current trip
Whether holding for participation players of play has big board of playing a card than upper family in board, if role is landlord, then role is encoded to 1, if role is
Peasant, then role is encoded to 0, and the coding for holding board number is in 00000 (holding 0 board) between 10100 (holding 20 boards), Jiao Sejing
The coding of multiple is selected to be between 01 (1 times) -11 (3 times), the coding of the quantity of playing a card of upper game participation players is in 00000
(0 board out) between 10100 (20 boards out), holding for going game participation players has big board of playing a card than upper family in board, then
Corresponding to be encoded to 1, going game participation players on the contrary hold big board of not playing a card than upper family in board, then corresponding to be encoded to
0。
For example, the role of three game participation players is respectively landlord, peasant and peasant, three game participation players are held
Board number is respectively 15,12 and 8, and it is respectively 3 times, 2 times and 2 times that the role of three game participation players, which campaigns for multiple, upper game
The quantity of playing a card of participation players is 5, and holding for going game participation players has big board of playing a card than upper family in board, then the corresponding plan
Slightly vector characteristics needed for prediction model are as follows: [1,0,0,01111,01100,01000,11,10,10,00101,1].
Step S103, AI model to be trained is loaded, wherein the AI model includes that role campaigns for model and strategy prediction
Model.
After getting first sample data set and the second sample data set, server loads AI model to be trained,
Wherein, AI model includes that role campaigns for model and tactful prediction model.Referring to figure 4., Fig. 4 is that role is competing in the embodiment of the present application
One hierarchical structure schematic diagram of modeling type, as shown in figure 4, it includes three full articulamentums, two convolutional layers that the role, which campaigns for model,
Splice layer with a vector, and the first full articulamentum is connect with the second full articulamentum, the second full articulamentum and vector splicing layer connect
It connects, the first convolutional layer is connect with the second convolutional layer, and the second convolutional layer is connect with vector splicing layer, and vector splicing layer connects entirely with third
Connect layer connection.
Referring to figure 5., Fig. 5 is a hierarchical structure schematic diagram of tactful prediction model in the embodiment of the present application, such as Fig. 5 institute
Show, which includes that seven full articulamentums, two convolutional layers and two vectors splice layers, and the first full articulamentum and
Second full articulamentum connection, the second full articulamentum are connect with primary vector splicing layer, and the first convolutional layer is connect with the second convolutional layer,
Second convolutional layer and primary vector splicing layer connects, primary vector splicing layer respectively with the full articulamentum of third, the 4th full articulamentum
It is connected with the 5th full articulamentum, the full articulamentum of third, the 4th full articulamentum and the 5th full articulamentum connection are spelled with secondary vector
Layer connection is connect, secondary vector splicing layer is connect with the 6th full articulamentum and the 7th full articulamentum respectively.It should be noted that third
The classification task of complete the first conjecture of articulamentum docking result, the first conjecture label of output and output probability and Loss1, the 4th Quan Lian
Connect the classification task of layer docking the second conjecture result, the second conjecture label of output and output probability and Loss2, the 6th full articulamentum
The classification task for docking main strategy, exports main policy tag and output probability and Loss3, and the 7th full articulamentum is docked from strategy
Classification task, output is from policy tag and output probability and Loss4, and the penalty values of model are Loss1+Loss2+Loss3+
Loss4。
Fig. 6 is please referred to, Fig. 6 is the another hierarchical structure schematic diagram of tactful prediction model in the embodiment of the present application, such as Fig. 6 institute
Show, which includes that seven full articulamentums, two convolutional layers and three vectors splice layers, and the first full articulamentum and
Second full articulamentum connection, the second full articulamentum are connect with primary vector splicing layer, and the first convolutional layer is connect with the second convolutional layer,
Second convolutional layer and primary vector splicing layer connects, primary vector splicing layer respectively with the full articulamentum of third, the 4th full articulamentum
It is connected with the 5th full articulamentum, the full articulamentum of third, the 4th full articulamentum and the 5th full articulamentum connection are spelled with secondary vector
Layer connection is connect, secondary vector splicing layer is connect with the 6th full articulamentum and third vector splicing layer respectively, and third vector splices layer
It is connect with the 7th full articulamentum.
Step S104, model is campaigned for the role according to the first sample data set and is iterated training, until institute
It states role and campaigns for model convergence, and training is iterated to the tactful prediction model according to second sample data set,
Until the tactful prediction model convergence.
After getting first sample data set and the second sample data set and loading AI model to be trained, service
Device campaigns for model to the role according to first sample data set and is iterated training, until the role campaigns for model convergence, and
Training is iterated to the strategy prediction model according to second sample data set, until the strategy prediction model is restrained, at angle
After color election contest model and tactful prediction model are restrained, stores the convergent role and campaign for model and tactful prediction model.
In one embodiment, which campaigns for the specific training process of model are as follows: obtains from first sample data set every time
Take one group of sample data, wherein the sample data includes that first kind characteristics of image, primary vector feature and role campaign for label;
Primary vector feature is handled by two layers of full articulamentum, obtains first object vector, and by two layers of convolutional layer to institute
It states first kind characteristics of image and carries out process of convolution, obtain the second object vector;Splice layer to first object vector sum by vector
Second object vector is spliced, and is obtained splicing vector, and campaign for layer by role and handle splicing vector, is obtained role
Campaign for the output probability of label;Label is campaigned for according to role and output probability calculates current penalty values, and according to current loss
Value determines that role campaigns for whether model restrains;If role campaigns for model convergence, stop model training, if role campaigns for model not
Convergence, then more new role campaigns for the parameter of model, and continues that updated role is trained to campaign for model.It should be noted that ginseng
Several more new algorithms can be configured based on actual conditions, and the application is not especially limited this, optionally, be based on backpropagation
The parameter of algorithm more new role election contest model.
In one embodiment, determine that role campaigns for the whether convergent mode of model specifically: obtain last model training
When penalty values, be denoted as history penalty values, and calculate the difference between the history penalty values and current penalty values;Determine the history
Whether the difference between penalty values and current penalty values, which is less than the role, is campaigned for the corresponding preset threshold of model, if the history is damaged
Difference preset threshold corresponding less than role election contest model between mistake value and current penalty values, it is determined that the role campaigns for mould
Type convergence, if instead the difference between the history penalty values and current penalty values is corresponding more than or equal to role election contest model
Preset threshold, it is determined that the role campaign for model it is not converged.
In one embodiment, when the level of tactful prediction model is as shown in Figure 5, the specific training of the strategy prediction model
Process are as follows: concentrated from the second sample data obtain one group of sample data every time, wherein sample data includes that the second class image is special
Sign, secondary vector feature and policy tag, policy tag include the first conjecture label, the second conjecture label, main policy tag and
From policy tag;By the first full articulamentum and the second full articulamentum to secondary vector feature handled to obtain first object to
Amount;Convolution is carried out to the second class characteristics of image by the first convolutional layer and the second convolutional layer and obtains the second object vector;Pass through
One vector splicing layer splices the second object vector of first object vector sum, obtains the first splicing vector;It is complete by third
Articulamentum determines the output probability of the first conjecture label based on the first splicing vector, and according to the first conjecture label and corresponding defeated
Probability calculation first-loss value out;
The output probability of the second conjecture label is determined based on the first splicing vector by the 4th full articulamentum, and according to second
Guess that label and corresponding output probability calculate the second penalty values;The first splicing vector is carried out by the 5th full articulamentum
Processing, and by secondary vector splice layer to first conjecture label, second guess label and after treatment first splice to
Amount is spliced, and the second splicing vector is obtained;Determine main policy tag's based on the second splicing vector by the 6th full articulamentum
Output probability, and third penalty values are calculated according to main policy tag and corresponding output probability;It is based on by the 7th full articulamentum
Second splicing vector determines the output probability from policy tag, and calculates the 4th according to from policy tag and corresponding output probability
Penalty values;According to first-loss value, the second penalty values, third penalty values and the 4th penalty values, whether strategy prediction model is determined
Convergence;If tactful prediction model convergence, stops model training, if tactful prediction model is not converged, more new strategy predicts mould
The parameter of type, and continue to train updated tactful prediction model.
In one embodiment, different from Fig. 5 to be when the level of tactful prediction model is as shown in Figure 6, strategy prediction
Model further includes third vector splicing layer, and, secondary vector splices layer and splices layer with the 6th full articulamentum and third vector respectively
Connection, third vector splicing layer are connect with the 7th full articulamentum, therefore from the method for determination of policy tag specifically: pass through third
Vector splices layer and splices to the second splicing main policy tag of vector sum, obtains third splicing vector;Pass through the 7th full connection
Layer determined based on third splicing vector from the output probability of policy tag, and according to from policy tag and corresponding output probability meter
Calculate the 4th penalty values.
It should be noted that the more new algorithm of parameter can be configured based on actual conditions, the application does not make this specifically
It limits, optionally, the parameter based on back-propagation algorithm more new strategy prediction model.The first conjecture is characterized by way of vector
Label, the second conjecture label, main policy tag and from policy tag, family's game participation players in the first conjecture tag characterization
The board of board, the second conjecture tag characterization player whose turn comes next game participation players, for example, the board of upper game participation players of conjecture is
B2AAKJJJ0933, then the first conjecture label is [200000113012110], and first guesses the number of every kind of board of tag representation,
15 kinds of boards altogether, in another example, main board type is 5, then main policy tag is also corresponding 5, and every kind of main policy tag couple
The vector answered is [10000], [01000], [00100], [0010] and [00001].
In one embodiment, the whether convergent mode of true tactful prediction model specifically: calculate first-loss value, the second damage
The sum of mistake value, third penalty values and the 4th penalty values, by first-loss value, the second penalty values, third penalty values and the 4th loss
The sum of value is as total losses value;History total losses value is obtained, and calculates the difference between history penalty values and the total losses value,
Wherein, total losses value when history total losses value is last model training;It determines between history penalty values and total losses value
Whether difference is less than or equal to preset threshold;If the difference between history penalty values and total losses value is less than or equal to default threshold
Value, it is determined that tactful prediction model convergence;If the difference between history penalty values and total losses value is greater than preset threshold, it is determined that
Tactful prediction model is not converged.It should be noted that above-mentioned preset threshold can be configured based on actual conditions, the application is to this
It is not especially limited.
The training method of AI model provided by the above embodiment is stored with class figure needed for role campaigns for model by obtaining
As feature, vector characteristics and the first sample data set for being labeled with role's election contest label, then it is based on first sample data set pair
Role campaigns for model and is iterated training, and available accurate role campaigns for model, while obtaining and being stored with strategy prediction mould
Class characteristics of image, vector characteristics needed for type and the second sample data set for being labeled with policy tag, then it is based on the second sample
Notebook data collection is iterated training, available accurate tactful prediction model, so as to be wrapped to tactful prediction model
The AI model that role campaigns for model and tactful prediction model is included, can be improved play a card efficiency and the accuracy of AI model.
The embodiment of the present application also provides a kind of call method of AI model.Wherein, the call method of the AI model can be applied
In server, wherein the server can be the server of separate unit, or the server set being made of multiple servers
Group.
Fig. 7 is please referred to, Fig. 7 is a kind of flow diagram of the call method of AI model provided by the embodiments of the present application.
As shown in fig. 7, the call method of the AI model includes step S201 to 204.
Step S201, determine whether the call instruction of triggering AI model, wherein the AI model includes that role campaigns for model
With tactful prediction model.
Wherein, it is stored with AI model in the server, which includes that role campaigns for model and tactful prediction model, should
Role, which campaigns for three, model full articulamentums, two convolutional layers and a vector, splices layer, and the first full articulamentum connects entirely with second
Connect layer connection, the second full articulamentum and vector splicing layer are connect, and the first convolutional layer is connect with the second convolutional layer, the second convolutional layer and
Vector splices layer connection, and vector splicing layer is connect with the full articulamentum of third.
The strategy prediction model includes that seven full articulamentums, two convolutional layers and two vectors splice layer, and first connects entirely
It connects layer to connect with the second full articulamentum, the second full articulamentum is connect with primary vector splicing layer, the first convolutional layer and the second convolution
Layer connection, the second convolutional layer and primary vector splicing layer connects, primary vector splicing layer respectively with the full articulamentum of third, the 4th entirely
Articulamentum and the connection of the 5th full articulamentum, the full articulamentum of third, the 4th full articulamentum and the 5th full articulamentum connection are with second
Vector splices layer connection, and secondary vector splicing layer is connect with the 6th full articulamentum and the 7th full articulamentum respectively.
In one embodiment, it is determined whether the call instruction of triggering AI model specifically: in game process, monitor game
Whether the game state of participation players is game off-line state;When the game state for monitoring game participation players is that game is offline
When state, the call instruction of AI model is triggered;When the game state for monitoring game participation players is game presence, no
Trigger the call instruction of AI model.By monitor game participation players game state, can when game participation players are offline,
AI model is called to carry out game trustship, since the accuracy of AI model is higher, it is possible to reduce to lose, improve as caused by offline
User experience.
In one embodiment, it is determined whether the call instruction of triggering AI model specifically: receive game participation terminal and send
Game control command, wherein game control command includes game abstract factory;It is default to judge whether game abstract factory is located at
Set of tags, wherein default set of tags includes online trustship, man-machine battle and the corresponding label of game Rapid matching;If trip
Abstract factory of playing is located at default set of tags, then the call instruction of AI model is triggered, if game abstract factory is not located at default label
Group does not trigger the call instruction of AI model then.
Wherein, the triggering mode of game control command specifically: game participation players participate in game, game ginseng by terminal
With player in game process, online Mandatory control can be clicked, triggers the corresponding game control command of online trustship, and the trip
Control instruction of playing includes the corresponding label of online trustship, and the corresponding game control command of online trustship is sent to server,
Online trustship operation is executed so that server is based on the corresponding game control command of online trustship, online trustship operation needs to call
AI model, therefore trigger the call instruction of AI model.
Similarly, when game participation players participate in game by terminal, since game is there are various modes, including it is man-machine right
War mode, Rapid matching mode and true man's battle mode etc., therefore game participation players can choose different game mode ginsengs
The corresponding game control command of game mode of selection is generated after game participation players select game mode with game, and
The game control command is sent to server, for the corresponding game data of server calls, due to man-machine battle mode and
Rapid matching mode isotype needs to call AI model, therefore triggers the call instruction of AI model.
If step S202, monitoring the call instruction of triggering, obtain currently to office data, according to described current right
Office data determines that model to be called is that role campaigns for model, or tactful prediction model.
If monitoring the call instruction of the AI model of triggering, server is obtained currently to office data, and according to current right
Office data determines that model to be called is that role campaigns for model, or tactful prediction model.
Wherein, which includes but is not limited to that the character of game participation players holds the character of information, every wheel
Output information, the information for not occurring character, concealed character information, multiple information and Role Information.Multiple information is for characterizing trip
The multiple that play is played a game, it includes the quantity held character and each hold character that character, which holds information, and concealed character information includes
The quantity of concealed character and each concealed character, the information for not occurring character include not occurring the quantity of character and not occurring each
Character, Role Information is for characterizing game participation players role in playing a game.
It should be noted that different type game is corresponding currently different to the information for including in office data, below with bucket
Be illustrated for ground primary games, fighting landlord game currently play a game data include tripartite's game participation players hold board information,
The information of playing a card of every wheel, multiple information, Role Information and do not occur the information of board.Multiple information is used to characterize times that game is played a game
Number, holding board information includes the quantity for holding board and the character for each holding board, and card in one's hand information includes the quantity and each card in one's hand of card in one's hand
Character, the information for not occurring board include not occurring the quantity of board and not occurring the character of board each, and Role Information is for characterizing trip
It plays participation players role in playing a game, including ground is advocated peace peasant.
In one embodiment, determine that model to be called is that role campaigns for model, or the mode of tactful prediction model has
Body are as follows: from currently to the role's label for obtaining each game participation players in office data, and determine each game participation players
Whether role's label is all the same, if role's label of each game participation players is all the same, it is determined that model to be called is
Tactful prediction model, it is different if there is the role label of at least one game participation players, it is determined that model to be called is
Role campaigns for model.
If model step S203, to be called is that role campaigns for model, the role is called to campaign for model based on described
It currently to office data, determines that role campaigns for label, and generates the role and campaign for the corresponding role's election contest instruction of label.
If model to be called is that role campaigns for model, the server calls role campaigns for model and is based on currently playing a game
Data determine that role campaigns for label, and generate role and campaign for the corresponding role's election contest instruction of label.It is competing that server executes the role
Choosing instruction, or the role is campaigned for into instruction and is sent to game server, which is executed by game server and campaigns for instruction.
Wherein, it includes that participation role campaigns for corresponding label and is not involved in the corresponding mark of role's election contest which, which campaigns for label,
Label execute the operation of participation role election contest, if should if it is that participation role campaigns for corresponding label that the role, which campaigns for label,
It is to be not involved in role to campaign for corresponding label that role, which campaigns for label, then executes the operation for being not involved in role's election contest.
Specifically, from currently special to the primary vector feature and First Kind Graph picture of extracting corresponding participant in game in office data
Sign, wherein the primary vector feature and first kind characteristics of image are feature needed for role campaigns for model;Pass through two layers of full connection
Layer, i.e., the first full articulamentum and the second full articulamentum handle primary vector feature to obtain first object vector;Pass through two
Layer convolutional layer, i.e. the first convolutional layer and the second convolutional layer carry out convolution to first kind characteristics of image and obtain the second object vector;It is logical
It crosses vector splicing layer to splice the second object vector of first object vector sum, obtains splicing vector;Layer is campaigned for by role
Determine that the role of participant in game campaigns for label based on splicing vector.
If model step S204, to be called is tactful prediction model, the tactful prediction model is called to be based on described
Currently to office data, strategy prediction result is determined, and generate the corresponding tactful output order of the tactful prediction result.
If model to be called is tactful prediction model, regulative strategy prediction model is based on currently to office data, really
Fixed strategy prediction result, the corresponding tactful output order of generation strategy prediction result.Server executes the strategy output order, or
The strategy output order is sent to game server by person, executes the strategy output order by game server.For example, strategy is pre-
Surveying result is 9993, then generates 9993 corresponding tactful output orders, and it is 9993 corresponding to be based on this by server or game server
The selection of tactful output order 9993 play a card.
Specifically, from currently special to the secondary vector feature for extracting corresponding participant in game in office data and the second class image
Sign, wherein secondary vector feature and the second class characteristics of image are feature needed for tactful prediction model;Pass through the first full articulamentum
Secondary vector feature is handled to obtain first object vector with the second full articulamentum;Pass through the first convolutional layer and the second convolution
Layer carries out convolution to the second class characteristics of image and obtains the second object vector;By primary vector splice layer to first object to
Amount and the second object vector are spliced, and the first splicing vector is obtained;It is true that the first splicing vector is based on by the full articulamentum of third
Fixed first conjecture label, and the second conjecture label is determined based on the first splicing vector by the 4th full articulamentum;It is complete by the 5th
Articulamentum handles the first splicing vector, and splices layer to the first conjecture label, the second conjecture label by secondary vector
The first splicing vector after treatment is spliced, and the second splicing vector is obtained;Second is based on by the 6th full articulamentum
Splicing vector determines main policy tag and main strategy, and is determined based on the second splicing vector from strategy and marked by the 7th full articulamentum
Label and from strategy;By tactful prediction result main tactful and from strategy as participant in game.
In one embodiment, the tactful prediction result mode of participant in game is determined specifically: according to tactful prediction model
The main policy tag of output, from policy tag and currently to office data, determine the tactful prediction result of participant in game, i.e., from
Currently to obtained in office data participant in game currently hold board information and history is played a card information;According to history play a card information, when
Before hold board information, main policy tag and from policy tag, determine the tactful prediction result of participant in game, that is, judge that the history goes out
Whether upper the playing a card for game participation players in board information is sky, and the upper game if the history is played a card in information participates in playing
Family plays a card as sky, then from currently hold obtained in board information the main policy tag with from the corresponding information of playing a card of policy tag, and
Using the information of playing a card as the tactful prediction result of participant in game;Upper game if the history is played a card in information participates in playing
Family plays a card not for sky, then from currently hold obtained in board information the main policy tag with from the corresponding information of playing a card of policy tag,
And the main strategy in the information of playing a card is greater than the main strategy of upper game participation players played a card.
It is the training of AI model and the schematic diagram of a scenario called in the embodiment of the present application referring to Fig. 8, Fig. 8, such as Fig. 8 institute
Show, including model training process and model call flow, model training process includes that role campaigns for model training process and strategy
Prediction model trains process, and it is from game to extraction role election contest feature in office data that role, which campaigns for model training process, including
Required vector characteristics and class characteristics of image, then extract role and campaign for label, sample data set is formed, and be based on sample data set
Training role campaigns for model;Tactful prediction model training process is from game to the tactful predicted characteristics of extraction in office data, including
Required vector characteristics and class characteristics of image, then policy tag is extracted, including conjecture label and tactful prediction label, form sample
Data set, and it is based on sample data set Training strategy prediction model;Model call flow is for game server by game to inning
According to being sent to AI server, AI server determines that model to be called is that role campaigns for model, or tactful prediction model, if to
The model of calling is that role campaigns for model, then role is called to campaign for model, and output role, which campaigns for, to be instructed to game server, if to
The model of calling is tactful prediction model, then regulative strategy prediction model, output policy output order to game server.
The call method of AI model provided by the above embodiment, can be quick by the call instruction of the AI model of triggering
AI model realization role election contest and prediction of playing a card are called, the calling speed for improving AI model can due to the better performances of AI model
Accurately to realize role's election contest and prediction of playing a card, user experience is effectively improved.
Device provided by the above embodiment can be implemented as a kind of form of computer program, which can be
It is run on server as shown in Figure 9.
Referring to Fig. 9, Fig. 9 is a kind of structural representation block diagram of server provided by the embodiments of the present application.The server
It can be the server of separate unit, or the server cluster being made of multiple servers.
As shown in figure 9, the server includes processor, memory and the network interface connected by system bus, wherein
Memory may include non-volatile memory medium and built-in storage, and the memory is stored with AI model, the AI model packet
It includes role and campaigns for model and tactful prediction model.
Non-volatile memory medium can storage program area and computer program.The computer program includes program instruction,
The program instruction is performed, and processor may make to execute the call method of any one AI model.
Processor supports the operation of entire computer equipment for providing calculating and control ability.
Built-in storage provides environment for the operation of the computer program in non-volatile memory medium, the computer program quilt
When processor executes, processor may make to execute the call method of any one AI model.It should be noted that fields
Technical staff can be understood that, for convenience of description and succinctly, the specific work process of the server of foregoing description,
The corresponding process in the call method embodiment of aforementioned AI model can be referred to, details are not described herein.
The network interface such as sends the task dispatching of distribution for carrying out network communication.It will be understood by those skilled in the art that
Structure shown in Fig. 9, only the block diagram of part-structure relevant to application scheme, is not constituted to application scheme institute
The restriction for the computer equipment being applied thereon, specific server may include than more or fewer components as shown in the figure,
Perhaps certain components are combined or with different component layouts.
It should be understood that processor can be central processing unit (Central Processing Unit, CPU), it should
Processor can also be other general processors, digital signal processor (Digital Signal Processor, DSP), specially
With integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array
(Field-Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor are patrolled
Collect device, discrete hardware components etc..Wherein, general processor can be microprocessor or the processor be also possible to it is any often
The processor etc. of rule.
Wherein, in one embodiment, the processor is for running computer program stored in memory, with reality
Existing following steps:
Determine whether the call instruction of triggering AI model, wherein the AI model includes that role's election contest model and strategy are pre-
Survey model;
If monitoring the call instruction of triggering, obtain currently to office data, and according to described currently to office data,
Determine that model to be called is that role campaigns for model, or tactful prediction model;
If model to be called is that role campaigns for model, the role is called to campaign for model based on described currently to inning
According to determining that role campaigns for label, and generate the role and campaign for the corresponding role of label and campaign for instruction;
If model to be called is tactful prediction model, the tactful prediction model is called to be based on described currently to inning
According to, determining strategy prediction result, and generate the corresponding tactful output order of the tactful prediction result.
In one embodiment, it includes two layers of full articulamentum, two layers of convolutional layer, vector splicing layer that the role, which campaigns for model,
Layer is campaigned for role, it is a full articulamentum that the role, which campaigns for layer, and the processor calls the role to campaign for mould in realization
Type be based on it is described currently to office data, when determining that role campaigns for label, for realizing:
From described currently to the primary vector feature and first kind characteristics of image for extracting corresponding participant in game in office data;
The primary vector feature is handled to obtain first object vector by described two layers full articulamentum;
Convolution is carried out to the first kind characteristics of image by two layers of convolutional layer and obtains the second object vector;
Splice layer by the vector to splice second object vector of first object vector sum, obtain splicing to
Amount;
Layer, which is campaigned for, by the role determines that the role of the participant in game campaigns for label based on the splicing vector.
In one embodiment, the tactful prediction model connects entirely including the first full articulamentum, the second full articulamentum, third
Connect layer, the 4th full articulamentum, the 5th full articulamentum, the 6th full articulamentum, the 7th full articulamentum, the first convolutional layer, the second convolution
Layer, primary vector splicing layer and secondary vector splice layer;The processor is realizing the calling tactful prediction model based on institute
It states currently to office data, when determining strategy prediction result, for realizing:
From described currently to the secondary vector feature and the second class characteristics of image for extracting corresponding participant in game in office data;
The secondary vector feature is handled to obtain first by the described first full articulamentum and the second full articulamentum
Object vector;
Convolution is carried out to the second class characteristics of image by first convolutional layer and the second convolutional layer and obtains the second mesh
Mark vector;
Splice layer by the primary vector to splice second object vector of first object vector sum, obtains the
One splicing vector;
The first conjecture label is determined based on the first splicing vector by the full articulamentum of the third, and passes through described the
Four full articulamentums determine the second conjecture label based on the first splicing vector;
The first splicing vector is handled by the 5th full articulamentum, and is spliced by the secondary vector
Layer to it is described first conjecture label, second conjecture label and after treatment first splicing vector splice, obtain second
Splice vector;
Main policy tag and main strategy are determined based on the second splicing vector by the 6th full articulamentum, and passed through
The 7th full articulamentum is determined based on the second splicing vector from policy tag and from strategy;
By the tactful prediction result main tactful and from strategy as the participant in game.
In one embodiment, the processor is when realizing the call instruction for determining whether to trigger AI model, for real
It is existing:
In game process, whether the game state of monitoring game participation players is game off-line state;
When the game state for monitoring game participation players is game off-line state, the call instruction of AI model is triggered;
When the game state for monitoring game participation players is game presence, the calling for not triggering AI model refers to
It enables.
In one embodiment, the processor is when realizing the call instruction for determining whether to trigger AI model, for real
It is existing:
Receive the game control command that game participation terminal is sent, wherein the game control command includes game control
Label;
Judge whether the game abstract factory is located at default set of tags, wherein the default set of tags includes online support
Pipe, man-machine battle and the corresponding label of game Rapid matching;
If the game abstract factory is located at default set of tags, the call instruction of AI model is triggered, if the game control
Label processed is not located at default set of tags, then does not trigger the call instruction of AI model.
The embodiment of the present application also provides a kind of computer readable storage medium, stores on the computer readable storage medium
There is computer program, include program instruction in the computer program, described program instruction, which is performed, to be performed the steps of
Determine whether the call instruction of triggering AI model, wherein the AI model includes that role's election contest model and strategy are pre-
Survey model;
If monitoring the call instruction of triggering, obtain currently to office data, and according to described currently to office data,
Determine that model to be called is that role campaigns for model, or tactful prediction model;
If model to be called is that role campaigns for model, the role is called to campaign for model based on described currently to inning
According to determining that role campaigns for label, and generate the role and campaign for the corresponding role of label and campaign for instruction;
If model to be called is tactful prediction model, the tactful prediction model is called to be based on described currently to inning
According to, determining strategy prediction result, and generate the corresponding tactful output order of the tactful prediction result.
In one embodiment, it includes two layers of full articulamentum, two layers of convolutional layer, vector splicing layer that the role, which campaigns for model,
Layer is campaigned for role, it is a full articulamentum that the role, which campaigns for layer, and the processor calls the role to campaign for mould in realization
Type be based on it is described currently to office data, when determining that role campaigns for label, for realizing:
From described currently to the primary vector feature and first kind characteristics of image for extracting corresponding participant in game in office data;
The primary vector feature is handled to obtain first object vector by described two layers full articulamentum;
Convolution is carried out to the first kind characteristics of image by two layers of convolutional layer and obtains the second object vector;
Splice layer by the vector to splice second object vector of first object vector sum, obtain splicing to
Amount;
Layer, which is campaigned for, by the role determines that the role of the participant in game campaigns for label based on the splicing vector.
In one embodiment, the tactful prediction model connects entirely including the first full articulamentum, the second full articulamentum, third
Connect layer, the 4th full articulamentum, the 5th full articulamentum, the 6th full articulamentum, the 7th full articulamentum, the first convolutional layer, the second convolution
Layer, primary vector splicing layer and secondary vector splice layer;The processor is realizing the calling tactful prediction model based on institute
It states currently to office data, when determining strategy prediction result, for realizing:
From described currently to the secondary vector feature and the second class characteristics of image for extracting corresponding participant in game in office data;
The secondary vector feature is handled to obtain first by the described first full articulamentum and the second full articulamentum
Object vector;
Convolution is carried out to the second class characteristics of image by first convolutional layer and the second convolutional layer and obtains the second mesh
Mark vector;
Splice layer by the primary vector to splice second object vector of first object vector sum, obtains the
One splicing vector;
The first conjecture label is determined based on the first splicing vector by the full articulamentum of the third, and passes through described the
Four full articulamentums determine the second conjecture label based on the first splicing vector;
The first splicing vector is handled by the 5th full articulamentum, and is spliced by the secondary vector
Layer to it is described first conjecture label, second conjecture label and after treatment first splicing vector splice, obtain second
Splice vector;
Main policy tag and main strategy are determined based on the second splicing vector by the 6th full articulamentum, and passed through
The 7th full articulamentum is determined based on the second splicing vector from policy tag and from strategy;
By the tactful prediction result main tactful and from strategy as the participant in game.
In one embodiment, the processor is when realizing the call instruction for determining whether to trigger AI model, for real
It is existing:
In game process, whether the game state of monitoring game participation players is game off-line state;
When the game state for monitoring game participation players is game off-line state, the call instruction of AI model is triggered;
When the game state for monitoring game participation players is game presence, the calling for not triggering AI model refers to
It enables.
In one embodiment, the processor is when realizing the call instruction for determining whether to trigger AI model, for real
It is existing:
Receive the game control command that game participation terminal is sent, wherein the game control command includes game control
Label;
Judge whether the game abstract factory is located at default set of tags, wherein the default set of tags includes online support
Pipe, man-machine battle and the corresponding label of game Rapid matching;
If the game abstract factory is located at default set of tags, the call instruction of AI model is triggered, if the game control
Label processed is not located at default set of tags, then does not trigger the call instruction of AI model.
Wherein, the computer readable storage medium can be the storage inside list of server described in previous embodiment
Member, such as the hard disk or memory of the server.The computer readable storage medium is also possible to the outside of the server
The plug-in type hard disk being equipped in storage equipment, such as the server, intelligent memory card (Smart Media Card, SMC), peace
Digital (Secure Digital, SD) card, flash card (Flash Card) etc..
It should be appreciated that the term used in this present specification is merely for the sake of for the purpose of describing particular embodiments
And it is not intended to limit the application.As present specification and it is used in the attached claims, unless up and down
Text clearly indicates other situations, and otherwise " one " of singular, "one" and "the" are intended to include plural form.
It is also understood that referring in present specification to term "and/or" used in the appended claims related
Join any combination and all possible combinations of one or more of item listed, and including these combinations.It needs to illustrate
, herein, the terms "include", "comprise" or any other variant thereof is intended to cover non-exclusive inclusion, thus
So that the process, method, article or the system that include a series of elements not only include those elements, but also including not clear
The other element listed, or further include for this process, method, article or the intrinsic element of system.Do not having more
In the case where more limitations, the element that is limited by sentence "including a ...", it is not excluded that including process, the side of the element
There is also other identical elements in method, article or system.
Above-mentioned the embodiment of the present application serial number is for illustration only, does not represent the advantages or disadvantages of the embodiments.The above, only this Shen
Specific embodiment please, but the protection scope of the application is not limited thereto, anyone skilled in the art
Within the technical scope of the present application, various equivalent modifications or substitutions can be readily occurred in, these modifications or substitutions should all be contained
Lid is within the scope of protection of this application.Therefore, the protection scope of the application should be subject to the protection scope in claims.
Claims (10)
1. a kind of training method of artificial intelligence AI model characterized by comprising
Obtain first sample data set, wherein the first sample data set includes first kind characteristics of image, primary vector feature
Label is campaigned for the role of mark;
Obtain the second sample data set, wherein second sample data set includes the second class characteristics of image, secondary vector feature
With the policy tag of mark;
Load AI model to be trained, wherein the AI model includes that role campaigns for model and tactful prediction model;
Model is campaigned for the role according to the first sample data set and is iterated training, until the role campaigns for model
Convergence, and training is iterated to the tactful prediction model according to second sample data set, until the strategy is pre-
Survey model convergence.
2. the training method of AI model as described in claim 1, which is characterized in that it includes complete two layers that the role, which campaigns for model,
Articulamentum, two layers of convolutional layer, vector splicing layer and role campaign for layer, and it is a full articulamentum that the role, which campaigns for layer,;Described
Model is campaigned for the role according to the first sample data set and is iterated training, until the role campaigns for model convergence,
Include:
One group of sample data is obtained from the first sample data set every time, wherein the sample data includes First Kind Graph
As feature, primary vector feature and role campaign for label;
The primary vector feature is handled by described two layers full articulamentum, obtains first object vector, and pass through institute
It states two layers of convolutional layer and process of convolution is carried out to the first kind characteristics of image, obtain the second object vector;
Splice layer by the vector to splice second object vector of first object vector sum, obtain splicing vector,
And layer is campaigned for by the role, the splicing vector is handled, obtain the output probability that the role campaigns for label;
Label is campaigned for according to the role and output probability calculates current penalty values, and determines institute according to the current penalty values
It states role and campaigns for whether model restrains;
If the role campaigns for model convergence, stop model training, if role election contest model is not converged, described in update
Role campaigns for the parameter of model, and continues that updated role is trained to campaign for model.
3. the training method of AI model as described in claim 1, which is characterized in that the strategy prediction model includes first complete
Articulamentum, the second full articulamentum, the full articulamentum of third, the 4th full articulamentum, the 5th full articulamentum, the 6th full articulamentum, the 7th
Full articulamentum, the first convolutional layer, the second convolutional layer, primary vector splicing layer and secondary vector splice layer;It is described according to described
Two sample data sets are iterated training to the tactful prediction model, until the tactful prediction model convergence, comprising:
It is concentrated every time from second sample data and obtains one group of sample data, wherein the sample data includes the second class figure
As feature, secondary vector feature and policy tag, the policy tag includes the first conjecture label, the second conjecture label, main plan
Slightly label and from policy tag;
The secondary vector feature is handled to obtain first object by the described first full articulamentum and the second full articulamentum
Vector;
By first convolutional layer and the second convolutional layer to the second class characteristics of image carry out convolution obtain the second target to
Amount;
Splice layer by the primary vector to splice second object vector of first object vector sum, obtains the first spelling
Connect vector;
The output probability of the first conjecture label is determined based on the first splicing vector by the full articulamentum of the third, and
First-loss value is calculated according to the first conjecture label and corresponding output probability;
The output probability of the second conjecture label is determined based on the first splicing vector by the 4th full articulamentum, and
The second penalty values are calculated according to the second conjecture label and corresponding output probability;
The first splicing vector is handled by the 5th full articulamentum, and right by secondary vector splicing layer
It is described first conjecture label, second conjecture label and after treatment first splicing vector spliced, obtain the second splicing
Vector;
The output probability of the main policy tag, and root are determined based on the second splicing vector by the 6th full articulamentum
Third penalty values are calculated according to the main policy tag and corresponding output probability;
The output probability from policy tag, and root are determined based on the second splicing vector by the 7th full articulamentum
The 4th penalty values are calculated from policy tag and corresponding output probability according to described;
According to the first-loss value, the second penalty values, third penalty values and the 4th penalty values, the tactful prediction model is determined
Whether restrain;
If the strategy prediction model convergence, stops model training, if the strategy prediction model is not converged, described in update
The parameter of tactful prediction model, and continue to train updated tactful prediction model.
4. the training method of AI model as claimed in claim 3, which is characterized in that described according to the first-loss value,
Two penalty values, third penalty values and the 4th penalty values, determine whether the tactful prediction model restrains, further includes:
The sum for calculating the first-loss value, the second penalty values, third penalty values and the 4th penalty values, by the first-loss
The sum of value, the second penalty values, third penalty values and the 4th penalty values is as total losses value;
History total losses value is obtained, and calculates the difference between the history penalty values and the total losses value, wherein is described to go through
Total losses value when history total losses value is last model training;
Determine whether the difference between the history penalty values and the total losses value is less than or equal to preset threshold;
If the difference between the history penalty values and the total losses value is less than or equal to preset threshold, it is determined that the strategy
Prediction model convergence;
If the difference between the history penalty values and the total losses value is greater than preset threshold, it is determined that the strategy prediction mould
Type is not converged.
5. a kind of call method of AI model characterized by comprising
Determine whether the call instruction of triggering AI model, wherein the AI model includes that role campaigns for model and strategy prediction mould
Type;
If monitoring the call instruction of triggering, obtain currently to office data, and is determined according to described currently to office data
Model to be called is that role campaigns for model, or tactful prediction model;
If model to be called is that role campaigns for model, call the role campaign for model be based on it is described currently to office data,
It determines that role campaigns for label, and generates the role and campaign for the corresponding role's election contest instruction of label;
If model to be called is tactful prediction model, the tactful prediction model is called to be based on described currently to office data,
It determines strategy prediction result, and generates the corresponding tactful output order of the tactful prediction result.
6. the call method of AI model as claimed in claim 5, which is characterized in that it includes complete two layers that the role, which campaigns for model,
Articulamentum, two layers of convolutional layer, vector splicing layer and role campaign for layer, and it is a full articulamentum that the role, which campaigns for layer,;The tune
Model is campaigned for based on described currently to office data with the role, determines that role campaigns for label, comprising:
From described currently to the primary vector feature and first kind characteristics of image for extracting corresponding participant in game in office data;
The primary vector feature is handled to obtain first object vector by described two layers full articulamentum;
Convolution is carried out to the first kind characteristics of image by two layers of convolutional layer and obtains the second object vector;
Splice layer by the vector to splice second object vector of first object vector sum, obtains splicing vector;
Layer, which is campaigned for, by the role determines that the role of the participant in game campaigns for label based on the splicing vector.
7. the call method of AI model as claimed in claim 5, which is characterized in that the strategy prediction model includes first complete
Articulamentum, the second full articulamentum, the full articulamentum of third, the 4th full articulamentum, the 5th full articulamentum, the 6th full articulamentum, the 7th
Full articulamentum, the first convolutional layer, the second convolutional layer, primary vector splicing layer and secondary vector splice layer;It is described to call the plan
Slightly prediction model is based on described currently to office data, determines strategy prediction result, comprising:
From described currently to the secondary vector feature and the second class characteristics of image for extracting corresponding participant in game in office data;
The secondary vector feature is handled to obtain first object by the described first full articulamentum and the second full articulamentum
Vector;
By first convolutional layer and the second convolutional layer to the second class characteristics of image carry out convolution obtain the second target to
Amount;
Splice layer by the primary vector to splice second object vector of first object vector sum, obtains the first spelling
Connect vector;
The first conjecture label is determined based on the first splicing vector by the full articulamentum of the third, and complete by the described 4th
Articulamentum determines the second conjecture label based on the first splicing vector;
The first splicing vector is handled by the 5th full articulamentum, and right by secondary vector splicing layer
It is described first conjecture label, second conjecture label and after treatment first splicing vector spliced, obtain the second splicing
Vector;
Main policy tag and main strategy are determined based on the second splicing vector by the 6th full articulamentum, and by described
7th full articulamentum is determined based on the second splicing vector from policy tag and from strategy;
By the tactful prediction result main tactful and from strategy as the participant in game.
8. the call method of AI model as claimed in claim 5, which is characterized in that the tune for determining whether to trigger AI model
With instruction, comprising:
Receive the game control command that game participation terminal is sent, wherein the game control command includes game abstract factory;
Judge whether the game abstract factory is located at default set of tags, wherein the default set of tags includes online trustship, people
Machine battle and the corresponding label of game Rapid matching;
If the game abstract factory is located at default set of tags, the call instruction of AI model is triggered, if game control mark
Label are not located at default set of tags, then do not trigger the call instruction of AI model.
9. a kind of server, which is characterized in that the server includes processor, memory and is stored in the memory
Computer program that is upper and being executed by the processor, the memory are stored with AI model, and the AI model includes role
Campaign for model and tactful prediction model, wherein when the computer program is executed by the processor, realize such as claim 5 to
Described in any one of 8 the step of the call method of AI model.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium
Program, wherein realizing the AI model as described in any one of claim 5 to 8 when the computer program is executed by processor
Call method the step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910636868.3A CN110443284B (en) | 2019-07-15 | 2019-07-15 | Artificial intelligence AI model training method, calling method, server and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910636868.3A CN110443284B (en) | 2019-07-15 | 2019-07-15 | Artificial intelligence AI model training method, calling method, server and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110443284A true CN110443284A (en) | 2019-11-12 |
CN110443284B CN110443284B (en) | 2022-04-05 |
Family
ID=68430323
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910636868.3A Active CN110443284B (en) | 2019-07-15 | 2019-07-15 | Artificial intelligence AI model training method, calling method, server and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110443284B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111437608A (en) * | 2020-03-24 | 2020-07-24 | 腾讯科技(深圳)有限公司 | Game game-play method, device, equipment and storage medium based on artificial intelligence |
CN111598169A (en) * | 2020-05-18 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Model training method, game testing method, simulation operation method and simulation operation device |
CN111738294A (en) * | 2020-05-21 | 2020-10-02 | 深圳海普参数科技有限公司 | AI model training method, use method, computer device and storage medium |
CN111744187A (en) * | 2020-08-10 | 2020-10-09 | 腾讯科技(深圳)有限公司 | Game data processing method and device, computer and readable storage medium |
CN112016704A (en) * | 2020-10-30 | 2020-12-01 | 超参数科技(深圳)有限公司 | AI model training method, model using method, computer device and storage medium |
CN112619157A (en) * | 2020-12-25 | 2021-04-09 | 北京百度网讯科技有限公司 | Game fighting interaction method and device, electronic equipment, readable medium and product |
CN112791411A (en) * | 2021-01-25 | 2021-05-14 | 网易(杭州)网络有限公司 | NPC control model training method and device and electronic equipment |
CN113256462A (en) * | 2021-05-20 | 2021-08-13 | 蓝海领航电子竞技(山东)有限公司 | Cloud computing-based electronic contest education platform |
CN113813610A (en) * | 2020-06-19 | 2021-12-21 | 北京龙创悦动网络科技有限公司 | Game data prediction model training method, game data prediction model prediction method, game data prediction device and game data prediction system |
CN114344889A (en) * | 2020-10-12 | 2022-04-15 | 腾讯科技(深圳)有限公司 | Game strategy model generation method and control method of intelligent agent in game |
WO2024092716A1 (en) * | 2022-11-04 | 2024-05-10 | 富士通株式会社 | Information transceiving method and apparatus |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105447296A (en) * | 2014-09-25 | 2016-03-30 | 博雅网络游戏开发(深圳)有限公司 | Poker hand pattern sequence data processing system, apparatus and method and Poker card order sequence data processing system, apparatus and method |
CN108553903A (en) * | 2018-04-19 | 2018-09-21 | 网易(杭州)网络有限公司 | Control robot player's method and device |
US20180308005A1 (en) * | 2017-04-24 | 2018-10-25 | International Business Machines Corporation | Artificial intelligence profiling |
CN109033309A (en) * | 2018-07-17 | 2018-12-18 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN109107161A (en) * | 2018-08-17 | 2019-01-01 | 深圳市腾讯网络信息技术有限公司 | A kind of control method of game object, device, medium and equipment |
CN109893857A (en) * | 2019-03-14 | 2019-06-18 | 腾讯科技(深圳)有限公司 | A kind of method, the method for model training and the relevant apparatus of operation information prediction |
-
2019
- 2019-07-15 CN CN201910636868.3A patent/CN110443284B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105447296A (en) * | 2014-09-25 | 2016-03-30 | 博雅网络游戏开发(深圳)有限公司 | Poker hand pattern sequence data processing system, apparatus and method and Poker card order sequence data processing system, apparatus and method |
US20180308005A1 (en) * | 2017-04-24 | 2018-10-25 | International Business Machines Corporation | Artificial intelligence profiling |
CN108553903A (en) * | 2018-04-19 | 2018-09-21 | 网易(杭州)网络有限公司 | Control robot player's method and device |
CN109033309A (en) * | 2018-07-17 | 2018-12-18 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN109107161A (en) * | 2018-08-17 | 2019-01-01 | 深圳市腾讯网络信息技术有限公司 | A kind of control method of game object, device, medium and equipment |
CN109893857A (en) * | 2019-03-14 | 2019-06-18 | 腾讯科技(深圳)有限公司 | A kind of method, the method for model training and the relevant apparatus of operation information prediction |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111437608B (en) * | 2020-03-24 | 2023-09-08 | 腾讯科技(深圳)有限公司 | Game play method, device, equipment and storage medium based on artificial intelligence |
CN111437608A (en) * | 2020-03-24 | 2020-07-24 | 腾讯科技(深圳)有限公司 | Game game-play method, device, equipment and storage medium based on artificial intelligence |
CN111598169A (en) * | 2020-05-18 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Model training method, game testing method, simulation operation method and simulation operation device |
CN111598169B (en) * | 2020-05-18 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Model training method, game testing method, simulation operation method and simulation operation device |
CN111738294A (en) * | 2020-05-21 | 2020-10-02 | 深圳海普参数科技有限公司 | AI model training method, use method, computer device and storage medium |
CN111738294B (en) * | 2020-05-21 | 2024-05-14 | 深圳海普参数科技有限公司 | AI model training method, AI model using method, computer device, and storage medium |
CN113813610A (en) * | 2020-06-19 | 2021-12-21 | 北京龙创悦动网络科技有限公司 | Game data prediction model training method, game data prediction model prediction method, game data prediction device and game data prediction system |
CN113813610B (en) * | 2020-06-19 | 2024-05-14 | 北京龙创悦动网络科技有限公司 | Game data prediction model training, prediction method, prediction device and prediction system |
CN111744187A (en) * | 2020-08-10 | 2020-10-09 | 腾讯科技(深圳)有限公司 | Game data processing method and device, computer and readable storage medium |
CN111744187B (en) * | 2020-08-10 | 2022-04-15 | 腾讯科技(深圳)有限公司 | Game data processing method and device, computer and readable storage medium |
CN114344889B (en) * | 2020-10-12 | 2024-01-26 | 腾讯科技(深圳)有限公司 | Game strategy model generation method and control method of intelligent agent in game |
CN114344889A (en) * | 2020-10-12 | 2022-04-15 | 腾讯科技(深圳)有限公司 | Game strategy model generation method and control method of intelligent agent in game |
CN112016704A (en) * | 2020-10-30 | 2020-12-01 | 超参数科技(深圳)有限公司 | AI model training method, model using method, computer device and storage medium |
CN112619157A (en) * | 2020-12-25 | 2021-04-09 | 北京百度网讯科技有限公司 | Game fighting interaction method and device, electronic equipment, readable medium and product |
CN112619157B (en) * | 2020-12-25 | 2024-04-30 | 北京百度网讯科技有限公司 | Game fight interaction method and device, electronic equipment, readable medium and product |
CN112791411A (en) * | 2021-01-25 | 2021-05-14 | 网易(杭州)网络有限公司 | NPC control model training method and device and electronic equipment |
CN112791411B (en) * | 2021-01-25 | 2024-06-04 | 网易(杭州)网络有限公司 | NPC control model training method and device and electronic equipment |
CN113256462B (en) * | 2021-05-20 | 2022-03-18 | 蓝海领航电子竞技(山东)有限公司 | Cloud computing-based electronic contest education system |
CN113256462A (en) * | 2021-05-20 | 2021-08-13 | 蓝海领航电子竞技(山东)有限公司 | Cloud computing-based electronic contest education platform |
WO2024092716A1 (en) * | 2022-11-04 | 2024-05-10 | 富士通株式会社 | Information transceiving method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
CN110443284B (en) | 2022-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110443284A (en) | Training method, call method, server and the readable storage medium storing program for executing of AI model | |
CN111291890B (en) | Game strategy optimization method, system and storage medium | |
US20240189718A1 (en) | Game character behavior control method and apparatus, storage medium, and electronic device | |
WO2020103723A1 (en) | Method, apparatus and device for scheduling virtual object in virtual environment | |
CN112016704B (en) | AI model training method, model using method, computer device and storage medium | |
CN107158708A (en) | Multi-player video game matching optimization | |
CN109508789A (en) | Predict method, storage medium, processor and the equipment of hands | |
CN111569429B (en) | Model training method, model using method, computer device, and storage medium | |
CN109529338A (en) | Object control method, apparatus, Electronic Design and computer-readable medium | |
CN111450531B (en) | Virtual character control method, virtual character control device, electronic equipment and storage medium | |
CN114048834B (en) | Continuous reinforcement learning non-complete information game method and device based on after-the-fact review and progressive expansion | |
CN107341548A (en) | A kind of data processing method, device and electronic equipment | |
CN110457534A (en) | A kind of data processing method based on artificial intelligence, device, terminal and medium | |
Khan et al. | Optimal Skipping Rates: Training Agents with Fine‐Grained Control Using Deep Reinforcement Learning | |
CN109446171A (en) | A kind of data processing method and device | |
CN110348563A (en) | The semi-supervised training method of neural network, device, server and storage medium | |
CN113069769A (en) | Cloud game interface display method and device, electronic equipment and storage medium | |
Soemers et al. | Deep learning for general game playing with ludii and polygames | |
CN111701240A (en) | Virtual article prompting method and device, storage medium and electronic device | |
CN113230650B (en) | Data processing method and device and computer readable storage medium | |
CN111652673B (en) | Intelligent recommendation method, device, server and storage medium | |
CN111598234B (en) | AI model training method, AI model using method, computer device, and storage medium | |
CN112274935A (en) | AI model training method, use method, computer device and storage medium | |
CN114728203A (en) | System and method for video stream analysis | |
CN110263937A (en) | A kind of data processing method, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |