CN111249734B

CN111249734B - Data processing method and device, computer equipment and storage medium

Info

Publication number: CN111249734B
Application number: CN202010023018.9A
Authority: CN
Inventors: 关凯; 林磊; 范长杰; 胡志鹏
Original assignee: Netease Hangzhou Network Co Ltd
Current assignee: Netease Hangzhou Network Co Ltd
Priority date: 2020-01-09
Filing date: 2020-01-09
Publication date: 2023-03-31
Anticipated expiration: 2040-01-09
Also published as: CN111249734A

Abstract

The application provides a data processing method, a data processing device, computer equipment and a storage medium, wherein the method comprises the following steps: firstly, training to obtain a plurality of first action models with different fighting strategies based on the acquired game state characteristics corresponding to the target virtual character in the target game scene, training to obtain a plurality of second action models with different fighting strategies based on the game state characteristics of the reference virtual character competing with the target virtual character, then controlling the target virtual character to compete according to the plurality of first action models and the reference virtual character according to the plurality of second action models, and finally adjusting the attribute value of the skill attribute of the target virtual character according to the competing results of a plurality of rounds of competing. By adopting the scheme, the problems of complex operation process, time consumption and labor consumption in manual adjustment are avoided, and the operation is simple, time-saving and labor-saving.

Description

Data processing method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, a computer device, and a storage medium.

Background

With the development of network technologies, network games are receiving more and more attention from people, especially network games such as role playing games and instant combat games. Regardless of the type of game, the balance of the game characters is a key factor affecting the life of the game. If the attributes of one game character are much stronger than those of another (i.e., one character is too strong and the other character is too weak), an imbalance in the overall game results.

In order to solve the problem, a scheme of attribute adjustment is provided in the related art, in which a tester needs to perform attribute adjustment on the basis of an initial attribute value of a game character according to human experience, then manually test the game balance of the game character in a game environment corresponding to the adjusted attribute value, and if the balance is not good, the adjustment and the test are performed again.

Therefore, the above method for adjusting the attribute needs to repeatedly adjust the attribute in a manual mode, and the operation flow is complex, time-consuming and labor-consuming.

Disclosure of Invention

In view of this, an object of the present application is to provide at least one data processing scheme, which can automatically control a character match by using a model training result, and further automatically adjust character attributes by using a match result, and is simple in operation, time-saving and labor-saving.

Mainly comprises the following aspects:

in a first aspect, the present application provides a data processing method, including:

acquiring game state characteristics corresponding to a target virtual character in a target game scene and game state characteristics corresponding to a reference virtual character which is in match with the target virtual character;

training to obtain a plurality of first action models of the target virtual character according to the game state characteristics corresponding to the target virtual character; training a plurality of second action models of the reference virtual character on the basis of game state characteristics corresponding to the reference virtual character which is in match with the target virtual character; wherein the fighting strategies between different first action models are different, and the fighting strategies between different second action models are different;

controlling the target virtual character to fight according to the first action model and the reference virtual character according to the second action model to obtain a fighting result of a plurality of pairs of wheels after fighting;

and adjusting the attribute value of the skill attribute of the target virtual character according to the fighting result.

In one embodiment, the training to obtain a plurality of first motion models of the target virtual character according to the game state features corresponding to the target virtual character includes:

aiming at a first action model to be trained, training the first action model to be trained for preset training times according to the game state characteristics corresponding to the target virtual character to obtain the trained first action model and updated game state characteristics corresponding to the target virtual character;

aiming at a next first action model to be trained, circularly executing training for preset training times according to the updated game state characteristics corresponding to the target virtual character, and obtaining the trained next first action model and the updated game state characteristics corresponding to the target virtual character until obtaining a plurality of first action models; wherein the combat strategy of the next first action model is an optimization strategy based on the combat strategy of the corresponding previous first action model.

In an embodiment, the training, performed by a preset number of times, on the first action model to be trained according to the game state feature corresponding to the target virtual character to obtain a trained first action model includes:

aiming at a first action model to be trained, inputting game state characteristics corresponding to the target virtual character into the first action model to be trained, and determining execution action information output by the model; sending the execution action information to a user terminal, and receiving updated game state characteristics corresponding to the target virtual character after the target virtual character in the target game scene returned by the user terminal executes the action according to the execution action information; determining an action reward value according to a comparison result between the updated game state characteristic and the game state characteristic before updating;

and inputting the determined action reward value and the updated game state characteristic into the first action model to be trained again, determining next execution action information output by the model, and circularly executing the step of sending the next execution action information to the user side until the first preset training times are reached to obtain a trained first action model.

In one embodiment, the training to obtain a plurality of first action models of the target virtual character according to the game state feature corresponding to the target virtual character further includes:

aiming at a trained next first action model, determining a target second action model corresponding to the next first action model, wherein the target second action model is at least one trained second action model before a second action model corresponding to the next first action model is obtained through training;

aiming at each second action model in the target second action models, controlling the target virtual character to fight according to the next first action model and the reference virtual character according to the second action model to obtain a fighting result between the target virtual character and the reference virtual character;

and when determining that each fighting result between the target virtual character and the reference virtual character meets a preset fighting winning rate, determining the next trained action model as the last trained action model in the plurality of first action models.

In one embodiment, the controlling the target virtual character to fight according to the first action model and the reference virtual character to fight according to the second action model to obtain a fighting result after a plurality of pairs of wheels fight comprises:

selecting a first motion model from the plurality of first motion models and a second motion model from the plurality of second motion models; and the number of the first and second groups,

controlling the target virtual character to fight according to the selected first action model and the selected second action model of the reference virtual character to obtain a fighting result after one fight;

adjusting the fighting score of the selected first action model or the selected second action model based on the fighting result after the fighting;

circularly executing the selection of one first action model from the plurality of first action models and the selection of one second action model from the plurality of second action models based on the adjusted fighting scores; and controlling the target virtual character to fight according to the selected first action model and the selected second action model of the reference virtual character to obtain a one-time fight result, and obtaining the fight result of the multi-wheel pair after fight until a preset fight deadline condition is reached.

In one embodiment, the controlling the target virtual character to fight according to the selected first action model and the reference virtual character to fight according to the selected second action model to obtain a fight result after one fight, includes:

inputting the game state characteristics corresponding to the target virtual character into a selected first action model, and determining first execution action information output by the model; inputting the acquired game state characteristics corresponding to the reference virtual character into a selected second action model, and determining second execution action information output by the model;

and after the target virtual character is controlled to execute a first action according to the first execution action information and the reference virtual character executes a second action according to the second execution action information in the target game scene, obtaining a one-time fighting result.

In one embodiment, the adjusting the attribute value of the skill attribute of the target virtual character according to the result of the battle comprises:

ranking the plurality of first action models according to the sequence of fight scores from high to low based on the fight results of the multi-wheel pairs after the fight to obtain a first ranking result; ranking the plurality of second action models according to the sequence of the fight scores from high to low to obtain a second ranking result;

adjusting attribute values of skill attributes of the target virtual character based on the first ranking result and the second ranking result.

In one embodiment, said adjusting the attribute value of the skill attribute of the target virtual character based on the first ranking result and the second ranking result comprises:

selecting a first action model with the highest score from the first ranking results, and selecting a second action model with the highest score from the second ranking results;

determining a fight score difference between the selected first action model with the highest score and the selected second action model with the highest score;

adjusting attribute values of skill attributes of the target virtual character based on the fight score difference.

In one embodiment, the adjusting the attribute value of the skill attribute of the target virtual character based on the first ranking result and the second ranking result includes:

selecting a preset number of first action models which are ranked at the top from the first ranking results, and selecting a preset number of second action models which are ranked at the top from the second ranking results;

controlling the target virtual character to fight according to the preset number of first action models and the reference virtual character to fight according to the preset number of second action models to obtain a fight result of a plurality of pairs of wheels after fighting;

ranking a preset number of first action models according to the sequence of fight scores from high to low based on the fight results of the multi-wheel pairs after the fight to obtain a final first ranking result; ranking the second action models with the preset number according to the sequence of the fight scores from high to low to obtain a final second ranking result;

adjusting attribute values of the skill attributes of the target virtual character based on the final first ranking result and the final second ranking result.

In one embodiment, the preset fight cutoff condition comprises one or more of the following conditions:

the number of times of fight of a single first action model reaches a first preset number of times of fight;

the number of times of fight of a single second action model reaches a second preset number of times of fight;

the total number of the battles reaches a third preset number of the battles.

In some embodiments, the multiple rounds of play include a first round of play and a second round of play, a first action model used in the first round of play being different from a first action model used in the second round of play, and/or a second action model used in the first round of play being different from a second action model used in the second round of play.

In a second aspect, the present application further provides a data processing apparatus, comprising:

the characteristic acquisition module is used for acquiring game state characteristics corresponding to a target virtual character in a target game scene and game state characteristics corresponding to a reference virtual character which is in match with the target virtual character;

the model training module is used for training to obtain a plurality of first action models of the target virtual character according to the game state characteristics corresponding to the target virtual character; training a plurality of second action models of the reference virtual character based on game state characteristics corresponding to the reference virtual character which is in match with the target virtual character; wherein the fighting strategies of different first action models are different, and the fighting strategies of different second action models are different;

the fighting control module is used for controlling the target virtual character to fight according to the first action model and the reference virtual character to fight according to the second action model so as to obtain fighting results of a plurality of pairs of wheels after fighting;

and the attribute adjusting module is used for adjusting the attribute value of the skill attribute of the target virtual character according to the fighting result.

In a third aspect, the present application further provides a computer device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the computer device is running, the machine-readable instructions, when executed by the processor, performing the steps of the data processing method according to the first aspect and any of its various embodiments.

In a fourth aspect, the present application further provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the data processing method according to the first aspect and any of its various embodiments.

By adopting the scheme, the method can firstly train and obtain a plurality of first action models with different fighting strategies based on the acquired game state characteristics corresponding to the target virtual character in the target game scene, train and obtain a plurality of second action models with different fighting strategies based on the game state characteristics of the reference virtual character competing with the target virtual character, then control the target virtual character to compete according to the plurality of first action models and the reference virtual character according to the plurality of second action models, and finally adjust the attribute value of the skill attribute of the target virtual character according to the competing results of a plurality of rounds of competing.

According to the scheme, the operation behaviors of players with different levels on the target virtual character can be simulated based on the training of the first action models with different fighting strategies (for example, the first action model with the high fighting strategy correspondingly simulates the operation behaviors of the high-level players on the target virtual character), and similarly, the operation behaviors of the players with different levels on the reference virtual character can be simulated based on the training of the second action models with different fighting strategies, that is, the mastering degree of the players with different levels on the game character is comprehensively considered, and further, the multi-round character fight of the players with different levels can be automatically controlled by using the first action models and the second action models obtained by training to realize the automatic adjustment of the character attributes, so that the problems of complicated operation process, time consumption and labor consumption existing in manual adjustment are solved, the operation is simple, and the time and labor are saved.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a flowchart illustrating a data processing method according to a first embodiment of the present application;

fig. 2 is a schematic diagram illustrating a specific example of training a plurality of first motion models in the data processing method according to a first embodiment of the present application;

fig. 3 is a schematic diagram illustrating a specific example of determining a last first motion model in the data processing method according to the first embodiment of the present application;

fig. 4 is a schematic diagram illustrating a specific example of determining a fight result in the data processing method according to the first embodiment of the present application;

fig. 5 is a schematic diagram illustrating a specific example of adjusting attribute values in the data processing method according to the first embodiment of the present application;

fig. 6 is a schematic diagram illustrating another specific example of adjusting attribute values in the data processing method according to the first embodiment of the present application;

fig. 7 is a schematic diagram illustrating another specific example of adjusting attribute values in the data processing method according to the first embodiment of the present application;

fig. 8 is a schematic diagram illustrating a data processing apparatus according to a second embodiment of the present application;

fig. 9 shows a schematic diagram of a computer device provided in the third embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

In the related technology, a tester needs to firstly adjust the attribute on the basis of the initial attribute value of a game role according to human experience, then manually test the game balance of the game in a game environment corresponding to the adjusted attribute value, and if the balance is not good, the adjustment and the test are carried out again, so that the operation process is complex, and time and labor are consumed.

Based on the research, the application provides at least one data processing scheme, can utilize the model training result automatic control role fight, and then utilize the fight result automatic adjustment role attribute, easy operation, labour saving and time saving.

The above-mentioned drawbacks are the results of the inventor after practice and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present application in the following paragraphs for the above-mentioned problems are all the contributions of the inventor to the present application in the course of this application.

The technical solutions in the present application will be described clearly and completely with reference to the drawings in the present application, and it should be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures.

In order to facilitate understanding of the data processing method, apparatus, computer device and storage medium provided in the present application, a detailed description is provided below through several embodiments.

Example one

Referring to a flowchart of a data processing method provided by the first embodiment of the present application shown in fig. 1, an execution subject of the method may be a game server, and the data processing method is specifically implemented by the following steps of S101 to S105:

s101, obtaining game state characteristics corresponding to a target virtual character in a target game scene and game state characteristics corresponding to a reference virtual character competing with the target virtual character;

s102, training to obtain a plurality of first action models of the target virtual character according to the game state characteristics corresponding to the target virtual character; training a plurality of second action models of the reference virtual character based on the game state characteristics corresponding to the reference virtual character which is in fight with the target virtual character; wherein the fighting strategies between different first action models are different, and the fighting strategies between different second action models are different;

s103, controlling the target virtual character to fight according to the first action model and the reference virtual character according to the second action model to obtain a fighting result of the multi-wheel pair after fighting;

and S104, adjusting the attribute value of the skill attribute of the target virtual character according to the fight result.

Here, the attribute value of the skill attribute of the target virtual character may be adjusted based on a result of a match between the control target virtual character and the reference virtual character in accordance with the first motion model and the second motion model obtained by training.

According to the embodiment of the application, the fight score difference between the first action model and the second action model can be determined through the fight result, so that the attribute value of the skill attribute of the target virtual character can be adjusted according to the fight score difference. The method mainly considers that the difference of the fighting scores directly reflects the difference of the abilities of the target virtual character and the reference virtual character in the game fighting, when the ability of the target virtual character in the game fighting is stronger, the balance of the target virtual character and the reference virtual character in the game fighting can be realized by reducing the attribute value of the skill attribute of the target virtual character, and similarly, when the ability of the target virtual character in the game fighting is weaker, the balance of the target virtual character and the reference virtual character in the game fighting can be realized by improving the attribute value of the skill attribute of the target virtual character.

In order to take account of the balance influence of the capability strength difference of players with different grades on game fighting, the fighting result in the embodiment of the application can be obtained by realizing multiple rounds of fighting based on a plurality of first action models with different fighting strategies obtained by training and a plurality of second action models with different fighting strategies, wherein the plurality of first action models with different fighting strategies can represent the operation behaviors of the players with different grades on the target virtual character, and the plurality of second action models with different fighting strategies can represent the operation behaviors of the players with different grades on the reference virtual character.

The multi-round battle in the embodiment of the application comprises a first round battle and a second round battle, wherein a first action model used in the first round battle is different from a first action model used in the second round battle, and/or a second action model used in the first round battle is different from a second action model used in the second round battle. The first round of play and the second round of play do not limit the order of play, but only show two different plays.

That is, each wheel fight may select one first motion model from the plurality of first motion models and one second motion model from the plurality of second motion models. In different rounds of battles, the selected first action model is different, or the selected second action model is different, or the selected first action model and the selected second action model are different. In this way, after a plurality of rounds of battles, the capability strength difference of each level of players of the target virtual character and each level of players of the reference virtual character can be reflected in the battle result, so that the applicability of the balance adjustment scheme in various application scenes is improved.

The above training process related to the plurality of first action models may be obtained sequentially, that is, a first action model and updated game state features may be obtained by training the game state features corresponding to the target virtual character, and then a second first action model and updated game state features may be obtained by training based on the updated game state features, and so on until a plurality of first action models are obtained. In the training process of each first action model, the influence of the output action and the action reward value on the game state characteristic is comprehensively considered, namely, along with the sequential training of each first action model, the model output action is closer to the actual action, so that the combat strategy is more and more optimized.

Similarly, the training process related to the plurality of second motion models may also be obtained sequentially, and the specific training process is similar to the process of the plurality of first motion models, and is not described herein again.

Note that the game state feature may be updated while training is performed for each of the plurality of first motion models, and the game state feature may be updated while training is performed for each of the plurality of second motion models.

The game state features corresponding to the target virtual character may be obtained from one user side connected to a server side (i.e., a game server), and the game state features corresponding to the reference virtual character may be obtained from another user side connected to the server side. In one interaction process, the virtual characters in the target game scene of the user side can be guided to execute corresponding actions through the output actions of the action model of the server side, then the game state characteristics corresponding to the virtual characters are updated, and the server side can receive the updated game state characteristics and can carry out next or next model training based on the updated game state characteristics.

The game state features corresponding to the target virtual character may be attribute features of the target virtual character, such as its own position, affiliated campsite, and the like, or may be features of the target virtual character and the reference virtual character after the battle, such as blood volume, ammunition, number of hits, direction of attack, and the like, or may be other features related to the game state of the target virtual character. Similarly, the game state feature corresponding to the reference virtual character may be an attribute feature of the reference virtual character, or may be a feature after the reference virtual character and the target virtual character are in a battle relationship, and specific reference is made to the above description, which is not described herein again. That is, in the process of training the first motion model and the second motion model, the game state features corresponding to the target virtual character and the game state features corresponding to the reference virtual character may be mutually influenced.

The above-mentioned training process of the plurality of first motion models and the plurality of second motion models is a key step for implementing attribute adjustment, and considering that the training process of the plurality of first motion models and the training process of the plurality of second motion models are similar, the following description will specifically use the training process of the plurality of first motion models as an example with reference to fig. 2.

As shown in fig. 2, the training process of the multiple first motion models provided in the embodiment of the present application specifically includes the following steps:

s201, aiming at a first action model to be trained, training the first action model to be trained for preset training times according to game state features corresponding to target virtual characters to obtain the trained first action model and updated game state features corresponding to the target virtual characters;

s202, aiming at a next first action model to be trained, circularly executing the step of training the next first action model to be trained for preset times according to the updated game state characteristics corresponding to the target virtual character to obtain the trained next first action model and the updated game state characteristics corresponding to the target virtual character until a plurality of first action models are obtained; wherein the combat strategy of the next first action model is an optimization strategy based on the combat strategy of the corresponding previous first action model.

Here, for the first motion model to be trained, the trained first motion model may be trained according to the following steps:

step one, aiming at a first action model to be trained, game state characteristics corresponding to a target virtual character are input into the first action model to be trained, and execution action information output by the model is determined; sending the execution action information to a user terminal, and receiving updated game state characteristics corresponding to a target virtual character after the target virtual character in a target game scene returned by the user terminal executes an action according to the execution action information; determining an action reward value according to a comparison result between the updated game state characteristic and the game state characteristic before updating;

and step two, inputting the determined action reward value and the updated game state characteristic into a first action model to be trained again, determining next execution action information output by the model, and circularly executing the step of sending the next execution action information to the user side until the first preset training frequency is reached to obtain a trained first action model.

That is, the server in the application may input the game state features corresponding to the target virtual character into a first action model to be trained, at this time, execution action information output by the model may be determined, and based on an interaction relationship between the server and the user, the server sends the execution action information to the user corresponding to the target virtual character, so that the target virtual character in a target game scene presented by the user executes an action according to the execution action information, at this time, the game state features related to the target virtual character in the target game field are updated, and then the server may receive the updated game state features corresponding to the target virtual character.

After the server receives the updated game state characteristics, the action reward value obtained by the target virtual character executing the action in the target game scene can be synchronously fed back to the first action model to be trained, so that the next execution action information output by the model can be obtained, the game state characteristics can be updated again after the next execution action information is sent to the client, the action reward value corresponding to the execution of the action can be determined to guide the output of the next execution action information, and the like until the first action model is obtained by training.

Here, by the feedback of the action award value, it can be ensured that the action output by the first action model to be trained tends to be more and more towards the action with the higher action award value, that is, the correct game behavior can be learned more and more. For example, the associated action reward value may be a self-killing enemy reward value of 1, and a self-injury reward value of-1, where the trained first action model is more inclined to perform the action of killing the enemy; as another example, the associated action reward value may be a self-kill enemy reward value of 0.1 and a self-injury reward value of 1, where the trained first action model is more likely to perform a self-protection action.

Regarding the preset training times of the first action model, the preset training times may be set based on requirements of different application scenarios, for example, for a game environment with higher complexity, a larger preset training time may be selected, and the preset training times are not specifically limited in the embodiment of the present application.

After the first action model is obtained through training, training of a next first action model can be performed based on the updated game state features corresponding to the target virtual character, which is similar to the training of the first action model, here, the updated game state features sent by the user side need to be received based on the interaction relationship between the server side and the user side, and then the action output by the model is guided based on the action reward value.

It is worth to be noted that, as the first action model is continuously learned, the fighting strategy becomes better and better, that is, the training of the plurality of first action models with different fighting strategies can simulate the operation level of the players with different levels on the target virtual character, and as the fighting strategy is upgraded, the corresponding player level can also become higher and higher.

Similar to the above description regarding training of multiple first motion models, in the embodiment of the present application, multiple second motion models may be trained in sequence, that is, a first second motion model may be trained first, and after the first second motion model is obtained through training, second motion models such as a second motion model and a third second motion model may be trained in sequence. With the continuous learning of the second action model, the fighting strategy of the second action model is better and better, that is, the training of a plurality of second action models with different fighting strategies can simulate the operation level of players with different grades on the reference virtual character, and the corresponding player grade can be higher and higher with the upgrading of the fighting strategy.

The training process related to the second motion model is similar to the training process related to the first motion model, and is described in detail in the above description, and will not be described again.

As the motion model is trained, its combat strategy is then optimized. In order to consider both the calculation amount of the motion model training and the optimization degree of the motion model training, the data processing method provided by the embodiment of the application needs to limit the number of model training. Exemplified with the training of the first motion model, as shown in fig. 3, the last first motion model may be determined as follows.

S301, aiming at a trained next first action model, determining a target second action model corresponding to the next first action model, wherein the target second action model is at least one trained second action model before a second action model corresponding to the next first action model is obtained through training;

s302, aiming at each second action model in the target second action models, controlling the target virtual characters to fight according to the next first action model and the reference virtual characters according to the second action models to obtain a fighting result between the target virtual characters and the reference virtual characters;

s303, when it is determined that each fighting result between the target virtual character and the reference virtual character meets the preset fighting winning rate, determining the next trained first action model as the last first action model in the plurality of trained first action models.

Here, at least one second action model trained before the second action model corresponding to the next first action model is trained may be determined as the target second action model corresponding to the next first action model, and thus, the target virtual character may be controlled to perform a match between the next first action model and the reference virtual character for each second action model in the target second action models, and respective match results between the target virtual character and the reference virtual character may be obtained.

At this time, whether each fight result between the target virtual character and the reference virtual character meets the preset fight victory ratio or not can be determined, if 4 fight second action models exist, the target virtual character wins for 3 times and fails for 1 time, if the preset fight victory ratio is 0.7 and the fight victory ratio is greater than 0.7, the trained next first action model can be determined as the last first action model in the trained first action models.

It is worth mentioning that, when determining the last first action model in the embodiment of the present application, the last first action model may be selected based on the win ratio, and may also be determined based on other statistical parameters (such as skill release success rate), which are not described herein again.

The step of determining the last second action model by the data processing method provided by the embodiment of the application is similar to the above process, and similarly, the target first action model needs to be selected and determined according to the comparison result between the fighting result and the preset fighting success rate, and the specific determination process is referred to the above contents and is not repeated herein.

The embodiment of the application can adjust the attribute value of the skill attribute of the target virtual character based on the result of the multi-wheel pair fight after the fight, and the process of the multi-wheel fight will be described in detail first.

As shown in fig. 4, the process of the multi-round battle specifically includes the following steps:

s401, selecting a first motion model from a plurality of first motion models, and selecting a second motion model from a plurality of second motion models;

s402, controlling the target virtual character to fight according to the selected first action model and the selected second action model of the reference virtual character to obtain a fight result after one fight;

s403, adjusting the fighting score of the selected first action model or the selected second action model based on the fighting result after the fighting;

s404, circularly executing to select one first action model from the plurality of first action models and select one second action model from the plurality of second action models based on the adjusted fight score; and controlling the target virtual character to fight according to the selected first action model and the selected second action model of the reference virtual character to obtain a fight result after one fight, and obtaining the fight result after the multi-wheel pair fight until a preset fight cut-off condition is reached.

Here, a first action model and a second action model may be selected from the plurality of first action models and the plurality of second action models, respectively, then the target virtual character and the reference virtual character may be controlled to fight according to the selected first action model and the selected second action model, then the fighting score of the selected first action model or the selected second action model may be adjusted based on the fighting result obtained by the fighting, for example, the target virtual character defeats the reference virtual character at this time, the fighting score of the selected first action model may be raised, after the fighting score is adjusted, the pair of action models may be selected again to fight, and the fighting score of the action models may be adjusted again by the fighting result of the fighting at this time until the preset fighting cutoff condition is reached, and the fighting result after the multi-wheel pair fighting is obtained.

In the process of selecting a pair of action models again for fighting, in order to balance the target virtual character and the reference virtual character as much as possible, two action models with slightly different fighting scores can be selected for fighting.

In the embodiment of the application, whether the first fight or the second fight, the specific process of the fight can be realized according to the following steps.

Step one, inputting game state characteristics corresponding to a target virtual character into a selected first action model, and determining first execution action information output by the model; inputting the acquired game state characteristics corresponding to the reference virtual character into a selected second action model, and determining second execution action information output by the model;

and step two, after the target virtual character is controlled to execute the first action according to the first execution action information and the reference virtual character executes the second action according to the second execution action information in the target game scene, a one-time fight result is obtained.

Here, the server may output the game state feature corresponding to the target virtual character to the selected one of the first motion models, that is, may obtain the first executed motion information output by the model, and similarly, may input the game state feature corresponding to the reference virtual character to the selected one of the second motion models, to obtain the second executed motion information output by the model.

After the first execution action information output by the first action model and the second execution action information output by the second action model are synchronously sent to the user terminal corresponding to the target virtual character and the user terminal corresponding to the reference virtual character, the virtual characters of the two user terminals in respective target game scenes can execute actions according to the execution action information, and therefore a one-time paired fighting result can be obtained.

It should be noted that one battle may correspond to one game, and in one game, there are multiple pieces of first executed action information related to the target virtual character, that is, the target virtual character needs to execute multiple pieces of first executed action information in sequence in one game, for example, the first piece of first executed action information may be obtained by inputting and outputting a selected first action model based on a game state characteristic corresponding to the target virtual character, and the first piece of first executed action information is applied to a target game scene presented at the user terminal to obtain an updated game state characteristic, so that the updated game state characteristic is input to the selected first action model again to obtain the next piece of first executed action information, and the process is repeated in sequence. Similarly, there are a plurality of pieces of second execution action information related to the reference virtual role, and the specific determination process is not described herein again.

In this embodiment of the application, the preset fight end condition may be that the number of times of fight of a single first action model reaches a first preset number of times of fight, the number of times of fight of a single second action model reaches a second preset number of times of fight, the number of times of total fight reaches a third preset number of times of fight, or other fight end conditions. Here, the first preset number of times of engagement, the second preset number of times of engagement, and the third preset number of times of engagement may be set in proportion to the number of action models, that is, the larger the number of action models is, the larger the number of times of engagement of models is allowed to be, so as to increase the proportion of the action model selected and taken out in the whole action model set (i.e., by the plurality of first action models and the plurality of second action models), and to ensure the accuracy of the engagement result while reducing the computational complexity.

In the embodiment of the application, the role attribute values can be adjusted according to ranking results obtained from the fighting results of a plurality of pairs of wheels after fighting, as shown in fig. 5, the method for adjusting the role attribute values specifically includes the following steps;

s501, ranking the plurality of first action models according to the sequence of fight scores from high to low based on the fight results of the multi-wheel pairs after the fight, and obtaining a first ranking result; ranking the plurality of second action models according to the sequence of the fight scores from high to low to obtain a second ranking result;

s502, adjusting the attribute value of the skill attribute of the target virtual role based on the first ranking result and the second ranking result.

Here, the plurality of first action models may be ranked in order of the fighting score from high to low and the plurality of second action models may be ranked in order of the fighting score from high to low based on the fighting results of the plurality of rounds of fighting, and then the adjustment of the attribute value of the skill attribute of the target virtual character may be performed based on the first ranking result obtained by ranking the plurality of first action models and the second ranking result obtained by ranking the plurality of second action models.

The skill attribute of the target virtual character may be an attribute of physical attack, an attribute of law attack, or an attribute related to other skills of the target virtual character.

The data processing method provided by the embodiment of the application can adjust the role attribute value in the following two ways.

In a first aspect: as shown in fig. 6, in the embodiment of the present application, a role attribute value may be adjusted according to the following steps:

s601, selecting a first action model with the highest score from the first ranking results, and selecting a second action model with the highest score from the second ranking results;

s602, determining a fight score difference between the selected first action model with the highest score and the selected second action model with the highest score;

and S603, adjusting the attribute value of the skill attribute of the target virtual character based on the fighting score difference.

Here, the first action model having the highest score may be selected from the respective first action models based on the first ranking result, and the second action model having the highest score may be selected from the respective second action models based on the second ranking result, and then the character attribute value may be adjusted by a battle score difference between the first action model having the highest score and the second action model having the highest score, where the degree of adjustment of the character attribute value may be proportional to the battle score difference.

In a second aspect: as shown in fig. 7, in the embodiment of the present application, a role attribute value may be adjusted according to the following steps:

s701, selecting a preset number of first action models with the top ranking from the first ranking results, and selecting a preset number of second action models with the top ranking from the second ranking results;

s702, controlling the target virtual character to fight according to a preset number of first action models and a preset number of second action models, so as to obtain a fight result of a plurality of wheel pairs after fight;

s703, ranking a preset number of first action models according to the sequence of the fighting scores from high to low based on the fighting results of the multi-wheel pairs after the fighting to obtain a final first ranking result; ranking the second action models with the preset number according to the sequence of the fight scores from high to low to obtain a final second ranking result;

and S704, adjusting the attribute value of the skill attribute of the target virtual character based on the final first ranking result and the final second ranking result.

Here, in the embodiment of the application, based on the first ranking result and the second ranking result, the first action models with the preset number and the second ranking result are selected, the fight scores of the selected first action models with the preset number and the second action models with the preset number are reset, the target virtual character and the reference virtual character are controlled to fight according to the first action models with the preset number and the second action models with the preset number by a method similar to the fight control process, and at this time, the first action models with the preset number and the second action models with the preset number can be ranked again according to the order of the fight scores from high to low, so that the final first ranking result and the final second ranking result are obtained. Based on the final first ranking result and the final second ranking result, the role attribute values may be adjusted.

After one reset, the role attribute value can be adjusted based on the fighting score difference between the first action model with the highest score in the final first ranking result and the second action model with the highest score in the final second ranking result; after multiple resets are carried out, each reset can obtain a group of final first ranking results and final second ranking results, multiple final first ranking results obtained by the multiple resets can be used for obtaining a first average ranking result, multiple final second ranking results obtained by the multiple resets can be used for obtaining a second average ranking result, and the role attribute value can be adjusted based on the fighting score difference between the first action model with the highest score in the first average ranking results and the second action model with the highest score in the second average ranking results.

In the embodiment of the application, the reference virtual character related to the target virtual character match can be various types of virtual characters, and various similar virtual characters can be selected according to the requirements of different game applications to adjust the attribute values of the skill attributes of the target virtual character. Considering that there may be mutual restraining operations between different types of virtual characters, when adjusting the attribute value of the target virtual character, the adjustment may be performed by comprehensively considering the battle score differences corresponding to various types of reference virtual characters, so as to improve the applicability of the data processing method provided by the embodiment of the present application.

Example two

Based on the same inventive concept, a third embodiment of the present application further provides a device corresponding to the data processing method provided in the foregoing embodiment, and since the principle of solving the problem of the device in the present application is similar to that of the data processing method in the foregoing embodiment, the implementation of the device may refer to the implementation of the method, and repeated details are not described here.

Referring to fig. 8, a schematic diagram of a data processing apparatus according to a second embodiment of the present application is shown, where the apparatus includes:

a feature obtaining module 801, configured to obtain a game state feature corresponding to a target virtual character in a target game scene and a game state feature corresponding to a reference virtual character that is in match with the target virtual character;

the model training module 802 is configured to train to obtain a plurality of first action models of the target virtual character according to the game state features corresponding to the target virtual character; training a plurality of second action models of the reference virtual character based on the game state characteristics corresponding to the reference virtual character which is in match with the target virtual character; wherein the fighting strategies between different first action models are different, and the fighting strategies between different second action models are different;

the fight control module 803 is used for controlling the target virtual character to fight according to the first action model and the reference virtual character according to the second action model to obtain a fight result of the multi-wheel pair after fight;

and the attribute adjusting module 804 is used for adjusting the attribute value of the skill attribute of the target virtual character according to the fighting result.

In one embodiment, the model training module 802 is configured to train a plurality of first motion models of the target virtual character according to the following steps:

aiming at a first action model to be trained, training the first action model to be trained for preset training times according to game state features corresponding to target virtual characters to obtain the trained first action model and updated game state features corresponding to the target virtual characters;

aiming at the next first action model to be trained, circularly executing the step of training the next first action model to be trained for preset training times according to the updated game state characteristics corresponding to the target virtual character to obtain the trained next first action model and the updated game state characteristics corresponding to the target virtual character until a plurality of first action models are obtained; wherein the combat strategy of the next first action model is an optimization strategy based on the combat strategy of the corresponding previous first action model.

In one embodiment, the model training module 802 is configured to obtain a trained first motion model according to the following steps:

aiming at a first action model to be trained, inputting game state characteristics corresponding to a target virtual character into the first action model to be trained, and determining execution action information output by the model; sending the execution action information to a user side, and receiving updated game state characteristics corresponding to a target virtual character after the target virtual character in a target game scene returned by the user side executes an action according to the execution action information; determining an action reward value according to a comparison result between the updated game state characteristic and the game state characteristic before updating;

and inputting the determined action reward value and the updated game state characteristic into a first action model to be trained again, determining next action execution information output by the model, and circularly executing the step of sending the next action execution information to the user side until a first preset training frequency is reached to obtain a trained first action model.

In one embodiment, the model training module 802 is further configured to train a plurality of first motion models of the target virtual character according to the following steps:

and when determining that each fighting result between the target virtual character and the reference virtual character meets the preset fighting winning rate, determining the next trained first action model as the last first action model in the plurality of trained first action models.

In one embodiment, the fight control module 803 is configured to obtain the fight result after the multi-wheel pair fight according to the following steps:

circularly executing to select one first action model from a plurality of first action models and select one second action model from a plurality of second action models based on the adjusted fight score; and controlling the target virtual character to fight according to the selected first action model and the selected second action model of the reference virtual character to obtain a fight result after one fight, and obtaining the fight result after the multi-wheel pair fight until a preset fight cut-off condition is reached.

In one embodiment, the fight control module 803 is configured to obtain the fight result after one fight according to the following steps:

inputting game state characteristics corresponding to the target virtual character into a selected first action model, and determining first execution action information output by the model; inputting the acquired game state characteristics corresponding to the reference virtual character into a selected second action model, and determining second execution action information output by the model;

and after the target virtual character is controlled to execute the first action according to the first execution action information and the reference virtual character executes the second action according to the second execution action information in the target game scene, obtaining a fighting result after one-time fighting.

In one embodiment, the attribute adjustment module 804 is configured to adjust the attribute values of the skill attributes of the target virtual character as follows:

ranking the first action models according to the sequence of the fight scores from high to low based on the fight results of the multi-wheel pair after the fight to obtain a first ranking result; ranking the plurality of second action models according to the sequence of the fight scores from high to low to obtain a second ranking result;

and adjusting the attribute value of the skill attribute of the target virtual character based on the first ranking result and the second ranking result.

and adjusting the attribute value of the skill attribute of the target virtual character based on the fighting score difference.

selecting a preset number of first action models which are ranked at the top from the first ranking result, and selecting a preset number of second action models which are ranked at the top from the second ranking result;

controlling the target virtual characters to fight according to a preset number of first action models and a preset number of second action models by the reference virtual characters to obtain a fighting result of a plurality of pairs of wheels after fighting;

and adjusting the attribute value of the skill attribute of the target virtual character based on the final first ranking result and the final second ranking result.

In some embodiments, the preset battle cutoff condition comprises one or more of the following conditions:

the total number of the battles reaches a third preset number of the battles.

In some embodiments, the multiple rounds of play comprise a first round of play and a second round of play, a first action model used in the first round of play being different from a first action model used in the second round of play, and/or a second action model used in the first round of play being different from a second action model used in the second round of play.

EXAMPLE III

An embodiment of the present application provides a computer device, as shown in fig. 9, which is a schematic structural diagram of the computer device provided in the embodiment of the present application, and includes: a processor 901, a memory 902, and a bus 903. The memory 902 stores machine-readable instructions executable by the processor 901 (for example, execution instructions corresponding to the feature obtaining module 801, the model training module 802, the fight control module 803, and the attribute adjusting module 804 in the data processing apparatus in fig. 8, and the like), when the computer device is operated, the processor 901 and the memory 902 communicate through the bus 903, and when the machine-readable instructions are executed by the processor 901, the following instructions are executed:

training to obtain a plurality of first action models of the target virtual character according to the game state characteristics corresponding to the target virtual character; training a plurality of second action models of the reference virtual character based on the game state characteristics corresponding to the reference virtual character which is in match with the target virtual character; wherein the fighting strategies of different first action models are different, and the fighting strategies of different second action models are different;

controlling the target virtual character to fight according to the first action model and the reference virtual character according to the second action model to obtain a fighting result of the multi-wheel pair after fighting;

In one embodiment, the instructions executed by the processor 901 for training a plurality of first motion models of the target virtual character according to the game state features corresponding to the target virtual character includes:

In an embodiment, in the instructions executed by the processor 901, according to the game state features corresponding to the target virtual character, training a first motion model to be trained for a preset number of times to obtain a trained first motion model, includes:

aiming at a first action model to be trained, inputting game state characteristics corresponding to a target virtual character into the first action model to be trained, and determining execution action information output by the model; sending the execution action information to a user terminal, and receiving updated game state characteristics corresponding to a target virtual character after the target virtual character in a target game scene returned by the user terminal executes an action according to the execution action information; determining an action reward value according to a comparison result between the updated game state characteristic and the game state characteristic before updating;

and inputting the determined action reward value and the updated game state characteristic into a first action model to be trained again, determining next execution action information output by the model, and circularly executing the step of sending the next execution action information to the user side until the first preset training frequency is reached to obtain a trained first action model.

In one embodiment, the instructions executed by the processor 901 further include:

In one embodiment, the instructions executed by the processor 901, in which the control target virtual character performs a fight according to the first action model and the reference virtual character performs a fight according to the second action model to obtain a fight result after a multi-wheel pair fight, include:

selecting a first motion model from the plurality of first motion models and a second motion model from the plurality of second motion models; and (c) a second step of,

controlling the target virtual character to fight according to the selected first action model and the selected second action model of the reference virtual character to obtain a fight result after one fight;

circularly executing the selection of one first action model from the plurality of first action models and the selection of one second action model from the plurality of second action models based on the adjusted fighting score; and controlling the target virtual character to fight according to the selected first action model and the selected second action model of the reference virtual character to obtain a fight result after one fight, and obtaining the fight result after the multi-wheel pair fight until a preset fight cut-off condition is reached.

In one embodiment, the instructions executed by the processor 901 to obtain a post-fight result by the control target virtual character fighting according to the selected first action model and the selected second action model according to the reference virtual character include:

and after the target virtual character is controlled to execute the first action according to the first execution action information and the reference virtual character executes the second action according to the second execution action information in the target game scene, a one-time fight result is obtained.

In one embodiment, the instructions executed by the processor 901 to adjust the attribute value of the skill attribute of the target virtual character according to the result of the battle includes:

In one embodiment, the instructions executed by the processor 901 to adjust the attribute value of the skill attribute of the target virtual character based on the first ranking result and the second ranking result includes:

selecting a first action model with the highest score from the first ranking result, and selecting a second action model with the highest score from the second ranking result;

controlling the target virtual character to fight according to a preset number of first action models and a preset number of second action models of the reference virtual character to obtain fighting results of a plurality of pairs of wheels after fighting;

the total number of the battles reaches a third preset number of the battles.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by the processor 901, the steps of the data processing method are executed.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in software functional units and sold or used as a stand-alone product, may be stored in a non-transitory computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes and substitutions do not depart from the spirit and scope of the embodiments of the present application and are intended to be covered by the claims. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of data processing, the method comprising:

training to obtain a plurality of first action models of the target virtual character according to the game state characteristics corresponding to the target virtual character; training a plurality of second action models of the reference virtual character on the basis of game state characteristics corresponding to the reference virtual character which is in match with the target virtual character; wherein the fighting strategies of different first action models are different, and the fighting strategies of different second action models are different;

2. The data processing method of claim 1, wherein the training of the plurality of first action models of the target virtual character according to the game state features corresponding to the target virtual character comprises:

3. The data processing method of claim 2, wherein the training of the first to-be-trained action model for a preset number of times according to the game state features corresponding to the target virtual character to obtain a trained first action model comprises:

aiming at a first action model to be trained, inputting game state characteristics corresponding to the target virtual character into the first action model to be trained, and determining execution action information output by the model; sending the execution action information to a user side, and receiving updated game state characteristics corresponding to the target virtual character after the target virtual character in the target game scene executes actions according to the execution action information returned by the user side; determining an action reward value according to a comparison result between the updated game state characteristic and the game state characteristic before updating;

4. The data processing method of claim 2, wherein the training to obtain a plurality of first action models of the target virtual character according to the game state feature corresponding to the target virtual character further comprises:

aiming at each second action model in the target second action models, controlling the target virtual character to fight according to the next first action model and the reference virtual character according to the second action model to obtain a fight result between the target virtual character and the reference virtual character;

5. The data processing method of claim 1, wherein the controlling the target virtual character to fight according to the first action model and the reference virtual character to fight according to the second action model to obtain a multi-wheel pair fight result after fighting comprises:

circularly executing the selection of one first action model from the plurality of first action models and the selection of one second action model from the plurality of second action models based on the adjusted fighting scores; and controlling the target virtual character to fight according to the selected first action model and the reference virtual character to fight according to the selected second action model to obtain a fight result after one fight, and obtaining the fight result after multiple pairs of fight until a preset fight cut-off condition is reached.

6. The data processing method of claim 5, wherein the controlling the target virtual character to fight according to the selected first action model and the reference virtual character to fight according to the selected second action model to obtain a fight result after fighting comprises:

and controlling the target virtual character to execute a first action according to the first execution action information and the reference virtual character to execute a second action according to the second execution action information in the target game scene to obtain a one-time fight result.

7. The data processing method according to claim 1, wherein the adjusting the attribute value of the skill attribute of the target virtual character according to the result of the engagement comprises:

8. The data processing method of claim 7, wherein the adjusting the attribute value of the skill attribute of the target virtual character based on the first ranking result and the second ranking result comprises:

9. The data processing method of claim 7, wherein the adjusting the attribute value of the skill attribute of the target virtual character based on the first ranking result and the second ranking result comprises:

10. The data processing method of any one of claims 5 to 6, wherein the preset fight cut-off conditions comprise one or more of the following conditions:

the total number of the battles reaches a third preset number of the battles.

11. The data processing method according to any one of claims 1 to 9,

the multi-round play includes a first round of play and a second round of play, a first action model used in the first round of play being different from a first action model used in the second round of play, and/or a second action model used in the first round of play being different from a second action model used in the second round of play.

12. A data processing apparatus, characterized in that the apparatus comprises:

the model training module is used for training to obtain a plurality of first action models of the target virtual character according to the game state characteristics corresponding to the target virtual character; training a plurality of second action models of the reference virtual character on the basis of game state characteristics corresponding to the reference virtual character which is in match with the target virtual character; wherein the fighting strategies between different first action models are different, and the fighting strategies between different second action models are different;

13. A computer device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when a computer device is running, the machine-readable instructions when executed by the processor performing the steps of the data processing method of any of claims 1 to 11.

14. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the data processing method according to one of the claims 1 to 11.