CN110478907B - Mahjong AI data processing method based on big data driving - Google Patents

Mahjong AI data processing method based on big data driving Download PDF

Info

Publication number
CN110478907B
CN110478907B CN201910759449.9A CN201910759449A CN110478907B CN 110478907 B CN110478907 B CN 110478907B CN 201910759449 A CN201910759449 A CN 201910759449A CN 110478907 B CN110478907 B CN 110478907B
Authority
CN
China
Prior art keywords
node
cards
card
hand
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910759449.9A
Other languages
Chinese (zh)
Other versions
CN110478907A (en
Inventor
尹鹏程
徐明胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Bianfeng Network Technology Co ltd
Original Assignee
Hangzhou Bianfeng Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Bianfeng Network Technology Co ltd filed Critical Hangzhou Bianfeng Network Technology Co ltd
Priority to CN201910759449.9A priority Critical patent/CN110478907B/en
Publication of CN110478907A publication Critical patent/CN110478907A/en
Application granted granted Critical
Publication of CN110478907B publication Critical patent/CN110478907B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • A63F13/67Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor adaptively or by learning from player actions, e.g. skill level adjustment or by storing successful combat sequences for re-use
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/70Game security or game management aspects
    • A63F13/77Game security or game management aspects involving data related to game devices or game servers, e.g. configuration data, software version or amount of memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of artificial intelligence. The mahjong AI data processing method based on big data driving can find out solutions under specific conditions under the condition of limited resources. A mahjong AI data processing method based on big data drive is to select the course of playing cards in the history hand, extract and store the node hand in the course of playing cards according to the composition as the action guidance of AI, the AI carries out the playing cards based on the action guidance according to the node composition of the current hand.

Description

Mahjong AI data processing method based on big data driving
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a mahjong AI data processing method based on big data driving.
Background
The chess and card games are divided into two cases of symmetrical data and asymmetrical data. In the case of symmetric data, such as chess, the states of all the chess are clear, the factors influencing the winning rate are only decision, and the optimal solution can be obtained under the condition of infinite computing resources, so that the chess needs to be pruned as much as possible under the condition of limited resources to find the local optimal solution. In the case of asymmetric data, such as in a player (three player) chess game, the dark cards are cards in other opponents, and the number of unknown cards decreases rapidly each time an opponent plays a card, and the dimension of the unknown space that the machine needs to consider decreases rapidly. For example, there is one square 2 in the hand and 3 squares 2 in the dark when the other opponents are not playing, the combination of which is not 1+2,0+3. Because of the initial fixed hands, the hand play order and strategy is essentially constant, and the machine can find a stable solution (not necessarily local or globally optimal).
Mahjong games are also of the data asymmetry type, but for mahjong games they have a continually dynamically varying sum rate compared to the fighting land owner tile game, with influencing factors including: cards from other opponents, held cards, cards in the wall. Wherein cards in the wall are invisible to other opponents, and the proportion of the invisible cards in all cards is extremely large. For example, there are 1 and 2 tens of thousands of robots. Other opponents' combinations include, due to the presence of a wall of cards: 1. the other hands are not provided with 2 ten thousand, 0+0+0; 2. one other opponent has 2 ten thousand, 0+0+1,0+1+0; 3. the other opponents have 2 ten thousand, 1+1+1; 4. one opponent has 2 ten thousand, the other opponent has 2, 1+2+0, and the like, and 1+3+3+3+1+3×2=17, and the other opponents are arranged and combined to form 17 kinds. So when a current tile condition space is created in the mahjong game, the single dimension is 9 times of the tile floor game, and the tile wall, the tile river and the hand of the mahjong game are 34 dimensions respectively (15 dimensions of the tile floor game), and when only the hand is considered, the complexity difference of the space is as follows: 34++9/15++2= 2.7 x 10≡11. When the machine builds a space according to the current card condition, each time other opponents touch a card, the composition of the current space is affected, and the space is wasted (the card river is changed) as long as any one of the other opponents changes the card-mixing strategy. Because the disturbance is too large, this also means that the mahjong game cannot find a stable solution even with infinite resources.
Therefore, the main problem to be solved by the present invention is how to find solutions under specific conditions under the condition of limited resources.
The condition of limited resources means that in actual business, the calculation time (for example, card-playing time is within 2 s) is included, and the occupation of memory and cpu can ensure that one machine can operate more AI. This condition limits the application of many search algorithms, and the search depth of the corresponding algorithm, thereby limiting the algorithm to find a better solution or a solution under certain conditions.
For example, in general, the closer the level between participants, the better the gaming experience, i.e., the higher the playability. However, due to the above limitations, mahjong AI systems employing previous algorithms cannot achieve sufficient "intelligence", namely: the AI at this time is of a different level than the other participants. The AI is susceptible to current card "errors" under the direction of this algorithm. "miss" herein refers to a row of cards logic that does not significantly conform to the row of cards logic that the horizontal hand would follow.
Solutions under certain conditions are meant to place some restrictions on the sum card conditions, such as: the AI can try to discard the high scoring cards as much as possible, and discard the low scoring cards and quick-speed cards to a certain extent.
Disclosure of Invention
The invention aims to provide a mahjong AI data processing method based on big data driving, which can find out a solution under a specific condition under the condition of limited resources.
Compared with the prior AI system, the method can realize finding out the solution under the specific condition, and the AI system adopting the mahjong AI data processing method based on big data driving provided by the invention has a level more matched with the opponent, thereby increasing the game playability.
First, we first define the following terms commonly used in mahjong games referred to in the present invention.
Hand plate: the number of cards placed in front of the player is 13.
Card-row: namely, the playing process. The first card is played by the dealer, and the process includes holding, dealing, eating, touching, opening (light bar, dark bar), and supplementing until the cards are added or barren.
And (3) card: the cards in the hands are combined into corresponding combinations according to the established rules to obtain victory.
Listening: only one card is needed to be different in state.
Listening to: is the distance of the hand from the listening card. The calculation of audiometric values typically requires a splitting operation of a pair of hands to form sets of cards such as cis-, nicks-, pairs-, and tabs.
And (3) transferring: the act of discarding an otherwise desired card to achieve a better card type is referred to as "turn-up," and the probabilistic search framework referred to in this application can be thought of as an extension of turn-up behavior to an audible module.
Improvement changes the sheet: after turning, the current hand hearing value is unchanged, but the final sum of the numbers of hands corresponding to the hands is increased.
And (3) discharging: one card played by one party just allows the other party to sum the cards, called "putting".
Side dew: representing eating, touching, and bar, except for dark bars.
Common algorithm implementations include recursive implementation and dynamic programming implementation, and the technical scheme provided by the application adopts recursive implementation.
The fastest sum card algorithm based on transfer is adopted by the AI data processing method which is developed by the applicant at the beginning, so that the fastest sum card algorithm based on transfer is conveniently distinguished and is called as an old AI data processing method, and a system adopting the old AI data processing method is called as an old AI. And selecting the turn-to-turn playing method with the highest score for each turn Zhang Pingfen by calculating the sum probability of the hands after each turn.
Under the limitation of resources, the number of times of transfer calculation can only be set to 2, and when hands are far away from the hearing direction, the walking method capable of combining cards cannot be found under most conditions. The calculated probability can only reflect the number of cards which can be used under the current card condition, and when any one of the cards is in line, the scoring is invalid, the scoring is needed to be repeated, and the disturbance is larger. The win rate of the old AI is approximately 40% accounting for 24% lower than the human hand.
To increase AI challenge level, applicant adds a multiple estimate based on the fastest transfer-based sum-card algorithm described above, namely: after each turn, the current hand is estimated and the score is added to the multiple. With this improved algorithm (hereinafter "improved AI data processing method"), the win rate after online is about 38%, and the score settlement is 18% lower than the human hand, and the challenge level is improved by 6% compared with the old AI. This improved AI data processing method results in a larger perturbation than the old AI data processing method due to the forefront of the fastest sum cards, and thus a lower win rate, but the overall level is improved over the old AI data processing method.
The calculation of the audiometric r (H) generally requires a splitting operation of a pair of hands H to form sets of cards such as a sequence, a nick, a pair, a lug, etc.
r (H) =8-2×# (cistron+moment) - # vs. sub- # strap
The distance R (H) of the hand H from the sum card is: r (H) =r (H) +1
Hand H achieves the sum card status after changing R (H) cards all possible card changing paths are T (H).
Hand H has T as all possible card changing paths for changing R (H) +n cards and achieving a sum card status therein n (H)。
Considering only the case of a single hand, for a hand H, there are two cases of |h|=3k+1 or |h|=3k+2, first define the touch as:
<H,tile,H′>
wherein the probability of touching tile is P (tile).
Defining playing cards as follows:
<H,H′>
the single player and card probability P (H) defining the hand is:
Figure GDA0004128775920000031
naturally, we can define the desired number of hands H of a single player E (H) as:
Figure GDA0004128775920000041
to limit the depth of the search, we specify the depth of the search tree at the time of the recursive search. By definition of T0 (H), T1 (H), we can further constrain the above recursive definition:
Figure GDA0004128775920000042
if the search target is a desired multiple, we can process in the same way.
Without limiting the search depth, the calculation space of AI is theoretically very large, and after each card-playing causes space disturbance, AI can correct the current card-playing strategy to a certain extent and find the card-playing method balancing the sum card speed and scoring at high search depth. However, since the asymmetry of the information causes dimension explosion, once the search depth is more than or equal to 3, the calculation time of a single server is 6-10 s, the calculation time is too long, and the waiting time is too long for other participants.
Also related to the search depth is the score, under the tension algorithm, at least 5-6 tension rotations are needed to get the final true and stable score, with a complexity of 34 (2*6). The estimated scoring method by estimation of the number of times is added, after other players deal, the scoring estimation of the AI last time may be directly proved to be wrong, and the card dealing strategy of the AI last time is proved to be wrong, so that the AI cannot match cards in the worst case.
Therefore, the technical purpose of the present invention cannot be achieved by adopting the above-mentioned old AI data processing method (i.e. the fastest sum-of-the-squares algorithm based on the transfer) and the improved AI data processing method (i.e. the algorithm through predictive scoring with the addition of the multiple number estimation after the transfer).
To this end, the applicant has further improved the algorithm, added component-based analysis algorithms, and integrated it into the AI data processing method described previously.
Because of the rules of mahjong game, most of the mahjong game rules are fixed with the types of cards, and the forms of the components are fixed when the mahjong game is played with cards, but the numerical values or the colors are different, and the foundation comprises pairing, sequential and engraving. This means that during the course of a mahjong game there will be certain nodes whose composition will contain these fixed components, and the closer to the playing card the higher the specific gravity of these components will be, the more stable the form of the hand will be, since under the mahjong game's playing logic the players will typically not change cards frequently or even walk in the direction of the increase in hearing. Through analysis of a large amount of game data, key nodes in the playing process are extracted. The session may then be converted into a conversion process between each key node.
The applicant then proposes the following solutions:
a mahjong AI data processing method based on big data drive is to select the course of playing cards in the history hand, extract and store the node hand in the course of playing cards according to the composition as the action guidance of AI, the AI carries out the playing cards based on the action guidance according to the node composition of the current hand.
Extracting nodes conforming to node characteristics in the historical hand, wherein the node characteristics comprise: the method comprises the steps of reducing hearing of hands, increasing hearing of hands, eating, touching or conducting bar action and improvement and transfer, compressing extracted hand Data of nodes and auxiliary exposure Data corresponding to the hand Data to obtain hash values corresponding to the hands of the nodes, then putting nodes in the same array according to the current hearing values of the hands of the nodes to obtain N arrays arr, namely arr [1], arr [2], … and arr [ N ], wherein N is the maximum hearing value of the hands extracted from all Data, each array arr jointly forms a database Data Map, and each node comprises index Data of a last hearing value node and a next hearing value node of the node.
The method comprises the following steps of:
(1) Grouping hands by suit, comprising: ten thousand, strip, tube, wind plate and arrow plate;
(2) Each set of cards is individually ordered, e.g., ascending, within the set;
(3) The number of occurrences of each card is respectively indicated by a number, and the number of non-occurrences of the card is indicated by a "0";
(4) Judging whether a plurality of continuous '0's exist in the same group of cards in the step (3), if so, compressing the cards into a '0';
(5) Judging whether the last position of the previous group of cards and the first position of the next group of cards in the step (3) are both 0, if so, not compressing and still recording as two continuous 0;
(6) Repeating steps (2) - (5) until all hands are numerically represented.
(7) Compressing the auxiliary dew corresponding to the hand according to the same method as the steps (2) - (6) and placing the auxiliary dew at the tail of the compression value of the hand;
(8) The ten thousand pieces of barrel cards, the wind cards and the arrow cards are respectively overlapped and compressed into 32-bit integer data, and the value is recorded as the hash value hash of the hand card.
Through the compression mode, a plurality of hand situations can be combined, the data volume is reduced, the compression of information is realized, redundancy is reduced, useless information is removed, and meanwhile, the characteristic information of the hand can be well reserved. For example: the information such as cistron and scale can be well stored. This is more advantageous for analysis of the data.
The information stored by each node includes historical hand information and data in the pool, as well as actions of the previous node and actions taken under the current node.
The hand is compressed through the steps, so that the meaningful components in the hand are determined to be divided into 11 types in total: 4,3,2, 1111, 1110, 0111, 1011, 1101, 0101, 1010, 0110.
All hands may be in the final form of 11 components as described above. The above compositions may overlap due to different combinations forming different card types, for example: 111011 can be regarded as 1110 and 1011.
Taking component data 0702071011110707017074070111707 as an example, wherein '7' is the division of each color, is only used for word expression, is convenient to distinguish, the first five groups of data are hand data, and the second four groups of data are the dew data corresponding to the hand. The characteristic components of the composition data are decomposed into: the 020 feature decomposes to 2;1011110 features decompose into 1011, 0111, 1111, 1110; the 01 feature is decomposed into 1; the 40 features decompose into 4 and the 0111 features decompose into 0111.
For mahjong types, the ten thousand barrels, southeast, northwest, middle blushing, respectively, can be exchanged in sequence without changing, for example: "ten thousand, twenty thousand, thirty thousand, twenty, three, four, ten thousand" the card type of Dongfeng, zhongzhong, the patterns are consistent with the patterns of 'one barrel, two barrels, three barrels, twenty thousands, thirty thousands, four tens of thousands, south wind, dealing'.
To unify this, the ten thousand cards, the wind card, and the arrow card are respectively superimposed and compressed into one 32-bit integer data, and this value is recorded as the hash of the hand.
For example: the initial card shape is 0702071011110707017074070111707, the ten thousand barrels are aligned according to the low order and are bitwise and NAND, namely, the arrow cards are aligned according to the high order and are right shifted by 3-bit and NAND at the same time.
The AI card-line process includes the following steps:
and (I) performing normal operation according to the fastest sum card rule by the AI, judging the current hearing value after each operation, and entering the step (II) when the current hearing value is reduced.
Secondly, compressing the current AI hand data and the auxiliary dew data corresponding to the current AI hand data to obtain a hash value hash, comprising the following steps:
(1) Grouping the current hands by suit, comprising: ten thousand, strip, tube, wind plate and arrow plate;
(2) Each set of cards is individually ordered, e.g., ascending, within the set;
(3) The number of occurrences of each card is respectively indicated by a number, and the number of non-occurrences of the card is indicated by a "0";
(4) Judging whether a plurality of continuous '0's exist in the same group of cards in the step (3), if so, compressing the cards into a '0';
(5) Judging whether the last position of the previous group of cards and the first position of the next group of cards in the step (3) are both 0, if so, not compressing and still recording as two continuous 0;
(6) Repeating steps (2) - (5) until all hands are numerically represented.
(7) Compressing the auxiliary dew corresponding to the hand according to the same method as the steps (2) - (6) and placing the auxiliary dew at the tail of the compression value of the hand;
(8) The ten thousand pieces of barrel cards, the wind cards and the arrow cards are respectively overlapped and compressed into 32-bit integer data, and the value is recorded as the hash value hash of the current hand card.
Thirdly, according to the current hearing value shanten obtained in the first step, finding a corresponding array as arr [ shanten ] in a database Data Map, and meanwhile, according to the hash value hash obtained in the second step, finding a node with the same hash value hash from the arr [ shanten ], marking as a node [0], and if the node [0] is empty (i.e. a node with the same hash value does not exist), returning to the first step, otherwise, entering the fourth step.
And (IV) according to the node [0] obtained in the step (III), a plurality of child nodes exist in a node [0] chain, each child node represents a hand of the hearing value shanten condition, the AI current hand node is the node, the matched node is the node [0], in the matched node chain structure, the subsequent nodes are respectively marked as the node [1], the node [2], the … and the node [ shanten ], and the maximum value of shanten in the node [ shanten ] is the hearing value obtained in the step (II).
And (V) carrying out characteristic component analysis on the chain structures node [0], node [1], node [2], … and node [ shanten ] of the matched nodes in the step (IV) respectively to obtain components nc [0], nc [1], nc [2], nc [3], … and nc [ x ] (the component number x changes according to the arrangement mode of the nodes), and recording the components nc [ n ] [0], nc [ n ] [1], … and nc [ n ] [ x ] corresponding to the node [ n ] to obtain respective components, and then sequentially comparing the components with the following nodes from the first node, and grouping the components with the smallest change (namely the minimal number of the transfer sheets).
Step six, using the method in step five to group the components in the node and the node [0] to obtain a path from the component of the node to the component of the node [ shanten ], namely nc [ n ] →nc [0] →nc [1] → … →nc [ shanten ], each component change represents sequential or several times of rotation, so that the current node hand moves to the sum card; continuing step (six) until the sum cards are withdrawn, and if the node component requires that cards are not present in the card wall or the probability of availability of the cards is too small, operating according to the fastest sum card rule until the sum cards.
Wherein, the node component demand cards do not exist in the card wall, generally, the number of the demand cards reaches 4 in the river and the deck dew. In addition, if the number of required cards is generally only one, the probability that the card is available is considered too small.
Due to the adoption of the technical scheme, the invention has the following beneficial effects compared with the prior art: the AI can estimate the card-playing probability of other participants in the next tour to a certain extent under the condition of simultaneously extracting the card river components based on the learning of the history hand; after comparing the card river of each participant in the current hand with each patrol card river condition in the historical high-score hand, the possible sum card probability of the hand can be estimated, so that the discharge probability is reduced; finding out a matching node from the historical nodes, listing according to a historical listing mode, extracting node data as the behavior guide of AI, and finding out a local optimal solution; different hand data can be pushed as guiding data according to the current level of different players, and the AI of the descending card is at the same level with the players in the guiding data, so that the players obtain the best game experience.
Drawings
Other advantages and features of the invention are illustrated by the following description of an embodiment of the invention, given by way of example and not by way of limitation, in connection with the accompanying drawings, in which:
FIG. 1 is a schematic diagram of the relationship between Data compression of characteristic nodes in a historical hand and nodes in a formed database Data Map according to the present invention.
Fig. 2 is a flow chart of AI line cards in the present invention.
Detailed Description
Extracting nodes conforming to node characteristics in the historical hand, including: the hearing of the hand is reduced; the hearing of hands is increased; eating, touching or leverage is performed; and (5) improving the rotation tension. And compressing the extracted hand data of the node and the auxiliary dew data corresponding to the hand data to obtain a hash value corresponding to the node hand.
The compressing treatment of the hand plate comprises the following steps:
(1) Grouping hands according to the designs of tens of thousands, strips, barrels, wind cards and arrow cards;
(2) Each group of cards is sorted in ascending order within the group, for example, 1,3,4,5,8,8 ten thousand after sorting;
(3) The number of occurrences of each card is respectively indicated by a number, the number of non-occurrences of the card is indicated by "0", and the upper card type is changed to "101110020";
(4) Judging whether a plurality of continuous '0's exist in the same group of cards in the step (3), if so, compressing the cards into a '0', and changing the cards into '10111020' if the cards exist;
(5) Judging whether the last position of the previous group of cards and the first position of the next group of cards in the step (3) are both 0, if so, not compressing and still recording as two continuous 0;
(6) Repeating steps (2) - (5) until all hands are numerically represented.
(7) And then compressing the exposure data corresponding to the hand data according to the same method as the steps (2) - (6) in the compression processing step of the hand data, and placing the exposure data at the tail end of the hand compression value.
Through the compression mode, a plurality of hand situations can be combined, the data volume is reduced, the compression of information is realized, redundancy is reduced, useless information is removed, and meanwhile, the characteristic information of the hand can be well reserved. For example: the information such as cistron and scale can be well stored. This is more advantageous for analysis of the data.
The information stored by each node contains the current hand information of the player and the data in the pool of cards, as well as the actions taken by the player (such as what cards were presented) and the actions taken by the current node (such as what cards were presented) by the player.
The hand is compressed through the steps, so that the meaningful components in the hand are determined to be divided into 11 types in total: 4,3,2, 1111, 1110, 0111, 1011, 1101, 0101, 1010, 0110.
All hands may be in the final form of 11 components as described above. The above compositions may overlap due to different combinations forming different card types, for example: 111011 can be regarded as 1110 and 1011.
Taking component data 0702071011110707017074070111707 as an example, wherein '7' is the division of each color, is only used for word expression, is convenient to distinguish, the first five groups of data are hand data, and the second four groups of data are the dew data corresponding to the hand. The characteristic components of the composition data are decomposed into: the 020 feature decomposes to 2;1011110 features decompose into 1011, 0111, 1111, 1110; the 01 feature is decomposed into 1; the 40 features decompose into 4 and the 0111 features decompose into 0111.
For mahjong types, the ten thousand barrels, southeast, northwest, middle blushing, respectively, can be exchanged in sequence without changing, for example: "ten thousand, twenty thousand, thirty thousand, twenty, three, four, ten thousand" the card type of Dongfeng, zhongzhong, the patterns are consistent with the patterns of 'one barrel, two barrels, three barrels, twenty thousands, thirty thousands, four tens of thousands, south wind, dealing'.
To unify this, the kaleidoscope, wind plate, arrow plate are respectively superimposed and compressed into a 32-bit integer data, which is noted as the hash of the current hand.
For example: the initial card shape is 0702071011110707017074070111707, the ten thousand barrels are aligned according to the low order and are bitwise and NAND, namely, the arrow cards are aligned according to the high order and are right shifted by 3-bit and NAND at the same time.
After extracting the nodes in the history Data, carrying out the compression processing on the nodes, then putting the nodes with the same audios into the same array according to the current audios of the hands of the nodes to obtain N arrays arr, namely arr [1], arr [2], … and arr [ N ], wherein N is the maximum audios of the hands extracted from all Data, each array arr jointly forms a database Data Map, and each node comprises index Data of the last audios node and the next audios node of the node. The entire chain represents all specific actions in a game, as shown in fig. 1.
The actual game is put into, and the specific game flow is shown in fig. 2, and comprises the following steps:
and (I) performing normal operation according to the fastest sum card rule by the AI, judging the current hearing value after each operation, and entering the step (II) when the current hearing value is reduced.
And secondly, compressing the hand data of the current AI and the auxiliary dew data corresponding to the current hand data to obtain a hash value hash. The compression processing method is the same as the method for compressing the hand data of the extraction node in the historical hand and the auxiliary dew data corresponding to the hand data.
Thirdly, according to the current hearing value shanten obtained in the first step, finding a corresponding array as arr [ shanten ] in a database Data Map, and meanwhile, according to the hash value hash obtained in the second step, finding a node with the same hash value hash from the arr [ shanten ], marking as a node [0], and if the node [0] is empty (i.e. a node with the same hash value does not exist), returning to the first step, otherwise, entering the fourth step.
And (IV) according to the node [0] obtained in the step (III), a plurality of child nodes exist in a node [0] chain, each child node represents a hand of the hearing value shanten condition, the AI current hand node is the node, the matched node is the node [0], in the matched node chain structure, the subsequent nodes are respectively marked as the node [1], the node [2], the … and the node [ shanten ], and the maximum value of shanten in the node [ shanten ] is the hearing value obtained in the step (II).
And (V) carrying out characteristic component analysis on the chain structures node [0], node [1], node [2], node [ … ] of the matched nodes in the step (IV) to obtain components nc [0], nc [1], nc [2], nc [3], … and nc [ x ] (the component number x changes according to the arrangement mode of the nodes), and recording the components nc [ n ] [0], nc [ n ] [1], … and nc [ n ] [ x ] corresponding to the node [0] to obtain the components, and then sequentially comparing the components with the following nodes from the first node, and grouping the components with the smallest change (namely the minimal number of the transfer sheets).
For example:
node [0] component nc [0] [0] = 2, nc [0] [1] = 0110, nc [0] [2] = 3, nc [0] [3] = 1111
node [1] component nc [1] [0] =2, nc [1] =0111, nc [1] [2] =2, nc [1] [3] =0110
Then:
{nc[0][0],nc[1][0]}
{nc[0][1],nc[1][3]}
{nc[0][3],nc[1][1]}
{nc[0][2],nc[1][2]}
respectively, in pairs, which combination minimizes the number of sheets to 2.
Step six, using the method in step five to group the components in the node and the node [0] to obtain a path from the component of the node to the component of the node [ shanten ], namely nc [ n ] →nc [0] →nc [1] → … →nc [ shanten ], each component change represents sequential or several times of rotation, so that the current node hand moves to the sum card; this step continues until the sum cards are dealt, and returns to step (one) if the node component requires that cards are not present or the probability that the cards are available is too small.
And (six) grouping the components in the node and the node [0] by using the method in the step (five), so as to obtain a path for converting the component of the node into the component of the node [ shanten ], namely nc [ n ] →nc [0] →nc [1] → … →nc [ shanten ]. Each component change represents sequential or several times of rotation, so that the current node hand moves to the sum card.
And (seventh) continuing the step (sixth) until the sum cards are withdrawn, and if the node component requires that the cards are not present in the card wall or the probability of the cards being available is too small, operating according to the fastest sum card rule until the sum cards.
Wherein, the node component demand cards do not exist in the card wall, generally, the number of the demand cards reaches 4 in the river and the deck dew. In addition, if the number of required cards is generally only one, the probability that the card is available is considered too small.
Although the invention has been described in terms of the preferred embodiment, it is not intended to limit the scope of the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the appended claims.

Claims (3)

1. A mahjong AI data processing method based on big data drive is characterized in that the method is characterized in that a tile-moving process in a history card-moving process is selected, node cards in the tile-moving process are extracted according to composition components and stored as AI behavior guidance, and the AI carries out tile-moving according to the node components of the current card-moving process based on the behavior guidance; the method specifically comprises the following steps:
extracting nodes conforming to node characteristics in the historical hand, wherein the node characteristics comprise: the method comprises the steps of reducing hearing of hands, increasing hearing of hands, eating, touching or conducting bar action and improving and turning, compressing the extracted node hands of nodes to obtain hash values corresponding to the node hands, putting nodes with the same hearing value into the same array according to hearing values of the node hands to obtain N arrays arr, marking the N arrays arr as arr [1], arr [2], … and arr [ N ], wherein N is the maximum hearing value of the extracted hands in all Data, each array arr jointly forms a database Data Map, each node contains index Data of a last hearing value node and a next hearing value node of the node, and the hands before turning and hands after turning can be found from the current hands of a history node through the index Data;
the AI card-line process includes the following steps:
the AI performs normal operation according to the fastest sum rule and judges the current hearing value after each operation, and when the current hearing value is reduced, the AI enters the step (II);
secondly, compressing the current AI hand data and the corresponding auxiliary dew data of the hand data to obtain a current hash value;
thirdly, according to the current hearing value obtained in the first step, finding a corresponding array as arr [ shanten ] in a database Data Map, and meanwhile, according to the hash value hash obtained in the second step, finding a node with the same hash value from the arr [ shanten ], marking as a node [0], if the node [0] is empty, returning to the first step, otherwise, entering the fourth step;
fourthly, according to the node [0] obtained in the step (three), a plurality of child nodes exist in a node [0] chain, each child node represents a hand of an audible value in the case of the shanten, the AI current hand node is the node, the matched node is the node [0], in the matched node chain structure, the subsequent nodes are respectively marked as the node [1], the node [2] … node [ shanten ], and the maximum value of the shanten in the node [ shanten ] is the audible value obtained in the step (one);
respectively carrying out characteristic component analysis on the chain structures node [0], node [1] and node [2] and node [ shalten ] of the matched nodes in the step (four), obtaining corresponding components nc [0], nc [1], nc [2], nc [3] and nc [ x ], wherein the component number x is changed according to the arrangement mode of the nodes, the corresponding components of the node [ n ] are recorded as nc [ n ] [0], nc [ n ] [1] & nc [ n ] [ x ], after obtaining the components, sequentially comparing the components with the following nodes from the first node, and grouping the components with the smallest change;
step six, using the method in step five to group the components in the node and the node [0] to obtain a path from the component of the node to the component of the node [ shanten ], namely nc [ n ] →nc [0] →nc [1] →. The constituent change represents one or several times of transfer, so that the current node hand trend and the card are achieved;
continuing the step (six) until the sum cards are withdrawn, and if the node component demand cards do not exist in the card wall or the probability of the cards being available is too small, operating according to the fastest sum card rule until the sum cards are withdrawn;
the method comprises the following steps of:
(1) Grouping hands according to the suit;
(2) Sorting each set of cards individually within the set;
(3) The number of occurrences of each card is respectively indicated by a number, and the number of non-occurrences of the card is indicated by a "0";
(4) Judging whether a plurality of continuous '0's exist in the same group of cards in the step (3), if so, compressing the cards into a '0';
(5) Judging whether the last position of the previous group of cards and the first position of the next group of cards in the step (3) are both 0, if so, not compressing and still recording as two continuous 0;
(6) Repeating the steps (2) - (5) until all hands are represented by numbers;
(7) Compressing the auxiliary dew corresponding to the hand according to the same method as the steps (2) - (6) and placing the auxiliary dew at the tail of the compression value of the hand;
(8) The ten thousand pieces of barrel cards, the wind cards and the arrow cards are respectively overlapped and compressed into 32-bit integer data, and the value is recorded as the hash value hash of the hand card.
2. The big data driving based mahjong AI data processing method according to claim 1, wherein: in the step (six), the node component required cards are not existing in the card wall, namely the required cards are up to four in the river and the opponent pair dew.
3. The big data driving based mahjong AI data processing method according to claim 1, wherein: in the step (six), the probability of the card being available is too small, which means that only one card is left in the required number.
CN201910759449.9A 2019-08-16 2019-08-16 Mahjong AI data processing method based on big data driving Active CN110478907B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910759449.9A CN110478907B (en) 2019-08-16 2019-08-16 Mahjong AI data processing method based on big data driving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910759449.9A CN110478907B (en) 2019-08-16 2019-08-16 Mahjong AI data processing method based on big data driving

Publications (2)

Publication Number Publication Date
CN110478907A CN110478907A (en) 2019-11-22
CN110478907B true CN110478907B (en) 2023-04-28

Family

ID=68551557

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910759449.9A Active CN110478907B (en) 2019-08-16 2019-08-16 Mahjong AI data processing method based on big data driving

Country Status (1)

Country Link
CN (1) CN110478907B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112685921B (en) * 2021-03-12 2021-06-15 中至江西智能技术有限公司 Mahjong intelligent decision method, system and equipment for efficient and accurate search

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013075085A (en) * 2011-09-30 2013-04-25 Xpec Entertainment Inc Machine-implemented method of handling hand tile/card of virtual player in tile game/card game
US20150018059A1 (en) * 2013-07-15 2015-01-15 Yaowen CHEN Online Mahjong Game
CN106469317A (en) * 2016-09-20 2017-03-01 哈尔滨工业大学深圳研究生院 A kind of method based on carrying out Opponent Modeling in non-perfect information game
CN108926843A (en) * 2017-05-22 2018-12-04 周少华 The control method and system won the game in a kind of mahjong class game

Also Published As

Publication number Publication date
CN110478907A (en) 2019-11-22

Similar Documents

Publication Publication Date Title
CN110404264B (en) Multi-person non-complete information game strategy solving method, device and system based on virtual self-game and storage medium
CN110404265B (en) Multi-user non-complete information machine game method, device and system based on game incomplete on-line resolving and storage medium
Lee et al. The computational intelligence of MoGo revealed in Taiwan's computer Go tournaments
Ponsen et al. Integrating opponent models with monte-carlo tree search in poker
Font et al. A card game description language
CN109871943A (en) A kind of depth enhancing learning method for big two three-wheel arrangement of pineapple playing card
Buro The evolution of strong othello programs
CN110478907B (en) Mahjong AI data processing method based on big data driving
Charlesworth Application of self-play reinforcement learning to a four-player game of imperfect information
Sturtevant et al. Prob-max^ n: Playing n-player games with opponent models
CN101901304A (en) Method for realizing intelligent algorithm for computer player in Doudizhu game
Edelkamp Challenging human supremacy in Skat
Dockhorn et al. A decision heuristic for Monte Carlo tree search doppelkopf agents
CN115054906A (en) Chess and card reinforcement learning method, system and medium based on Monte Carlo sampling
WO2016132297A1 (en) Simulator and automated selection module for strategies to play baseball
Temporel et al. A heuristic hill climbing algorithm for Mastermind
Barone et al. Evolving adaptive play for simplified poker
CN114146401A (en) Mahjong intelligent decision method, device, storage medium and equipment
Gaina et al. TAG: Pandemic competition
Agrawal et al. Targeted upskilling framework based on player mistake context in online skill gaming platforms
Fedorovskaya et al. Monte carlo tree search player for mai-star and balance evaluation
Barone et al. Evolving computer opponents to play a game of simplified poker
Li et al. Solving six-player games via online situation estimation
Goykhman On self-play computation of equilibrium in poker
Li et al. A multi-strategy valuation model for popular mahjong actions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant