WO2016085042A1

WO2016085042A1 - Card use pattern analysis method for predicting type of business used, and server for performing same

Info

Publication number: WO2016085042A1
Application number: PCT/KR2015/001297
Authority: WO
Inventors: 이태영
Original assignee: 비씨카드(주)
Priority date: 2014-11-28
Filing date: 2015-02-09
Publication date: 2016-06-02
Also published as: KR101624272B1; CN107004221A; CN107004221B

Abstract

According to one embodiment of the present invention, a card use pattern analysis method for predicting the type of business used by a card user comprises the steps of: collecting business type card use information from a plurality of users; forming at least one hash function group including at least one hash function; calculating, for each hash function, hash values for the business type card use information and extracting minimum values from the hash values; generating, for each hash function group, clustering keys on the basis of the minimum values; grouping the plurality of users by using the clustering keys; and predicting the type of business to be used in the future by the card user by using the grouping information.

Description

Card usage pattern analysis method for predicting usage industry and server performing the same

The present invention relates to a card usage pattern analysis method for predicting a usage industry and a card company server performing the same. More particularly, the grouping of users is performed through a plurality of hash function groups, and the suffix tree and Bayes' theorem are managed. The present invention relates to a method of quickly analyzing a card use pattern and using the result, and a card company server performing the same.

In modern society, where economic activity is becoming more active, payment methods for goods and services are becoming more and more diversified and complicated.

Among them, a payment method using a card is one of the most commonly used payment methods in addition to cash payment, and various types of cards such as check cards, credit cards, and debit cards are used.

Each card company conducts a kind of CRM (Customer Relationship Management) that analyzes the customer's payment information for customer management and recommends cards for the industries that customers frequently use. Algorithms such as Nearest Neighbor (K-NN) and K-Means have been used as a method of clustering information.

However, since there are many customers in one card company, in order to analyze all the card usage patterns of the customers using these memory-based algorithms, there is a disadvantage in that a system having a high performance such as a super computer must be used. Therefore, the card company sampled a certain amount of customer information, and compared and analyzed the card usage pattern of customers.Increasing the amount of data sampled increased the time required for analysis, and decreasing the amount of data increased the accuracy of data analysis through sampling. There was a problem of falling.

Therefore, there is a demand for a method of quickly processing a large amount of customer information and more accurately calculating the probability of use of each target customer.

The present invention aims to solve the above-mentioned problems of the prior art.

An object of the present invention is to more accurately predict the probability for each of the industries to be used in the future based on the industries previously used by the user.

Another object of the present invention is to provide card product recommendation information suitable for the user based on the probability calculated for the user's future use.

Another object of the present invention is to detect card misuse based on the probability calculated for the user's future use.

In order to achieve the above object, an embodiment of the present invention, in the card usage pattern analysis method for predicting the use industry of the card user, collecting the card use industry information from a plurality of users; Constructing at least one hash function group, the at least one hash function group comprising at least one hash function; Calculating a hash value of the card type information for each hash function and extracting a minimum value among the hash function; For each hash function group, generating a clustering key based on the minimum values; Grouping the plurality of users using the clustering key; And predicting a future usage type of a card user by using the grouping information.

The grouping step may further include constructing a use path for each industry type included in each group classified for each clustering key as a suffix tree.

The suffix tree may set weights of the use paths according to the frequency of use of each type of business.

The using industry prediction step may include: extracting a plurality of users having a similar industry usage pattern to a target user who is a use industry prediction target; Calculating a prior probability on the usage history of each of the extracted users; And using the Bayes' theorem, calculating a probability of industries that may be used in the future, based on the previously used industry information of the target user.

The method of analyzing a card usage pattern may include accumulating an abnormal value when the user uses a card in an industry in which the future use probability is less than a certain value, and determines that the user uses the card as a fraud when the card reaches a certain value. can do.

The grouping step may include distributing data processing to a plurality of processing units in generating a user group for each of the clustering keys by performing a Map-Reduce operation.

The method of analyzing a card use pattern may further include recommending a card product including a benefit for the industry based on the predicted use industry.

In order to achieve the above object, another embodiment of the present invention, a card company server for analyzing a user's card usage pattern, Card usage information collection unit for collecting card use industry information from a plurality of users; A hash function group constructing unit for constructing at least one hash function group including at least one hash function; Computing a hash value for the card-use industry information for each hash function and extracting a minimum value among the hash function, generating a clustering key for each hash function group based on the extracted minimum values, and using the generated clustering key to a plurality of users. A map reducer for performing grouping on the apparatus; A suffix tree construction unit for constructing a use path for each industry type included in each grouped by the clustering key as a suffix tree; And a usage probability calculation unit for each industry type that calculates a use probability for a future use industry based on information on a previous use industry of the target user to be analyzed.

The suffix tree building unit may set weights for the use paths according to the frequency of use of each type of business.

The use probability prediction unit for each industry type extracts a plurality of users having similar usage patterns to the target user, calculates a prior probability of the usage history of each of the extracted users, and uses the Bayes' theorem to calculate the use probability of the target user. Can be calculated

The card company server may further include a card product recommendation unit for providing card product recommendation information to the user based on the calculated use probability for each industry type.

The card company server, based on the calculated use probability for each type of industry, when the user uses the card in the industry that the use probability is a certain value or less, accumulate an abnormal value for this if the abnormal value reaches a certain value of the corresponding user The apparatus may further include a fraud detection unit for determining a card usage as fraudulent usage.

The map reducer may be configured to distribute data processing necessary for performing user grouping to a plurality of processing units.

According to an embodiment of the present invention, the Bayes theorem may calculate the use probability of each industry based on the industry information previously used by the user.

According to an embodiment of the present invention, based on the probability calculated for the user's future use industry, it is possible to provide the card product information with a great benefit to the user, it is possible to detect that the user's card is used for fraudulent use have.

The effects of the present invention are not limited to the above-described effects, but should be understood to include all the effects deduced from the configuration of the invention described in the detailed description or claims of the present invention.

1 is a diagram schematically illustrating a card use pattern analysis system according to an embodiment of the present invention.

2 illustrates a process of hashing an industry code of an industry in which a user has made a payment using a card, and a process of generating a clustering key through a corresponding hash value according to an embodiment of the present invention.

3 is a schematic diagram illustrating a grouping method of users through a Min-Hash algorithm according to an embodiment of the present invention.

4 is a diagram briefly showing a suffix tree constructed according to an embodiment of the present invention.

5 is a block diagram showing the internal configuration of a card company server according to an embodiment of the present invention.

Hereinafter, with reference to the accompanying drawings will be described the present invention. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and like reference numerals designate like parts throughout the specification.

Throughout the specification, when a part is "connected" to another part, it includes not only "directly connected" but also "indirectly connected" with another member in between. . In addition, when a part is said to "include" a certain component, this means that it may further include other components, without excluding the other components unless otherwise stated.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

Card usage pattern analysis system according to an embodiment of the present invention, the user may include a card 100, affiliated stores 200 and card company server 300.

Card 100 according to an embodiment may mean any card that the user can directly pay through the card, such as credit card, check card, debit card. In addition, the card 100 may be in the form of a magnetic card, IC card, mobile card, RF card.

Merchant 200 according to an embodiment may have a CAT terminal, POS terminal or other payment performing device that can perform payment through the user's card 100, through the card company server 300 Can communicate with. Communication between the terminal of the affiliated store 200 and the card company server 300 may be performed through a VAN company server.

Merchant 200 according to an embodiment may be classified into a plurality of industries on the card company server (300). For example, the card company may classify each affiliate store 200 into a convenience store, a restaurant, a large mart, mobile communication, a coffee shop, a hospital, an academy, and the like. When you make a payment at can provide a variety of benefits, such as discounts, points earned.

The card company server 300 may assign a code for each industry to manage the industry classification of the affiliated store 200 as described above. Industry type codes may be configured in various forms, but in the present invention will be described on the assumption that it is composed of four digits for convenience of description.

The card company server 300 manages the payment history of the card 100 of a plurality of users, and manages the business type code of the affiliated store 200 in which payment is performed through each card 100. For example, four users may be described. As shown in Table 1 below, usage history of each user may be summarized.

Table 1

User 1	4072	4063	4012	4011	4566
User 2	4072	4063	4076	4099	4800
User 3	4095	4044	4042	4511	4566	4800	4099
User 4	4702	4063	4012	4011	4566	4042	4511	4566	4800

Referring to Table 1, the first user and the second user in five industries, the third user in seven industries, the fourth user in nine merchants 200 through the payment through their card 100 It can be seen that the. In addition, according to an embodiment, the business type code information may be managed in the order in which each user performs payment in each affiliate store 200.

The card company server 300 may analyze the user's card usage pattern based on such information, and may predict a future usage industry of the affiliate store 200 for each user to perform the next payment.

Looking at a specific method for the card company server 300 to predict the user's future industry, the card company server 300 may use a Min-Hash method. Min-Hash algorithm is an algorithm that groups users through the minimum value of hash values, which is a unique reduction value generated when hashing data.

The hash may mean that the data is changed to another shape and used for various purposes. In the present invention, the hash function used to perform hashing may be a one-way hash function. One-way hash function has a characteristic that the contents of the original data are unknown through the unique value (hash value) generated when the original data is hashed using the hash function. Also, if the original data is the same, the hash values generated through the same hash function are the same, but the identity of the original data to the same hash value is not guaranteed.

The hash algorithms used by the card company server 300 disclosed in the present invention may be verified hash algorithms selected from a group of Secure Hash Algorithm (SHA) functions, but may be composed of algorithms that the card company server 300 arbitrarily generates. .

The card company server 300 may generate a hash function group including at least one hash function. In addition, a plurality of hash function groups may be set. Hereinafter, the present invention will be described on the assumption that two hash function groups are implemented on the card company server 300, and each hash function group includes two hash functions. Let's do it.

When two hash function groups configured on the card company server 300 are referred to as q1 and q2 and the hash functions are referred to as h1 to h4, q1 may include h1 and h2 and q2 may include h3 and h4. It may be expressed as Equation 1.

Equation 1

According to one embodiment, the hash functions disclosed in the present invention may be configured in the form of Equation 2, respectively.

Equation 2

For example, the hash functions of h1 to h4 may be configured as in Equation 3 below.

Equation 3

The card company server may obtain a hash value corresponding to each business type code by substituting the business type code of the affiliated store 200 that has settled the users in each hash function corresponding to h1 to h4.

For example, referring to the process of acquiring the hash values for the data shown in Table 1, if 4072 of the industry codes of the industry in which the first user paid the bill is assigned to the hash function corresponding to h1, h1 (4072) = 3x4072 + 5 (mod 17) = 12221 (mod 17) = 15 the calculation is performed to produce a hash value of 15.

The card company server 300 obtains a hash value for each industry code by performing the above calculation for each industry code of the industry in which each user has made a payment. The results can be found in the table shown in FIG.

2 illustrates a process of hashing a business type code of an industry in which a user has made a payment using a card 100 according to an embodiment of the present invention, and a process of generating a clustering key through a corresponding hash value.

Referring to FIG. 2, an industry type code of an affiliated store 200 in which four users have made a payment is displayed, and a hash value calculated by assigning each industry type code to a hash function of h1 to h4 is displayed. Referring to FIG. 2A corresponding to the first user, an industry code corresponding to 4063 and 4012 is included. The two industry codes are different from each other, but the hash value calculated by h1 is equal to 5, and the hash value calculated by h2 is different from each other by 1 and 6.

Min-hash algorithm used by the card company server 300 of the hash value calculated through the function for each hash function used in the process of each user hashing the industry code of the merchant 200 using the card 100 The minimum value can be extracted. The minimum value thus extracted is shaded in the table shown in FIG. 2.

Referring to FIG. 2, when a hash value of an industry code for each user's card-using business is calculated with each hash function, the minimum value of the hash function number is extracted. In the present invention, this is divided into a hash function group and used to group each user. Referring to the example of FIG. 2, a clustering key of 00010000 is generated in a hash function group corresponding to q1 for the first user. do. In detail, a clustering key of 00010000 is generated by combining 1, which is the minimum value of the hash value generated by h1, and 0, which is the minimum value of the hash value generated by h2. Similarly, a clustering key called 00050006 is generated in the hash function group corresponding to q2 for the first user. In the same manner, two clustering keys are generated for each of the second to fourth users.

The clustering key is a value used as a criterion for classifying users in the present invention, and users having the same clustering key are included in the same group. Since the clustering key is generated by the number of hash function groups randomly set by the card company server 300, each user can belong to the same group as the number of hash function groups.

Referring to FIG. 3, a process of grouping first to fourth users through the clustering key calculated in FIG. 2 may be seen.

For example, a clustering key corresponding to 00010000 combines a first user, a third user, and a fourth user into one group, and a clustering key corresponding to 00030002 includes one third user and a fourth user. Will be grouped together. For each clustering key corresponding to the remaining 000050001, 00050002, 00050006, only one user will be included in the group.

According to an embodiment, as the number of hash functions included in one hash function group increases, the probability that each clustering key has a unique value increases, so the number of users included in one group may decrease. .

According to an embodiment of the present invention, the card company server 300 may use Hadoop, which is a platform for processing big data, in grouping users using the clustering key as described above.

Hadoop is a platform that distributes and processes data processing tasks through a plurality of data processing systems in order to efficiently process huge amounts of data such as big data.

Hadoop is composed of Hadoop Distributed File System (HDFS) and Map-Reduce algorithm. Hereinafter, a description will be given of how the card company server 300 groups users using the map reduce algorithm. .

According to an embodiment, the map reduce algorithm may perform a mapping operation and a reduce operation. In the present invention, the card company server 300 extracts the clustering key of each user through the minimum value of the hash value of the merchant 200 code that the user performed the payment through the mapping operation, the user for each clustering key through the reduce operation Grouping of can be performed.

Looking at how the card company server 300 implements the Min-Hash algorithm through the mapping operation, the card company server 300 is distributed to the processing unit for processing the mapping information of each user to provide the processing. Each processing unit may calculate an industry type code corresponding to an affiliated business type in which each user has settled the payment, into a hash value, and generate a clustering key for each hash function group.

As described above, the card company server 300 extracts the minimum value of the hash value calculated by substituting the user's industry use information into each hash function, and in the case where a plurality of hash functions exist, combining the extracted minimum values to generate a clustering key. Create

As described above, since the process of generating the clustering key is performed for each user, user information may be distributed and processed in a plurality of processing units included in or connected to the card company server 300 and then grouped through a reduce operation.

In the example herein, the user is limited to four users, but in order to process the data for the number of members existing in each card company in one system, a system having a high function such as a super computer is required, or There was a problem that takes a long time. Therefore, by distributing user data in a plurality of processing units included in the card company server 300 or connected to the card company server 300 as described above, the data processing speed can be improved. According to an embodiment of the present invention, as the data processing speed increases, the user may predict a type of business of the affiliate store 200 to perform future payments and recommend a card having a benefit for the type of business to the user in real time.

The card company server 300 performs a reduce operation based on the data for which mapping is completed, and performs grouping based on the clustering key derived for each user. Referring to FIG. 3, the process from (b) to (c) is processed by the reduce operation.

The card company server 300 may then establish a suffix tree of a usage path for each type of business included in each group classified by the clustering key. A suffix tree is an index data structure that allows you to search for a string through the suffixes that it contains. It can be used as a tool to detect whether a pattern of a specific string exists in the suffix tree.

The card company server 300 may generate information on user groups based on the clustering key through a mapping operation and a reduce operation. The card company server 300 may construct a usage path of each user's industry in the form of a suffix tree for each group grouped by users having the same clustering key.

For example, when two users are included in a specific group, and the first user and the second user are referred to, the first user may have used the card 100 in ABCD sector order, and the second user may use AFBC. Card 100 may be used in order of industry.

At this time, the card company server 300 may build the suffix tree shown in FIG. 4 based on the card use industry paths of the first and second users. Referring to the process of constructing the suffix tree of FIG. 4, when the order of an industry in which the first user uses the card 100 is ABCD, the suffix of the string, that is, the suffix D, CD, BCD, ABCD, may appear. . Similarly, when the order of the type of business in which the second user uses the card 100 is A-F-B-C, the suffixes of this character string may be C, B-C, F-B-C, and A-F-B-C. In this way, the card company server 300 constructs the suffix tree based on the list of the suffixes of the string representing the industry type order used by each user.

In Fig. 4, each circle is called a node and includes industry type information as described on the drawing. The line connecting each node may indicate an order of use by industry.

Referring to FIG. 4, the furnaces derived from the top (Root node) of the suffix tree may have values of A, BC, CD, D, FB, and C, respectively. Among the nodes corresponding to A, nodes having a value of BCD and FBC are derived, and nodes having a value of D are derived from a node corresponding to BC.

Looking at node A, the sectors that can be arranged after A, as shown in ABCD and AFBC, are BCD and FBC, so corresponding nodes can be derived. Looking at BC node, ABCD is followed by D. However, since there is no following industry in AFBC, only nodes corresponding to D may be derived.

As such, by constructing a usage tree for each industry type included in each group generated through the clustering key as a suffix tree, a time required for searching for a usage path for each industry type can be shortened. That is, when searching for a usage path for each industry by using a suffix tree, a search may be performed at a speed of O (n). For example, assuming that a user searches for the previous five industries used by a particular user, even in the worst case, the pattern may be searched with only five accesses.

According to an embodiment of the present disclosure, the card company server 300 may assign a weight to a node corresponding to a type of business frequently used by a user included in a group based on building a suffix tree. The weights given to the nodes constituting the suffix tree may be applied to the probability calculation in the use industry prediction to be described later.

According to an embodiment of the present disclosure, the card company server 300 may predict an industry of the affiliate store 200 to which the user will later pay by analyzing a previous user's card usage pattern based on the established suffix trees.

Looking at how the card company server 300 performs the user's use industry forecast, the card company server 300 may select a user that is the target of the use industry forecast, hereinafter the selected user will be referred to as the target user do. According to an embodiment of the present disclosure, the target user is selected as a user who has recently made a payment through the card 100, and then the use industry prediction is performed in real time on the card company server 300, and within a predetermined time after the user performs payment The card product recommendation message according to the usage industry prediction may be transmitted to the corresponding user.

The card company server 300 may then search for all the usage paths for each type of business in the group to which the target user belongs, and extract a predetermined number of users that show a card use pattern similar to the target user.

According to an embodiment, in determining the similarity between the card usage pattern of the target user and the card use pattern of the other users, Jaccard Similarity, Pearson Correlation, Cosine Similarity, and the like may be used.

The card company server 300 may use Bayes' theorem to predict the type of business of the target user. Looking briefly at Bayesian theorem, this is to obtain the probability of the desired hypothesis through the prior value and the probability value that changes according to the addition of new data, which can be expressed by the following equation.

Equation 4

Referring to the above Equation 4 is applied to the present invention, X and Y represent the type of business of the affiliated store 200, P (X) and P (Y) is the user of the merchant corresponding to the X industry and Y industry ( In the 200, the probability of performing the payment using the card 100, that is, the probability of the user using the X industry and the Y industry can be seen. P (Y | X) represents the probability that users who use the X industry use the Y industry, and P (X | Y) represents the probability that the users who use the Y industry use the X industry.

The card company server 300 calculates the probabilities for a certain number of users who exhibit a similar industry usage pattern to the target user among the groups to which the target user belongs among the groups classified by the clustering key, so that the target user can perform payment in the future. Find the industry with the highest probability. In addition, the card company server 300 may predict the upper predetermined number of industries that the target user is likely to make a payment.

Equation 4 may be converted as follows.

Equation 5

In Equation 5, P (Y) of the upper side is fixed as a prior probability, and the lower side calculates the probability that the X sector is used as the sum of the probability that the X sector is used after the remaining sectors except for X are used. . As such, by using the conditional probability that the user using the Y industry uses the X industry, the user using the X industry can calculate the probability of using the Y industry, and by repeating this, the user using the X industry next time By combining the probabilities of using, the industry with the highest probability can be extracted. In addition, instead of selecting one industry, the card company server 300 may extract a certain number of industries in a high probability order.

According to an embodiment of the present disclosure, the card company server 300 may recommend a new card product to the target user based on a probability analysis on the industries to be used by the target user in the future. The card product recommended to the target user may be the card with the greatest benefit for the industry that the target user is most likely to use in the future, and the benefits for a certain number of industries that the target user is likely to use in the future are considered in consideration. It may be a card. For example, a recommendation may be made in order of a card product having a higher total score by multiplying a probability of an industry to be used by a target user later and an evaluation score for a benefit of the industry included in a specific card.

Card product recommendation of the card company server 300 may be made in real time through the user's terminal. As described above, the card company server 300 can recommend card products in real time. As described above, the data processing work is distributed and processed by a plurality of processing units through a map reduce method, and the user's industry is used through the establishment of a suffix tree. This is because the pattern can be retrieved quickly.

According to an embodiment of the present disclosure, the card company server 300 may determine whether the target user's card 100 is used for fraudulent use, based on the probability of the industry to be used later by the target user.

In detail, for each type of business existing in the classification system of the card company server 300, while the probability that the target user uses the industry in the future is calculated, the target user repeatedly uses the industry having a lower probability than the predetermined value. If used, it is judged as fraudulent use.

For example, the card company server 300 may manage anomaly related to fraudulent use for each user, and increase the abnormality value when the target user uses an industry calculated that the probability of future use is lower than a certain value. In this way, when the abnormal value increases and reaches a certain value, the target user's card use can be determined to be illegal.

According to an embodiment of the present disclosure, when it is determined that the use of a specific user's card is illegally used, the card company server 300 may set the corresponding card 100 to be in an unusable state.

Hereinafter, the configuration of the card company server 300 according to an embodiment of the present invention will be described with reference to FIG. 5. 5 is a block diagram showing the internal configuration of a card company server 300 according to an embodiment of the present invention.

Card company server according to an embodiment of the present invention, the card usage information collecting unit 310, the hash function group configuration unit 320, the map reducer 330, suffix tree construction unit 340, the use probability calculation for each industry type The unit 350 may include a card product recommendation unit 360, an illegal use detection unit 370, a control unit 380, and a communication unit 390.

According to an embodiment of the present invention, the card usage information collecting unit 310, the hash function group configuration unit 320, the map reducer 330, the suffix tree construction unit 340, the use probability calculation unit for each industry ( The card commodity recommender 360, the fraud detection detector 370, the controller 380, and the communicator 390 may be program modules or hardware capable of communicating with an external device. Such a program module or hardware may be included in the card company server 300 or another device that can communicate with the card company server in the form of an operating system, an application module, and other program modules, and may be physically stored on various known storage devices. . Meanwhile, such program modules or hardware may include, but are not limited to, routines, subroutines, programs, objects, components, data structures, etc. that perform particular tasks or execute particular abstract data types described below in accordance with the present invention.

The card usage information collector 310 according to an embodiment may collect payment information of card users. The information used to analyze the user's card usage pattern in the present invention is the industry information of the affiliated store 200 in which the user has made a payment using the card 100, the card usage information collecting unit 310 is the industry information May be stored in the form of an industry code corresponding to each industry.

The card usage information collecting unit 310 may store time information of payment information of each user, so that the suffix tree may be constructed later based on the order of the industry used by the user.

The hash function group configuration unit 320 according to an embodiment may configure a hash function group including a hash function used to hash an industry code for an industry used by a user, and at least one hash function.

As described above, the hash function used in the present invention may be in the form as shown in

Equations

2 and 3, and a plurality of authorized hash algorithms may be used.

The hash function group configuration unit 320 configures a plurality of hash function groups, and the number of hash function groups configured may be adjusted according to a user card use pattern analysis result. For example, as the number of hash function groups increases, the number of groups to which one user belongs may increase. Accordingly, the card usage pattern of each user may be analyzed more diversely, but the card company server 300 may have one. As the number of groups to be analyzed for each user increases, it takes more time to analyze the data, and the load of the card company server 300 and the processing units connected to the data processing unit for processing data analysis may increase.

The hash function group configuration unit 320 may adjust the number of hash functions included in each hash function group. If more hash functions are included in one hash function group, the number of cases in which clustering keys can be generated accordingly becomes larger. Therefore, the number of users included in one group becomes smaller, so that the same group as the target user will be added later. The number of targets for extracting users who have a similar usage history as the target user may be reduced. On the contrary, if the number of hash functions included in one hash function group is small, the number of users included in one group increases, so that the accuracy of searching for a user having a similar industry usage pattern to the target user may be inferior. have. The hash function group construction unit 320 may grasp the advantages and disadvantages as described above and adjust the hash function group so that a proper number of hash functions are included in one hash function group.

The map reducer 330 according to an embodiment may generate a clustering key for each user by performing a mapping operation and a reduce operation, and perform grouping of users based on the generated clustering key. The map reducer 330 may be divided into a mapping unit and a reducer to perform respective roles. The mapping and reducer may be included in the card company server 300 or connected to the card company server 300. Distributed in a plurality of processing units.

The map reducer 330 may implement the Min-Hash algorithm through a mapping operation and generate a clustering key through the map reduction operation. In detail, the map reducer 330 performs hashing on a business type code of each type of industry used by each user through each hash function generated by the hash function group configuring unit 320, and for each hash function. The minimum value of the calculated hash values can be extracted. Thereafter, the map reducer 330 may generate a clustering key based on the extracted minimum values for each hash function group.

The map reducer 330 may perform grouping of users based on the clustering key generated for each user. Users with the same clustering key are grouped into one group, and as a user has a plurality of clustering keys, they may belong to a plurality of groups.

The suffix tree building unit 340 according to an embodiment may build a usage path for each type of business included in the group in the form of a suffix tree for each group divided by the clustering key. As shown in FIG. 4, the suffix tree is a data structure in which a user's usage path of each user belonging to a group is constructed in a tree form, and may include order information of industries used by the users.

The suffix tree builder 340 may assign a weight to each node in the suffix tree. For example, the suffix tree builder 340 may assign a weight to a node related to a frequently used industry usage pattern, so that the weight may be considered in calculating a post-use probability for each industry in the future.

According to an exemplary embodiment, the usage probability calculator 350 for each type of industry may calculate a probability that the target user uses other types of business in the future based on information on the type of business previously used by the target user.

To this end, the use probability calculation unit 350 for each industry searches for a usage path for each industry in the group to which the target user belongs by using the suffix tree, and extracts a predetermined number of users having the most similar usage patterns for each industry in each group. can do.

The use probability calculation unit 350 for each type of industry may use Bayes' theorem as described above in calculating the use probability for each type of industry of the target user. In detail, the usage probability calculation unit 350 for each type of business may calculate a prior probability that each type of industry is used by analyzing the entire usage history of users having similar target patterns and usage patterns for each type of industry as described above.

The use probability calculation unit 350 for each business type is configured to calculate a probability for a future use industry of the target user, and when information on a previous use industry of the target user is collected, the post probability based on this information, that is, the future according to the previous use industry The probability for the industries you use can be calculated using Bayes' theorem.

According to an embodiment of the present disclosure, the card product recommendation unit 360 may recommend a new card product to the user, based on the probability calculated by the industry-specific use probability calculator 350. The way in which the card product is recommended to the user may be a method in which the card product recommendation unit 360 transmits information corresponding to a user terminal such as a computer or a mobile phone, and the card company's telemarketer receives the information from the card product recommendation unit 360. It may also be a method of learning and performing telemarketing.

The card recommendation unit 360 recommends a card to a user by recommending a card having the greatest benefit to an industry that the target user is most likely to use in the future. It may be to recommend a card that includes all of the benefits. In addition, the card product recommendation unit 360 may provide the card product recommendation information to the user, together with information analyzing the user's card usage pattern.

The fraud detection unit 370 according to an embodiment may detect that fraud is performed through the user's card, based on the probability derived from the usage probability calculation unit 350 for each industry.

The fraud detection unit 370 may determine fraud in a manner such as managing an abnormal value in a point format when continuously making payments in businesses determined that the user has a low probability of using the fraudulent use. For example, when a user makes a payment in an industry having a lower probability than a certain value, the user allocates a certain point according to the calculated probability, and when the accumulation of points reaches a certain value, the card 100 is detected as an illegal use detection target. It may be to select.

As described above, according to the embodiment of the present invention, the card company server 300 can quickly analyze the user's card usage pattern analysis, and based on the probability of the user's future use industry, the card product suitable for the user in real time. Recommend or detect fraud through your card.

The foregoing description of the present invention is intended for illustration, and it will be understood by those skilled in the art that the present invention may be easily modified in other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.

The scope of the present invention is represented by the following claims, and it should be construed that all changes or modifications derived from the meaning and scope of the claims and their equivalents are included in the scope of the present invention.

Claims

In the card usage pattern analysis method for predicting the use industry of the card user,

Collecting card use industry information from a plurality of users;

Constructing at least one hash function group, the at least one hash function group comprising at least one hash function;

Calculating a hash value of the card type information for each hash function and extracting a minimum value among the hash function;

For each hash function group, generating a clustering key based on the minimum values;

Grouping the plurality of users using the clustering key; And

Predicting the future use industry of the card user using the grouping information, Card usage pattern analysis method.
The method of claim 1,

The grouping step,

And using a suffix tree to construct a usage path for each industry type included in each grouped by the clustering key as a suffix tree.
The method of claim 2,

The suffix tree is a card usage pattern analysis method for setting the weight of the use path in accordance with the frequency of use of the user industry.
The method of claim 1,

The use industry prediction step,

Extracting a plurality of users having a similar business pattern and a target user to be used industry prediction;

Calculating a prior probability on the usage history of each of the extracted users; And

And using a Bayes theorem, calculating a probability of businesses that may be used in the future based on previous usage industry information of the target user.
The method of claim 4, wherein

If the user uses the card in the industry in which the probability of future use is below a certain value accumulates an abnormal value for this, and further comprising the step of determining the user's use of the card as fraudulent when reaching a certain value, the card Usage pattern analysis method.
The method of claim 1,

The grouping step,

And generating a user group for each of the clustering keys by performing a map-reduce operation, wherein the data processing is distributed to a plurality of processing units.
The method of claim 1,

The card usage pattern analysis method further comprises the step of recommending a card product including benefits for the industry based on the predicted use industry.
In the card company server to analyze the user's card usage pattern,

A card usage information collection unit configured to collect card usage type information from a plurality of users;

A hash function group constructing unit for constructing at least one hash function group including at least one hash function;

Computing a hash value for the card-use industry information for each hash function and extracting a minimum value among the hash function, generating a clustering key for each hash function group based on the extracted minimum values, and using the generated clustering key to a plurality of users. A map reducer for performing grouping on the apparatus;

A suffix tree construction unit for constructing a use path for each industry type included in each grouped by the clustering key as a suffix tree; And

Card company server comprising a usage probability calculation unit for each industry, calculating the use probability for the future use industry, based on the information on the previous use industry of the target user to be analyzed.
The method of claim 8,

The suffix tree construction unit,

Card company server for setting the weight for the use path according to the frequency of use of the industry by users.
The method of claim 8,

The use probability prediction unit for each industry type,

The card company server extracts a plurality of users having similar usage patterns with the target user, calculates a prior probability of the usage history of the extracted users, and calculates the use probability of the target user by industry using Bayes' theorem.
The method of claim 8,

The card company server further comprises a card product recommendation unit for providing the card product recommendation information to the user based on the calculated industry-specific use probability.
The method of claim 8,

When a user uses a card in an industry in which the probability of use is equal to or less than a certain value, the user accumulates an abnormal value and denies the user's use of the card when the abnormal value reaches a certain value. The card company server further comprises a fraud detection unit determined to use.
The method of claim 8,

The map reducer,

A card company server configured to distribute data processing necessary for performing user grouping to a plurality of processing units.