CN111368543A - Method and device for determining merchant category - Google Patents

Method and device for determining merchant category Download PDF

Info

Publication number
CN111368543A
CN111368543A CN202010099718.6A CN202010099718A CN111368543A CN 111368543 A CN111368543 A CN 111368543A CN 202010099718 A CN202010099718 A CN 202010099718A CN 111368543 A CN111368543 A CN 111368543A
Authority
CN
China
Prior art keywords
merchant
category
name
determining
categories
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010099718.6A
Other languages
Chinese (zh)
Other versions
CN111368543B (en
Inventor
郑磊
张达
朱峰言
赵萌
徐婷婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unionpay Advisors Counselor Shanghai Co ltd
Original Assignee
Unionpay Advisors Counselor Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unionpay Advisors Counselor Shanghai Co ltd filed Critical Unionpay Advisors Counselor Shanghai Co ltd
Priority to CN202010099718.6A priority Critical patent/CN111368543B/en
Publication of CN111368543A publication Critical patent/CN111368543A/en
Application granted granted Critical
Publication of CN111368543B publication Critical patent/CN111368543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for determining merchant categories, wherein the method comprises the following steps: acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and comprises first merchant names of determined merchant categories and second merchant names of undetermined merchant categories; according to a preset sequence, sequentially executing classification logic for each merchant category in the following modes: according to the first merchant name in the merchant category, the merchant name characteristic corresponding to the merchant category is determined, and the second merchant name conforming to the merchant name characteristic corresponding to the merchant category is determined to belong to the merchant category. The technical scheme is used for automatically classifying the classes of the merchants to which the merchants belong and ensuring the accuracy of classification.

Description

Method and device for determining merchant category
Technical Field
The embodiment of the invention relates to the technical field of information, in particular to a method and a device for determining a merchant category.
Background
The financial industry is mainly divided into merchants according to merchant category codes (MCC codes for short). The MCC code is used for marking the transaction environment of the Unionpay card, the main business range and the industry affiliation of the merchant and is the main basis for judging the settlement commission charge standard of the merchant in the domestic cross-bank transaction; the method is also one of important basic data for developing analysis and report of the Unionpay card transaction industry and managing and controlling risks of the Unionpay card business. The MCC code is formulated according to ISO international standard 'category code of commercial tenants in financial retail industry', which ensures that the bank card marks the commercial tenants industry in the same way when used across countries and borders.
The merchant reports the category of the merchant according to the comparison relation of the MCC codes when applying for registering the POS machine, and then the merchant is checked by a checker and then is recorded. The mode depends on the self reporting of the merchant category by the merchant and the field investigation of the auditor, and a large amount of human resources are wasted. Moreover, the category of the merchant reported by the merchant may not be consistent with the actual situation.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining merchant categories, which are used for automatically dividing merchant categories to which merchants belong and ensuring the accuracy of division.
The method for determining the merchant category provided by the embodiment of the invention comprises the following steps:
acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and the merchant name library comprises first merchant names of determined merchant categories and second merchant names of undetermined merchant categories;
according to a preset sequence, sequentially executing classification logic for each merchant category in the following modes:
determining merchant name characteristics corresponding to the merchant category according to a first merchant name in the merchant category; and determining that the second merchant name which accords with the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
Optionally, the merchant name feature corresponding to the merchant category is determined according to the first merchant name in the merchant category; determining that a second merchant name that meets the merchant name characteristics corresponding to the merchant category belongs to the merchant category includes:
obtaining a plurality of first merchant names in the merchant category;
performing word segmentation processing on each first merchant name in the plurality of first merchant names to obtain a word set;
if the appearance frequency of any word in the word set is greater than a first threshold value, the word does not belong to a preset common word, and/or the word belongs to a brand class word, determining the word as a white list word of the merchant class;
determining that a second merchant name that includes the whitelist term belongs to the merchant category.
Optionally, after determining that the second merchant name including the white list word belongs to the merchant category, the method further includes:
determining blacklist terms of the merchant category;
determining that a second merchant name in the merchant category that includes the blacklist terms does not belong to the merchant category, and determining other merchant categories to which the second merchant name that includes the blacklist terms belongs.
Optionally, after the classification logic is sequentially executed on each merchant category according to the preset sequence, the method further includes:
and if the determined classification result does not accord with the preset condition, adjusting the preset sequence, and sequentially executing the classification logic for each merchant category according to the adjusted preset sequence until the determined classification result accords with the preset condition.
Optionally, after the classification logic is sequentially executed on each merchant category according to the preset sequence, the method further includes:
aiming at any merchant name, acquiring transaction data of the merchant corresponding to the merchant name in a preset time period;
and determining the sub-category of the merchant name in the merchant category according to the transaction data of the merchant in the preset time period.
Optionally, the determining, according to the transaction data of the merchant in the preset time period, a sub-category of the merchant name in the merchant category includes:
if the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data of the merchant in the preset time period is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and the commodity details in the transaction data;
and determining the scene category of the merchant name in the merchant category according to a plurality of transaction scenes corresponding to the merchant.
Optionally, after determining the merchant name in the sub-category of the merchant category, the method further includes:
and for the customers involved in each transaction data, determining the customer category to which the customers belong according to the subcategories of the business names in the business categories to which the business names belong.
In the technical scheme, the merchant categories are set, classification logic is executed on each merchant category based on a preset sequence, and merchants with merchant categories which are not determined yet are classified into corresponding merchant categories, so that the merchant categories of all merchants are automatically determined, the labor cost is reduced, in addition, when the classification logic is executed on each merchant category, the merchant name characteristics of the merchant categories are determined based on the existing merchant names in the merchant categories, then the uncategorized merchants which accord with the merchant name characteristics are classified into the current merchant categories, the condition that the situation is inconsistent with the actual situation when the merchants report the merchant categories by themselves is avoided, and the accuracy of merchant classification is improved.
Correspondingly, an embodiment of the present invention further provides a device for determining a merchant category, including:
the acquisition unit is used for acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and the merchant name library comprises first merchant names of determined merchant categories and second merchant names of undetermined merchant categories;
the classification unit is used for sequentially executing classification logic on each merchant category according to a preset sequence in the following way:
determining merchant name characteristics corresponding to the merchant category according to a first merchant name in the merchant category; and determining that the second merchant name which accords with the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
Optionally, the classification unit is specifically configured to:
obtaining a plurality of first merchant names in the merchant category;
performing word segmentation processing on each first merchant name in the plurality of first merchant names to obtain a word set;
if the appearance frequency of any word in the word set is greater than a first threshold value, the word does not belong to a preset common word, and/or the word belongs to a brand class word, determining the word as a white list word of the merchant class;
determining that a second merchant name that includes the whitelist term belongs to the merchant category.
Optionally, the classifying unit is further configured to:
determining blacklist terms for the merchant category after the determining that a second merchant name including the whitelist terms belongs to the merchant category;
determining that a second merchant name in the merchant category that includes the blacklist terms does not belong to the merchant category, and determining other merchant categories to which the second merchant name that includes the blacklist terms belongs.
Optionally, the classifying unit is further configured to:
after the classification logic is sequentially executed on each merchant category according to the preset sequence, if the determined classification result does not accord with the preset condition, the preset sequence is adjusted, and the classification logic is sequentially executed on each merchant category according to the adjusted preset sequence until the determined classification result accords with the preset condition.
Optionally, the classifying unit is further configured to:
after the classification logic is sequentially executed on each merchant category according to the preset sequence, acquiring transaction data of the merchant corresponding to the merchant name in a preset time period aiming at any merchant name;
and determining the sub-category of the merchant name in the merchant category according to the transaction data of the merchant in the preset time period.
Optionally, the classification unit is specifically configured to:
if the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data of the merchant in the preset time period is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and the commodity details in the transaction data;
and determining the scene category of the merchant name in the merchant category according to a plurality of transaction scenes corresponding to the merchant.
Optionally, the classifying unit is further configured to:
after determining the sub-category of the merchant name in the affiliated merchant category, for the customers involved in each transaction data, determining the customer category to which the customers belong according to the sub-category of the merchant name in the affiliated merchant category.
Correspondingly, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the method for determining the merchant category according to the obtained program.
Accordingly, an embodiment of the present invention further provides a computer-readable non-volatile storage medium, which includes computer-readable instructions, and when the computer reads and executes the computer-readable instructions, the computer is caused to execute the method for determining the category of the merchant.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for determining a category of a merchant according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating another method for determining a category of a merchant according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an apparatus for determining a merchant category according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
According to the consumption industry category concerned by commercial banks and relevant institutions, label design is carried out on merchants in the transaction information, and a label system suitable for the invention is obtained. The label system can be divided into 18 primary labels, 80 secondary labels and 135 tertiary labels.
The first-level label type relates to the fields of love car I, leisure and entertainment, catering and food, house purchase and the like, and meanwhile, the first-level label (such as the leisure and entertainment) is refined to a second-level label (such as beauty and health care, sports, amusement places and the like), and then the second-level label (such as the amusement places) is further refined to a third-level label (such as amusement parks, chess and card rooms, game halls and the like).
In addition, the label system not only performs new division for the traditional industry, but also performs division for emerging industry, and subdivides short tenant accommodations, campus card transactions, shared bicycles and the like.
Based on the above description, fig. 1 exemplarily illustrates a flow of a method for determining a merchant category according to an embodiment of the present invention, and the flow may be performed by an apparatus for determining a merchant category.
As shown in fig. 1, the process specifically includes:
step 101, acquiring a merchant name library;
the merchant name library is formed by summarizing merchant names related to the transaction data of the bank cards, corresponds to a plurality of merchant categories, and comprises first merchant names of determined merchant categories and second merchant names of undetermined merchant categories. For example, the merchant name library includes 500 merchant names, wherein 100 merchant names already determine the merchant category to which the merchant belongs, and another 400 merchant names do not determine the merchant category to which the merchant belongs, the merchant name library includes 100 first merchant names and 400 second merchant names. Here, the merchant category to which the first merchant name belongs may be determined manually, or may be a merchant category determined based on an original merchant category and guaranteeing accuracy.
In the concrete implementation, the bank card transaction data is summarized into a merchant name library, the merchant name library contains the original merchant category (the merchant category reported by merchants on the basis of the MCC code), the merchant name, the merchant code and the like of each merchant, the original merchant category of the merchant and the merchant category of the scheme can be compared, the merchant name conforming to the merchant category of the scheme is determined as an initial sample, namely the first merchant name belonging to the merchant category of the scheme, and here, the guaranteed accuracy is taken as the main point, and a certain sample amount is needed.
In this embodiment of the present invention, the merchant category corresponding to the merchant name library may be a three-level tag in the tag system, that is, the merchant name library corresponds to 135 merchant categories, each merchant category includes a plurality of first merchant names, for example, the merchant category is "fitness center", and the "fitness center" includes a plurality of first merchant names, which are respectively "AAA fitness club", "BBB fitness", "AAA fitness studio", "fight fitness studio", "CCC yoga", and the like.
102, according to a preset sequence, sequentially executing classification logic for each merchant category in the following modes: determining merchant name characteristics corresponding to the merchant category according to a first merchant name in the merchant category; and determining that the second merchant name which accords with the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
Here, since the merchant categories corresponding to the merchant name library are multiple, a preset sequence is determined, and a classification logic is sequentially executed on the merchant categories according to the preset sequence, where the classification logic is configured to determine a second merchant name belonging to the current merchant category according to a first merchant name in the current merchant category, and is interpreted that, since the obtained merchant name library includes the first merchant name of the determined merchant category and the second merchant name of the undetermined merchant category, a merchant name feature of the merchant category may be determined based on all the first merchant names under a certain merchant category, and then a second merchant name conforming to the merchant name feature is determined from the second merchant names of the undetermined merchant category according to the merchant name feature, and is determined as the second merchant name belonging to the merchant category.
The method includes the steps that a plurality of first merchant names in a merchant category are obtained in a classification logic executed by any one merchant category, word segmentation processing is conducted on each first merchant name in the obtained plurality of first merchant names to obtain a word set, namely word segmentation processing is conducted on each first merchant name in the merchant category, all words obtained after word segmentation processing are formed into a word set, and further, if the occurrence frequency of any word in the word set is larger than a first threshold value and does not belong to a preset common word, the word is determined to be a white list word of the merchant category, and a second merchant name containing the white list word is determined to belong to the merchant category. Here, the preset general words are understood as the preset general words, and the general words refer to words that cannot indicate the category of the merchant, such as "studio", "club", and the like.
For example, the merchant category is "fitness center", the "fitness center" includes a plurality of first merchant names, respectively "AAA fitness club", "BBB fitness", "AAA fitness studio", "fight fitness studio", "CCC yoga", etc., the first merchant names are subjected to word segmentation processing, word sets including 'AAA, fitness, club, BBB, fitness, AAA, fitness, studio, fight, fitness, studio, CCC, yoga and … …' can be obtained, and if the frequency of the two words of 'fitness' and 'yoga' is determined to exceed a first threshold value and not belong to preset common words, the words of 'fitness' and 'yoga' can be determined to be white list words of 'fitness center', and then determining that the second business name containing the fitness and the second business name containing the yoga belong to the fitness center from the business name library.
In order to better determine the second merchant name belonging to the current merchant category from the merchant name library, the above technical solution may be further supplemented, specifically, before determining that the second merchant name including the white list term belongs to the merchant category, the white list term of the merchant category is further supplemented, and the brand term is determined from the term set as the white list term, in the above example, the brand terms such as "AAA", "BBB", "CCC", and the like are determined as the white list terms.
In the embodiment of the present invention, the misclassified second merchant names in each merchant category may also be corrected, and for any merchant category, a blacklist word of the merchant category, that is, a word that should not appear in the merchant category, is determined, then it is determined that the second merchant name including the blacklist word in the merchant category does not belong to the current merchant category, and then according to the second merchant name including the blacklist word, other merchant categories to which the second merchant name belongs are determined. That is, for any merchant category, the second merchant name in the merchant category that contains the blacklist term of the merchant category is re-classified into other merchant categories. For example, if the blacklist word in the "fitness center" business category is "fitness meal", all second business names including "fitness meal" in the "fitness center" business category are deleted, and all second business names including "fitness meal" are determined as business categories related to the meal.
In the embodiment of the present invention, after the classification logic is sequentially executed for each merchant category according to the preset sequence, that is, after the classification logic is executed for all the merchant categories, the classification result is determined, and if the classification result meets the preset condition, the classification is determined to be completed; and if the classification result does not accord with the preset condition, executing classification logic to all the merchant categories according to the preset sequence again, and circulating the steps until the classification result accords with the preset condition, and determining that the classification is finished.
The above technical solution can refer to the flowchart of fig. 2. The concrete steps are as follows:
acquiring a merchant name library, determining a first merchant name under each merchant category according to the merchant category, and obtaining a merchant list under each merchant category;
(1) screening white list words of each merchant category
The method comprises the steps of performing word segmentation on all first business names under each business category, determining high-frequency words corresponding to each business category, and screening white list words (also called seed white words) of each business category.
(2) Discovering more merchants according to white list word diffusion
And diffusing more merchant names (namely second merchant names) belonging to each merchant category in the merchant name library according to the white list words of each merchant category, and determining a new merchant name list corresponding to each merchant category after the diffusion is finished.
(3) Merchant name correction for misclassification
And summarizing the blacklist words for the wrongly classified second merchant names in each merchant category, thereby correcting the second merchant names in each merchant category and obtaining a corrected merchant name list corresponding to each merchant category.
And taking the corrected merchant name list corresponding to each merchant category as an initial merchant list under each merchant category, continuously screening white list words of each merchant category, finding more merchants according to the diffusion of the white list words, and correcting the wrongly classified merchant names until the classification result is determined to meet the preset condition.
It should be noted that, according to the scheme, classification logic is sequentially executed on the merchant categories according to a preset sequence, so that if a certain second merchant name is classified into a previous merchant category, it is no longer classified into a next merchant category, for example, the preset sequence of the two merchant categories of "fitness center" and "hot pot" is that "fitness center" is before "hot pot", and after it is determined that "yoga hot pot restaurant" belongs to "fitness center", it is not classified into "hot pot" again.
Based on the reasons, the preset sequence can be adjusted, and classification logic is sequentially executed on each merchant category according to the adjusted preset sequence until the determined classification result meets the preset condition. That is, based on the preset sequence, the first merchant classification is performed (the classification logics of all merchant categories are sequentially executed according to the preset sequence) to obtain the classification result after the first merchant classification, if the classification result does not meet the preset condition, the preset sequence is adjusted, and based on the adjusted preset sequence, the second merchant classification is performed (the classification logics of all merchant categories are sequentially executed according to the adjusted preset sequence) to obtain the classification result after the second merchant classification, … …, until the classification result is determined to meet the preset condition.
The classification result may include a classification coverage rate and a classification accuracy rate, where the classification coverage rate refers to a ratio of the merchant name of the currently determined merchant category to all merchant names in the merchant name library, and the classification accuracy rate refers to a ratio of the merchant name of the currently determined merchant category to the merchant name of the determined merchant category. The classification result meets the preset condition, and the classification coverage rate and the classification accuracy rate can both reach the preset values.
In the embodiment of the present invention, after the classification logic is sequentially executed for each merchant category according to the preset sequence, the transaction data of the merchant corresponding to the merchant name within the preset time period is obtained for any merchant name, and the sub-category of the merchant name in the merchant category to which the merchant belongs is determined according to the transaction data of the merchant within the preset time period.
For the specific situation of online consumption, the detailed description is given below for merchants and multi-scenario merchants in the catering industry.
(1) Commercial tenant in catering industry
If the merchant category to which a certain merchant name belongs corresponds to the catering industry, the consumption grade category of the merchant name in the merchant category can be determined according to each transaction data of the merchant in a preset time period.
In a specific implementation, in order to more accurately identify the sub-category of the merchant category to which the merchant belongs, the following three factors may be considered:
brand information: brand information is obtained through external data, partial light luxury catering and fast food brands are selected, and light luxury catering and fast food are marked by matching with merchant information;
class information: the method comprises the following steps of grading partial catering products further selected by merchants which are not marked based on brand information, wherein the products are classified as light luxury catering by self-help of steak shops and seafood, and the products are classified as fast food in steamed stuffed bun shops and wonton shops;
transaction amount: calculating and sequencing the transaction amount of each restaurant merchant which is not marked based on brand information and category information in nearly 6 months, wherein the first 30% is light luxury, the last 20% is fast food, and the rest is middle;
marking is herein understood to be labeling, i.e. determining the sub-category under the category of merchants to which the merchant belongs.
(2) Multi-scene commercial tenant
If the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data of the merchant in the preset time period is larger than the second threshold, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and the commodity details in the transaction data, and further determining the scene type of the merchant name in the category of the merchant according to the transaction scenes corresponding to the merchant.
For example, there may be multiple operating scenarios for some merchants, such as a mobile business hall (broadband charging/mobile phone purchasing), a 4S car store (maintenance/car purchasing), etc., and if these merchants are classified as a single scenario, the accuracy of the related classification is affected, so that based on the characteristic that the transaction amounts of different operating services of the same merchant are greatly different, the determination of the transaction amount per pen of the merchant is added to the result of the last step of determination to implement the scenario differentiation.
In an implementation manner, after determining the sub-category of the merchant name in the merchant category, for the customer related to each transaction data, the customer category to which the customer belongs is determined according to the sub-category of the merchant name in the merchant category, and the customer category and the merchant category may be in a corresponding relationship. For example, if the sub-category of the "hot pot" corresponding to a certain merchant is "Sichuan hot pot", it can be determined that the customer having a transaction with the merchant is a customer who likes to eat "Sichuan hot pot", or the customer is labeled as "Sichuan hot pot".
In addition, the label types designed for the customers can also relate to the fields of cross-border payment, mobile phone PAY, mobile payment, tendency towards children consumption, tendency towards female consumption and the like. For example, a client who consumes in a moon center may consider that the client has a high probability of having children, a client who consumes in a mother-and-baby shop may consider that the client has a certain probability of having children, and for example, the consumption frequency and the consumption amount may be comprehensively considered, and the higher the consumption frequency and the consumption amount, the higher the probability that the client has children is.
In the technical scheme, the merchant categories are set, classification logic is executed on each merchant category based on a preset sequence, and merchants with merchant categories which are not determined yet are classified into corresponding merchant categories, so that the merchant categories of all merchants are automatically determined, the labor cost is reduced, in addition, when the classification logic is executed on each merchant category, the merchant name characteristics of the merchant categories are determined based on the existing merchant names in the merchant categories, then the uncategorized merchants which accord with the merchant name characteristics are classified into the current merchant categories, the condition that the situation is inconsistent with the actual situation when the merchants report the merchant categories by themselves is avoided, and the accuracy of merchant classification is improved.
Based on the same inventive concept, fig. 3 exemplarily illustrates a structure of an apparatus for determining a merchant category, which may perform a flow of a method for determining a merchant category according to an embodiment of the present invention.
The apparatus, comprising:
an obtaining unit 301, configured to obtain a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and the merchant name library comprises first merchant names of determined merchant categories and second merchant names of undetermined merchant categories;
a classifying unit 302, configured to sequentially execute a classification logic for each merchant category according to a preset sequence in the following manner:
determining merchant name characteristics corresponding to the merchant category according to a first merchant name in the merchant category; and determining that the second merchant name which accords with the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
Optionally, the classification unit 302 is specifically configured to:
obtaining a plurality of first merchant names in the merchant category;
performing word segmentation processing on each first merchant name in the plurality of first merchant names to obtain a word set;
if the appearance frequency of any word in the word set is greater than a first threshold value, the word does not belong to a preset common word, and/or the word belongs to a brand class word, determining the word as a white list word of the merchant class;
determining that a second merchant name that includes the whitelist term belongs to the merchant category.
Optionally, the classifying unit 302 is further configured to:
determining blacklist terms for the merchant category after the determining that a second merchant name including the whitelist terms belongs to the merchant category;
determining that a second merchant name in the merchant category that includes the blacklist terms does not belong to the merchant category, and determining other merchant categories to which the second merchant name that includes the blacklist terms belongs.
Optionally, the classifying unit 302 is further configured to:
after the classification logic is sequentially executed on each merchant category according to the preset sequence, if the determined classification result does not accord with the preset condition, the preset sequence is adjusted, and the classification logic is sequentially executed on each merchant category according to the adjusted preset sequence until the determined classification result accords with the preset condition.
Optionally, the classifying unit 302 is further configured to:
after the classification logic is sequentially executed on each merchant category according to the preset sequence, acquiring transaction data of the merchant corresponding to the merchant name in a preset time period aiming at any merchant name;
and determining the sub-category of the merchant name in the merchant category according to the transaction data of the merchant in the preset time period.
Optionally, the classification unit 302 is specifically configured to:
if the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data of the merchant in the preset time period is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and the commodity details in the transaction data;
and determining the scene category of the merchant name in the merchant category according to a plurality of transaction scenes corresponding to the merchant.
Optionally, the classifying unit 302 is further configured to:
after determining the sub-category of the merchant name in the affiliated merchant category, for the customers involved in each transaction data, determining the customer category to which the customers belong according to the sub-category of the merchant name in the affiliated merchant category.
Based on the same inventive concept, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the method for determining the merchant category according to the obtained program.
Based on the same inventive concept, the embodiment of the present invention further provides a computer-readable non-volatile storage medium, which includes computer-readable instructions, and when the computer reads and executes the computer-readable instructions, the computer is caused to execute the method for determining the category of the merchant.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method for determining a merchant category, comprising:
acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and the merchant name library comprises first merchant names of determined merchant categories and second merchant names of undetermined merchant categories;
according to a preset sequence, sequentially executing classification logic for each merchant category in the following modes:
determining merchant name characteristics corresponding to the merchant category according to a first merchant name in the merchant category; and determining that the second merchant name which accords with the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
2. The method according to claim 1, wherein the determining, according to a first merchant name in the merchant category, a merchant name feature corresponding to the merchant category; determining that a second merchant name that meets the merchant name characteristics corresponding to the merchant category belongs to the merchant category includes:
obtaining a plurality of first merchant names in the merchant category;
performing word segmentation processing on each first merchant name in the plurality of first merchant names to obtain a word set;
if the appearance frequency of any word in the word set is greater than a first threshold value, the word does not belong to a preset common word, and/or the word belongs to a brand class word, determining the word as a white list word of the merchant class;
determining that a second merchant name that includes the whitelist term belongs to the merchant category.
3. The method of claim 2, wherein after determining that a second merchant name that includes the whitelist term belongs to the merchant category, further comprising:
determining blacklist terms of the merchant category;
determining that a second merchant name in the merchant category that includes the blacklist terms does not belong to the merchant category, and determining other merchant categories to which the second merchant name that includes the blacklist terms belongs.
4. The method of claim 1, wherein after the performing the classification logic for each merchant category in turn according to the preset order, further comprises:
and if the determined classification result does not accord with the preset condition, adjusting the preset sequence, and sequentially executing the classification logic for each merchant category according to the adjusted preset sequence until the determined classification result accords with the preset condition.
5. The method of claim 1, wherein after the performing the classification logic for each merchant category in turn according to the preset order, further comprises:
aiming at any merchant name, acquiring transaction data of the merchant corresponding to the merchant name in a preset time period;
and determining the sub-category of the merchant name in the merchant category according to the transaction data of the merchant in the preset time period.
6. The method as claimed in claim 5, wherein the determining the sub-category of the merchant name in the merchant category according to the transaction data of the merchant in the preset time period comprises:
if the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data of the merchant in the preset time period is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and the commodity details in the transaction data;
and determining the scene category of the merchant name in the merchant category according to a plurality of transaction scenes corresponding to the merchant.
7. The method of claim 5, wherein the determining that the merchant name is in a sub-category of the merchant category further comprises:
and for the customers involved in each transaction data, determining the customer category to which the customers belong according to the subcategories of the business names in the business categories to which the business names belong.
8. An apparatus for determining a merchant category, comprising:
the acquisition unit is used for acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and the merchant name library comprises first merchant names of determined merchant categories and second merchant names of undetermined merchant categories;
the classification unit is used for sequentially executing classification logic on each merchant category according to a preset sequence in the following way:
determining merchant name characteristics corresponding to the merchant category according to a first merchant name in the merchant category; and determining that the second merchant name which accords with the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
9. A computing device, comprising:
a memory for storing program instructions;
a processor for calling program instructions stored in said memory to perform the method of any of claims 1 to 7 in accordance with the obtained program.
10. A computer-readable non-transitory storage medium including computer-readable instructions which, when read and executed by a computer, cause the computer to perform the method of any one of claims 1 to 7.
CN202010099718.6A 2020-02-18 2020-02-18 Method and device for determining merchant category Active CN111368543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010099718.6A CN111368543B (en) 2020-02-18 2020-02-18 Method and device for determining merchant category

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010099718.6A CN111368543B (en) 2020-02-18 2020-02-18 Method and device for determining merchant category

Publications (2)

Publication Number Publication Date
CN111368543A true CN111368543A (en) 2020-07-03
CN111368543B CN111368543B (en) 2023-06-02

Family

ID=71206250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010099718.6A Active CN111368543B (en) 2020-02-18 2020-02-18 Method and device for determining merchant category

Country Status (1)

Country Link
CN (1) CN111368543B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080270209A1 (en) * 2007-04-25 2008-10-30 Michael Jon Mauseth Merchant scoring system and transactional database
US20100138324A1 (en) * 2003-07-25 2010-06-03 Stoneeagle Services, Inc. Method of providing secure payment and transaction reconciliation
CN104408635A (en) * 2014-12-01 2015-03-11 银联智惠信息服务(上海)有限公司 Method and device for recognizing class information of commercial tenant
WO2017050188A1 (en) * 2015-09-24 2017-03-30 中国银联股份有限公司 Transaction data processing method and device
CN106886934A (en) * 2016-12-30 2017-06-23 北京三快在线科技有限公司 Method, system and apparatus for determining merchant categories
US20180165735A1 (en) * 2016-12-08 2018-06-14 Mastercard International Incorporated Multi-merchant portal for e-commerce
CN109101989A (en) * 2018-06-29 2018-12-28 阿里巴巴集团控股有限公司 A kind of Merchant Category model construction and Merchant Category method, device and equipment
CN110009364A (en) * 2019-01-08 2019-07-12 阿里巴巴集团控股有限公司 A kind of industry identification model determines method and apparatus
CN110020427A (en) * 2019-01-30 2019-07-16 阿里巴巴集团控股有限公司 Strategy determines method and apparatus
CN110046959A (en) * 2019-03-27 2019-07-23 拉扎斯网络科技(上海)有限公司 A kind of determining trade company manages method, apparatus, equipment and the storage medium of classification
CN110750697A (en) * 2019-10-30 2020-02-04 汉海信息技术(上海)有限公司 Merchant classification method, device, equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100138324A1 (en) * 2003-07-25 2010-06-03 Stoneeagle Services, Inc. Method of providing secure payment and transaction reconciliation
US20080270209A1 (en) * 2007-04-25 2008-10-30 Michael Jon Mauseth Merchant scoring system and transactional database
CN104408635A (en) * 2014-12-01 2015-03-11 银联智惠信息服务(上海)有限公司 Method and device for recognizing class information of commercial tenant
WO2017050188A1 (en) * 2015-09-24 2017-03-30 中国银联股份有限公司 Transaction data processing method and device
US20180165735A1 (en) * 2016-12-08 2018-06-14 Mastercard International Incorporated Multi-merchant portal for e-commerce
CN106886934A (en) * 2016-12-30 2017-06-23 北京三快在线科技有限公司 Method, system and apparatus for determining merchant categories
CN109101989A (en) * 2018-06-29 2018-12-28 阿里巴巴集团控股有限公司 A kind of Merchant Category model construction and Merchant Category method, device and equipment
CN110009364A (en) * 2019-01-08 2019-07-12 阿里巴巴集团控股有限公司 A kind of industry identification model determines method and apparatus
CN110020427A (en) * 2019-01-30 2019-07-16 阿里巴巴集团控股有限公司 Strategy determines method and apparatus
CN110046959A (en) * 2019-03-27 2019-07-23 拉扎斯网络科技(上海)有限公司 A kind of determining trade company manages method, apparatus, equipment and the storage medium of classification
CN110750697A (en) * 2019-10-30 2020-02-04 汉海信息技术(上海)有限公司 Merchant classification method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
林冠峰;曾阳;: "银行统一商户管理模型与系统设计" *
郭开卫;王颖卓;王亚雄;: "基于大数据的线下商户真实属地判定研究" *

Also Published As

Publication number Publication date
CN111368543B (en) 2023-06-02

Similar Documents

Publication Publication Date Title
CN109978538B (en) Method and device for determining fraudulent user, training model and identifying fraudulent risk
CN106127505A (en) The single recognition methods of a kind of brush and device
US20090177480A1 (en) System And Method For Identifying Targeted Consumers Using Partial Social Security Numbers
US20150095247A1 (en) Classifying Fraud on Event Management Systems
CN110807657B (en) Order processing method, device, equipment and computer readable storage medium
CN107679103B (en) Attribute analysis method and system for entity
CN105931068A (en) Cardholder consumption figure generation method and device
CN109409964B (en) Method and device for identifying high-quality brand
CN108509497A (en) Information recommendation method, device and electronic equipment
CN108428123A (en) The method of payment and device of identity-based identification
CN108509458B (en) Business object identification method and device
CN111028016A (en) Sales data prediction method and device and related equipment
CN108429776B (en) Network object pushing method, device, client, interaction equipment and system
CN110033120A (en) For providing the method and device that risk profile energizes service for trade company
WO2020156003A1 (en) Offline self-service settlement method, apparatus, and system
CN112734161A (en) Method, equipment and storage medium for accurately identifying empty-shell enterprises
CN111198989A (en) Method and device for determining travel recommendation data, storage medium and electronic equipment
CN113554438B (en) Account identification method and device, electronic equipment and computer readable medium
CN112330373A (en) User behavior analysis method and device and computer readable storage medium
CN110348983B (en) Transaction information management method and device, electronic equipment and non-transitory storage medium
CN111368543B (en) Method and device for determining merchant category
CN111507779A (en) Method and device for grading commodities
CN110705994A (en) Risk user detection method and device
CN109858448A (en) Item identification method and equipment under a kind of public safety
CN108960501A (en) A kind of commodity method for preventing goods from altering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant