CN111368543B - Method and device for determining merchant category - Google Patents

Method and device for determining merchant category Download PDF

Info

Publication number
CN111368543B
CN111368543B CN202010099718.6A CN202010099718A CN111368543B CN 111368543 B CN111368543 B CN 111368543B CN 202010099718 A CN202010099718 A CN 202010099718A CN 111368543 B CN111368543 B CN 111368543B
Authority
CN
China
Prior art keywords
merchant
category
name
determining
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010099718.6A
Other languages
Chinese (zh)
Other versions
CN111368543A (en
Inventor
郑磊
张达
朱峰言
赵萌
徐婷婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unionpay Advisors Counselor Shanghai Co ltd
Original Assignee
Unionpay Advisors Counselor Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unionpay Advisors Counselor Shanghai Co ltd filed Critical Unionpay Advisors Counselor Shanghai Co ltd
Priority to CN202010099718.6A priority Critical patent/CN111368543B/en
Publication of CN111368543A publication Critical patent/CN111368543A/en
Application granted granted Critical
Publication of CN111368543B publication Critical patent/CN111368543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for determining a class of a merchant, wherein the method comprises the following steps: acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and comprises a first merchant name of the determined merchant category and a second merchant name of the undetermined merchant category; according to a preset sequence, the classification logic is executed for each merchant category in turn by the following modes: and determining the characteristic of the merchant name corresponding to the merchant category according to the first merchant name in the merchant category, and determining that the second merchant name conforming to the characteristic of the merchant name corresponding to the merchant category belongs to the merchant category. The technical scheme is used for automatically classifying the classes of the merchants to which the merchants belong, and guaranteeing the accuracy of the classification.

Description

Method and device for determining merchant category
Technical Field
The embodiment of the invention relates to the technical field of information, in particular to a method and a device for determining a class of a merchant.
Background
The interior of the financial industry is mainly to divide merchants according to merchant category codes (MCC codes for short). The MCC code is used for marking the Unionpay card transaction environment, the main business range of the merchant and the industry attribution, and is a main basis for judging the inter-border transaction merchant settlement commission standard; and is also one of important basic data for developing analysis and report of the Unionpay card transaction industry and managing and controlling the Unionpay card business risk. The MCC code is formulated by referring to ISO international standard 'financial retail industry merchant category code', so that the bank card is ensured to mark the merchant industry in the same mode when used across countries.
And when the merchant applies for registering the POS machine, referring to the comparison relation of the MCC codes, reporting the category of the merchant to which the merchant belongs by self, and then, checking by an auditor and recording. The method relies on the self-reporting of the merchant category by the merchant and the field investigation by auditors, thereby wasting a great deal of human resources. Moreover, the merchant category reported by the merchant by itself may not be in line with the actual situation.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining a class of a merchant, which are used for automatically dividing the class of the merchant to which the merchant belongs and guaranteeing the accuracy of division.
The method for determining the class of the merchant provided by the embodiment of the invention comprises the following steps:
acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and comprises a first merchant name of a determined merchant category and a second merchant name of an undetermined merchant category;
according to a preset sequence, the classification logic is executed for each merchant category in turn by the following modes:
determining a merchant name characteristic corresponding to the merchant category according to the first merchant name in the merchant category; and determining that a second merchant name conforming to the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
Optionally, determining a merchant name feature corresponding to the merchant category according to the first merchant name in the merchant category; determining that a second merchant name conforming to the merchant name feature corresponding to the merchant category belongs to the merchant category comprises:
acquiring a plurality of first merchant names in the merchant category;
word segmentation processing is carried out on each first merchant name in the plurality of first merchant names, so as to obtain a word set;
if the occurrence frequency of any word in the word set is greater than a first threshold value and does not belong to a preset general word and/or belongs to a brand word, determining the word as a white list word of the merchant category;
and determining that a second merchant name containing the white list word belongs to the merchant category.
Optionally, after determining that the second merchant name including the whitelist term belongs to the merchant category, the method further includes:
determining blacklist words of the merchant category;
determining that a second merchant name including the blacklist term in the merchant category does not belong to the merchant category, and determining other merchant categories to which the second merchant name including the blacklist term belongs.
Optionally, after executing the classification logic on each merchant category in turn according to the preset sequence, the method further includes:
and if the determined classification result does not meet the preset condition, adjusting the preset sequence, and executing the classification logic on each merchant category in sequence according to the adjusted preset sequence until the determined classification result meets the preset condition.
Optionally, after executing the classification logic on each merchant category in turn according to the preset sequence, the method further includes:
aiming at any merchant name, acquiring transaction data of a merchant corresponding to the merchant name in a preset period;
and determining the subcategory of the merchant name in the category of the merchant according to the transaction data of the merchant in the preset period.
Optionally, the determining, according to the transaction data of the merchant in the preset period, the sub-category of the merchant name in the category of the merchant includes:
if the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data in the preset period of time of the merchant is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and commodity detail in each transaction data;
and determining the scene category of the merchant name in the category of the merchant according to the transaction scenes corresponding to the merchant.
Optionally, after determining the sub-category of the merchant name in the category of the merchant, the method further includes:
and determining the client category of the client according to the sub-category of the merchant name in the merchant category of the client related to each transaction data.
In the technical scheme, the merchant categories are set, the classification logic is executed for each merchant category based on the preset sequence, and the merchants which are not determined to be the merchant categories are divided into the corresponding merchant categories, so that the merchant categories of all the merchants are automatically determined, the labor cost is reduced, in addition, when the classification logic is executed for each merchant category, the merchant name characteristics of the merchant category are determined based on the existing merchant names in the merchant categories, and then the uncategorized merchants which accord with the merchant name characteristics are divided into the current merchant categories, so that the situation that the merchants are inconsistent with the actual situation when reporting the merchant categories by themselves is avoided, and the accuracy of the merchant division is improved.
Correspondingly, the embodiment of the invention also provides a device for determining the category of the merchant, which comprises the following steps:
the acquisition unit is used for acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and comprises a first merchant name of a determined merchant category and a second merchant name of an undetermined merchant category;
the classification unit is used for executing classification logic on each merchant category in turn according to a preset sequence in the following way:
determining a merchant name characteristic corresponding to the merchant category according to the first merchant name in the merchant category; and determining that a second merchant name conforming to the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
Optionally, the classification unit is specifically configured to:
acquiring a plurality of first merchant names in the merchant category;
word segmentation processing is carried out on each first merchant name in the plurality of first merchant names, so as to obtain a word set;
if the occurrence frequency of any word in the word set is greater than a first threshold value and does not belong to a preset general word and/or belongs to a brand word, determining the word as a white list word of the merchant category;
and determining that a second merchant name containing the white list word belongs to the merchant category.
Optionally, the classification unit is further configured to:
after the second merchant name containing the white list words is determined to belong to the merchant category, determining the black list words of the merchant category;
determining that a second merchant name including the blacklist term in the merchant category does not belong to the merchant category, and determining other merchant categories to which the second merchant name including the blacklist term belongs.
Optionally, the classification unit is further configured to:
after the classification logic is executed on each merchant category in turn according to the preset sequence, if the determined classification result does not meet the preset condition, the preset sequence is adjusted, and the classification logic is executed on each merchant category in turn according to the adjusted preset sequence until the determined classification result meets the preset condition.
Optionally, the classification unit is further configured to:
after the classification logic is sequentially executed on each merchant category according to the preset sequence, transaction data of merchants corresponding to any merchant name in a preset period are obtained;
and determining the subcategory of the merchant name in the category of the merchant according to the transaction data of the merchant in the preset period.
Optionally, the classification unit is specifically configured to:
if the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data in the preset period of time of the merchant is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and commodity detail in each transaction data;
and determining the scene category of the merchant name in the category of the merchant according to the transaction scenes corresponding to the merchant.
Optionally, the classification unit is further configured to:
and after determining the sub-category of the merchant name in the merchant category, determining the client category to which the client belongs according to the sub-category of the merchant name in the merchant category for the client related to each transaction data.
Accordingly, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the method for determining the class of the merchant according to the obtained program.
Accordingly, an embodiment of the present invention further provides a computer-readable non-volatile storage medium, which includes computer-readable instructions that, when read and executed by a computer, cause the computer to perform the method for determining a category of a merchant.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart illustrating a method for determining a category of a merchant according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating another method for determining a category of a merchant according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a device for determining a category of a merchant according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
And designing a label for the commercial tenant in the transaction information according to the consumer industry category concerned by the commercial bank and the related institutions to obtain the label system suitable for the invention. The tag system can be divided into 18 primary tags, 80 secondary tags and 135 tertiary tags.
The first-level label type relates to the fields of car love, recreation, food, house purchase and the like, and meanwhile, the first-level label (such as recreation) is thinned to a second-level label (such as beauty care, sports, recreation places and the like), and the second-level label (such as recreation places) is further thinned to a third-level label (such as amusement parks, chess and card rooms, game halls and the like).
In addition, the label system not only carries out new division aiming at the traditional industry, but also divides the emerging industry into short renters, campus card transactions, sharing bicycles and the like.
Based on the foregoing description, fig. 1 illustrates a flow of a method for determining a category of a merchant according to an embodiment of the present invention, where the flow may be performed by a device for determining a category of a merchant.
As shown in fig. 1, the process specifically includes:
step 101, acquiring a merchant name library;
the merchant name library is formed by summarizing merchant names related to transaction data of all bank cards, corresponds to a plurality of merchant categories, and comprises a first merchant name of a determined merchant category and a second merchant name of an undetermined merchant category. For example, the merchant name library includes 500 merchant names, wherein 100 merchant names have determined the category of the merchant, and another 400 merchant names have not determined the category of the merchant, and the merchant name library includes 100 first merchant names and 400 second merchant names. Here, the class of the first merchant name may be manually determined, or may be a class of the merchant that is determined based on the original class of the merchant and that ensures accuracy.
In the specific implementation, the bank card transaction data is utilized to gather into a merchant name library, the merchant name library comprises the original merchant category (the merchant category which is automatically reported by the merchant based on the MCC code), the merchant name, the merchant code and the like of each merchant, the original merchant category of the merchant and the merchant category of the scheme can be compared, the merchant name conforming to the merchant category of the scheme is determined to be used as an initial sample, namely the first merchant name belonging to the merchant category of the scheme, and the method is characterized in that the accuracy is ensured as a main part and a certain sample size is provided.
In the embodiment of the present invention, the merchant category corresponding to the merchant name library may be three-level labels in the label system, that is, the merchant name library corresponds to 135 merchant categories, each merchant category includes a plurality of first merchant names, for example, the merchant category is a "gym center", the "gym center" includes a plurality of first merchant names, which are respectively "AAA gym office", "BBB gym", "AAA gym studio", "fight gym studio", "CCC yoga", and the like.
Step 102, executing classification logic for each merchant category in turn according to a preset sequence by the following method: determining a merchant name characteristic corresponding to the merchant category according to the first merchant name in the merchant category; and determining that a second merchant name conforming to the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
Here, since the number of the merchant categories corresponding to the merchant name library is a plurality, a preset sequence is determined first, and classification logic is sequentially executed on each merchant category according to the preset sequence, wherein the classification logic is used for determining a second merchant name belonging to the current merchant category according to a first merchant name in the current merchant category, and explaining that since the acquired merchant name library comprises the first merchant name of the determined merchant category and the second merchant name of the undetermined merchant category, the merchant name characteristic of the merchant category can be determined based on all the first merchant names in a certain merchant category, and further, the second merchant name conforming to the merchant name characteristic is determined from the second merchant names of the undetermined merchant category according to the merchant name characteristic and is determined as the second merchant name belonging to the merchant category.
In the classification logic executed for any one of the merchant categories, a plurality of first merchant names in the merchant category are acquired, word segmentation processing is carried out on each first merchant name in the acquired plurality of first merchant names to obtain a word set, that is, word segmentation processing is carried out on each first merchant name in the merchant category, words obtained after all word segmentation processing form a word set, further, if the occurrence frequency of any word in the word set is greater than a first threshold value and does not belong to a preset general word, the word is determined to be a white list word of the merchant category, and a second merchant name containing the white list word is determined to belong to the merchant category. Here, the preset general word is understood as a preset general word, and the general word refers to a word that cannot indicate a category of a merchant, such as "studio", "meeting", and the like.
For example, the classes of businesses are "gym centers", which include a plurality of first business names, respectively, "AAA gym meeting", "BBB gym", "AAA gym studio", "fight gym studio", "CCC yoga", and the like, and word segmentation processing is performed on each first business name, so that word sets of "AAA, gym, meeting, BBB, gym, AAA, gym, studio, fight, gym, studio, CCC, yoga, … …" can be obtained, and if it can be determined that the frequency of occurrence of two words of "gym", "yoga" exceeds a first threshold and does not belong to a preset general word, it can be determined that "gym", "yoga" is a white list word of "gym centers", and further it is determined that a second business name containing "gym" and a second business name containing "yoga" belong to "gym centers" from the business name library.
In order to better determine the second merchant name belonging to the current merchant category from the merchant name library, the technical scheme can be supplemented, specifically, before determining that the second merchant name containing the white list word belongs to the merchant category, the white list word of the merchant category is further supplemented, the brand word is determined from the word set to be used as the white list word, and in the above example, brand words such as "AAA", "BBB", "CCC" and the like are determined to be the white list word.
In the embodiment of the invention, the second merchant name with wrong classification in each merchant category can be corrected, the blacklist words of the merchant category, namely the words which should not appear in the merchant category, are determined for any merchant category, then the second merchant name including the blacklist words in the merchant category is determined not to belong to the current merchant category, and further other merchant categories to which the second merchant name including the blacklist words belongs are determined according to the second merchant name including the blacklist words. That is, for any one merchant category, the second merchant name of the merchant category that contains the blacklisted term for that merchant category is repartitioned into the other merchant categories. For example, if the blacklist word in the "gym center" merchant category is "gym meal", all second merchant names containing "gym meal" in the "gym center" merchant category are deleted, and all second merchant names containing "gym meal" are determined as being in the merchant category related to dining.
In the embodiment of the invention, after classification logic is executed on each merchant category in turn according to a preset sequence, namely after classification logic is executed on all merchant categories, a classification result is determined, and if the classification result meets a preset condition, classification is determined to be completed; if the classification result does not meet the preset condition, executing classification logic again for all the merchant categories according to the preset sequence, and circulating the steps until the classification result is determined to meet the preset condition, and determining that classification is completed.
The above technical solution may refer to the flowchart in fig. 2. The specific steps are as follows:
acquiring a merchant name library, and determining a first merchant name under each merchant category according to the merchant category to obtain a merchant list under each merchant category;
(1) Screening white list words of various merchant categories
And (3) word segmentation is carried out on all first merchant names under each merchant category, high-frequency words corresponding to each merchant category are determined, and white list words (also called seed white words) of each merchant category are screened.
(2) Finding more merchants based on whitelist word diffusion
And diffusing more merchant names (namely second merchant names) belonging to each merchant category in the merchant name library according to the white list words of each merchant category, and determining a new merchant name list corresponding to each merchant category after the diffusion is completed.
(3) Merchant name correction for misclassification
Summarizing blacklist words for the wrongly classified second merchant names in each merchant category, thereby correcting the second merchant names in each merchant category and obtaining corrected merchant name lists corresponding to each merchant category.
And taking the corrected merchant name list corresponding to each merchant category as an original merchant list under each merchant category, continuously screening white list words of each merchant category, finding more merchants according to the white list word diffusion, and correcting wrongly-classified merchant names until a classification result is determined to be in accordance with a preset condition.
It should be noted that, in this scheme, the classification logic is sequentially executed for each merchant category according to the preset sequence, so if a certain second merchant name is classified into the previous merchant category, it is not classified into the next merchant category any more, for example, the preset sequence of two merchant categories of "exercise center" and "hot pot" is that before "exercise center" is before "hot pot", after it is determined that "yoga hot pot restaurant" belongs to "exercise center", it is not classified into "hot pot" again.
Based on the reasons, the preset sequence can be adjusted, and classification logic is sequentially executed on each merchant category according to the adjusted preset sequence until the determined classification result meets the preset condition. That is, based on the preset sequence, the first merchant classification is performed (the classification logic of all the merchant categories is sequentially performed according to the preset sequence) to obtain the classification result after the first merchant classification, if the classification result does not meet the preset condition, the preset sequence is adjusted, and then based on the adjusted preset sequence, the second merchant classification is performed (the classification logic of all the merchant categories is sequentially performed according to the adjusted preset sequence) to obtain the classification result after the second merchant classification, … …, until the classification result is determined to meet the preset condition.
The classification result may include a classification coverage, which refers to a ratio of a merchant name of a currently determined merchant category to all merchant names in the merchant name library, and a classification accuracy, which refers to a ratio of a merchant name of a currently correctly determined merchant category to a merchant name of a determined merchant category. The classification result meets the preset condition, and the classification coverage rate and the classification accuracy rate can reach preset values.
In the embodiment of the invention, after the classification logic is executed on each merchant category in turn according to the preset sequence, transaction data of the merchant corresponding to the merchant name in a preset period is acquired for any one merchant name, and the sub-category of the merchant name in the merchant category is determined according to the transaction data of the merchant in the preset period.
The following describes in detail the business and multi-scenario business of the catering industry for the specific case of under-line consumption.
(1) Commercial tenant in catering industry
If the class of the merchant to which a certain merchant name belongs corresponds to the catering industry, the class of the consumption level of the merchant name in the class of the merchant can be determined according to the transaction data of the merchant in a preset period.
In a specific implementation, in order to more accurately identify the sub-category under the category of the merchant to which the merchant belongs, the following three factors may be considered:
brand information: obtaining brand information through external data, selecting partial light luxury dining and fast food brands, and matching merchant information to mark the light luxury dining and the fast food;
class information: further selecting partial food and drink products from commercial suppliers which are not marked based on brand information, classifying the commercial suppliers into light luxury food and drink products such as steak stores and seafood self-service, and classifying the commercial suppliers into fast food and wonton stores;
transaction amount: calculating and sequencing transaction amounts of catering merchants which are not marked on the basis of brand information and category information and are equal to each other in a period of 6 months, taking 30% of the catering merchants as light luxury, 20% of the catering merchants as fast food, and the balance as middle ends;
marking is herein understood to mean marking, i.e. determining sub-categories under the category of the merchant to which the merchant belongs.
(2) Multi-scene merchant
If the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data of the merchant in the preset period is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and the commodity detail in each transaction data, and further determining scene categories of the merchant names in the category of the merchant according to the plurality of transaction scenes corresponding to the merchant.
For example, some merchants may have multiple business scenarios, such as mobile business hall (telephone fee broadband recharging/mobile phone purchase), car 4S store (maintenance/car purchase), etc., and if the merchants are classified into a single scenario, the accuracy of the related classification will be affected, so the judgment of the transaction amounts of the merchants is increased based on the characteristic that the transaction amounts of different business operations of the same merchant are greatly different, so as to realize the scene distinction.
The embodiment of the invention can divide the customer behaviors, which is equivalent to designing tags for customers, in one implementation way, after determining the sub-category of the merchant name in the category of the merchant, the customer category to which the customer belongs is determined according to the sub-category of the merchant name in the category of the merchant, aiming at the customer related to each transaction data, and the customer category and the merchant category can be corresponding relations. For example, if the sub-category of a merchant corresponding to the merchant category "chafing dish" is "Sichuan chafing dish", it may be determined that the customer who transacts with the merchant is a customer who prefers to eat "Sichuan chafing dish", or the customer is labeled as "Sichuan chafing dish".
In addition, the label types designed for the clients can also relate to various fields such as cross-border payment, mobile phone PAY, mobile payment, child consumption tendency, female consumption tendency and the like. For example, a customer who consumes in a lunar center may consider that the customer has a very high probability of having a child, and a customer who consumes in a mother-infant store may consider that the customer has a certain probability of having a child, and for example, factors such as consumption frequency and consumption amount may be comprehensively considered, and a higher consumption frequency and consumption amount indicates a higher probability of having a child for the customer.
In the technical scheme, the merchant categories are set, the classification logic is executed for each merchant category based on the preset sequence, and the merchants which are not determined to be the merchant categories are divided into the corresponding merchant categories, so that the merchant categories of all the merchants are automatically determined, the labor cost is reduced, in addition, when the classification logic is executed for each merchant category, the merchant name characteristics of the merchant category are determined based on the existing merchant names in the merchant categories, and then the uncategorized merchants which accord with the merchant name characteristics are divided into the current merchant categories, so that the situation that the merchants are inconsistent with the actual situation when reporting the merchant categories by themselves is avoided, and the accuracy of the merchant division is improved.
Based on the same inventive concept, fig. 3 illustrates an exemplary structure of an apparatus for determining a category of a merchant, which may perform a flow of a method for determining a category of a merchant according to an embodiment of the present invention.
The device comprises:
an obtaining unit 301, configured to obtain a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and comprises a first merchant name of a determined merchant category and a second merchant name of an undetermined merchant category;
a classification unit 302, configured to execute classification logic for each merchant category in turn according to a preset sequence by:
determining a merchant name characteristic corresponding to the merchant category according to the first merchant name in the merchant category; and determining that a second merchant name conforming to the merchant name characteristics corresponding to the merchant category belongs to the merchant category.
Optionally, the classifying unit 302 is specifically configured to:
acquiring a plurality of first merchant names in the merchant category;
word segmentation processing is carried out on each first merchant name in the plurality of first merchant names, so as to obtain a word set;
if the occurrence frequency of any word in the word set is greater than a first threshold value and does not belong to a preset general word and/or belongs to a brand word, determining the word as a white list word of the merchant category;
and determining that a second merchant name containing the white list word belongs to the merchant category.
Optionally, the classifying unit 302 is further configured to:
after the second merchant name containing the white list words is determined to belong to the merchant category, determining the black list words of the merchant category;
determining that a second merchant name including the blacklist term in the merchant category does not belong to the merchant category, and determining other merchant categories to which the second merchant name including the blacklist term belongs.
Optionally, the classifying unit 302 is further configured to:
after the classification logic is executed on each merchant category in turn according to the preset sequence, if the determined classification result does not meet the preset condition, the preset sequence is adjusted, and the classification logic is executed on each merchant category in turn according to the adjusted preset sequence until the determined classification result meets the preset condition.
Optionally, the classifying unit 302 is further configured to:
after the classification logic is sequentially executed on each merchant category according to the preset sequence, transaction data of merchants corresponding to any merchant name in a preset period are obtained;
and determining the subcategory of the merchant name in the category of the merchant according to the transaction data of the merchant in the preset period.
Optionally, the classifying unit 302 is specifically configured to:
if the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data in the preset period of time of the merchant is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and commodity detail in each transaction data;
and determining the scene category of the merchant name in the category of the merchant according to the transaction scenes corresponding to the merchant.
Optionally, the classifying unit 302 is further configured to:
and after determining the sub-category of the merchant name in the merchant category, determining the client category to which the client belongs according to the sub-category of the merchant name in the merchant category for the client related to each transaction data.
Based on the same inventive concept, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the method for determining the class of the merchant according to the obtained program.
Based on the same inventive concept, the embodiment of the invention also provides a computer readable nonvolatile storage medium, which comprises computer readable instructions, when the computer reads and executes the computer readable instructions, the computer is caused to execute the method for determining the category of the merchant.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (7)

1. A method of determining a category of merchants, comprising:
acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and comprises a first merchant name of a determined merchant category and a second merchant name of an undetermined merchant category;
according to a preset sequence, the classification logic is executed for each merchant category in turn by the following modes:
determining a merchant name characteristic corresponding to the merchant category according to the first merchant name in the merchant category; determining that a second merchant name conforming to the merchant name characteristics corresponding to the merchant category belongs to the merchant category;
determining a merchant name characteristic corresponding to the merchant category according to the first merchant name in the merchant category; determining that a second merchant name conforming to the merchant name feature corresponding to the merchant category belongs to the merchant category comprises: acquiring a plurality of first merchant names in the merchant category; word segmentation processing is carried out on each first merchant name in the plurality of first merchant names, so as to obtain a word set; if the occurrence frequency of any word in the word set is greater than a first threshold value and does not belong to a preset general word and/or belongs to a brand word, determining the word as a white list word of the merchant category; determining that a second merchant name containing the whitelist term belongs to the merchant category;
after determining that the second merchant name containing the whitelist term belongs to the merchant category, the method further comprises: determining blacklist words of the merchant category; determining that a second merchant name including the blacklist word in the merchant category does not belong to the merchant category, and determining other merchant categories to which the second merchant name including the blacklist word belongs;
after the classification logic is sequentially executed on each merchant category according to the preset sequence, the method further comprises the following steps: and if the determined classification result does not meet the preset condition, adjusting the preset sequence, and executing the classification logic on each merchant category in sequence according to the adjusted preset sequence until the determined classification result meets the preset condition.
2. The method of claim 1, wherein after sequentially performing classification logic for each merchant category in a predetermined order, further comprising:
aiming at any merchant name, acquiring transaction data of a merchant corresponding to the merchant name in a preset period;
and determining the subcategory of the merchant name in the category of the merchant according to the transaction data of the merchant in the preset period.
3. The method of claim 2, wherein the determining the sub-category of the merchant name in the category of the merchant based on transaction data of the merchant within the preset period of time comprises:
if the difference value between the maximum transaction amount and the minimum transaction amount in the transaction data in the preset period of time of the merchant is larger than a second threshold value, determining a plurality of transaction scenes corresponding to the merchant according to the transaction amount and commodity detail in each transaction data;
and determining the scene category of the merchant name in the category of the merchant according to the transaction scenes corresponding to the merchant.
4. The method of claim 2, wherein the determining the sub-category of the merchant name within the category of merchants further comprises:
and determining the client category of the client according to the sub-category of the merchant name in the merchant category of the client related to each transaction data.
5. An apparatus for determining a category of merchants, comprising:
the acquisition unit is used for acquiring a merchant name library; the merchant name library corresponds to a plurality of merchant categories, and comprises a first merchant name of a determined merchant category and a second merchant name of an undetermined merchant category;
the classification unit is used for executing classification logic on each merchant category in turn according to a preset sequence in the following way: determining a merchant name characteristic corresponding to the merchant category according to the first merchant name in the merchant category; determining that a second merchant name conforming to the merchant name characteristics corresponding to the merchant category belongs to the merchant category;
the classifying unit determines the merchant name characteristics corresponding to the merchant category according to the first merchant name in the merchant category; and when determining that the second merchant name conforming to the merchant name characteristic corresponding to the merchant category belongs to the merchant category, the method is specifically used for: acquiring a plurality of first merchant names in the merchant category; word segmentation processing is carried out on each first merchant name in the plurality of first merchant names, so as to obtain a word set; if the occurrence frequency of any word in the word set is greater than a first threshold value and does not belong to a preset general word and/or belongs to a brand word, determining the word as a white list word of the merchant category; determining that a second merchant name containing the whitelist term belongs to the merchant category;
the classification unit is further configured to, after determining that the second merchant name including the whitelist term belongs to the merchant category: determining blacklist words of the merchant category; determining that a second merchant name including the blacklist word in the merchant category does not belong to the merchant category, and determining other merchant categories to which the second merchant name including the blacklist word belongs;
the classifying unit is further configured to, after executing the classification logic on each merchant category in turn according to a preset order: and if the determined classification result does not meet the preset condition, adjusting the preset sequence, and executing the classification logic on each merchant category in sequence according to the adjusted preset sequence until the determined classification result meets the preset condition.
6. A computing device, comprising:
a memory for storing program instructions;
a processor for invoking program instructions stored in said memory to perform the method of any of claims 1 to 4 in accordance with the obtained program.
7. A computer readable non-transitory storage medium comprising computer readable instructions which, when read and executed by a computer, cause the computer to perform the method of any of claims 1 to 4.
CN202010099718.6A 2020-02-18 2020-02-18 Method and device for determining merchant category Active CN111368543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010099718.6A CN111368543B (en) 2020-02-18 2020-02-18 Method and device for determining merchant category

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010099718.6A CN111368543B (en) 2020-02-18 2020-02-18 Method and device for determining merchant category

Publications (2)

Publication Number Publication Date
CN111368543A CN111368543A (en) 2020-07-03
CN111368543B true CN111368543B (en) 2023-06-02

Family

ID=71206250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010099718.6A Active CN111368543B (en) 2020-02-18 2020-02-18 Method and device for determining merchant category

Country Status (1)

Country Link
CN (1) CN111368543B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106886934A (en) * 2016-12-30 2017-06-23 北京三快在线科技有限公司 Method, system and apparatus for determining merchant categories
CN110046959A (en) * 2019-03-27 2019-07-23 拉扎斯网络科技(上海)有限公司 A kind of determining trade company manages method, apparatus, equipment and the storage medium of classification
CN110750697A (en) * 2019-10-30 2020-02-04 汉海信息技术(上海)有限公司 Merchant classification method, device, equipment and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8744961B2 (en) * 2003-07-25 2014-06-03 Stoneeagle Services, Inc. Method of providing secure payment and transaction reconciliation
US8725597B2 (en) * 2007-04-25 2014-05-13 Google Inc. Merchant scoring system and transactional database
CN104408635A (en) * 2014-12-01 2015-03-11 银联智惠信息服务(上海)有限公司 Method and device for recognizing class information of commercial tenant
CN105931066A (en) * 2015-09-24 2016-09-07 中国银联股份有限公司 Transaction data processing method and device
EP3333794A1 (en) * 2016-12-08 2018-06-13 Mastercard International Incorporated Multi-merchant portal for e-commerce
CN109101989B (en) * 2018-06-29 2021-06-29 创新先进技术有限公司 Merchant classification model construction and merchant classification method, device and equipment
CN113988880A (en) * 2019-01-08 2022-01-28 创新先进技术有限公司 Industry identification model determining method and device
CN110020427B (en) * 2019-01-30 2023-10-17 创新先进技术有限公司 Policy determination method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106886934A (en) * 2016-12-30 2017-06-23 北京三快在线科技有限公司 Method, system and apparatus for determining merchant categories
CN110046959A (en) * 2019-03-27 2019-07-23 拉扎斯网络科技(上海)有限公司 A kind of determining trade company manages method, apparatus, equipment and the storage medium of classification
CN110750697A (en) * 2019-10-30 2020-02-04 汉海信息技术(上海)有限公司 Merchant classification method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111368543A (en) 2020-07-03

Similar Documents

Publication Publication Date Title
CN109978538B (en) Method and device for determining fraudulent user, training model and identifying fraudulent risk
CN107180371B (en) Method, system and computer-readable storage medium for purchasing goods using coupons
US20090177480A1 (en) System And Method For Identifying Targeted Consumers Using Partial Social Security Numbers
CN106127505A (en) The single recognition methods of a kind of brush and device
CN109409964B (en) Method and device for identifying high-quality brand
CN107563757A (en) The method and device of data risk control
CN109615461B (en) Target user identification method, illegal merchant identification method and device
CN108509497A (en) Information recommendation method, device and electronic equipment
CN107679103B (en) Attribute analysis method and system for entity
CN109711955A (en) Difference based on current order comments method for early warning, system, blacklist library method for building up
CN107622326A (en) User's classification, available resources Forecasting Methodology, device and equipment
CN111368147A (en) Graph feature processing method and device
CN110827036A (en) Method, device, equipment and storage medium for detecting fraudulent transactions
US20240087046A1 (en) System to automatically categorize
CN111368543B (en) Method and device for determining merchant category
CN113554438B (en) Account identification method and device, electronic equipment and computer readable medium
CN110348983B (en) Transaction information management method and device, electronic equipment and non-transitory storage medium
CN112286790A (en) Full link test method, device, equipment and storage medium
KR102065928B1 (en) Patent assessment system
CN111402027B (en) Identity recognition method, commodity loan auditing method, device and terminal equipment
CN110570301B (en) Risk identification method, device, equipment and medium
CN108960501A (en) A kind of commodity method for preventing goods from altering
CN116453141B (en) Identification method and device for bill latent passenger and electronic equipment
CN111159398B (en) Method and device for identifying merchant types
CN110111131A (en) The determination method and device of false visitor's standing breath

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant