CN113836916A - Method and device for determining brand side of article and server - Google Patents

Method and device for determining brand side of article and server Download PDF

Info

Publication number
CN113836916A
CN113836916A CN202111140101.5A CN202111140101A CN113836916A CN 113836916 A CN113836916 A CN 113836916A CN 202111140101 A CN202111140101 A CN 202111140101A CN 113836916 A CN113836916 A CN 113836916A
Authority
CN
China
Prior art keywords
brand
target
party
article
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111140101.5A
Other languages
Chinese (zh)
Other versions
CN113836916B (en
Inventor
李广
徐文斌安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Duodian Life Chengdu Technology Co ltd
Original Assignee
Duodian Life Chengdu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Duodian Life Chengdu Technology Co ltd filed Critical Duodian Life Chengdu Technology Co ltd
Priority to CN202111140101.5A priority Critical patent/CN113836916B/en
Publication of CN113836916A publication Critical patent/CN113836916A/en
Application granted granted Critical
Publication of CN113836916B publication Critical patent/CN113836916B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Mathematical Physics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a device and a server for determining an article brand party, which comprise the following steps: acquiring article description data of a target article; if the article description data does not contain the target keywords, performing word segmentation processing on the article description data to obtain a plurality of description keywords; the target keywords are used for representing a target brand side of the target object; calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each description keyword; and determining the target brand side of the target item according to the scoring value of each first candidate brand side. The invention can realize the automatic matching of the article and the brand party, not only obviously reduces the cost for determining the article brand party, but also effectively improves the accuracy rate for determining the article brand party.

Description

Method and device for determining brand side of article and server
Technical Field
The invention relates to the technical field of data processing, in particular to a method, a device and a server for determining an article brand.
Background
In the new retail age, the local business super is actively making digital transformation. The most important thing in retail is the commodity, and the accuracy of commodity information is the root of the whole business circulation no matter in a traditional ERP (Enterprise Resource Planning) system or an intelligent operating system. In this case, brand is one of basic attributes of goods, and when a merchant displays goods data to a user, especially when purchasing order, gross profit settlement and data analysis are performed from the perspective of the brand merchant, the true accuracy of the brand merchant to which the goods belong is particularly important. At present, the main mode for determining the brand to which a commodity belongs is that store personnel or operation personnel of a retailer manually inputs the brand of the commodity in a commodity management system, so that the relationship between the commodity and the brand is maintained, however, the manual input mode has the problems of high cost, low accuracy and the like.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus, and a server for determining an item brand party, which can achieve automatic matching between an item and a brand party, significantly reduce the cost of determining the item brand party, and effectively improve the accuracy of determining the item brand party.
In a first aspect, an embodiment of the present invention provides a method for determining an item brand side, including: acquiring article description data of a target article; if the article description data does not contain the target keywords, performing word segmentation processing on the article description data to obtain a plurality of description keywords; wherein the target keyword is used for representing a target brand side of the target item; calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and the description keywords; determining a target brand party for the target item based on the total score value for each of the first candidate brand parties.
In one embodiment, the item description data includes at least a country barcode and an item name; if the article description data does not contain the target keyword, the step of performing word segmentation processing on the article description data to obtain a plurality of description keywords comprises the following steps: extracting a specified field in the national bar code, and determining a manufacturer identification code of the target object; searching at least one second candidate brand party matched with the manufacturer identification code in a pre-established attribution database; wherein the attribution database comprises at least a mapping relationship between a historical identifier and a set of branded parties, the set of branded parties comprising at least one historical branded party; judging whether the article name contains a target keyword corresponding to the second candidate brand party or not; and if not, performing word segmentation processing on the article name to obtain a plurality of description keywords.
In one embodiment, the step of calculating a total score value of a plurality of first candidate brand parties based on a pre-established scoring model and each of the description keywords comprises: respectively determining a first candidate brand party corresponding to each description keyword in the attribution database; the attribution database also comprises a mapping relation between historical brand parties and a keyword set, wherein the keyword set comprises at least one historical keyword; for each first candidate brand party, determining the sub-score value of each description keyword for the first candidate brand party based on a pre-established scoring model, and taking the sum of each sub-score value as the total score value of the first candidate brand party.
In one embodiment, the method further comprises: for each historical brand party, performing statistical processing on each historical keyword in a keyword set corresponding to the historical brand party, and determining a first frequency and a second frequency of each historical keyword for the historical brand party; the first frequency is used for representing the frequency of the historical keywords appearing in the keyword set corresponding to the historical brand party, and the second frequency is used for representing the frequency of the historical keywords appearing in the keyword set corresponding to each historical brand party; each historical keyword determines a sub-score value for each historical keyword for the brand party for the first frequency and the second frequency of the historical brand party.
In one embodiment, the sub-score value is positively correlated with the first frequency and the sub-score value is negatively correlated with the second frequency.
In one embodiment, the step of determining a target brand party for the target item based on the total score value of each of the first candidate brand parties includes: determining a third candidate brand party from the first candidate brand parties based on the total score value for each of the first candidate brand parties; and determining the third candidate brand party with the highest scoring value as the target brand party of the target item.
In one embodiment, the step of determining a third candidate brand party from the first candidate brand parties based on the total score value of each of the first candidate brand parties includes: for each first candidate brand party, judging whether the total score value of the first candidate brand party is greater than a preset threshold value; if so, the first candidate brand party is determined to be a third candidate brand party.
In a second aspect, an embodiment of the present invention further provides an apparatus for determining a brand of an article, including: the data acquisition module is used for acquiring article description data of a target article; the word segmentation module is used for carrying out word segmentation on the article description data to obtain a plurality of description keywords if the article description data does not contain the target keywords; wherein the target keyword is used for representing a target brand side of the target item; the scoring value calculating module is used for calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and the description keywords; a brand party determination module to determine a target brand party for the target item based on the total score value for each of the first candidate brand parties.
In a third aspect, an embodiment of the present invention further provides a server, including a processor and a memory, where the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement any one of the methods provided in the first aspect.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions that, when invoked and executed by a processor, cause the processor to implement any one of the methods provided in the first aspect.
According to the method, the device and the server for determining the brand of the object, provided by the embodiment of the invention, if the obtained object description data of the object does not contain the object keyword for representing the object brand of the object, the object description data is subjected to word segmentation processing to obtain a plurality of description keywords, then the total score values of a plurality of first candidate brand parties are calculated based on the pre-established scoring model and the description keywords, and the object brand of the object is determined according to the total score value of each first candidate brand party. According to the method, when the object description data does not contain the target key words, the first candidate brand sides are scored by using the scoring model and the description key words, so that the target brand sides are determined based on the total scoring value of each first candidate brand side, and the automatic brand sides of the target brand sides to which the target objects belong are further realized.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flow chart of a method for determining an item brand according to an embodiment of the present invention;
fig. 2 is an architecture diagram of a merchandise management system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a scoring model according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating another method for determining the brand identity of an item according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an apparatus for determining a brand side of an article according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the embodiments, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, the mode of manually inputting the brand side of a commodity has the following problems: (1)1) the goods of the E-commerce platform are in million-level quantity, and the manual maintenance cost is too high; (2) the manual maintenance is easy to generate manual operation errors, so that the accuracy of the commodity brand information entry is low. Based on the method, the device and the server for determining the brand side of the article, the automatic matching of the article and the brand side can be realized, the cost for determining the brand side of the article is obviously reduced, and the accuracy for determining the brand side of the article is effectively improved.
To facilitate understanding of the present embodiment, first, a detailed description is given to a method for determining an item brand side disclosed in the present embodiment, referring to a flowchart of the method for determining an item brand side shown in fig. 1, where the method mainly includes the following steps S102 to S108:
step S102, acquiring article description data of the target article. The target article is also the article of the brand to be matched, and the article description information may include a country barcode and an article name, and may also include an article description text and the like. In an embodiment, an upload channel may be provided for a user, so that the user may upload the item description data of the target item through the upload channel, or may obtain the item description information through a code scanning or image recognition, for example, the user may automatically read the item description data of the target item by scanning an area where a national barcode of the target item is located through a code scanning device.
And step S104, if the article description data does not contain the target keywords, performing word segmentation processing on the article description data to obtain a plurality of description keywords. The target keywords are used for representing a target brand side of the target object, and the description keywords can be used for guaranteeing characteristics of the target object. In one embodiment, the brand range of the target item may be determined based on the country barcode, and then whether the target keyword belonging to the brand range is included in the item name may be determined, for example, the brand range including "illite", "mengku" and "power" may be determined based on the country barcode, and then whether the target keyword such as "illite", "mengku" and "power" is included in the item name may be determined, and when the target keyword does not exist, the item name may be subjected to word segmentation processing. Optionally, the participle obtained by performing the participle processing on the name of the object may be directly used as the description keyword, or the participle after being screened may be further used as the description keyword, so as to improve the overall efficiency of determining the target affiliation party to which the target object belongs.
And step S106, calculating the total score value of the plurality of first candidate brand parties based on the pre-established scoring model and each description keyword. The scoring model comprises a plurality of historical brand parties and sub-scoring values of historical keywords corresponding to each historical brand party. In one embodiment, at least one first candidate brand party of the target item may be determined based on the description keyword, then a sub-score value of each description keyword for each first candidate brand party may be determined based on the scoring model, and a total score value of each first candidate brand party may be obtained by calculating a sum of the sub-score values of each description keyword for each first candidate brand party.
And step S108, determining a target brand party of the target article according to the total score value of each first candidate brand party. In one embodiment, the first candidate brand party with the highest total score value may be determined as the target brand party; in another embodiment, in consideration of the situation that the total score value of each first candidate brand is low, in order to improve the accuracy of the target brand, it may be determined whether the total score value of the first candidate brand is large or not and a preset threshold value, so as to screen out a third candidate brand with high reliability, and then determine the third candidate brand with the highest total score value as the target brand, and under the situation that the total score value of the first candidate brand is lower than the preset threshold value, the target brand will not be determined from the first candidate brand.
According to the method for determining the brand side of the article, provided by the embodiment of the invention, when the article description data does not contain the target keyword, the first candidate brand sides are scored by using the scoring model and the description keyword, so that the target brand sides are determined based on the total scoring value of each first candidate brand side, and further the automatic brand side of the target brand side to which the target article belongs is realized.
In one embodiment, a commodity management system is deployed in the server, so that the method for determining the brand of the article is executed based on the commodity management system, and the idea is as follows: sample data of the product is collected from the product-related system, core feature data such as a product core keyword (i.e., the historical keyword), a historical manufacturer identifier (i.e., the historical identifier) and the like are extracted from the sample data, and a brand word segmentation corpus (i.e., the attribution database) is formed. And then, establishing a rule matching and TFIDF (term frequency-inverse document frequency) model combination model by using the brand word segmentation corpus and the core characteristic data, extracting description keywords and manufacturer identification codes from the commodity description data of each subsequent commodity of which the brand needs to be matched, and matching a target brand party to which the commodity belongs by using the combination model. For easy understanding, an embodiment of the present invention provides an architecture diagram of a product management system as shown in fig. 2, where the product management system includes a product corpus preprocessing unit, a rule matching unit, a model prediction unit, a product data set, and a service database.
The commodity corpus preprocessing unit is responsible for performing sample screening on collected sample data or commodity description data and performing structured preprocessing, the structured preprocessing comprises word segmentation processing and identification code extraction processing, the word segmentation processing refers to performing word segmentation processing on the commodity name and the commodity description text of each commodity, the identification code extraction processing refers to extracting a manufacturer identification code from a national barcode, optionally, a specified field of the national barcode is extracted to obtain the manufacturer identification code, and for example, the first 7 digits in the national barcode are the manufacturer identification code. In addition, for sample data, a commodity core keyword with strong correlation with a corresponding brand can be screened out from the participles, so that an attribution database is obtained, the attribution database can comprise a mapping relation between a historical identification code and a brand party set, the attribution database can also comprise a mapping relation between a historical brand party and a keyword set, and the attribution database can also comprise a commodity SKU ID (Stock Keeping Unit-Identity document, Stock quantity Unit-unique code), the brand party set comprises at least one historical brand party, and the keyword set comprises at least one historical keyword. The brand participle corpus stored in the attribution database is shown in the following table 1:
TABLE 1
Figure BDA0003283509630000081
The rule matching unit is responsible for performing custom rule matching on a to-be-matched commodity (i.e., the target item) based on an attribution database to determine a target brand party to which the to-be-matched commodity belongs, and in one embodiment, the matching rule includes vendor identification code matching and keyword matching. The manufacturer identification code matching is used for searching the brand range to which the to-be-matched commodity belongs, and the keyword matching is used for determining the unique brand (namely, the target brand) to which the to-be-matched commodity belongs within the brand range. Illustratively, the data structure of the matched goods is shown in the following table 2:
TABLE 2
Figure BDA0003283509630000082
Figure BDA0003283509630000091
Based on the rule matching unit, the embodiment of the present invention provides an implementation manner of step S104, which is shown in the following steps a to e:
step a, extracting the specified field in the national bar code and determining the manufacturer identification code of the target object. For example, the first 7 digits "6907992" in the country barcode "6907992104554" are extracted, and the "6907992" is the manufacturer identification code.
And b, searching at least one second candidate brand party matched with the manufacturer identification code in a pre-established attribution database. As can be seen from table 1, the attribution database includes a mapping relationship between the vendor identification code and the matching party, so that the second brand candidate party corresponding to the vendor identification code can be searched in the attribution database. For example, the brand range corresponding to the vendor identification "6907992" includes "illite" and "mong cattle".
And c, judging whether the object name contains a target keyword corresponding to the second candidate brand party. If yes, executing step d; if not, executing step e. Illustratively, if the item name contains "illite", the "illite" is directly determined as the target brand party, and if the item name does not contain "illite" and "Mongolian", the item name is subjected to word segmentation.
And d, if not, performing word segmentation processing on the article name to obtain a plurality of description keywords.
And e, determining the brand party represented by the target keyword as a target brand party.
The embodiment of the invention improves the accuracy of determining the target brand party to which the target article belongs based on the manufacturer identification code and by combining the description key words of the commodity.
The model prediction unit is responsible for constructing a scoring model (also called as a TFIDF model or a brand probability prediction model) by using a TFIDF statistical rule, so as to predict a target brand party of the commodity through the scoring model. The characteristic data is derived from the commodity word segmentation result in the commodity corpus preprocessing unit. The model mainly comprises a model construction part and a model prediction part.
For the model building part, the main purpose of model building is to score the commodity keywords appearing under each brand, the scoring rule is mainly based on the TFIDF statistical rule, and the embodiment of the invention provides the scoring model building method shown in the following steps 1 to 2:
step 1, for each historical brand party, performing statistical processing on each historical keyword in a keyword set corresponding to the historical brand party, and determining a first frequency and a second frequency of each historical keyword for the historical brand party. The first frequency is used for representing the frequency of the historical keywords appearing in the keyword set corresponding to the historical brand party, and the second frequency is used for representing the frequency of the historical keywords appearing in the keyword set corresponding to each historical brand party. For example, for the brand side "tide", corresponding to the keyword set 1 "tide, laundry powder, laundry detergent, full effect, clean white, incense, lavender, high concentration, overcoat, natural, clean, laundry soap, lemon, degerming, collar clean", taking "laundry powder" as an example, the number x (i.e., the first frequency) of occurrences of "laundry powder" in the above keyword set 1 is determined, and the number y (i.e., the second frequency) of occurrences of "laundry powder" in all the corpora included in the brand participle corpus is determined.
And 2, determining the sub-score value of each historical keyword for the brand party according to the first frequency and the second frequency of each historical brand party. In one embodiment, the higher the frequency of the history keyword appearing in the brand party, the higher the sub-score value, and the higher the frequency of the history keyword appearing in the brand word segmentation corpus, the lower the sub-score value, that is, the sub-score value is positively correlated with the first frequency, and the sub-score value is negatively correlated with the second frequency. Illustratively, referring to a schematic diagram of a scoring model shown in fig. 3, the scoring model defines a sub-score value of each keyword under each brand, taking the brand 'tide' as an example, wherein the sub-score value of the keyword 'tide' is 5.1, the sub-score value of the keyword 'laundry powder' is 2.1, and the sub-score value of the keyword 'laundry detergent' is 1.9.
Based on the scoring model, in subsequent applications, the to-be-predicted commodity is segmented by using a segmentation library in the commodity corpus preprocessing unit, the sum of the sub-scoring values of each keyword in a certain brand party is used as a judgment standard, and the brand party with the highest total scoring value is used as a finally selected brand (namely, a target brand party). The embodiment of the invention also provides an implementation mode for calculating the total score values of a plurality of first candidate brand parties based on pre-established scoring models and various description keywords, which is shown in the following (1) to (2):
(1) and respectively determining a first candidate brand party corresponding to each description keyword in the attribution database. For example, the article name of the target article is "natural essence laundry detergent", and the description keywords [ "natural", "essence", "laundry detergent" ] are obtained after the word segmentation, whether the keywords "natural", "essence", "laundry detergent" are included is searched in the keyword set corresponding to each brand party in the attribution database, and if the keywords "fridge" and "drift" are included, the "fridge" and the "drift" are determined as the first candidate brand party.
(2) For each first candidate brand party, determining the sub-score value of each description keyword for the first candidate brand party based on a pre-established scoring model, and taking the sum of the sub-score values as the total score value of the first candidate brand party. With continued reference to fig. 3, fig. 3 illustrates the sub-score values of "natural", "essence", "laundry" for "tide" and "soft", so that according to the scoring model, the total score value (denoted by score (n)) is the sum of the scores of the matching words of the segmentation result under the brand TFID model, as follows:
score (tide) ═ natural: 0.3+ laundry detergent: 1.9 ═ 2.2;
score (piao) is natural 0.2+ essence 0.6 is 0.8.
The embodiment of the invention builds the scoring model for the keywords, and can accurately output the target brand party to which the target object belongs under the condition that the commodity has no manufacturer identification code.
Embodiments of the present invention further provide an implementation manner of determining a target brand side of a target item according to a total score value of each first candidate brand side, please refer to (a) to (b) below:
a third candidate brand party is determined from the first candidate brand parties based on the total score value of each of the first candidate brand parties. In an optional implementation manner, for each first candidate brand party, whether the total score value of the first candidate brand party is greater than a preset threshold value is judged; if so, the first candidate brand party is determined to be a third candidate brand party. For example, the preset threshold is 0.7, and since Score (tide) and Score (floe) are both greater than 0.7, both "tide" and "floe" are determined as the third candidate brand.
And (II) determining a third candidate brand party with the highest total score value as a target brand party of the target item. For example, "tide" is determined to be the target brand side of the target item, since Score > Score.
For convenience of understanding, an application example of the method for determining the brand side of an item is provided in the embodiment of the present invention, referring to a flowchart of another method for determining the brand side of an item shown in fig. 4, the method mainly includes the following steps S402 to S418:
step S402, loading commodity data and manufacturer identification code data of the commodity to be matched, wherein the commodity data comprises the commodity name and the commodity description text. For example, the trade name is "orchid permanent care moist 750g shampoo packs" and the manufacturer identification code data is "6903148".
Step S404, judging whether the national bar code of the commodity to be matched can be matched with the manufacturer identification code. If yes, go to step S406; if not, step S412 is performed. In one embodiment, vendor ID "6903148" is found in the brand segmentation corpus, and if vendor ID "6903148" is found, step S406 is performed, and if vendor ID "6903148" is not found, step S412 is performed.
Step S406, obtaining a brand range corresponding to the manufacturer identification code. For example, the brand range corresponding to the vendor identification "6903148" found in the brand segmentation corpus includes "power, soft, and pantry".
Step S408, acquiring the history keywords associated with the matching range. In one embodiment, the matching range includes a plurality of second candidate brand parties, each of which corresponds to a plurality of historical keywords, and the historical keywords may include keywords for characterizing the affiliated brand party. For example, the history keywords associated with the matching range include "power, flighting, and pantting".
Step S410, judging whether the commodity name has a target keyword. If yes, go to step S418; if not, step S412 is performed. Assuming that the trade name is "orchid long-acting smooth water 750g shampoo preferential package", which does not include words matching the above-mentioned history keywords, step S412 is executed. Assuming that the trade name is "Piaorou orchid Long-acting clean and smooth water 750g shampoo preferential package" containing the word "Piaorou" matching the above history keywords, step S418 is performed.
Step S412, inputting the commodity name into the word segmentation model to obtain a word segmentation list. The word segmentation model is used for carrying out word segmentation on the commodity name, and the word segmentation list is also used for displaying the description keywords in a list form.
And S414, performing score prediction on the word segmentation list through a TFIDF model, and outputting a first candidate brand party corresponding to the commodity name and a total score value of the first candidate brand party.
In step S416, it is determined whether the total score of the first candidate brand is greater than a predetermined threshold. If so, taking the first candidate brand party with the highest total score value as the target brand party; if not, the process is ended.
And step S418, writing the commodity to be matched and the target brand party into a service database.
According to the method for determining the brand of the article, provided by the embodiment of the invention, the brand prediction process is a process for obtaining the target brand of the article to be matched by aiming at the article to be matched and passing the article description data of the article to be matched through the rule matching module and the model prediction module. The whole process is that firstly, the article description data of the article to be matched is loaded from the article correlation system, then the manufacturer identification code and the description keyword are determined, the country bar code and the manufacturer identification code are respectively matched, the article name and the keyword are matched, and if the conditions are met, the accurate brand relation is output. And when the condition is not met, performing word segmentation on the commodity name, and scoring through a TFIDF model to obtain the brand relation which reaches the threshold value and has the highest score. The method for determining the brand side of the article at least has the following characteristics:
(1) the automatic commodity brand relationship maintenance process can quickly match commodity brand data, and reduces labor cost overhead caused by manual maintenance.
(2) The commodity brand relation accuracy is improved, and data support is provided for follow-up merchant brand cooperation operation while the data accuracy is improved.
(3) And error data generated by manual maintenance is found in time, and quick correction is realized.
With respect to the method for determining the brand side of an article provided by the foregoing embodiment, an embodiment of the present invention provides an apparatus for determining the brand side of an article, and referring to a schematic structural diagram of an apparatus for determining the brand side of an article shown in fig. 5, the apparatus mainly includes the following components:
a data obtaining module 502, configured to obtain item description data of a target item;
a word segmentation module 504, configured to perform word segmentation on the article description data to obtain multiple description keywords if the article description data does not include the target keyword; the target keywords are used for representing a target brand side of the target object;
a scoring value calculating module 506, configured to calculate total scoring values of the plurality of first candidate brand parties based on a pre-established scoring model and the respective description keywords;
and a brand party determining module 508, configured to determine a target brand party of the target item according to the total score value of each first candidate brand party.
According to the determination device for the brand party of the article, provided by the embodiment of the invention, when the article description data does not contain the target keyword, the first candidate brand parties are scored by using the scoring model and the description keyword, so that the target brand party is determined based on the total scoring value of each first candidate brand party, and further, the automatic brand party of the target brand party to which the target article belongs is realized.
In one embodiment, the item description data includes at least a national barcode and an item name; the word segmentation module 504 is further configured to: extracting a specified field in the national bar code, and determining a manufacturer identification code of the target object; searching at least one second candidate brand party matched with the manufacturer identification code in a pre-established attribution database; the attribution database at least comprises a mapping relation between historical identification codes and a brand party set, and the brand party set comprises at least one historical brand party; judging whether the name of the article contains a target keyword corresponding to the second candidate brand party; and if not, performing word segmentation processing on the article name to obtain a plurality of description keywords.
In one embodiment, the score calculation module 506 is further configured to: respectively determining a first candidate brand party corresponding to each description keyword in an attribution database; the attribution database also comprises a mapping relation between historical brand parties and a keyword set, wherein the keyword set comprises at least one historical keyword; for each first candidate brand party, determining the sub-score value of each description keyword for the first candidate brand party based on a pre-established scoring model, and taking the sum of the sub-score values as the total score value of the first candidate brand party.
In one embodiment, the model building module is configured to: for each historical brand party, performing statistical processing on each historical keyword in a keyword set corresponding to the historical brand party, and determining a first frequency and a second frequency of each historical keyword for the historical brand party; the first frequency is used for representing the frequency of the historical keywords appearing in the keyword set corresponding to the historical brand party, and the second frequency is used for representing the frequency of the historical keywords appearing in the keyword set corresponding to each historical brand party; each historical keyword determines a sub-score value for each historical keyword for the brand party for the first frequency and the second frequency of the historical brand party.
In one embodiment, the sub-score value is positively correlated with the first frequency and the sub-score value is negatively correlated with the second frequency.
In one embodiment, branding party determination module 508 is further configured to: determining a third candidate brand party from the first candidate brand parties based on the total score value of each of the first candidate brand parties; and determining the third candidate brand party with the highest scoring value as the target brand party of the target item.
In one embodiment, branding party determination module 508 is further configured to: for each first candidate brand party, judging whether the total score value of the first candidate brand party is greater than a preset threshold value; if so, the first candidate brand party is determined to be a third candidate brand party.
The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments.
The embodiment of the invention provides a server, which particularly comprises a processor and a storage device; the storage means has stored thereon a computer program which, when executed by the processor, performs the method of any of the above described embodiments.
Fig. 6 is a schematic structural diagram of a server according to an embodiment of the present invention, where the server 100 includes: a processor 60, a memory 61, a bus 62 and a communication interface 63, wherein the processor 60, the communication interface 63 and the memory 61 are connected through the bus 62; the processor 60 is arranged to execute executable modules, such as computer programs, stored in the memory 61.
The Memory 61 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 63 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
The bus 62 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
The memory 61 is used for storing a program, the processor 60 executes the program after receiving an execution instruction, and the method executed by the apparatus defined by the flow process disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 60, or implemented by the processor 60.
The processor 60 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 60. The Processor 60 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory 61, and the processor 60 reads the information in the memory 61 and, in combination with its hardware, performs the steps of the above method.
The computer program product of the readable storage medium provided in the embodiment of the present invention includes a computer readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiment, and specific implementation may refer to the foregoing method embodiment, which is not described herein again.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for determining a brand of an item, comprising:
acquiring article description data of a target article;
if the article description data does not contain the target keywords, performing word segmentation processing on the article description data to obtain a plurality of description keywords; wherein the target keyword is used for representing a target brand side of the target item;
calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and the description keywords;
determining a target brand party for the target item based on the total score value for each of the first candidate brand parties.
2. The method of claim 1, wherein the item description data includes at least a country barcode and an item name;
if the article description data does not contain the target keyword, the step of performing word segmentation processing on the article description data to obtain a plurality of description keywords comprises the following steps:
extracting a specified field in the national bar code, and determining a manufacturer identification code of the target object;
searching at least one second candidate brand party matched with the manufacturer identification code in a pre-established attribution database; wherein the attribution database comprises at least a mapping relationship between a historical identifier and a set of branded parties, the set of branded parties comprising at least one historical branded party;
judging whether the article name contains a target keyword corresponding to the second candidate brand party or not;
and if not, performing word segmentation processing on the article name to obtain a plurality of description keywords.
3. The method of claim 2, wherein the step of calculating a total score value for a plurality of first candidate brand parties based on a pre-established scoring model and each of the descriptive keywords comprises:
respectively determining a first candidate brand party corresponding to each description keyword in the attribution database; the attribution database also comprises a mapping relation between historical brand parties and a keyword set, wherein the keyword set comprises at least one historical keyword;
for each first candidate brand party, determining the sub-score value of each description keyword for the first candidate brand party based on a pre-established scoring model, and taking the sum of each sub-score value as the total score value of the first candidate brand party.
4. The method of claim 3, further comprising:
for each historical brand party, performing statistical processing on each historical keyword in a keyword set corresponding to the historical brand party, and determining a first frequency and a second frequency of each historical keyword for the historical brand party; the first frequency is used for representing the frequency of the historical keywords appearing in the keyword set corresponding to the historical brand party, and the second frequency is used for representing the frequency of the historical keywords appearing in the keyword set corresponding to each historical brand party;
each historical keyword determines a sub-score value for each historical keyword for the brand party for the first frequency and the second frequency of the historical brand party.
5. The method of claim 4, wherein the sub-score value is positively correlated with the first frequency and the sub-score value is negatively correlated with the second frequency.
6. The method of claim 1, wherein the step of determining a target brand party for the target item based on the total score value for each of the first candidate brand parties comprises:
determining a third candidate brand party from the first candidate brand parties based on the total score value for each of the first candidate brand parties;
and determining the third candidate brand party with the highest scoring value as the target brand party of the target item.
7. The method of claim 6, wherein the step of determining a third candidate brand party from the first candidate brand parties based on the total score value of each of the first candidate brand parties comprises:
for each first candidate brand party, judging whether the total score value of the first candidate brand party is greater than a preset threshold value;
if so, the first candidate brand party is determined to be a third candidate brand party.
8. An apparatus for determining a brand side of an article, comprising:
the data acquisition module is used for acquiring article description data of a target article;
the word segmentation module is used for carrying out word segmentation on the article description data to obtain a plurality of description keywords if the article description data does not contain the target keywords; wherein the target keyword is used for representing a target brand side of the target item;
the scoring value calculating module is used for calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and the description keywords;
a brand party determination module to determine a target brand party for the target item based on the total score value for each of the first candidate brand parties.
9. A server comprising a processor and a memory, the memory storing computer-executable instructions executable by the processor, the processor executing the computer-executable instructions to implement the method of any one of claims 1 to 7.
10. A computer-readable storage medium having computer-executable instructions stored thereon which, when invoked and executed by a processor, cause the processor to implement the method of any of claims 1 to 7.
CN202111140101.5A 2021-09-28 2021-09-28 Method, device and server for determining brand party of article Active CN113836916B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111140101.5A CN113836916B (en) 2021-09-28 2021-09-28 Method, device and server for determining brand party of article

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111140101.5A CN113836916B (en) 2021-09-28 2021-09-28 Method, device and server for determining brand party of article

Publications (2)

Publication Number Publication Date
CN113836916A true CN113836916A (en) 2021-12-24
CN113836916B CN113836916B (en) 2023-06-20

Family

ID=78970784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111140101.5A Active CN113836916B (en) 2021-09-28 2021-09-28 Method, device and server for determining brand party of article

Country Status (1)

Country Link
CN (1) CN113836916B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521906A (en) * 2023-04-28 2023-08-01 广州商研网络科技有限公司 Meta description generation method, device, equipment and medium thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10387568B1 (en) * 2016-09-19 2019-08-20 Amazon Technologies, Inc. Extracting keywords from a document
CN110457568A (en) * 2018-05-03 2019-11-15 北京京东尚科信息技术有限公司 The recognition methods of brand word and system, object recommendation method and system
CN110750985A (en) * 2018-07-04 2020-02-04 阿里巴巴集团控股有限公司 Brand word recognition method, device, equipment and storage medium
CN110781307A (en) * 2019-11-06 2020-02-11 北京沃东天骏信息技术有限公司 Target item keyword and title generation method, search method and related equipment
CN111259660A (en) * 2020-01-15 2020-06-09 中国平安人寿保险股份有限公司 Method, device and equipment for extracting keywords based on text pairs and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10387568B1 (en) * 2016-09-19 2019-08-20 Amazon Technologies, Inc. Extracting keywords from a document
CN110457568A (en) * 2018-05-03 2019-11-15 北京京东尚科信息技术有限公司 The recognition methods of brand word and system, object recommendation method and system
CN110750985A (en) * 2018-07-04 2020-02-04 阿里巴巴集团控股有限公司 Brand word recognition method, device, equipment and storage medium
CN110781307A (en) * 2019-11-06 2020-02-11 北京沃东天骏信息技术有限公司 Target item keyword and title generation method, search method and related equipment
CN111259660A (en) * 2020-01-15 2020-06-09 中国平安人寿保险股份有限公司 Method, device and equipment for extracting keywords based on text pairs and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521906A (en) * 2023-04-28 2023-08-01 广州商研网络科技有限公司 Meta description generation method, device, equipment and medium thereof
CN116521906B (en) * 2023-04-28 2023-10-24 广州商研网络科技有限公司 Meta description generation method, device, equipment and medium thereof

Also Published As

Publication number Publication date
CN113836916B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
US11450125B2 (en) Methods and systems for automated table detection within documents
KR101999471B1 (en) Information recommendation methods and devices
JP6991163B2 (en) How to push information and devices
US20230005286A1 (en) Methods, systems, articles of manufacture, and apparatus for decoding purchase data using an image
CN113570413B (en) Advertisement keyword generation method and device, storage medium and electronic equipment
US20220414630A1 (en) Methods, systems, articles of manufacture, and apparatus for decoding purchase data using an image
US20230110941A1 (en) Data processing for enterprise application chatbot
CN113535817A (en) Method and device for generating characteristic broad table and training business processing model
CN112685635A (en) Item recommendation method, device, server and storage medium based on classification label
CN110362702B (en) Picture management method and equipment
CN113836916A (en) Method and device for determining brand side of article and server
CN110796178B (en) Decision model training method, sample feature selection method, device and electronic equipment
CN110717095B (en) Service item pushing method and device
CN112669053A (en) Fraud group identification method, device, equipment and medium based on sales data
CN111898378A (en) Industry classification method and device for government and enterprise clients, electronic equipment and storage medium
CN116150477A (en) Financial information personalized recommendation method, device, equipment and medium
CN110765100A (en) Label generation method and device, computer readable storage medium and server
CN116186286A (en) International logistics information recommendation method, system and medium based on enterprise knowledge graph
CN111695922A (en) Potential user determination method and device, storage medium and electronic equipment
CN108345600B (en) Management of search application, data search method and device thereof
RU2480828C1 (en) Method of predicting target value of events based on unlimited number of characteristics
US11282093B2 (en) Method and system for machine learning based item matching by considering user mindset
CN114328844A (en) Text data set management method, device, equipment and storage medium
CN113806526A (en) Feature extraction method, device and storage medium
CN112598185A (en) Agricultural public opinion analysis method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant