CN113836916B - Method, device and server for determining brand party of article - Google Patents
Method, device and server for determining brand party of article Download PDFInfo
- Publication number
- CN113836916B CN113836916B CN202111140101.5A CN202111140101A CN113836916B CN 113836916 B CN113836916 B CN 113836916B CN 202111140101 A CN202111140101 A CN 202111140101A CN 113836916 B CN113836916 B CN 113836916B
- Authority
- CN
- China
- Prior art keywords
- brand
- party
- target
- candidate
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Game Theory and Decision Science (AREA)
- Mathematical Physics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method, a device and a server for determining brand parties of articles, comprising the following steps: acquiring article description data of a target article; if the object description data does not contain the target keywords, performing word segmentation processing on the object description data to obtain a plurality of description keywords; the target keywords are used for representing target brand parties of target articles; calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each description keyword; the target brand party of the target item is determined according to the scoring value of each first candidate brand party. The method can realize automatic matching of the article and the brand party, not only remarkably reduce the cost of determining the brand party of the article, but also effectively improve the accuracy of determining the brand party of the article.
Description
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, an apparatus, and a server for determining a brand of an article.
Background
In the new retail age, the local business is actively transforming into digital. The most important ring in retail is commodity, and whether in a traditional ERP (Enterprise Resource Planning ) system or a digital intelligent operating system, the commodity information is accurate, the fundamental of the whole business circulation. Among these, branding is one of the basic properties of a commodity, and the true accuracy of the brand name to which the commodity belongs is particularly important when the commodity exceeds the display of commodity data to users, especially purchasing orders, gross settlement and data analysis at the point of view of the brand name. At present, the main mode of determining the brand to which the commodity belongs is that store personnel or operators of retailers manually enter the brand of the commodity in a commodity management system, so that the relationship between the commodity and the brand is maintained, however, the manual entry mode has the problems of higher cost, lower accuracy and the like.
Disclosure of Invention
In view of the above, the present invention aims to provide a method, a device and a server for determining a brand of an article, which can realize automatic matching between the article and the brand, and not only remarkably reduce the cost for determining the brand of the article, but also effectively improve the accuracy for determining the brand of the article.
In a first aspect, an embodiment of the present invention provides a method for determining a brand party of an article, including: acquiring article description data of a target article; if the article description data does not contain the target keywords, performing word segmentation processing on the article description data to obtain a plurality of description keywords; wherein the target keyword is used for characterizing a target brand party of the target article; calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each description keyword; a target brand party of the target item is determined based on the total scoring value for each of the first candidate brand parties.
In one embodiment, the item description data includes at least a country barcode and an item name; and if the article description data does not contain the target keywords, performing word segmentation processing on the article description data to obtain a plurality of description keywords, wherein the step comprises the following steps: extracting a specified field in the national bar code, and determining a manufacturer identification code of the target object; searching at least one second candidate brand party matched with the manufacturer identification code in a pre-established attribution database; the attribution database at least comprises a mapping relation between a historical identification code and a brand party set, wherein the brand party set comprises at least one historical brand party; judging whether the object name contains a target keyword corresponding to the second candidate brand party or not; if not, the article names are subjected to word segmentation processing to obtain a plurality of description keywords.
In one embodiment, the step of calculating the total scoring values of the plurality of first candidate brands based on the pre-established scoring model and each of the descriptive keywords comprises: respectively determining a first candidate brand party corresponding to each description keyword in the attribution database; wherein the attribution database further comprises a mapping relation between a historical brand party and a keyword set, and the keyword set comprises at least one historical keyword; for each first candidate brand party, determining the sub-scoring values of the descriptive keywords for the first candidate brand party based on a pre-established scoring model, and taking the sum of the sub-scoring values as the total scoring value of the first candidate brand party.
In one embodiment, the method further comprises: for each historical brand party, carrying out statistical processing on each historical keyword in a keyword set corresponding to the historical brand party, and determining a first frequency and a second frequency of each historical keyword aiming at the historical brand party; the first frequency is used for representing the frequency of occurrence of the historical keywords in the keyword sets corresponding to the historical brand sides, and the second frequency is used for representing the frequency of occurrence of the historical keywords in the keyword sets corresponding to each historical brand side; each historical keyword is used for determining a sub-scoring value of each historical keyword for the brand party according to a first frequency and a second frequency of the historical brand party.
In one embodiment, the sub-scoring value is positively correlated with the first frequency and the sub-scoring value is negatively correlated with the second frequency.
In one embodiment, the step of determining the target brand party of the target item based on the total scoring value for each of the first candidate brand parties includes: determining a third candidate brand party from the first candidate brand parties based on the total scoring value of each of the first candidate brand parties; and determining the third candidate brand party with the highest scoring value as the target brand party of the target object.
In one embodiment, the step of determining a third candidate brand party from the first candidate brand parties based on the total scoring value of each of the first candidate brand parties includes: for each first candidate brand party, judging whether the total scoring value of the first candidate brand party is larger than a preset threshold value; if so, the first candidate brand party is determined to be a third candidate brand party.
In a second aspect, an embodiment of the present invention further provides a device for determining a brand party of an article, including: the data acquisition module is used for acquiring article description data of the target article; the word segmentation module is used for carrying out word segmentation processing on the article description data to obtain a plurality of description keywords if the article description data does not contain the target keywords; wherein the target keyword is used for characterizing a target brand party of the target article; the scoring value calculation module is used for calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each description keyword; and the brand party determining module is used for determining a target brand party of the target object according to the total scoring value of each first candidate brand party.
In a third aspect, embodiments of the present invention also provide a server comprising a processor and a memory storing computer executable instructions executable by the processor, the processor executing the computer executable instructions to implement the method of any one of the first aspects.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions which, when invoked and executed by a processor, cause the processor to implement the method of any one of the first aspects.
According to the method, the device and the server for determining the brand of the object, if the acquired object description data of the object does not contain the object keywords used for representing the object brand of the object, the object description data are subjected to word segmentation processing to obtain a plurality of description keywords, and then the total scoring values of a plurality of first candidate brand parties are calculated based on a pre-established scoring model and the description keywords, so that the object brand of the object is determined according to the total scoring values of each first candidate brand party. When the object description data does not contain the target keywords, the first candidate brand parties are scored by using the scoring model and the description keywords, so that the target brand party is determined based on the total scoring value of each first candidate brand party, and further, the automatic brand party of the target brand party to which the target object belongs is realized.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for determining brand identity of an article according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a commodity management system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a scoring model according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating another method for determining brand identity of an article according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a device for determining brand of an article according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described in conjunction with the embodiments, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
At present, the manual entering of branded parties to which a commodity belongs has the following problems: (1) 1) the electronic commerce platform commodity is in millions, and the manual maintenance cost is too high; (2) Manual maintenance is extremely easy to produce manual operation errors, so that the accuracy of information input of commodity brands is low. Based on the method, the device and the server for determining the brand of the article are provided, the automatic matching of the article and the brand can be realized, the cost for determining the brand of the article is obviously reduced, and the accuracy for determining the brand of the article is effectively improved.
For the sake of understanding the present embodiment, first, a method for determining an article brand side according to an embodiment of the present invention will be described in detail, referring to a flowchart of a method for determining an article brand side shown in fig. 1, the method mainly includes the following steps S102 to S108:
step S102, article description data of the target article is acquired. The object article, namely the article to be matched with the brand, can comprise a national bar code and an article name, and can also comprise an article description text and the like. In one embodiment, an uploading channel may be provided for a user to upload the article description data of the target article through the uploading channel, and the article description information may also be obtained through a code scanning or image recognition mode, for example, the user scans the area where the country bar code of the target article is located through the code scanning device, so that the article description data of the target article may be automatically read.
Step S104, if the object description data does not contain the target keywords, the object description data is subjected to word segmentation processing to obtain a plurality of description keywords. Wherein, the target keywords are used for characterizing the target brand party of the target object, and the descriptive keywords can be used for guaranteeing the characteristics of the target object. In one embodiment, the brand range of the target object may be determined based on the national bar code, then it may be determined whether the object name includes the target keyword belonging to the brand range, for example, determining that the brand range includes "illi", "Mongolian", "Lishi", and then it may be determined whether the object name includes the target keyword such as "illi", "Mongolian", "Lishi", and the like, and when the target keyword does not exist, the object name may be subjected to word segmentation. Optionally, the word segmentation obtained by word segmentation processing on the object name can be directly used as a description keyword, and the word segmentation can be further screened and the screened word segmentation can be used as the description keyword, so that the overall efficiency of determining the target attribution party to which the target object belongs is improved.
Step S106, calculating total scoring values of a plurality of first candidate brand parties based on the pre-established scoring model and the descriptive keywords. The scoring model comprises a plurality of historical brand parties and sub-scoring values of historical keywords corresponding to each historical brand party. In one embodiment, at least one first candidate brand party of the target object may be determined based on the description keywords, then the sub-scoring values of each description keyword for each first candidate brand party may be determined based on the scoring model, and the total scoring value of each first candidate brand party may be obtained by calculating the sum of the sub-scoring values of each description keyword for each first candidate brand party.
Step S108, determining the target brand party of the target object according to the total scoring value of each first candidate brand party. In one embodiment, the first candidate brand party with the highest total scoring value may be determined to be the target brand party; in another embodiment, in consideration of the situation that the total scoring value of each first candidate brand party is low, in order to improve the accuracy of the target brand party, whether the total scoring value of the first candidate brand party is large or not and a preset threshold value can be judged first, so that a third candidate brand party with high reliability is screened out, the third candidate brand party with the highest total scoring value is determined to be the target brand party, and in the case that the total scoring value of the first candidate brand party is lower than the preset threshold value, the target brand party is not determined from the first candidate brand parties.
According to the method for determining the brand of the object, when object description data do not contain target keywords, the first candidate brand parties are scored by using the scoring model and the description keywords, so that the target brand party is determined based on the total scoring value of each first candidate brand party, and further, the automatic brand party of the target brand party to which the target object belongs is realized.
In one embodiment, a commodity management system is deployed in the server, so that the method for determining the brand party of the commodity is executed based on the commodity management system, and the concept is as follows: and collecting commodity sample data from a commodity correlation system, extracting core feature data such as commodity core keywords (namely, the historical keywords), historical manufacturer identification codes (namely, the historical identification codes) and the like from the sample data, and forming a brand word segmentation corpus (namely, the attribution database). And then, constructing a rule matching and TFIDF (term frequency-inverse document frequency) model combination model by utilizing the brand word segmentation corpus and core feature data, extracting descriptive keywords and manufacturer identification codes from commodity description data of commodities for each commodity requiring a brand to be matched, and matching a target brand party to which the commodity belongs by utilizing the combination model. For easy understanding, the embodiment of the invention provides an architecture diagram of a commodity management system shown in fig. 2, wherein the commodity management system comprises a commodity corpus preprocessing unit, a rule matching unit, a model prediction unit, a commodity data set and a business database.
And the commodity corpus preprocessing unit is responsible for carrying out sample screening and structuring preprocessing on collected sample data or commodity description data, the structuring preprocessing comprises word segmentation processing and identification code extraction processing, the word segmentation processing refers to word segmentation processing on commodity names and commodity description texts of each commodity, the identification code extraction processing refers to extracting factory identification codes from national bar codes, and optionally, the manufacturer identification codes can be obtained by extracting specified fields of the national bar codes, for example, the first 7 digits in the national bar codes are the manufacturer identification codes. In addition, for sample data, commodity core keywords with stronger correlation with corresponding brands can be screened out from the segmented words, so that a home database can be obtained, the home database can comprise a mapping relation between a historical identification code and a brand party set, the home database can also comprise a mapping relation between a historical brand party and a keyword set, the home database can also comprise commodity SKU IDs (Stock Keeping Unit-Identity document, stock units-unique codes), the brand party set comprises at least one historical brand party, and the keyword set comprises at least one historical keyword. The brand word corpus stored in the home database is shown in table 1 below:
TABLE 1
The rule matching unit is responsible for performing custom rule matching on the commodity to be matched (i.e. the target article) based on the attribution database to confirm the target brand party to which the commodity to be matched belongs, and in one embodiment, the matching rule includes vendor identification code matching and keyword matching. The vendor identification code matching is used for searching a brand range to which the to-be-matched commodity belongs, and the keyword matching is used for determining a unique brand (namely the target brand party) to which the to-be-matched commodity belongs in the brand range. Exemplary, matched commodity data structures are shown in table 2 below:
TABLE 2
Based on the rule matching unit, the embodiment of the present invention provides an implementation manner of step S104, see the following steps a to e:
and a, extracting a designated field in the national bar code, and determining the manufacturer identification code of the target object. For example, the first 7 digits "6907992" in the national barcode "6907992104554" are extracted, and the "6907992" is the manufacturer identification code.
And b, searching at least one second candidate brand party matched with the manufacturer identification code in a pre-established home database. As can be seen from table 1, the home database contains the mapping relationship between the manufacturer identification code and the matching party, so that the second candidate brand party corresponding to the manufacturer identification code can be found in the home database. For example, the brand range corresponding to vendor identification code "6907992" includes "illite" and "Mongolian".
And c, judging whether the object names contain target keywords corresponding to the second candidate brand side. If yes, executing the step d; if not, executing step e. Illustratively, if the item name contains "illi", then "illi" is determined directly as the target brand party, and if the item name does not contain "illi" and "Mongolian", then the item name is subjected to a word segmentation process.
And d, if not, performing word segmentation on the object names to obtain a plurality of description keywords.
And e, determining the brand party characterized by the target keyword as a target brand party.
The embodiment of the invention improves the accuracy of determining the target brand party to which the target object belongs by combining the description key words of the commodity based on the manufacturer identification code.
For the model prediction unit, it is responsible for constructing a scoring model (also referred to as TFIDF model or brand probability prediction model) using TFIDF statistical rules, so that a target brand party of a commodity is predicted by the scoring model. The feature data is derived from commodity word segmentation results in a commodity corpus preprocessing unit. The model mainly comprises two parts of model construction and model prediction.
For the model construction part, the main purpose of model construction is to score commodity keywords appearing under each brand, and the scoring rule is mainly based on the TFIDF statistical rule, and the embodiment of the invention provides a scoring model construction method shown in the following steps 1 to 2:
step 1, for each historical brand party, carrying out statistical processing on each historical keyword in a keyword set corresponding to the historical brand party, and determining a first frequency and a second frequency of each historical keyword aiming at the historical brand party. The first frequency is used for representing the frequency of occurrence of the historical keywords in the keyword sets corresponding to the historical brand parties, and the second frequency is used for representing the frequency of occurrence of the historical keywords in the keyword sets corresponding to each historical brand party. For example, for brand side "dip", corresponding to keyword set 1 "dip, laundry powder, laundry liquid, full effect, whitening, incense, lavender, high concentration, clothing, natural, clean, laundry soap, lemon, sterilization, collar cleaning", taking "laundry powder" as an example, determining the number x (i.e., first frequency) of occurrences of "laundry powder" in the above keyword set 1, and determining the number y (i.e., second frequency) of occurrences of "laundry powder" in all corpora contained in the brand word corpus.
And 2, determining the sub-scoring value of each historical keyword for the brand party according to the first frequency and the second frequency of each historical keyword for the historical brand party. In one embodiment, the higher the frequency of occurrence of the history keyword in the branding party, the higher the sub-score value, and the higher the frequency of occurrence of the history keyword in the branding word corpus, the lower the sub-score value, that is, the sub-score value is positively correlated with the first frequency, and the sub-score value is negatively correlated with the second frequency. For example, referring to a schematic diagram of a scoring model shown in fig. 3, the scoring model defines a sub-score value of each keyword under each brand side, taking brand side "dip" as an example, where the sub-score value of the keyword "dip" is 5.1, the sub-score value of the keyword "washing powder" is 2.1, and the sub-score value of the keyword "washing liquid" is 1.9.
Based on the scoring model, in the subsequent application, the article to be predicted is segmented by using a segmentation word library in the article corpus preprocessing unit, then the sum of sub scoring values of each keyword in a certain brand party is used as a criterion, and the brand party with the highest total scoring value is used as the final selected brand (namely, the target brand party). The embodiment of the invention also provides an implementation manner for calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each description keyword, which is described in the following (1) to (2):
(1) And respectively determining the first candidate brand party corresponding to each description keyword in the attribution database. For example, the object name of the target object is "natural essence laundry detergent", the description keywords [ "natural", "essence", "laundry detergent" ] are obtained after the object is segmented, whether the description keywords comprise "natural", "essence", "laundry detergent" keywords are searched in the keyword set corresponding to each brand party of the attribution database, the description keywords are contained in the "eliminating stain" and the "floating softness", and the "eliminating stain" and the "floating softness" are determined to be the first candidate brand party.
(2) For each first candidate brand party, determining the sub-scoring values of the descriptive keywords for the first candidate brand party based on a pre-established scoring model, and taking the sum of the sub-scoring values as the total scoring value of the first candidate brand party. With continued reference to fig. 3, fig. 3 illustrates the subtyping values of "nature", "essence", and "laundry detergent" for "eliminating stains" and "softness", so that according to the scoring model, the total scoring value (denoted by Score (n)) is the sum of the scores of the matching words of the word segmentation result under the brand TFID model, as follows:
score (jigging) =natural: 0.3+ laundry detergent: 1.9=2.2;
score (softness) =natural: 0.2+ essence: 0.6=0.8.
The embodiment of the invention builds the scoring model for the keywords, and can accurately output the target brand party to which the target object belongs under the condition that the commodity has no manufacturer identification code.
The embodiment of the invention also provides an implementation manner of determining the target brand party of the target object according to the total scoring value of each first candidate brand party, please refer to the following (one) to (two):
(one) determining a third candidate brand party from the first candidate brand parties based on the total scoring value of each first candidate brand party. In an alternative embodiment, for each first candidate brand party, determining whether the total scoring value of the first candidate brand party is greater than a preset threshold; if so, the first candidate brand party is determined to be a third candidate brand party. For example, the preset threshold is 0.7, and since Score (dip) and Score (float) are both greater than 0.7, both "dip" and "float" are determined to be third candidate brands.
And (II) determining the third candidate brand party with the highest total scoring value as the target brand party of the target object. For example, "dip" is determined as the target brand side of the target item because of Score > Score.
For easy understanding, an example of application of the method for determining an article brand party is provided in the embodiment of the present invention, referring to a flowchart of another method for determining an article brand party shown in fig. 4, the method mainly includes the following steps S402 to S418:
step S402, loading commodity data and manufacturer identification code data of the commodity to be matched, wherein the commodity data comprises the commodity name and the commodity description text. For example, the trade name is "orchid long-acting clean water 750g shampoo preferential package", and the manufacturer identification code data is "6903148".
Step S404, judging whether the national bar code of the commodity to be matched can be matched with the manufacturer identification code. If yes, go to step S406; if not, step S412 is performed. In one embodiment, the vendor identification code "6903148" is looked up in the brand-segmentation corpus, if the vendor identification code "6903148" is looked up, step S406 is performed, and if the vendor identification code "6903148" is not looked up, step S412 is performed.
Step S406, obtaining the brand range corresponding to the manufacturer identification code. For example, looking up a brand range corresponding to vendor identification code "6903148" in a brand word corpus includes "Lishi, piaorou, panting".
Step S408, a history keyword associated with the matching range is acquired. In one embodiment, the matching scope includes a plurality of second candidate brand parties, each corresponding to a plurality of historical keywords, which may include keywords that characterize the brand party to which it belongs. For example, the history keywords associated with the matching scope include "power, softness, panting".
Step S410, judging whether the commodity name has the target keyword. If yes, go to step S418; if not, step S412 is performed. Suppose that the trade name is "orchid long-acting clean water 750g shampoo preferential package", which does not contain a vocabulary matching the above-described history keyword, step S412 is performed. Suppose that the trade name is "long-acting clean and smooth rinse 750g shampoo preferential package" which contains the word "soft" matching the above-described history keyword, step S418 is performed.
Step S412, inputting the commodity name into the word segmentation model to obtain a word segmentation list. The word segmentation model is used for carrying out word segmentation processing on commodity names, and a word segmentation list is used for displaying the description keywords in a list mode.
Step S414, score prediction is carried out on the word segmentation list through a TFIDF model, and a first candidate brand party corresponding to the commodity name and a total score value of the first candidate brand party are output.
In step S416, it is determined whether the total score value of the first candidate brand party is greater than a preset threshold. If so, taking the first candidate brand party with the highest total score value as a target brand party; if not, ending.
Step S418, writing the commodity to be matched and the target brand party into a business database.
According to the method for determining the brand side of the article, provided by the embodiment of the invention, the brand prediction flow is the flow of the target brand side to which the article to be matched belongs, wherein the article description data of the article to be matched is obtained by the rule matching module and the model prediction module. The whole process is that firstly, article description data of the commodity to be matched is loaded from a commodity related system, then, manufacturer identification codes and description keywords are determined, the national bar codes are respectively matched with the manufacturer identification codes, commodity names are matched with the keywords, and if the conditions are met, accurate brand relations are output. When the conditions are not met, the commodity names can be subjected to word segmentation processing, scoring is carried out through a TFIDF model, and the brand relationship which reaches the threshold value and has the highest score is obtained. The method for determining the brand of the article at least has the following characteristics:
(1) The automatic commodity brand relation maintenance flow can be used for quickly matching commodity brand data, and labor cost expenditure caused by manual maintenance is reduced.
(2) The accuracy of the commodity brand relationship is improved, and data support is provided for the brand cooperation operation of the follow-up merchants while the accuracy of the data is improved.
(3) And error data generated by manual maintenance is found in time, so that quick correction is realized.
For the method for determining an article brand side according to the foregoing embodiment, the embodiment of the present invention provides a device for determining an article brand side, referring to a schematic structural diagram of a device for determining an article brand side shown in fig. 5, where the device mainly includes the following parts:
a data acquisition module 502, configured to acquire item description data of a target item;
the word segmentation module 504 is configured to perform word segmentation processing on the item description data to obtain a plurality of description keywords if the item description data does not include the target keywords; the target keywords are used for representing target brand parties of target articles;
a scoring value calculation module 506, configured to calculate total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each description keyword;
the brand party determination module 508 is configured to determine a target brand party of the target object according to the total scoring value of each first candidate brand party.
According to the device for determining the brand of the object, when object description data do not contain target keywords, the first candidate brand parties are scored by using the scoring model and the description keywords, so that the target brand party is determined based on the total scoring value of each first candidate brand party, and further, the automatic brand party of the target brand party to which the target object belongs is realized.
In one embodiment, the item description data includes at least a country barcode and an item name; the word segmentation module 504 is further configured to: extracting a designated field in the national bar code, and determining a manufacturer identification code of the target object; searching at least one second candidate brand party matched with the manufacturer identification code in a pre-established attribution database; the attribution database at least comprises a mapping relation between a historical identification code and a brand party set, and the brand party set comprises at least one historical brand party; judging whether the object names contain target keywords corresponding to the second candidate brand party or not; if not, the article name is subjected to word segmentation processing to obtain a plurality of description keywords.
In one embodiment, the scoring value calculation module 506 is further configured to: respectively determining a first candidate brand party corresponding to each description keyword in a home database; the attribution database further comprises a mapping relation between the historical brand party and a keyword set, and the keyword set comprises at least one historical keyword; for each first candidate brand party, determining the sub-scoring values of the descriptive keywords for the first candidate brand party based on a pre-established scoring model, and taking the sum of the sub-scoring values as the total scoring value of the first candidate brand party.
In one embodiment, the model building module is configured to: for each historical brand party, carrying out statistical processing on each historical keyword in a keyword set corresponding to the historical brand party, and determining a first frequency and a second frequency of each historical keyword aiming at the historical brand party; the first frequency is used for representing the frequency of occurrence of the historical keywords in the keyword sets corresponding to the historical brand parties, and the second frequency is used for representing the frequency of occurrence of the historical keywords in the keyword sets corresponding to each historical brand party; each historical keyword is used for determining a sub-scoring value of each historical keyword for the brand party according to a first frequency and a second frequency of the historical brand party.
In one embodiment, the sub-scoring value is positively correlated with the first frequency and the sub-scoring value is negatively correlated with the second frequency.
In one implementation, the brand party determination module 508 is further to: determining a third candidate brand party from the first candidate brand parties based on the total scoring value of each first candidate brand party; and determining the third candidate brand party with the highest scoring value as the target brand party of the target object.
In one implementation, the brand party determination module 508 is further to: for each first candidate brand party, judging whether the total scoring value of the first candidate brand party is larger than a preset threshold value; if so, the first candidate brand party is determined to be a third candidate brand party.
The device provided by the embodiment of the present invention has the same implementation principle and technical effects as those of the foregoing method embodiment, and for the sake of brevity, reference may be made to the corresponding content in the foregoing method embodiment where the device embodiment is not mentioned.
The embodiment of the invention provides a server, which specifically comprises a processor and a storage device; the storage means has stored thereon a computer program which, when executed by the processor, performs the method of any of the embodiments described above.
Fig. 6 is a schematic structural diagram of a server according to an embodiment of the present invention, where the server 100 includes: a processor 60, a memory 61, a bus 62 and a communication interface 63, the processor 60, the communication interface 63 and the memory 61 being connected by the bus 62; the processor 60 is arranged to execute executable modules, such as computer programs, stored in the memory 61.
The memory 61 may include a high-speed random access memory (RAM, random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The communication connection between the system network element and at least one other network element is achieved via at least one communication interface 63 (which may be wired or wireless), and may use the internet, a wide area network, a local network, a metropolitan area network, etc.
Bus 62 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 6, but not only one bus or type of bus.
The memory 61 is configured to store a program, and the processor 60 executes the program after receiving an execution instruction, and the method executed by the apparatus for flow defining disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 60 or implemented by the processor 60.
The processor 60 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 60. The processor 60 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also Digital Signal Processors (DSP), application specific integrated circuits (ASIC Application Specific Integrated Circuit), off-the-shelf programmable gate arrays (Field-Programmable Gate Array FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory 61 and the processor 60 reads the information in the memory 61 and in combination with its hardware performs the steps of the method described above.
The computer program product of the readable storage medium provided by the embodiment of the present invention includes a computer readable storage medium storing a program code, where the program code includes instructions for executing the method described in the foregoing method embodiment, and the specific implementation may refer to the foregoing method embodiment and will not be described herein.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above examples are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention, but it should be understood by those skilled in the art that the present invention is not limited thereto, and that the present invention is described in detail with reference to the foregoing examples: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (9)
1. A method of determining a brand of an article, comprising:
acquiring article description data of a target article;
if the article description data does not contain the target keywords, performing word segmentation processing on the article description data to obtain a plurality of description keywords; wherein the target keyword is used for characterizing a target brand party of the target article;
calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each description keyword;
determining a target brand party for the target item based on the total scoring value for each of the first candidate brand parties;
the step of calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each of the descriptive keywords comprises the steps of:
respectively determining a first candidate brand party corresponding to each description keyword in a home database; wherein the attribution database comprises a mapping relation between a historical brand party and a keyword set, and the keyword set comprises at least one historical keyword;
for each first candidate brand party, determining the sub-scoring values of the descriptive keywords for the first candidate brand party based on a pre-established scoring model, and taking the sum of the sub-scoring values as the total scoring value of the first candidate brand party.
2. The method of claim 1, wherein the item description data includes at least a country barcode and an item name;
and if the article description data does not contain the target keywords, performing word segmentation processing on the article description data to obtain a plurality of description keywords, wherein the step comprises the following steps:
extracting a specified field in the national bar code, and determining a manufacturer identification code of the target object;
searching at least one second candidate brand party matched with the manufacturer identification code in a pre-established attribution database; the attribution database at least comprises a mapping relation between a historical identification code and a brand party set, wherein the brand party set comprises at least one historical brand party;
judging whether the object name contains a target keyword corresponding to the second candidate brand party or not;
if not, the article names are subjected to word segmentation processing to obtain a plurality of description keywords.
3. The method according to claim 2, wherein the method further comprises:
for each historical brand party, carrying out statistical processing on each historical keyword in a keyword set corresponding to the historical brand party, and determining a first frequency and a second frequency of each historical keyword aiming at the historical brand party; the first frequency is used for representing the frequency of occurrence of the historical keywords in the keyword sets corresponding to the historical brand sides, and the second frequency is used for representing the frequency of occurrence of the historical keywords in the keyword sets corresponding to each historical brand side;
each historical keyword is used for determining a sub-scoring value of each historical keyword for the brand party according to a first frequency and a second frequency of the historical brand party.
4. The method of claim 3, wherein the sub-scoring value is positively correlated with the first frequency and the sub-scoring value is negatively correlated with the second frequency.
5. The method of claim 1, wherein the step of determining a target brand party for the target item based on the total scoring value for each of the first candidate brand parties comprises:
determining a third candidate brand party from the first candidate brand parties based on the total scoring value of each of the first candidate brand parties;
and determining the third candidate brand party with the highest scoring value as the target brand party of the target object.
6. The method of claim 5, wherein the step of determining a third candidate brand party from the first candidate brand parties based on the total scoring value for each of the first candidate brand parties comprises:
for each first candidate brand party, judging whether the total scoring value of the first candidate brand party is larger than a preset threshold value;
if so, the first candidate brand party is determined to be a third candidate brand party.
7. A device for determining a brand of an article, comprising:
the data acquisition module is used for acquiring article description data of the target article;
the word segmentation module is used for carrying out word segmentation processing on the article description data to obtain a plurality of description keywords if the article description data does not contain the target keywords; wherein the target keyword is used for characterizing a target brand party of the target article;
the scoring value calculation module is used for calculating total scoring values of a plurality of first candidate brand parties based on a pre-established scoring model and each description keyword;
a brand party determination module configured to determine a target brand party of the target item based on the total scoring value for each of the first candidate brand parties;
the scoring value calculation module is further configured to:
respectively determining a first candidate brand party corresponding to each description keyword in a home database; wherein the attribution database further comprises a mapping relation between a historical brand party and a keyword set, and the keyword set comprises at least one historical keyword;
for each first candidate brand party, determining the sub-scoring values of the descriptive keywords for the first candidate brand party based on a pre-established scoring model, and taking the sum of the sub-scoring values as the total scoring value of the first candidate brand party.
8. A server comprising a processor and a memory, the memory storing computer executable instructions executable by the processor, the processor executing the computer executable instructions to implement the method of any one of claims 1 to 6.
9. A computer readable storage medium storing computer executable instructions which, when invoked and executed by a processor, cause the processor to implement the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111140101.5A CN113836916B (en) | 2021-09-28 | 2021-09-28 | Method, device and server for determining brand party of article |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111140101.5A CN113836916B (en) | 2021-09-28 | 2021-09-28 | Method, device and server for determining brand party of article |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113836916A CN113836916A (en) | 2021-12-24 |
CN113836916B true CN113836916B (en) | 2023-06-20 |
Family
ID=78970784
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111140101.5A Active CN113836916B (en) | 2021-09-28 | 2021-09-28 | Method, device and server for determining brand party of article |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113836916B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116521906B (en) * | 2023-04-28 | 2023-10-24 | 广州商研网络科技有限公司 | Meta description generation method, device, equipment and medium thereof |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10387568B1 (en) * | 2016-09-19 | 2019-08-20 | Amazon Technologies, Inc. | Extracting keywords from a document |
CN110457568A (en) * | 2018-05-03 | 2019-11-15 | 北京京东尚科信息技术有限公司 | The recognition methods of brand word and system, object recommendation method and system |
CN110750985A (en) * | 2018-07-04 | 2020-02-04 | 阿里巴巴集团控股有限公司 | Brand word recognition method, device, equipment and storage medium |
CN110781307A (en) * | 2019-11-06 | 2020-02-11 | 北京沃东天骏信息技术有限公司 | Target item keyword and title generation method, search method and related equipment |
CN111259660A (en) * | 2020-01-15 | 2020-06-09 | 中国平安人寿保险股份有限公司 | Method, device and equipment for extracting keywords based on text pairs and storage medium |
-
2021
- 2021-09-28 CN CN202111140101.5A patent/CN113836916B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10387568B1 (en) * | 2016-09-19 | 2019-08-20 | Amazon Technologies, Inc. | Extracting keywords from a document |
CN110457568A (en) * | 2018-05-03 | 2019-11-15 | 北京京东尚科信息技术有限公司 | The recognition methods of brand word and system, object recommendation method and system |
CN110750985A (en) * | 2018-07-04 | 2020-02-04 | 阿里巴巴集团控股有限公司 | Brand word recognition method, device, equipment and storage medium |
CN110781307A (en) * | 2019-11-06 | 2020-02-11 | 北京沃东天骏信息技术有限公司 | Target item keyword and title generation method, search method and related equipment |
CN111259660A (en) * | 2020-01-15 | 2020-06-09 | 中国平安人寿保险股份有限公司 | Method, device and equipment for extracting keywords based on text pairs and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113836916A (en) | 2021-12-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101999471B1 (en) | Information recommendation methods and devices | |
CN105989004B (en) | Information delivery preprocessing method and device | |
US20170154385A1 (en) | System and method for automatic validation | |
CN111523976A (en) | Commodity recommendation method and device, electronic equipment and storage medium | |
CN102609422A (en) | Class misplacing identification method and device | |
CN109558541B (en) | Information processing method and device and computer storage medium | |
CN112163424A (en) | Data labeling method, device, equipment and medium | |
CN112163553B (en) | Material price accounting method, device, storage medium and computer equipment | |
CN105205188A (en) | Method and device for recommending purchase material suppliers | |
WO2015190485A1 (en) | Method, system, and program for evaluating intellectual property right | |
US11757808B2 (en) | Data processing for enterprise application chatbot | |
CN112199451B (en) | Commodity identification method, commodity identification device, computer equipment and storage medium | |
CN113836916B (en) | Method, device and server for determining brand party of article | |
CN112395881A (en) | Material label construction method and device, readable storage medium and electronic equipment | |
CN110674388A (en) | Mapping method and device for push item, storage medium and terminal equipment | |
CN112818088A (en) | Commodity search data processing method, commodity search data processing device, commodity search equipment and storage medium | |
CN116150477A (en) | Financial information personalized recommendation method, device, equipment and medium | |
CN108595498B (en) | Question feedback method and device | |
CN112347231B (en) | Building inventory matching model construction method, matching method and device | |
CN114328844A (en) | Text data set management method, device, equipment and storage medium | |
CN110807646A (en) | Data analysis method, device and computer readable storage medium | |
CN115331004A (en) | Zero sample semantic segmentation method and device based on meaningful learning | |
Karpischek et al. | Detecting incorrect product names in online sources for product master data | |
CN113190666A (en) | Industrial intellectual property analysis method, system, equipment and storage medium | |
CN111695922A (en) | Potential user determination method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |