CN110135769B - Goods attribute filling method and device, storage medium and electronic terminal - Google Patents
Goods attribute filling method and device, storage medium and electronic terminal Download PDFInfo
- Publication number
- CN110135769B CN110135769B CN201810104645.8A CN201810104645A CN110135769B CN 110135769 B CN110135769 B CN 110135769B CN 201810104645 A CN201810104645 A CN 201810104645A CN 110135769 B CN110135769 B CN 110135769B
- Authority
- CN
- China
- Prior art keywords
- goods
- filled
- attribute
- filling
- item
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 61
- 239000011159 matrix material Substances 0.000 claims abstract description 49
- 238000010801 machine learning Methods 0.000 claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 19
- 230000011218 segmentation Effects 0.000 claims description 44
- 238000012545 processing Methods 0.000 claims description 32
- 238000013507 mapping Methods 0.000 claims description 15
- 238000012706 support-vector machine Methods 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims description 6
- 238000007637 random forest analysis Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 6
- 238000002790 cross-validation Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 2
- 239000010495 camellia oil Substances 0.000 description 10
- 238000010586 diagram Methods 0.000 description 4
- 239000008157 edible vegetable oil Substances 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 244000138502 Chenopodium bonus henricus Species 0.000 description 2
- 235000008645 Chenopodium bonus henricus Nutrition 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 240000004160 Capsicum annuum Species 0.000 description 1
- 235000008534 Capsicum annuum var annuum Nutrition 0.000 description 1
- 235000007862 Capsicum baccatum Nutrition 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 239000001728 capsicum frutescens Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 239000013585 weight reducing agent Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/08—Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
- G06Q10/087—Inventory or stock management, e.g. order filling, procurement or balancing against orders
- G06Q10/0875—Itemisation or classification of parts, supplies or services, e.g. bill of materials
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Economics (AREA)
- Tourism & Hospitality (AREA)
- Finance (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Development Economics (AREA)
- Artificial Intelligence (AREA)
- Entrepreneurship & Innovation (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Strategic Management (AREA)
- Accounting & Taxation (AREA)
- General Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
Abstract
The disclosure relates to the technical field of computers, in particular to an item attribute filling method, an item attribute filling device, a storage medium and an electronic terminal. The method comprises the following steps: acquiring a typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data; acquiring description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled; training the basic data by using a machine learning algorithm and generating a classifier model; and acquiring the attribute value of the goods to be filled by using the classifier model and the feature matrix, and automatically filling the attribute value of the goods to be filled according to the attribute value. The method and the device can train the classifier model by utilizing the complete goods attribute values of part of typical goods, so that the classifier model can adapt to different attributes of each goods, and further the accuracy of goods attribute filling is ensured.
Description
Technical Field
The disclosure relates to the technical field of computers, in particular to an item attribute filling method, an item attribute filling device, a storage medium and an electronic terminal.
Background
For the e-commerce platform, when the commodities or goods are classified and managed, or the user screens the commodities or goods, the operation can be performed according to the attribute information of the commodities or goods, and the management of the attribute information of the goods becomes critical. For example, the attribute information of the article may include: the size, brand, and origin of the display; color, style, size, texture, etc. of the shirt. In addition to the regular attribute information of the article, when a new article is added, it is also possible to add new attribute information.
When the conventional method is used for filling the attributes of the goods, each basic attribute of the goods needs to be filled manually. But such a manner is only applicable when the number of items is small; when the goods reach a certain order of magnitude, the time cost and the labor cost for manually filling in the properties of the goods become huge, and the working efficiency is lower. Although some filling algorithms exist in the existing schemes to automatically fill the property of the goods, certain defects still exist. For example, the existing filling algorithm needs to design different filling rules according to different properties of goods, has no universality, and the filling rate cannot be effectively ensured. For example, for the "screen size" attribute, the corresponding value fields, such as {4 inches, 4.5 inches, 4.8 inches, 5.2 inches … … }, need to be set in advance. When a new screen size "5.9 inches" appears, the screen size value cannot be filled without setting the "5.9 inches" in the value field in advance. For the color attribute, for example, the product information is "red pepper hua color mobile phone, black, 5 inches", then for the "color" attribute, because there are a plurality of keywords related to the "color" attribute in the product information, the filling algorithm may not be able to accurately obtain the "black" attribute value, and a situation of mismatching may occur. Meanwhile, for the goods attribute which can not be filled after the automatic filling by the filling algorithm, the filling is still needed to be performed manually, so that the filling rate is reduced, and the filling efficiency is reduced.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
It is an object of the present disclosure to provide an item property filling method, an item property filling apparatus, a storage medium, and an electronic terminal, which overcome, at least in part, one or more of the problems due to the limitations and disadvantages of the related art.
Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the disclosure.
According to a first aspect of the present disclosure, there is provided an item attribute filling method including:
Acquiring a typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data;
acquiring description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled;
training the basic data by using a machine learning algorithm and generating a classifier model;
And acquiring the attribute value of the goods to be filled by using the classifier model and the feature matrix, and automatically filling the attribute value of the goods to be filled according to the attribute value.
In one exemplary embodiment of the present disclosure, the acquiring a typical item from items to be filled includes:
Acquiring an article to be filled and extracting a characteristic field according to the description information of the article to be filled;
Clustering the goods to be filled according to the extracted characteristic fields;
randomly extracting one or more to-be-filled goods from each class of the clustering result as typical goods.
In an exemplary embodiment of the present disclosure, the extracting the feature field according to the description information of the item to be filled includes:
word segmentation processing is carried out on the description information of the goods to be filled;
And extracting the characteristic fields of the goods to be filled by using a word2vec model according to the segmented words obtained after the segmentation processing.
In an exemplary embodiment of the present disclosure, when generating the feature matrix according to the description information of the item to be filled, the method further includes:
judging whether a filling quantity m is received or not;
setting the category number of the to-be-filled item clusters as m when judging that the filling number m is received; wherein m >0.
In an exemplary embodiment of the present disclosure, the generating the feature matrix according to the description information of the item to be filled includes:
performing word segmentation processing on the description information of the goods to be filled to obtain word segmentation;
extracting feature fields of the obtained segmented words by using a preset model;
And carrying out vectorization processing and sparsification processing on the extracted characteristic fields respectively to generate a characteristic matrix.
In one exemplary embodiment of the present disclosure, the training the base data and generating a classifier model using a machine learning algorithm includes:
Respectively training the basic data by using n machine learning algorithms and obtaining n corresponding classifier models; wherein n >0;
evaluating the accuracy of filling the attribute values of the to-be-filled goods attribute by each classifier model through cross verification;
Calculating the weight of each classifier model for filling the attribute of the goods to be filled according to the filling accuracy of each classifier model for each attribute value; wherein,
Wherein w c is the weight of the classifier model c; n is the number of classifier models; accuracy cv is the accuracy of the cross-validation acquisition.
In an exemplary embodiment of the present disclosure, the obtaining the attribute value of the item to be filled using the classifier model and the feature matrix includes:
and selecting an output result of the classifier model as an attribute value of the goods to be filled according to the weight of each classifier model for filling each attribute of the goods to be filled.
In an exemplary embodiment of the present disclosure, the method further comprises:
calculating the relation score between each attribute value in the basic data and each goods to be filled according to the basic data and the description information of each goods to be filled by using an automatic dictionary model;
Calculating the confidence score S of the attribute value corresponding to the current attribute of the goods to be filled according to the relation score calculated by the automatic dictionary model and the weights of the n classifier models:
Wherein C is a set of classifier models; f c (i) e {0,1}, indicating whether the classifier model c fills in attribute values for the current item to be filled; w c E [0,1] represents the weight of the classifier c that fills in the attribute value at the current time; d (i) e [0,1] representing the relationship score calculated by the automatic dictionary model; w d e [0,1], representing the weight of the automatic dictionary model;
Judging whether the confidence score of the current attribute of the goods to be filled is larger than a preset accuracy;
And outputting the attribute value of the current attribute when the confidence score is judged to be larger than the accuracy rate.
In an exemplary embodiment of the present disclosure, the method further comprises:
and adding the output attribute value to the basic data.
In an exemplary embodiment of the present disclosure, the method further comprises:
And when judging that the confidence score of the current attribute of the goods to be filled is smaller than the preset accuracy, reserving the goods to be filled and waiting for filling of the next attribute.
In an exemplary embodiment of the disclosure, the calculating, using an automatic dictionary model, a relationship score between each attribute value in the basic data and each item to be filled according to the basic data and the description information of each item to be filled includes:
establishing the existing attribute value fields corresponding to the attributes according to the basic data;
performing word segmentation processing on the elements in the existing attribute value domain by utilizing jieba word segmentation models so as to obtain a first word segmentation set;
splitting and filtering the elements according to a first preset rule so as to obtain a second word segmentation list;
Merging and filtering the first word segmentation set and the second word segmentation list according to a second preset rule so as to obtain a third word segmentation set;
Establishing a mapping relation table between the elements and the third word segmentation set;
Traversing each element of the attribute value field and adding a mapping relation established by each element into the mapping relation table;
Searching and generating a prediction result in the mapping relation table according to the characteristic field in the description information of the goods to be filled.
In one exemplary embodiment of the present disclosure, the machine learning algorithm includes:
Support vector machine algorithm, polynomial naive Bayes algorithm, multi-layer perceptron algorithm, random forest algorithm and K-nearest neighbor algorithm.
In one exemplary embodiment of the present disclosure, the units of the item are merchandise, singles, or SKUs.
According to a second aspect of the present disclosure, there is provided an article attribute filling apparatus including:
The typical data acquisition module is used for acquiring typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data;
the feature matrix generation module is used for acquiring the description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled;
The classifier model generation module is used for training the basic data by utilizing a machine learning algorithm and generating a classifier model;
And the attribute filling execution module is used for acquiring the attribute value of the goods to be filled by utilizing the classifier model and the feature matrix, and automatically filling the attribute value of the goods to be filled according to the attribute value.
According to a third aspect of the present disclosure, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described item attribute filling method.
According to a fourth aspect of the present disclosure, there is provided an electronic terminal comprising:
A processor; and
A memory for storing executable instructions of the processor;
wherein the processor is configured to perform the following via execution of the executable instructions:
Acquiring a typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data;
acquiring description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled;
training the basic data by using a machine learning algorithm and generating a classifier model;
And acquiring the attribute value of the goods to be filled by using the classifier model and the feature matrix, and automatically filling the attribute value of the goods to be filled according to the attribute value.
According to the goods attribute filling method provided by the embodiment of the disclosure, typical goods with complete goods attribute data filled in are firstly obtained and used as basic data, a feature matrix is generated according to basic goods information of all goods, a machine learning algorithm is utilized to train according to technical data and generate a classifier, and finally, the generated classifier and the feature matrix can be utilized to automatically fill in attribute values of the goods to be filled in. By training the classifier by utilizing the complete article attribute data of part of typical articles, the computer can automatically learn the filling rules of the filled articles, so that the classifier model can adapt to different attributes of each article, and the learned rules are multiplexed on similar articles, thereby ensuring the accuracy of the article attribute filling. In addition, the man-hour cost can be greatly reduced.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
FIG. 1 schematically illustrates a schematic diagram of an item property filling method in an exemplary embodiment of the present disclosure;
FIG. 2 schematically illustrates a method of acquiring typical item attribute data in an exemplary embodiment of the present disclosure;
FIG. 3 schematically illustrates a method of generating a feature matrix for an item to be filled in an exemplary embodiment of the present disclosure;
FIG. 4 schematically illustrates a composition diagram of an item property filling apparatus in an exemplary embodiment of the present disclosure;
FIG. 5 schematically illustrates another schematic diagram of another article property filling apparatus in an exemplary embodiment of the present disclosure;
Fig. 6 schematically illustrates yet another schematic diagram of yet another article property filling apparatus in an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
The embodiment of the invention firstly provides a method for filling the goods attribute, which can be applied to the fields of e-commerce platforms, warehouse management and the like for automatically filling the goods and the goods attribute, in particular to filling the goods and the goods with larger quantity. For example: the goods attributes of the television set may include: screen size, brand, place of production, etc.; the item attributes of a shirt may include: color, style, and materials, etc. Referring to fig. 1, the above-described item attribute filling method may include the steps of:
step S1, acquiring a typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data;
Step S2, acquiring description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled;
step S3, training the feature matrix according to the basic data by using a machine learning algorithm and generating a classifier model;
and S4, acquiring the attribute value of the goods to be filled by using the classifier model and the feature matrix, and automatically filling the attribute value of the goods to be filled according to the attribute value.
According to the method for filling the goods attribute, provided by the example embodiment, the classifier is trained by utilizing the goods attribute data of part of typical goods, so that a computer can automatically learn filling rules of filled goods, a classifier model can adapt to different attributes of each goods, the learned rules are multiplexed on similar goods, and therefore accuracy of filling the goods attribute is guaranteed. In addition, the man-hour cost can be greatly reduced.
Hereinafter, each step in the method for filling the property of the article in the present exemplary embodiment will be described in more detail with reference to the accompanying drawings and examples.
Step S1, acquiring a typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data.
In this example embodiment, referring to fig. 2, a method of acquiring a typical item from items to be filled may include:
Step S11, obtaining an article to be filled and extracting a characteristic field according to the description information of the article to be filled;
step S12, clustering the goods to be filled according to the extracted characteristic fields;
step S13, randomly extracting one or more goods to be filled in from each class of the clustering result as typical goods;
And step S14, determining attribute values of the various attributes of the typical goods.
The to-be-filled goods in the embodiment of the present disclosure refers to goods for which the attribute value is not filled. For all the goods to be filled which are not filled with the attribute values, the characteristic fields can be extracted from the description information of the goods to be filled, and a characteristic matrix can be generated according to the extracted characteristic fields. Then classifying the goods by means of a clustering algorithm, and randomly selecting part of the goods in each category as typical goods. For the attribute value acquisition of the typical goods, the attribute values of all the attributes of the typical goods can be accurately filled in by adopting a manual filling mode or an algorithm automatic filling and manual auxiliary filling mode, and the attribute values corresponding to the attributes of the typical goods are used as basic data.
For the clustering result, an article clustering result list can be generated, and an article is randomly extracted from each type of the article clustering result list to serve as a typical article, so that the calculation amount is reduced as much as possible while the filling accuracy of the follow-up attribute is ensured. Or to ensure the validity of the basic data, a plurality of items may be randomly extracted as typical items in each category, for example, 2 or 3 items may be randomly selected as typical items in each category.
In this example embodiment, the extracting the feature field according to the description information of the to-be-filled item may include:
step S111, word segmentation processing is carried out on the description information of the goods to be filled;
and step S112, extracting the characteristic fields of the goods to be filled by using a word2vec model according to the segmented words obtained after the segmentation processing.
The word2vec model maps the relation of each word to the same coordinate system according to the up-down relation of each word in the article, so as to form a large matrix, and the relation of each word is reflected under the matrix. The relation of the words is obtained through context correlation, the context correlation has front-to-back sequence, and the word2vec model simultaneously adopts a Huffman compression algorithm, so that good weight reduction processing is performed on some hot words. Thus, there is a good effect on some similar words, or on the expansion of words. Therefore, the feature fields in the goods description information can be accurately extracted through the word2vec model.
For example, the description information of one article includes: the first-stage pressed tea seed oil of the 25-degree camellia oil edible oil is bottled in 1.1L, and the extracted characteristic fields of the first-stage pressed tea seed oil can comprise: tea seed oil, 1.1L, 25 degrees, bottling, squeezing, etc.
Based on the above, in this exemplary embodiment, when generating the feature matrix according to the description information of the item to be filled, the method further includes:
Step 12-1, judging whether a filling quantity m is received;
Step 12-2, setting the category number of the to-be-filled item clusters as m when judging that the filling number m is received; wherein m >0.
Before clustering treatment is carried out on the goods to be filled, whether the quantity m of the goods to be filled which is input by a user is received or not can be judged, when the input of the user is judged to be received, the quantity m of the clustered categories can be set according to the user requirement, and the quantity m of the goods to be filled each time can be set at the same time, so that the quantity control of the goods to be filled once is realized when the attribute is filled.
Step S2, the description information of the goods to be filled is obtained, and a feature matrix is generated according to the description information of the goods to be filled.
In this exemplary embodiment, referring to fig. 3, the step S2 may specifically include:
Step S21, performing word segmentation processing on the description information of the goods to be filled to obtain segmented words;
s22, extracting feature fields of the obtained segmented words by using a preset model;
step S23, vectorization processing and sparsification processing are respectively carried out on the extracted characteristic fields to generate a characteristic matrix.
After determining the attribute values of the attributes of the typical goods, chinese word segmentation processing can be performed on the description information of all goods to be filled in, and feature fields can be extracted to remove punctuation marks, error characters, special symbols and the like in the description information of the goods. In the extracting of the feature field, a word2vec model or other models can be used for extracting the feature field. In addition, the feature fields can be subjected to sparsification processing and vectorization processing respectively to obtain a feature matrix of the goods to be filled. In other exemplary embodiments of the present disclosure, when the feature matrix is acquired, extraction of the feature field of the segmentation set may also be performed on all the items including the item to be filled, the typical item, and other items already having the complete attribute value, so as to expand the capacity of the feature matrix and further improve the effectiveness of the feature matrix.
The above-mentioned sparsification processing can be completed by a sparsification model, and redundant data can be effectively removed. The sparse model can adopt a sparse model, a group sparse model, a tree sparse model, a graph sparse model and the like. The vectorization processing is carried out on the segmented words, so that the data processing speed can be effectively improved, and the working efficiency is further improved.
And step S3, training the feature matrix according to the basic data by using a machine learning algorithm and generating a classifier model.
In this example embodiment, the above-mentioned attribute data of typical goods selected according to the goods to be filled may be used as a training set and trained to generate a classifier model, and the obtained classifier model is used to automatically fill in attribute values of the remaining goods to be filled.
For example, the machine learning algorithm may employ an algorithm model such as a support vector machine algorithm, a polynomial naive bayes algorithm, a multi-layer perceptron algorithm, a random forest algorithm, or a K-nearest neighbor algorithm.
Further, in other exemplary embodiments of the present disclosure, the step S3 may further include:
step S31, training the basic data by using n machine learning algorithms respectively and obtaining n corresponding classifier models; wherein n >0;
Step S32, evaluating the filling accuracy of the attribute values of the to-be-filled goods attribute by each classifier model through cross verification;
Step S33, calculating the weight of each classifier model for filling the attribute of the goods to be filled according to the accuracy of each classifier model for filling the attribute value; wherein,
Wherein w c is the weight of the classifier model c; n is the number of classifier models; accuracy cv is the Accuracy obtained by cross verification;
in the above formula, TP represents that the true value is positive and the predicted value is also positive; TN represents that the true value is negative and the predicted value is negative; FP represents that the true value is negative and the predicted value is positive; FN indicates that the true value is positive and the predicted value is negative.
In this exemplary embodiment, when filling in an attribute value of a certain attribute of a good, in order to avoid a situation that the accuracy of filling in the attribute value is low due to using a single classifier model, multiple machine learning algorithms may be used to train the basic data to generate multiple classifier models at the same time, for example, the above-mentioned support vector machine algorithm, polynomial naive bayes algorithm, multi-layer perceptron algorithm, random forest algorithm, K-nearest neighbor algorithm, or other algorithms are used. And automatically filling in attribute values of the current attributes by utilizing the classifier models respectively. For example, 3, 4, or 5 machine learning algorithms may be selected simultaneously for training and 3, 4, or 5 classifier models may be generated.
Because the filling effect of different classifier models on the same attribute value is often different, even the result of the same classifier on a single attribute in multiple filling is also different. Therefore, in order to obtain a more accurate attribute value filling result, a plurality of classifier models can be made to fill in the current attribute value, the accuracy of filling in the attribute value of the current attribute by each classifier model is evaluated in a cross-validation mode, and the result corresponding to the classifier model with the highest accuracy in the evaluation result is selected and output. Or the evaluation result can be converted into the weight of each classifier model when filling each attribute, and the attribute value can be filled according to the weight value of each classifier model for the current attribute.
For example, for an item "shirt," its attributes that need to be filled include: color, brand, material, place of origin, price, collar type, cuff type, etc. When the attribute values are automatically filled, the five machine learning algorithms can be utilized to respectively obtain five corresponding classifier models, the accuracy of filling each attribute by the five classifier models is evaluated, and the result corresponding to the classifier model with the highest accuracy is selected as the attribute value filling result to be output. Or calculating the corresponding weight value of each classifier model for the current attribute according to the accuracy evaluation result of the five classifier models for each attribute. And then, for the current attribute, weighting the result of each classifier according to the weight to obtain a final result. For example, if the accuracy of the random forest algorithm model is higher than that of the multi-layer perceptron algorithm model and is twice the accuracy of the multi-layer perceptron algorithm model for the attribute of 'collarband type' through calculation. Assuming that when the attribute of the collar type is filled into a certain goods shirt, the result obtained by the random forest algorithm model is a collar standing, the result obtained by the multi-layer perceptron algorithm is a round collar, and when the final weight is calculated, the weight ticket number obtained by the attribute filling result of the collar standing is twice the weight ticket number obtained by the round collar. Or when the attribute of color is filled, if the accuracy of the classifier model corresponding to the support vector machine algorithm is highest, the weight W c1 of the classifier model C 1 corresponding to the support vector machine algorithm is the largest, which means that the contribution of the classifier model corresponding to the support vector machine algorithm to the prediction result is the largest when the attribute of color is filled.
If the weights of the classifier models filled in for a certain attribute value are the same, the classifier models have the same influence on the final output result, which is equivalent to that each classifier model fills in a prediction result for each attribute to give a ticket with the same weight.
And S4, acquiring the attribute value of the goods to be filled by using the classifier model and the feature matrix, and automatically filling the attribute value of the goods to be filled according to the attribute value.
Preferably, based on the foregoing, in the present exemplary embodiment, when the attribute values of the goods are automatically filled in by using the classifier model and the feature matrix, the method may further include:
Step S41, calculating the relation score between each attribute value in the basic data and each item to be filled by using an automatic dictionary model according to the basic data and the description information of each item to be filled;
Step S42, calculating a confidence score S of the attribute value corresponding to the current attribute of the goods to be filled according to the relation score calculated by the automatic dictionary model and the weights of the n classifier models:
Wherein C is a set of classifier models; f c (i) e {0,1}, indicating whether the classifier model c fills in attribute values for the current item to be filled; w c epsilon [0,1] represents the weight of the classifier model c that the attribute value is filled in at the present time; d (i) e [0,1] representing the relationship score calculated by the automatic dictionary model; w d e [0,1], representing the weight of the automatic dictionary model;
step S43, judging whether the confidence score of the current attribute of the goods to be filled is larger than a preset accuracy;
And S44, outputting the attribute value of the current attribute when the confidence score is judged to be larger than the accuracy.
In this embodiment, after the weight value automatically filled in by each classifier model for each attribute is obtained, the relationship score between each attribute value in the basic data and each item to be filled in may be calculated according to the automatic dictionary model, and whether to output the attribute value of the current attribute of the item obtained by the classifier model is determined by using the relationship score and the weight value of each classifier model.
For example, when the confidence score of the current attribute value is determined to be greater than the preset accuracy, the attribute value of the current attribute of the article may be output; if the confidence score is less than the preset accuracy, the attribute value is not output, and the goods are put back to the set of goods to be filled, and the next filling is waited. The preset accuracy can be set when the user sets the clustering quantity of the goods to be filled, or can be set independently. The present disclosure is not particularly limited thereto.
In this exemplary embodiment, calculating, using an automatic dictionary model, the relationship score between each attribute value in the basic data and each item to be filled according to the basic data and the description information of each item to be filled includes:
step S301, an existing attribute value field corresponding to each attribute is established according to the basic data;
Step S302, performing word segmentation processing on the elements in the existing attribute value domain by utilizing jieba word segmentation models so as to obtain a first word segmentation set;
Step S303, splitting and filtering the elements according to a first preset rule so as to obtain a second word segmentation list;
Step S304, merging and filtering the first word segmentation set and the second word segmentation list according to a second preset rule so as to obtain a third word segmentation set;
Step S305, establishing a mapping relation table between the elements and the third word segmentation set;
Traversing each element of the attribute value field and adding a mapping relation established by each element into the mapping relation table;
And step S306, searching and generating a prediction result in the mapping relation table according to the characteristic field in the description information of the goods to be filled.
For example, for the capacity attribute, existing base data includes: 1.1-3L, 0.6-1L, 0-0.5L, others, more than 5L, 3.1-5L, 0.5-1L, 1-3L, unlimited, 3-5L. The above data make up the existing capacity attribute value field attVals. And for each element of the capacity attribute value field, splitting by using jieba word segmentation models. For example, "1.1-3 liters" yields a first set of partitionals attValToks after resolution: [ 1.1 ] -, 3, liter ].
Using list (1.1-3L) for "1.1-3L", SINGLE CHARS is obtained after resolution: [ 1 ], 1, -, 3, liter ]. Then, using the regular expression method, str beginning with letters and numbers in SINGLE CHARS is removed, so as to obtain a second word list: FILTERED SINGLE CHARS: [.- -, rising ]. Combining the first word segmentation set with the second word segmentation list to obtain a new word segmentation set: [ 1.1 ] -, 3, liter ]. Removing the Chinese punctuation and the English punctuation in the word segmentation set to obtain a third word segmentation set: [ 1.1, 3, liter ]. At this time, a mapping relation list of the original element 1.1-3L and the third word segmentation set is established: attValLookup: {1.1: [ 1.1-3 liters ]; 3: [ 1.1-3 liters ]; lifting: [ 1.1-3 liters, 1.1-3 liters ].
Based on the above method, all elements in the traversal capacity attribute value range are added into the mapping relation list, and the following can be obtained:
{1.1: [ 1.1-3 liters ]; 3: [ 1.1-3 liters, 1-3 liters, 3-5 liters ]. Lifting: [ 1.1-3L, 1.1-3L 0.6-1 liter, 0.6-1 liter 0.6-1 liter 3.1-5L, 3.1-5L 0.5-1L, 0.5-1L 1-3L, 3-5L and 3-5L; 0.6: [ 0.6-1 liter ]; 1: [ 0.6-1 liter, 0.5-1 liter, 1-3 liter ]. 0: [ 0-0.5 liter ]; 0.5: [ 0-0.5L, 0.5-1L ]; other: [ other ]; it comprises the following steps: [ other ]; it comprises the following steps: [ other ]; 5: [ 5 liters or more, 3.1 to 5 liters, 3 to 5 liters ]. The following steps: [ 5 liters or more ]; to: [ 5 liters or more ]; and (3) the following steps: [ 5 liters or more ]; 3.1: [ 3.1-5 liters ]; not: [ not limited, not limited ]; limit: [ no limitation, no limitation ].
For example, the name in the description of a good includes "1.1L bottle of first-stage pressed tea seed oil of 25 ° camellia oil edible oil". Each key field key in the mapping list is looked up in turn. For example, "1" is in a "25 degree camellia oil edible oil first stage pressed tea seed oil 1.1L bottle". Then for the value corresponding to "1": [ 0.6-1 liter, 0.5-1 liter, 1-3 liter ], three of them are added to keys of predictions For Sku, respectively, and their value=1. If one of them is already in key predictions For Sku, the corresponding value+1 is set.
For the "25 degrees camellia oil edible oil first-stage pressed tea seed oil 1.1L bottle", there are "1.1", "1", "5" therein, so the obtained result is: predictions For Sku: {1.1-3 liters: 1, a step of; 0.6-1 liter: 1, a step of; 0.5-1 liter: 1, a step of; 1-3 liters: 1, a step of; 5 liters or more: 1, a step of; 3.1-5 liters: 1, a step of; 3-5 liters: 1}. And (3) carrying out normalization treatment on the obtained product: each value/sum (value), then we get: {1.1-3 liters: 1/7;0.6-1 liter: 1/7;0.5-1 liter: 1/7;1-3 liters: 1/7;5 liters or more: 1/7;3.1-5 liters: 1/7;3-5 liters: 1/7}. For each item, its corresponding predictions For Sku is calculated as a prediction of the automatic dictionary model.
The weights w d of the automatic dictionary model described above may also be determined by cross-validation methods. For example, take 5-fold cross-validation as an example; first, determining a selection interval [ 0.1,0.2,0.3, …,0.9 ] of wd; the filled rows were averaged 5 parts, four of which trained classifiers and automatic dictionaries, and the other one used as predictions, assuming weights for each classifier to determine, let wd=0.1, calculated in the prediction data:
Wd is traversed [ 0.1,0.2,0.3, …,0.9 ] to obtain a 1,A2,A3,…,A9, respectively. Assuming that the calculated final a 5 is maximum, wd=0.5.
In other exemplary embodiments of the present disclosure, the above method may also be employed in units of goods, singles, or SKUs when filling the property of the goods. The present disclosure is not particularly limited thereto.
And judging whether the confidence score filled in each attribute value is larger than the preset accuracy rate or not by combining the automatic dictionary model with each classifier model, thereby meeting the requirement of a user on the accuracy rate of attribute filling. And the goods with the confidence score smaller than the preset accuracy rate are added to the set of goods to be filled, and the next automatic filling is waited, after a plurality of iterations, the filling rate can be effectively ensured, and further the accuracy rate of the filling of the goods attribute is ensured on the premise of ensuring the accuracy rate, so that the goods attribute filling method can be suitable for various types of goods and goods attributes.
Further, referring to fig. 4, there is also provided in the embodiment of the present example an article attribute filling apparatus 4 including: a typical data acquisition module 41, a feature matrix generation module 42, a classifier model generation module 43, and an attribute fill execution module 44. Wherein:
The typical data acquisition module 41 may be configured to acquire a typical item from items to be filled and acquire an attribute value of the typical item as basic data.
The feature matrix generation module 42 may be configured to obtain description information of the to-be-filled item and generate a feature matrix according to the description information of the to-be-filled item.
The classifier model generation module 43 may be configured to train the underlying data using a machine learning algorithm and generate a classifier model.
The attribute filling execution module 44 may obtain the attribute value of the to-be-filled item by using the classifier model and the feature matrix, and automatically fill the attribute value of the to-be-filled item according to the attribute value.
The specific details of each module in the above-mentioned goods attribute filling device are described in detail in the corresponding goods attribute filling method, so that the details are not repeated here.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above data query method is also provided.
Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 5. The electronic device 600 shown in fig. 5 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in fig. 5, the electronic device 600 is embodied in the form of a general purpose computing device. Components of electronic device 600 may include, but are not limited to: the at least one processing unit 610, the at least one memory unit 620, and a bus 630 that connects the various system components, including the memory unit 620 and the processing unit 610.
Wherein the storage unit stores program code that is executable by the processing unit 610 such that the processing unit 610 performs steps according to various exemplary embodiments of the present invention described in the above-described "exemplary methods" section of the present specification. For example, the processing unit 610 may perform S1 as shown in fig. 1 to acquire a typical item from items to be filled and acquire an attribute value of the typical item as basic data; s2: acquiring description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled; s3: training the basic data by using a machine learning algorithm and generating a classifier model; s4: and acquiring the attribute value of the goods to be filled by using the classifier model and the feature matrix, and automatically filling the attribute value of the goods to be filled according to the attribute value.
The storage unit 620 may include readable media in the form of volatile storage units, such as Random Access Memory (RAM) 6201 and/or cache memory unit 6202, and may further include Read Only Memory (ROM) 6203.
The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 630 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 650. Also, electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. As shown, network adapter 660 communicates with other modules of electronic device 600 over bus 630. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the "exemplary methods" section of this specification, when said program product is run on the terminal device.
Referring to fig. 6, a program product 800 for implementing the above-described method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Furthermore, the above-described drawings are only schematic illustrations of processes included in the method according to the exemplary embodiment of the present invention, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (16)
1. A method of filling an item property, comprising:
Acquiring a typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data;
acquiring description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled;
Training the base data and generating a classifier model using a machine learning algorithm, comprising: respectively training the basic data by using n machine learning algorithms and obtaining n corresponding classifier models; wherein n >0; evaluating the accuracy of filling the attribute values of the to-be-filled goods attribute by each classifier model through cross verification; calculating the weight of each classifier model for filling the attribute of the goods to be filled according to the accuracy rate of each classifier model for filling the attribute value;
Acquiring attribute values of the goods to be filled by using the classifier model and the feature matrix, and automatically filling the attribute values of the goods to be filled according to the attribute values;
the method further comprises the steps of: calculating the relation score between each attribute value in the basic data and each goods to be filled according to the basic data and the description information of each goods to be filled by using an automatic dictionary model; calculating a confidence score S of an attribute value corresponding to the current attribute of the goods to be filled according to the relation score calculated by the automatic dictionary model and the weights of the n classifier models; judging whether the confidence score of the current attribute of the goods to be filled is larger than a preset accuracy; and outputting the attribute value of the current attribute when the confidence score is judged to be larger than the accuracy rate.
2. The item property filling method according to claim 1, wherein the acquiring a typical item from items to be filled comprises:
Acquiring an article to be filled and extracting a characteristic field according to the description information of the article to be filled;
Clustering the goods to be filled according to the extracted characteristic fields;
randomly extracting one or more to-be-filled goods from each class of the clustering result as typical goods.
3. The item attribute filling method of claim 2, wherein the extracting feature fields according to the description information of the item to be filled comprises:
word segmentation processing is carried out on the description information of the goods to be filled;
And extracting the characteristic fields of the goods to be filled by using a word2vec model according to the segmented words obtained after the segmentation processing.
4. The item attribute filling method according to claim 2, wherein when generating a feature matrix from the description information of the item to be filled, the method further comprises:
judging whether a filling quantity m is received or not;
setting the category number of the to-be-filled item clusters as m when judging that the filling number m is received; wherein m >0.
5. The item attribute filling method of claim 1, wherein the generating a feature matrix according to the description information of the item to be filled comprises:
performing word segmentation processing on the description information of the goods to be filled to obtain word segmentation;
extracting feature fields of the obtained segmented words by using a preset model;
And carrying out vectorization processing and sparsification processing on the extracted characteristic fields respectively to generate a characteristic matrix.
6. The method of claim 1, wherein calculating the weight of each classifier model for filling in the property of the item to be filled in comprises:
Wherein w c is the weight of the classifier model c; n is the number of classifier models; accuracy cv is the accuracy of the cross-validation acquisition.
7. The method of claim 6, wherein the obtaining the attribute values of the item to be filled using the classifier model and the feature matrix comprises:
and selecting an output result of the classifier model as an attribute value of the goods to be filled according to the weight of each classifier model for filling each attribute of the goods to be filled.
8. The method according to claim 1, wherein calculating the confidence score S of the attribute value corresponding to the current attribute of the item to be filled comprises:
Wherein C is a set of classifier models; f c (i) e {0,1}, indicating whether the classifier model c fills in attribute values for the current item to be filled; w c E [0,1] represents the weight of the classifier c that fills in the attribute value at the current time; d (i) e [0,1] representing the relationship score calculated by the automatic dictionary model; w d e 0,1, represents the weight of the automatic dictionary model.
9. The article attribute filling method of claim 8 further comprising:
and adding the output attribute value to the basic data.
10. The article attribute filling method of claim 8 further comprising:
And when judging that the confidence score of the current attribute of the goods to be filled is smaller than the preset accuracy, reserving the goods to be filled and waiting for filling of the next attribute.
11. The method of claim 8, wherein calculating the relationship score between each attribute value in the basic data and each item to be filled according to the basic data and the description information of each item to be filled by using an automatic dictionary model comprises:
establishing the existing attribute value fields corresponding to the attributes according to the basic data;
performing word segmentation processing on the elements in the existing attribute value domain by utilizing jieba word segmentation models so as to obtain a first word segmentation set;
splitting and filtering the elements according to preset rules to obtain a second word segmentation list;
Merging and filtering the first word segmentation set and the second word segmentation list according to a preset rule to obtain a third word segmentation set;
Establishing a mapping relation table between the elements and the third word segmentation set;
Traversing each element of the attribute value field and adding a mapping relation established by each element into the mapping relation table;
Searching and generating a prediction result in the mapping relation table according to the characteristic field in the description information of the goods to be filled.
12. The method of claim 6, wherein the machine learning algorithm comprises:
Support vector machine algorithm, polynomial naive Bayes algorithm, multi-layer perceptron algorithm, random forest algorithm and K-nearest neighbor algorithm.
13. The item property filling method of claim 1, wherein the item is in units of commodity, single item, or SKU.
14. An article property filling apparatus, comprising:
The typical data acquisition module is used for acquiring typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data;
the feature matrix generation module is used for acquiring the description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled;
The classifier model generation module is used for training the basic data by using a machine learning algorithm and generating a classifier model, and comprises the following steps: respectively training the basic data by using n machine learning algorithms and obtaining n corresponding classifier models; wherein n >0; evaluating the accuracy of filling the attribute values of the to-be-filled goods attribute by each classifier model through cross verification; calculating the weight of each classifier model for filling the attribute of the goods to be filled according to the accuracy rate of each classifier model for filling the attribute value;
The attribute filling execution module is used for acquiring the attribute value of the goods to be filled by utilizing the classifier model and the feature matrix, and automatically filling the attribute value of the goods to be filled according to the attribute value;
The device is also used for calculating the relation score between each attribute value in the basic data and each goods to be filled by utilizing an automatic dictionary model according to the basic data and the description information of each goods to be filled; calculating a confidence score S of an attribute value corresponding to the current attribute of the goods to be filled according to the relation score calculated by the automatic dictionary model and the weights of the n classifier models; judging whether the confidence score of the current attribute of the goods to be filled is larger than a preset accuracy; and outputting the attribute value of the current attribute when the confidence score is judged to be larger than the accuracy rate.
15. A storage medium having stored thereon a computer program which when executed by a processor implements the item property filling method of any one of claims 1 to 13.
16. An electronic terminal, comprising:
A processor; and
A memory for storing executable instructions of the processor;
wherein the processor is configured to perform the following via execution of the executable instructions:
Acquiring a typical goods from goods to be filled and acquiring attribute values of the typical goods as basic data;
acquiring description information of the goods to be filled and generating a feature matrix according to the description information of the goods to be filled;
Training the base data and generating a classifier model using a machine learning algorithm, comprising: respectively training the basic data by using n machine learning algorithms and obtaining n corresponding classifier models; wherein n >0; evaluating the accuracy of filling the attribute values of the to-be-filled goods attribute by each classifier model through cross verification; calculating the weight of each classifier model for filling the attribute of the goods to be filled according to the accuracy rate of each classifier model for filling the attribute value;
Acquiring attribute values of the goods to be filled by using the classifier model and the feature matrix, and automatically filling the attribute values of the goods to be filled according to the attribute values;
the processor is further used for executing, and calculating the relation score between each attribute value in the basic data and each goods to be filled according to the basic data and the description information of each goods to be filled by using an automatic dictionary model; calculating a confidence score S of an attribute value corresponding to the current attribute of the goods to be filled according to the relation score calculated by the automatic dictionary model and the weights of the n classifier models; judging whether the confidence score of the current attribute of the goods to be filled is larger than a preset accuracy;
And outputting the attribute value of the current attribute when the confidence score is judged to be larger than the accuracy rate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810104645.8A CN110135769B (en) | 2018-02-02 | 2018-02-02 | Goods attribute filling method and device, storage medium and electronic terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810104645.8A CN110135769B (en) | 2018-02-02 | 2018-02-02 | Goods attribute filling method and device, storage medium and electronic terminal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110135769A CN110135769A (en) | 2019-08-16 |
CN110135769B true CN110135769B (en) | 2024-09-20 |
Family
ID=67566977
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810104645.8A Active CN110135769B (en) | 2018-02-02 | 2018-02-02 | Goods attribute filling method and device, storage medium and electronic terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110135769B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111859878A (en) * | 2020-07-29 | 2020-10-30 | 广州易行信息技术有限公司 | Intelligent material attribute value filling method |
US11775565B2 (en) * | 2020-10-14 | 2023-10-03 | Coupang Corp. | Systems and methods for database reconciliation |
CN112365365A (en) * | 2020-11-10 | 2021-02-12 | 贵州电网有限责任公司 | Method for counting business expansion file accuracy of marketing system |
CN114936849A (en) * | 2022-06-28 | 2022-08-23 | 中国电信股份有限公司 | Attribute value determination method and device, nonvolatile storage medium and computer equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104866578A (en) * | 2015-05-26 | 2015-08-26 | 大连理工大学 | Hybrid filling method for incomplete data |
CN106919957A (en) * | 2017-03-10 | 2017-07-04 | 广州视源电子科技股份有限公司 | Method and device for processing data |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7660779B2 (en) * | 2004-05-12 | 2010-02-09 | Microsoft Corporation | Intelligent autofill |
US10102195B2 (en) * | 2014-06-25 | 2018-10-16 | Amazon Technologies, Inc. | Attribute fill using text extraction |
CN106295175B (en) * | 2016-08-09 | 2018-12-14 | 西安电子科技大学 | Station meteorological data missing value fill method based on svd algorithm |
CN106407258A (en) * | 2016-08-24 | 2017-02-15 | 广东工业大学 | Missing data prediction method and apparatus |
-
2018
- 2018-02-02 CN CN201810104645.8A patent/CN110135769B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104866578A (en) * | 2015-05-26 | 2015-08-26 | 大连理工大学 | Hybrid filling method for incomplete data |
CN106919957A (en) * | 2017-03-10 | 2017-07-04 | 广州视源电子科技股份有限公司 | Method and device for processing data |
Also Published As
Publication number | Publication date |
---|---|
CN110135769A (en) | 2019-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6929971B2 (en) | Neural network-based translation of natural language queries into database queries | |
US11334635B2 (en) | Domain specific natural language understanding of customer intent in self-help | |
AU2016225947B2 (en) | System and method for multimedia document summarization | |
US20230102337A1 (en) | Method and apparatus for training recommendation model, computer device, and storage medium | |
CN110135769B (en) | Goods attribute filling method and device, storage medium and electronic terminal | |
US11429405B2 (en) | Method and apparatus for providing personalized self-help experience | |
WO2023011382A1 (en) | Recommendation method, recommendation model training method, and related product | |
CN110827112B (en) | Deep learning commodity recommendation method and device, computer equipment and storage medium | |
US20230096118A1 (en) | Smart dataset collection system | |
CN111078842A (en) | Method, device, server and storage medium for determining query result | |
CN111461345A (en) | Deep learning model training method and device | |
CN111966886A (en) | Object recommendation method, object recommendation device, electronic equipment and storage medium | |
CN111695024A (en) | Object evaluation value prediction method and system, and recommendation method and system | |
CN111353838A (en) | Method and device for automatically checking commodity category | |
CN111209351A (en) | Object relation prediction method and device, object recommendation method and device, electronic equipment and medium | |
CN115293332A (en) | Method, device and equipment for training graph neural network and storage medium | |
CN115018549A (en) | Method for generating advertisement file, device, equipment, medium and product thereof | |
CA3144405A1 (en) | Text information recognizing method, extracting method, devices and system | |
CN113139115A (en) | Information recommendation method, search method, device, client, medium and equipment | |
CN112749300A (en) | Method, apparatus, device, storage medium and program product for video classification | |
CN113591881B (en) | Intention recognition method and device based on model fusion, electronic equipment and medium | |
CN112989182B (en) | Information processing method, information processing device, information processing apparatus, and storage medium | |
CN110348581B (en) | User feature optimizing method, device, medium and electronic equipment in user feature group | |
CN111915414A (en) | Method and device for displaying target object sequence to target user | |
CN111724221B (en) | Method, system, electronic device and storage medium for determining commodity matching information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |