CN114677174A - Method and device for calculating sales volume of unladen articles - Google Patents

Method and device for calculating sales volume of unladen articles Download PDF

Info

Publication number
CN114677174A
CN114677174A CN202210300331.1A CN202210300331A CN114677174A CN 114677174 A CN114677174 A CN 114677174A CN 202210300331 A CN202210300331 A CN 202210300331A CN 114677174 A CN114677174 A CN 114677174A
Authority
CN
China
Prior art keywords
shelved
articles
item
similar
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210300331.1A
Other languages
Chinese (zh)
Inventor
闵旭
吕昊
王答明
易津锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Shangke Information Technology Co Ltd
Priority to CN202210300331.1A priority Critical patent/CN114677174A/en
Publication of CN114677174A publication Critical patent/CN114677174A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities

Abstract

The invention discloses a method and a device for calculating sales volume of unladen articles, and relates to the technical field of machine learning. One embodiment of the method comprises: screening similar articles of the articles which are not shelved from the first articles which are shelved; calculating supplementary features of the non-shelved item according to shelving data of similar items of the non-shelved item; inputting the intrinsic characteristics and the supplementary characteristics of the unladen goods into a sales calculation model to output sales of the unladen goods; and the sales calculation model is obtained by training a machine learning model by using the inherent characteristics and the supplementary characteristics of the second item which is placed on the shelf. This embodiment can solve the technical problem that the sales volume of the article not on the shelf cannot be calculated.

Description

Method and device for calculating sales volume of unladen articles
Technical Field
The invention relates to the technical field of machine learning, in particular to a method and a device for calculating sales volume of unladen articles.
Background
Existing sales calculation methods can be divided into three major categories: 1) the delphi method, also called expert survey, is used for calculating by consulting the judgment or opinion of experts; 2) a time series prediction method, which utilizes the sales data of the item on the historical time line to calculate the future sales volume of the item; 3) and the method comprehensively considers the property, the price, the time-space information, the historical sales data, the pictures and the texts of the article and other related data to calculate the future sales volume of the article.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
since the new products which are not put on shelves have no historical sales data, the calculation of the sales volume of the new products after being put on shelves becomes a difficult problem.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for calculating a sales volume of an unladen item, so as to solve the technical problem that the sales volume of the unladen item cannot be calculated.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a method of calculating an amount of sales of an unladen item, including:
screening similar articles of the articles which are not shelved from the first articles which are shelved;
calculating supplementary features of the non-shelved item according to shelving data of similar items of the non-shelved item;
inputting the intrinsic characteristics and the supplementary characteristics of the unfriendly goods into a sales calculation model to output sales of the unfriendly goods; and the sales calculation model is obtained by training a machine learning model by using the inherent characteristics and the supplementary characteristics of the second item which is placed on the shelf.
Optionally, screening out similar items of the unframed items from the shelved first item, comprising:
clustering the first articles which are placed on the shelves and the articles which are not placed on the shelves to obtain a plurality of clusters and the similarity between the articles in the clusters;
and taking each article in the cluster where the non-shelved article is positioned as a similar article of the non-shelved article.
Optionally, calculating a complementary characteristic of the non-shelved item from shelving data of similar items of the non-shelved item, comprising:
calculating the characteristic weight of the similar articles of the non-shelved articles according to the similarity between the non-shelved articles and the similar articles;
according to shelving data of similar articles of the non-shelving articles, calculating supplementary features of the similar articles of the non-shelving articles in all dimensions;
and for the supplementary features of the non-shelved item in each dimension, carrying out weighted summation on the supplementary features of the dimension based on the feature weights of similar items of the non-shelved item to obtain the supplementary features of the non-shelved item in the dimension.
Optionally, calculating a feature weight of the similar article of the non-shelved article according to the similarity between the non-shelved article and the similar article thereof, including:
for each similar article of the non-shelved articles, calculating the similarity between the similar article and the non-shelved article, and calculating a difference value between the similarity and the similarity, thereby obtaining the characteristic weight of the similar article.
Optionally, the shelving data comprises at least one of:
sales data, discount data, price data, weather data, holiday data, and inventory data.
Optionally, before screening out similar articles of the non-shelved articles from the shelved first articles, the method further comprises:
screening out similar articles of the second article which is already put on the shelf from the first article which is already put on the shelf;
calculating supplementary features of the second item according to shelving data of similar items of the second item;
and inputting the inherent features and the supplementary features of the second article into a machine learning model, taking the sales data of the second article as output, and obtaining a sales calculation model through iterative training.
Optionally, the machine learning model is LightGBM.
Optionally, the shelf loading time of the first item is earlier than the shelf loading time of the second item.
In addition, according to another aspect of the embodiments of the present invention, there is provided an apparatus for calculating sales volume of an unladen item, including:
the screening module is used for screening similar articles of the articles which are not shelved from the first articles which are shelved;
the characteristic module is used for calculating the supplementary characteristics of the non-shelved articles according to shelving data of similar articles of the non-shelved articles;
the calculation module is used for inputting the inherent characteristics and the supplementary characteristics of the non-shelved goods into a sales calculation model so as to output the sales of the non-shelved goods; and the sales calculation model is obtained by training a machine learning model by using the inherent characteristics and the supplementary characteristics of the second goods which are put on shelves.
Optionally, the screening module is further configured to:
clustering the first articles which are placed on the shelves and the articles which are not placed on the shelves to obtain a plurality of clusters and the similarity between the articles in the clusters;
and taking each article in the cluster where the non-shelved article is positioned as a similar article of the non-shelved article.
Optionally, the feature module is further configured to:
calculating the characteristic weight of the similar articles of the non-shelved articles according to the similarity between the non-shelved articles and the similar articles;
according to shelving data of similar articles of the non-shelving articles, calculating supplementary features of the similar articles of the non-shelving articles in all dimensions;
and for the supplementary features of the non-shelved item in each dimension, carrying out weighted summation on the supplementary features of the dimension based on the feature weights of similar items of the non-shelved item to obtain the supplementary features of the non-shelved item in the dimension.
Optionally, the feature module is further configured to:
for each similar article of the non-shelved articles, calculating the similarity between the similar article and the non-shelved article, and calculating a difference value between the similarity and the similarity, thereby obtaining the characteristic weight of the similar article.
Optionally, the shelving data comprises at least one of:
sales data, discount data, price data, weather data, holiday data, and inventory data.
Optionally, a training module is further included for:
screening out similar articles of the second article which is already put on the shelf from the first article which is already put on the shelf;
calculating supplementary features of the second item according to shelving data of similar items of the second item;
and inputting the inherent features and the supplementary features of the second article into a machine learning model, taking the sales data of the second article as output, and obtaining a sales calculation model through iterative training.
Optionally, the machine learning model is LightGBM.
Optionally, the shelf loading time of the first item is earlier than the shelf loading time of the second item.
According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method of any of the embodiments described above.
According to another aspect of the embodiments of the present invention, there is also provided a computer readable medium, on which a computer program is stored, which when executed by a processor implements the method of any of the above embodiments.
One embodiment of the above invention has the following advantages or benefits: the technical means that the supplement characteristics of the non-shelved items are calculated according to the shelving data of the similar items of the non-shelved items, and the inherent characteristics and the supplement characteristics of the non-shelved items are input into the sales volume calculation model to output the sales volume of the non-shelved items is adopted, so that the technical problem that the sales volume of the non-shelved items cannot be calculated in the prior art is solved. The embodiment of the invention calculates the supplementary characteristics of the goods which are not placed on the shelves based on the data of the goods which are placed on the shelves, and accurately calculates the sales volume of the goods which are not placed on the shelves within a period of time by using a machine learning regression method, thereby not only having low data demand dimension and strong availability, but also establishing the visual relation between the characteristics and the result and improving the interpretability of the result.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of a main flow of a method for calculating sales of an unladen item according to an embodiment of the present invention;
FIG. 2 is a schematic view of the main flow of a method for calculating sales of shelved items according to one referential embodiment of the present invention;
FIG. 3 is a schematic view of the main flow of a method for calculating the sales volume of an unladen item according to another referential embodiment of the present invention;
FIG. 4 is a schematic diagram of the major modules of an apparatus for calculating the sales volume of an unframed item according to an embodiment of the present invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 6 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
When calculating new sales using item attributes, the three methods commonly used in the prior art all have significant disadvantages:
1) in the delphi method, experts give prediction results according to past market experience, but new product release is a brand new problem, and historical experience may have large deviation. Meanwhile, because of the wide variety of articles and the massive features and data related to the articles, standardized models and methods are needed to improve the efficiency and accuracy of calculation.
2) The time series model is used for calculating the future sales condition of the commodity by using observable historical sales data of the commodity, and the sales volume of the new commodity cannot be calculated by the time series model because the new commodity which is not put on the shelf has no historical sales data.
3) On one hand, large-scale machine learning models often need multidimensional features and data as input, and data of new products which are not placed on shelves are blank in many aspects, so that future sales of the new products which are not placed on shelves cannot be calculated.
Fig. 1 is a schematic diagram of a main flow of a method for calculating sales volume of an unladen item according to an embodiment of the present invention. As an embodiment of the present invention, as shown in fig. 1, the method for calculating the sales volume of the unladen goods may include:
step 101, screening out similar articles of the non-shelved articles from the shelved first articles.
Since the shelved items lack the shelved data that will be available after shelving, such as historical sales data, discount data, price data, etc., it is necessary to first dig out similar items from the shelved items, and then calculate the features of the shelved items based on the similar items, and use the features as supplementary features of the shelved items.
In order to facilitate processing and calculating the relevant data of each article in the subsequent step, the data of each article needs to be preprocessed and cleaned first, so as to improve the calculation efficiency of the subsequent step. The method is mainly used for preprocessing and cleaning the inherent attribute data, the shelving time, the space-time data, the discount data, the price data, the inventory data, the historical sales data and the like of the shelved goods, and the inherent attribute data and the planned shelving time of the shelved goods.
The specific operation comprises the following steps: fusing a plurality of data lists by taking an item SKU code as an index, determining a data type (distinguishing character string features and numerical features), filling or deleting missing data lines, merging and splitting existing features (for example, merging similar feature values, subdividing to define wide features), determining data granularity (taking the item SKU as a unit or taking each item SKU in each region as a unit), re-encoding the features (if a linear regression model or a clustering model is subsequently used, one-shot coding is required to be performed on the features, the feature values are converted into 0 and 1 numerical values, if a classification and regression tree model is subsequently used, the feature values are defined into character strings, each character string represents a type), determining the prediction duration, and calculating the sales volume of the shelved items from shelf to prediction time. In the new product sales calculation problem, a training set and a test set are naturally formed, and the embodiment of the invention takes the data of the shelved articles as the training set for training the sales calculation model which is not shelved and takes the shelved articles as the test set.
one-hot encoding: the N states are encoded using an N-bit status register, each state being represented by its own independent register bit and only one of which is active at any one time.
Optionally, before step 101, further comprising: screening out similar articles of the second article which is already put on the shelf from the first article which is already put on the shelf; calculating supplementary features of the second item according to shelving data of similar items of the second item; and inputting the inherent features and the supplementary features of the second article into a machine learning model, taking the sales data of the second article as output, and obtaining a sales calculation model through iterative training.
Since the non-shelved items have fewer features and only have inherent attributes, which are not enough to support the training of the machine learning model, the embodiment of the invention firstly screens out similar items of the shelved second item (the shelved second item is an item used for training the model) from the shelved first item, then calculates the features according to the shelved data of the similar items of the second item, uses the features as the supplementary features of the second item, and finally inputs the inherent features and the supplementary features of the second item together into the machine learning model and uses the sales data of the second item as output to obtain the sales calculation model through iterative training. In order to accurately screen out similar items of a second item from a large number of items, a Kmeans clustering model can be adopted to find out the similar items of the second item from the first items which are put on shelves.
Optionally, screening out similar items of the shelved second item from the shelved first item, comprising: clustering the first articles on the shelves and the second articles on the shelves to obtain a plurality of clusters and the similarity between the articles in the clusters; and taking each article in the cluster of the second article as a similar article of the non-shelved article. Optionally, the shelving time of the first item is earlier than the shelving time of the second item, that is, the shelving time of the first item is earlier and has more sufficient shelving data, so that similar items of the second item are screened from the first item. Optionally, the second item is shelved for the same time as the predicted duration of the non-shelved item, for example, if the sales volume of the non-shelved item in the future of one week is calculated, the second item is shelved for one week; if the sales of the shelved item in the future month are calculated, the shelving time of the second item is one month ago.
Optionally, calculating the supplementary features of the second item according to the shelving data of similar items of the second item may include: calculating the characteristic weight of the similar articles of the second article according to the similarity between the second article and the similar articles; according to the shelving data of the similar articles of the second article, calculating supplementary features of the similar articles of the second article in all dimensions; and for the supplementary features of the second item in each dimension, carrying out weighted summation on the supplementary features of the dimension based on the feature weight of the similar items of the second item to obtain the supplementary features of the second item in the dimension.
After similar items of the second item which is already shelved are screened out from the first item which is already shelved through the clustering model, the shelving data of the similar items are weighted to be used as the supplementary characteristics of the second item. Defining the distance between two items in the clustering model as:
Figure BDA0003565238610000091
wherein the content of the first and second substances,
Figure BDA0003565238610000092
is the s-dimensional feature of the second item,
Figure BDA0003565238610000093
is the s-dimensional feature of a similar item of the second item.
Optionally, the shelving data comprises at least one of: sales data, discount data, price data, weather data, holiday data, and inventory data. After a period of time has elapsed since the item being shelved, shelving data such as sales data, discount data, price data, weather data, holiday data, and inventory data may be generated, and the supplemental characteristic of the second item may be calculated based on the shelving data. Alternatively, the supplementary features of the similar items in each dimension can be calculated based on the shelving data of the similar items, such as sales volume of a week on shelving, sales volume of a month on shelving, price trend of a month on shelving, stock quantity of a quarter on shelving, sales volume of holidays, air temperature of a month on shelving, and the like, and the supplementary features of the similar items in each dimension can be calculated according to business needs.
According to the size of the similarity among the attributes of the articles, the articles are aggregated into a cluster with a determined number, a cluster where a second article is located is found out firstly, and all the articles in the cluster are similar articles of the second article. It should be noted that, part of the articles in the cluster may also be used as similar articles of the second article, for example, several articles with smaller euclidean distances may be selected from all the articles in the cluster as similar articles of the second article according to the euclidean distances between each article in the cluster and the second article.
Optionally, calculating the feature weight of the similar item of the second item according to the similarity between the second item and the similar item thereof includes: and for each similar article of the second article, calculating the similarity between the similar article and the second article, and calculating a difference between the similarity and the similarity so as to obtain the characteristic weight of the similar article. If the complementary features of the second item are constructed using N similar items in the cluster, the feature weighting for the similar items of the second item is:
Figure BDA0003565238610000094
wherein the content of the first and second substances,
Figure BDA0003565238610000095
is the euclidean distance between the second item and its similar items,
Figure BDA0003565238610000096
is the s-dimension feature of the semblance.
It should be noted that the cosine distance may also be used to calculate the similarity, which is not limited in this embodiment of the present invention.
Optionally, the machine learning model is LightGBM. The embodiment of the invention selects a machine learning model LightGBM for training, and fits the existing sales data by constructing a regression task. The LightGBM model is an implementation framework of a GBDT algorithm and is iteratively trained by using a weak classifier (decision tree)And obtaining an optimal model to support efficient parallel training. Specifically, the method comprises the following steps: suppose that the feature obtained for item i is xiCorresponding to a pin count of yiMake output r by learning LightGBMiFitting yiThe square loss can be constructed by:
L=(yi-ri)2.
and then, training the LightGBM model based on the loss function to obtain a trained sales calculation model. Thereafter, the prediction samples can be predicted using the sales computation model to obtain a prediction result. Therefore, the LightGBM regression model and the Kmeans clustering model are fused and introduced into a new sales prediction scene in the embodiment of the invention.
In the inference phase, similar items to those not on the shelf need to be first found, as with the type of training process. Optionally, step 101 may comprise: clustering the first articles which are placed on the shelves and the articles which are not placed on the shelves to obtain a plurality of clusters and the similarity between the articles in the clusters; and taking each article in the cluster where the non-shelved article is positioned as a similar article of the non-shelved article. Because the non-shelved articles have fewer characteristics and only have inherent attributes, in the embodiment of the invention, similar articles of the non-shelved articles are screened from the shelved first articles, in order to screen out the similar articles of the non-shelved articles from a large number of articles accurately, a Kmeans clustering model can be adopted to cluster the shelved first articles and the non-shelved articles, the articles are aggregated into a cluster with a determined number according to the similarity among the article attributes, and the cluster where the non-shelved articles are located is found out first, and all the articles in the cluster are similar articles of the non-shelved articles. It should be noted that, part of the articles in the cluster may also be regarded as similar articles of the articles that are not shelved, for example, several articles with smaller euclidean distances may be selected from all the articles in the cluster as similar articles of the articles that are not shelved according to the euclidean distances between each article in the cluster and the articles that are not shelved.
And 102, calculating the supplementary characteristics of the non-shelved items according to shelving data of similar items of the non-shelved items.
After similar articles of the articles which are not placed on the shelves are screened out, characteristics are calculated according to the placing data of the similar articles, the characteristics are used as supplementary characteristics of the articles which are not placed on the shelves, and the defect that the articles which are not placed on the shelves can obtain fewer characteristics is overcome.
Optionally, step 102 may comprise: calculating the characteristic weight of the similar articles of the non-shelved articles according to the similarity between the non-shelved articles and the similar articles; according to shelving data of similar articles of the non-shelving articles, calculating supplementary features of the similar articles of the non-shelving articles in all dimensions; and for the supplementary features of the non-shelved items in each dimension, carrying out weighted summation on the supplementary features of the dimension based on the feature weights of the similar items of the non-shelved items to obtain the supplementary features of the non-shelved items in the dimension. After the similar articles of the non-shelved articles are screened out, the feature weight is calculated according to the similarity between the non-shelved articles and the similar articles, then the supplementary features of the similar articles in each dimension are calculated based on shelving data of the similar articles, such as sales volume of a week for shelving, sales volume of a month for shelving, price trend of a month for shelving, inventory quantity of a quarter for shelving, sales volume of holidays, air temperature of a month for shelving and the like, the supplementary features of the similar articles in each dimension can be calculated according to business needs, and finally the supplementary features of each dimension are weighted and summed respectively based on the feature weight of the similar articles.
Optionally, calculating a feature weight of the similar article of the non-shelved article according to the similarity between the non-shelved article and the similar article thereof, including: for each similar item of the non-shelved items, subtracting the similarity between the similar item and the non-shelved item from one to obtain the characteristic weight of the similar item. If the complementary features of the non-shelved item are constructed using the N similar items in the cluster, the feature weighting for the similar items of the non-shelved item is:
Figure BDA0003565238610000111
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003565238610000112
is the euclidean distance between the second item and its similar items,
Figure BDA0003565238610000113
is the s-dimension feature of the semblance.
Optionally, the shelving data comprises at least one of: sales data, discount data, price data, weather data, holiday data, and inventory data. After a period of time of sale of the shelved item, shelving data such as sales data, discount data, price data, weather data, holiday data, and inventory data may be generated, and the complementary characteristics of the shelved item may be calculated based on the shelving data.
And 103, inputting the inherent characteristics and the supplementary characteristics of the non-shelved items into a sales calculation model to output the sales of the non-shelved items. And the sales calculation model is obtained by training a machine learning model by using the inherent characteristics and the supplementary characteristics of the second goods which are put on shelves.
Finally, the inherent characteristics and the supplementary characteristics of the unladen goods are input into a sales calculation model, and the output result of the sales calculation model is the future sales of the unladen goods, such as sales of a week in the future and sales of a month in the future.
It should be noted that the predicted time length of the unladen item is determined according to the needs of the business scenario, which is not limited in this embodiment of the present invention, but the predicted time length of the unladen item is consistent with the time that the second item has been shelved. For example, if the sales volume of the non-shelved item for one week in the future is calculated, the shelved time of the second item is one week; if the sales of the non-shelved object in a quarter of the future are calculated, the shelved time of the second object is a quarter, so that consistency of the sales time adopted in the training stage and the reasoning stage can be ensured, and the prediction accuracy of the sales calculation model is improved.
According to the various embodiments described above, it can be seen that the technical problem that the sales volume of the shelved item cannot be calculated in the prior art is solved by the technical means of calculating the supplementary features of the shelved item according to the shelving data of the similar items of the shelved item, and inputting the inherent features and the supplementary features of the shelved item into the sales volume calculation model to output the sales volume of the shelved item in the embodiments of the present invention. The embodiment of the invention calculates the supplementary characteristics of the goods which are not placed on the shelves based on the data of the goods which are placed on the shelves, and accurately calculates the sales volume of the goods which are not placed on the shelves within a period of time by using a machine learning regression method, thereby not only having low data demand dimension and strong availability, but also establishing the visual relation between the characteristics and the result and improving the interpretability of the result.
Fig. 2 is a schematic view of a main flow of a method of calculating sales of unladen items according to a reference embodiment of the present invention. As another embodiment of the present invention, as shown in fig. 2, the method for calculating the sales volume of the unladen goods may include:
in step 201, similar articles of the second article which is already shelved are screened out from the first article which is already shelved.
Step 201-step 203 are training phases, and a sales calculation model is trained through step 201-step 203. The shelving time of the first item is earlier than that of the second item, for example, the first item may be a year, two years or even longer item that is shelved, the shelving time of the second item is determined according to the needs of the business scenario, and the shelving time of the second item is the same as the predicted time length of the non-shelved item, for example, if the sales volume of the non-shelved item in a quarter in the future is calculated, the shelving time of the second item is one quarter ago; if the sales volume of the non-shelved item in the next half year is calculated, the shelving time of the second item is half year ago.
Step 202, calculating supplementary features of the second item according to shelving data of similar items of the second item.
Because the non-shelved articles have fewer features and only have inherent attributes, which are not enough to support the training of the machine learning model, the embodiment of the invention firstly screens out similar articles of a shelved second article (the data of the second article is used for training the model) from the shelved first article, then calculates the features according to the shelved data of the similar articles of the second article, uses the features as the supplementary features of the second article, and finally inputs the inherent features and the supplementary features of the second article together into the machine learning model and uses the sales data of the second article as output to obtain the sales calculation model through iterative training.
And step 203, inputting the inherent characteristics and the supplementary characteristics of the second article into a machine learning model, taking the sales data of the second article as output, and obtaining a sales calculation model through iterative training.
Optionally, the machine learning model is LightGBM. The embodiment of the invention selects a machine learning model LightGBM for training, and fits the existing sales data by constructing a regression task.
The LightGBM model can measure the influence of the dimensional characteristics on the prediction result according to the times of dividing the characteristics in all decision trees, so that the interpretable association between the attributes and sales of the off-shelf articles can be quantitatively described. Based on the interpretable correlation between the property of the article and the sales volume prediction of the non-shelved article, the property of the non-shelved article can be reversely suggested according to the sales target of the merchant for customization, the sales expectation can be suggested to the merchant for adjustment according to the sales volume prediction, and meanwhile, the pricing of the non-shelved article by the merchant can be used as a reference by matching with the adjustment of related production and inventory management work. Therefore, the embodiment of the invention predicts the sales volume of the shelved item in a shelving period by using a machine learning regression method based on the inherent attribute and the historical sales data of the shelved item, establishes the interpretable relation between the attribute of the item and the sales volume after shelving, and provides support for the attribute customization, generation and inventory management of the shelved item and pricing decision.
Similar items not shelved are screened from the shelved first item, step 204.
Since the shelved items lack the shelved data that will be available after shelving, such as historical sales data, discount data, price data, etc., it is necessary to first dig out similar items from the shelved items, and then calculate the features of the shelved items based on the similar items, and use the features as supplementary features of the shelved items.
Step 205, calculating the supplementary features of the non-shelved item according to the shelving data of the similar items of the non-shelved item.
After the similar articles of the non-shelved articles are screened out, the feature weight is calculated according to the similarity between the non-shelved articles and the similar articles, then the complementary features of the similar articles in each dimension are calculated based on the shelved data of the similar articles, such as sales volume of a week, sales volume of a month, price trend of a month, inventory quantity of a quarter, sales volume of holidays, air temperature of a month and the like, the complementary features of the similar articles in each dimension can be calculated according to business needs, and finally the complementary features in each dimension are weighted and summed respectively based on the feature weight of the similar articles.
And step 206, inputting the inherent characteristics and the supplementary characteristics of the non-shelved items into a sales calculation model to output the sales of the non-shelved items.
Finally, the inherent characteristics and the supplementary characteristics of the unladen goods are input into a sales calculation model, and the output result of the sales calculation model is the future sales of the unladen goods, such as the sales of two weeks in the future and the sales of two months in the future.
In addition, in a reference embodiment of the present invention, the detailed implementation of the method for calculating the sales amount of the non-shelved item is already described in detail in the above method for calculating the sales amount of the non-shelved item, so that the repeated content is not described again.
Fig. 3 is a schematic view of a main flow of a method for calculating sales of an unladen item according to another referential embodiment of the present invention. As another embodiment of the present invention, as shown in fig. 3, the step of the training phase may include:
step 301, clustering the first articles and the second articles on shelves to obtain a plurality of clusters and similarity between the articles in the clusters.
And step 302, regarding each article in the cluster where the second article is located as a similar article of the second article.
Step 303, calculating the feature weight of the similar article of the second article according to the similarity between the second article and the similar article.
And step 304, calculating supplementary features of the similar articles of the second article in all dimensions according to shelving data of the similar articles of the second article.
Step 305, for the supplementary features of the second item in each dimension, performing weighted summation on the supplementary features of the dimension based on the feature weights of similar items of the second item to obtain the supplementary features of the second item in the dimension.
And step 306, inputting the inherent characteristics and the supplementary characteristics of the second article into a machine learning model, taking the sales data of the second article as output, and obtaining a sales calculation model through iterative training.
In addition, in another embodiment of the present invention, the detailed implementation of the method for calculating the sales amount of the non-shelved item is described in detail in the above-mentioned method for calculating the sales amount of the non-shelved item, and therefore the repeated content is not described herein.
FIG. 4 is a schematic diagram of the main modules of an apparatus for calculating the sales volume of an unladen item according to an embodiment of the present invention. As shown in fig. 4, the device 400 for calculating the sales volume of the unladen goods comprises a screening module 401, a feature module 402 and a calculation module 403; the screening module 401 is used for screening similar articles which are not shelved from the first articles which are shelved; the characteristic module 402 is configured to calculate a complementary characteristic of the shelved item based on shelving data of similar items of the shelved item; the calculation module 403 is used for inputting the intrinsic characteristics and the supplementary characteristics of the non-shelved items into a sales calculation model to output sales of the non-shelved items; and the sales calculation model is obtained by training a machine learning model by using the inherent characteristics and the supplementary characteristics of the second item which is placed on the shelf.
Optionally, the screening module 401 is further configured to:
clustering the first articles which are placed on the shelves and the articles which are not placed on the shelves to obtain a plurality of clusters and the similarity between the articles in the clusters;
and taking each article in the cluster where the non-shelved article is positioned as a similar article of the non-shelved article.
Optionally, the feature module 402 is further configured to:
calculating the characteristic weight of the similar articles of the non-shelved articles according to the similarity between the non-shelved articles and the similar articles;
according to shelving data of similar articles of the non-shelving articles, calculating supplementary features of the similar articles of the non-shelving articles in all dimensions;
and for the supplementary features of the non-shelved item in each dimension, carrying out weighted summation on the supplementary features of the dimension based on the feature weights of similar items of the non-shelved item to obtain the supplementary features of the non-shelved item in the dimension.
Optionally, the feature module 402 is further configured to:
for each similar article of the non-shelved articles, calculating the similarity between the similar article and the non-shelved article, and calculating a difference value between the similarity and the similarity so as to obtain the characteristic weight of the similar article.
Optionally, the shelving data comprises at least one of:
sales data, discount data, price data, weather data, holiday data, and inventory data.
Optionally, a training module is further included for:
screening out similar articles of the second article which is already put on the shelf from the first article which is already put on the shelf;
calculating supplementary features of the second item according to shelving data of similar items of the second item;
and inputting the inherent features and the supplementary features of the second article into a machine learning model, taking the sales data of the second article as output, and obtaining a sales calculation model through iterative training.
Optionally, the machine learning model is LightGBM.
Optionally, the shelf loading time of the first item is earlier than the shelf loading time of the second item.
According to the various embodiments described above, it can be seen that the technical problem that the sales volume of the shelved item cannot be calculated in the prior art is solved by the technical means of calculating the supplementary features of the shelved item according to the shelving data of the similar items of the shelved item, and inputting the inherent features and the supplementary features of the shelved item into the sales volume calculation model to output the sales volume of the shelved item in the embodiments of the present invention. The embodiment of the invention calculates the supplementary characteristics of the non-shelved items based on the shelving data of the shelved items, and accurately calculates the sales volume of the non-shelved items in a period of shelving by using a machine learning regression method, thereby not only having low data demand dimension and strong availability, but also establishing the visual relation between the characteristics and the result and improving the interpretability of the result.
It should be noted that, in the implementation of the apparatus for calculating the sales amount of the non-shelved item according to the present invention, the above method for calculating the sales amount of the non-shelved item has been described in detail, and therefore, the repeated content herein will not be described again.
Fig. 5 illustrates an exemplary system architecture 500 to which the method of calculating an amount of unladen items or the apparatus for calculating an amount of unladen items of embodiments of the present invention may be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 is the medium used to provide communication links between terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The terminal devices 501, 502, 503 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 501, 502, 503. The background management server can analyze and process the received data such as the article information query request and feed back the processing result to the terminal equipment.
It should be noted that the method for calculating the sales amount of the non-shelved item provided by the embodiment of the present invention is generally executed by the server 505, and accordingly, the device for calculating the sales amount of the non-shelved item is generally disposed in the server 505. The method for calculating the sales volume of the unlifted articles provided by the embodiment of the present invention can also be executed by the terminal devices 501, 502, and 503, and accordingly, the device for calculating the sales volume of the unlifted articles can be disposed in the terminal devices 501, 502, and 503.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer programs according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a screening module, a feature module, and a calculation module, where the names of the modules do not in some cases constitute a limitation on the modules themselves.
As another aspect, the present invention also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not assembled into the device. The computer readable medium carries one or more programs which, when executed by a device, implement the method of: screening similar articles of the articles which are not shelved from the first articles which are shelved; calculating supplementary features of the non-shelved item according to shelving data of similar items of the non-shelved item; inputting the intrinsic characteristics and the supplementary characteristics of the unfriendly goods into a sales calculation model to output sales of the unfriendly goods; and the sales calculation model is obtained by training a machine learning model by using the inherent characteristics and the supplementary characteristics of the second item which is placed on the shelf.
According to the technical scheme of the embodiment of the invention, the technical means that the supplement characteristics of the articles which are not shelved are calculated according to the shelving data of the similar articles of the articles which are not shelved is adopted, and the inherent characteristics and the supplement characteristics of the articles which are not shelved are input into the sales volume calculation model to output the sales volume of the articles which are not shelved is adopted, so that the technical problem that the sales volume of the articles which are not shelved cannot be calculated in the prior art is solved. The embodiment of the invention calculates the supplementary characteristics of the non-shelved items based on the shelving data of the shelved items, and accurately calculates the sales volume of the non-shelved items in a period of shelving by using a machine learning regression method, thereby not only having low data demand dimension and strong availability, but also establishing the visual relation between the characteristics and the result and improving the interpretability of the result.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A method of calculating sales of unladen items, comprising:
screening similar articles of the articles which are not shelved from the first articles which are shelved;
calculating supplementary features of the non-shelved item according to shelving data of similar items of the non-shelved item;
inputting the intrinsic characteristics and the supplementary characteristics of the unfriendly goods into a sales calculation model to output sales of the unfriendly goods; and the sales calculation model is obtained by training a machine learning model by using the inherent characteristics and the supplementary characteristics of the second item which is placed on the shelf.
2. The method of claim 1, wherein screening out similar items from the shelved first item for non-shelved items comprises:
clustering the first articles which are placed on the shelves and the articles which are not placed on the shelves to obtain a plurality of clusters and the similarity between the articles in the clusters;
and taking each article in the cluster where the non-shelved article is positioned as a similar article of the non-shelved article.
3. The method of claim 2, wherein calculating the supplemental characteristics of the unstacked item from the shelving data for similar items of the unstacked item comprises:
calculating the characteristic weight of the similar articles of the non-shelved articles according to the similarity between the non-shelved articles and the similar articles;
according to shelving data of similar articles of the non-shelving articles, calculating supplementary features of the similar articles of the non-shelving articles in all dimensions;
and for the supplementary features of the non-shelved item in each dimension, carrying out weighted summation on the supplementary features of the dimension based on the feature weights of similar items of the non-shelved item to obtain the supplementary features of the non-shelved item in the dimension.
4. The method of claim 3, wherein calculating the feature weights of the similar items of the shelved item based on the similarity between the shelved item and the similar items thereof comprises:
for each similar article of the non-shelved articles, calculating the similarity between the similar article and the non-shelved article, and calculating a difference value between the similarity and the similarity, thereby obtaining the characteristic weight of the similar article.
5. The method of claim 3, wherein the racking data comprises at least one of:
sales data, discount data, price data, weather data, holiday data, and inventory data.
6. The method of claim 1, wherein prior to screening similar ones of the unframed items from the first items that have been unframed, further comprising:
screening out similar articles of the second article which is already put on the shelf from the first article which is already put on the shelf;
calculating supplementary features of the second item according to shelving data of similar items of the second item;
and inputting the inherent features and the supplementary features of the second article into a machine learning model, taking the sales data of the second article as output, and obtaining a sales calculation model through iterative training.
7. The method of claim 6, wherein the machine learning model is LightGBM.
8. The method of claim 6, wherein the racking time of the first item is earlier than the racking time of the second item.
9. An apparatus for calculating sales of unframed items, comprising:
the screening module is used for screening similar articles of the articles which are not shelved from the first articles which are shelved;
the characteristic module is used for calculating the supplementary characteristics of the non-shelved articles according to shelving data of similar articles of the non-shelved articles;
the calculation module is used for inputting the inherent characteristics and the supplementary characteristics of the non-shelved goods into a sales calculation model so as to output the sales of the non-shelved goods; and the sales calculation model is obtained by training a machine learning model by using the inherent characteristics and the supplementary characteristics of the second item which is placed on the shelf.
10. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
the one or more programs, when executed by the one or more processors, implement the method of any of claims 1-8.
11. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-8.
CN202210300331.1A 2022-03-25 2022-03-25 Method and device for calculating sales volume of unladen articles Pending CN114677174A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210300331.1A CN114677174A (en) 2022-03-25 2022-03-25 Method and device for calculating sales volume of unladen articles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210300331.1A CN114677174A (en) 2022-03-25 2022-03-25 Method and device for calculating sales volume of unladen articles

Publications (1)

Publication Number Publication Date
CN114677174A true CN114677174A (en) 2022-06-28

Family

ID=82074896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210300331.1A Pending CN114677174A (en) 2022-03-25 2022-03-25 Method and device for calculating sales volume of unladen articles

Country Status (1)

Country Link
CN (1) CN114677174A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196695A (en) * 2023-11-03 2023-12-08 中国民航信息网络股份有限公司 Target product sales data prediction method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196695A (en) * 2023-11-03 2023-12-08 中国民航信息网络股份有限公司 Target product sales data prediction method and device
CN117196695B (en) * 2023-11-03 2024-02-27 中国民航信息网络股份有限公司 Target product sales data prediction method and device

Similar Documents

Publication Publication Date Title
CN110751497A (en) Commodity replenishment method and device
CN110348921B (en) Method and device for selecting store articles
CN109214730A (en) Information-pushing method and device
CN113095893A (en) Method and device for determining sales of articles
CN110929136A (en) Personalized recommendation method and device
CN112184348A (en) Order data processing method and device, electronic equipment and medium
CN111126442A (en) Method for generating key attribute of article, method and device for classifying article
CN110866625A (en) Promotion index information generation method and device
CN112749323A (en) Method and device for constructing user portrait
CN115033801A (en) Article recommendation method, model training method and electronic equipment
CN114677174A (en) Method and device for calculating sales volume of unladen articles
CN112784212B (en) Inventory optimization method and device
CN110599281A (en) Method and device for determining target shop
CN110827102A (en) Method and device for adjusting goods price ratio
CN111612385B (en) Method and device for clustering articles to be distributed
CN113743971A (en) Data processing method and device
CN116823404A (en) Commodity combination recommendation method, device, equipment and medium thereof
CN113495991A (en) Recommendation method and device
CN112825182A (en) Method and device for determining recommended commodities
CN114663015A (en) Replenishment method and device
CN110880119A (en) Data processing method and device
CN113449175A (en) Hot data recommendation method and device
CN110826948B (en) Warehouse selecting method and device
CN112784861A (en) Similarity determination method and device, electronic equipment and storage medium
CN113159877A (en) Data processing method, device, system and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination