CN110633401A - Prediction model of store data and establishment method thereof - Google Patents
Prediction model of store data and establishment method thereof Download PDFInfo
- Publication number
- CN110633401A CN110633401A CN201910683454.6A CN201910683454A CN110633401A CN 110633401 A CN110633401 A CN 110633401A CN 201910683454 A CN201910683454 A CN 201910683454A CN 110633401 A CN110633401 A CN 110633401A
- Authority
- CN
- China
- Prior art keywords
- data
- store
- prediction model
- variable data
- sales
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0202—Market predictions or forecasting for commercial activities
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Economics (AREA)
- Quality & Reliability (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Medical Informatics (AREA)
- Game Theory and Decision Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the invention discloses a prediction model of store data and an establishment method thereof, wherein the prediction method of the store data comprises the following steps: acquiring first store information around stores through a web crawler technology, acquiring second store information through an enterprise internal management system, cleaning and performing structured data processing to obtain different types of sales data, dividing the sales data into continuous variable data and discrete variable data, performing normalization correction on the continuous variable data, and performing assignment continuous optimization on the discrete variable data; and performing training set learning by using a machine learning method to obtain a prediction model of the sales scale, and performing replication optimization on the prediction model by using actual sales scale data of the store to be predicted. The prediction model of store data and the establishment method thereof disclosed by the embodiment of the invention can reasonably and accurately predict the sales data of stores, further reasonably arrange the operation of the stores and improve the terrace effect of the stores.
Description
Technical Field
The invention relates to the field of big data, in particular to a prediction model of store data and an establishment method thereof.
Background
Traditionally, the enterprise usually forecasts the sales scale of stores through manual forecast based on business experts, and forecasts the sales situation of the stores through experience of sales business personnel and manual analysis, the forecasting method is called as an expert method and can quickly and simply obtain forecasting results, but the forecasting method does not utilize sales data to make reasonable and persuasive judgment and does not have theoretical support, so that forecasting effects are uneven, and further, the establishment, personnel, stock, operation, management and the like of the stores are influenced.
Because the machine learning method is widely applied, at present, enterprises can directly import data of different stores into a computer, and simply predict sales conditions of the stores through machine learning, the prediction is only simple application to the machine learning, but the sales of the stores of the enterprises are not only influenced by the management data of the stores, but also interfered by multiple factors of different geographic positions, urban development conditions and other stores of the enterprises, the prediction method does not analyze and process various factors, and in addition, as the application of the management data of the stores in the machine learning has great uncertainty, the final prediction effect can also be influenced.
Disclosure of Invention
In order to solve the problems in the prior art, the embodiment of the invention provides a building method and a building system of a store data prediction model, which can reasonably and accurately predict sales data of stores, further reasonably arrange the operation of the stores and improve the floor effect of the stores.
In order to solve the technical problems, the invention adopts the technical scheme that:
in a first aspect, an embodiment of the present invention provides a method for building a store data prediction model, including the following steps:
acquiring first store information around a store through a web crawler technology, acquiring second store information through an enterprise internal management system, and storing the first store information and the second store information into a database;
cleaning and structuring data processing are carried out on the first store information and the second store information to obtain different types of sales data, the sales data are divided into continuous variable data and discrete variable data, normalization correction is carried out on the continuous variable data, and assignment continuous optimization is carried out on the discrete variable data;
and (3) scoring the normalized and corrected and continuously optimized feature data by using a machine learning method, automatically selecting the feature data with high score to perform training set learning to obtain a sales scale prediction model, and performing repeated optimization on the prediction model by using the actual sales scale data of the store to be predicted.
And further, outputting a predicted value of the sales scale of the store to be predicted by using the prediction model according to the specific position and retail scene of the store to be predicted.
Furthermore, the crawling ways of the crawler technology at least comprise crawling map data, crawling online shopping data and crawling official statistical data.
Further, the first store information comprises a city data characteristic, a business circle data characteristic and a geographic data characteristic, and the second store information comprises a store member data characteristic and a store project data characteristic.
Further, the normalization correction includes: and carrying out logarithmic deviation correction on the continuous variable data which are approximately normally distributed, and then carrying out mean value normalization processing on the corrected continuous variable data.
Further, the continuously optimizing the assignment includes: and carrying out unitary variance analysis on the discrete variable data, screening the discrete variable data which has a large influence on the store plateau effect, converting the screened discrete variable data into the store plateau effect mean value corresponding to the store, and continuously sequencing the different plateau effect mean values according to different quantitative values.
Further, before the prediction result is output by using the prediction model, the method further comprises the steps of determining the state type of the retail scene, and automatically associating and matching the retail scene to the corresponding specified prediction model according to the state type.
On the other hand, the embodiment of the invention also provides a prediction model of store data, which comprises the following steps:
the system comprises a data acquisition module and a database, wherein the data acquisition module comprises a web crawler unit and an enterprise data acquisition unit, the web crawler unit is used for acquiring first store information around stores through a web crawler technology, the enterprise data acquisition unit is used for acquiring second store information through an enterprise internal management system, and the data acquisition module stores the first store information and the second store information into the database;
the data processing module is used for cleaning and carrying out structured data processing on the first store information and the second store information to obtain different types of sales data, dividing the sales data into continuous variable data and discrete variable data, carrying out normalization correction on the continuous variable data, and carrying out assignment continuous optimization on the discrete variable data;
and the machine learning module is used for scoring the characteristic data after the normalization correction and the continuous optimization, automatically selecting the characteristic data with high score to perform training set learning to obtain a prediction model of the sales scale, and performing repeated optimization on the prediction model by using the actual sales scale data of the store to be predicted.
Furthermore, the data processing module comprises a continuous variable correction unit, and is used for carrying out logarithmic correction on the continuous variable data which is approximately normally distributed, and then carrying out mean value normalization processing on the corrected continuous variable data.
Further, the data processing module comprises a discrete variable optimization unit, which is used for performing univariate variance analysis on the discrete variable data, screening the discrete variable data which have a large influence on the store plateau effect, then converting the screened discrete variable data into the store plateau effect mean value, and continuously sequencing the different store plateau effect mean values according to different quantitative values.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a building method and a system of a store data prediction model, wherein the information around a store is collected through a web crawler technology of a computer, and the web crawler can acquire a large amount of first store information accurately, so that the information acquisition quantity and accuracy are obviously improved; meanwhile, second store information is acquired by combining a management system in an enterprise, reasonable and accurate target characteristic data are cleaned and optimized by utilizing computer analysis on the first store information and the second store information, multiple interference factors are eliminated from various acquired data before machine learning through data processing, and the accuracy of a model obtained after machine learning is obviously improved; the prediction model in the embodiment of the invention is subjected to repeated optimization by using the actual sales data of the store, the output accuracy of the prediction model is improved again, and finally the output predicted data of the prediction model is reasonable and effective. And the operation of the store is reasonably arranged through an effective prediction result, so that the plateau effect of the store is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart of a method for building a store data prediction model according to an embodiment of the present invention;
FIG. 2 is a flow chart of data processing analysis in the establishment method of the store data prediction model disclosed in the embodiment of the present schematic diagram;
fig. 3 is an architectural diagram of a predictive model of store data according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The first embodiment is as follows:
fig. 1 discloses a method for building a store data prediction model according to an embodiment of the present invention, which includes the following steps:
s1: acquiring first store information around a store through a web crawler technology, acquiring second store information through an enterprise internal management system, and storing the first store information and the second store information into a database;
s2: cleaning and structuring data processing are carried out on the first store information and the second store information to obtain different types of sales data, the sales data are divided into continuous variable data and discrete variable data, normalization correction is carried out on the continuous variable data, and assignment continuous optimization is carried out on the discrete variable data;
s3: and (3) scoring the normalized and corrected and continuously optimized feature data by using a machine learning method, automatically selecting the feature data with high score to perform training set learning to obtain a sales scale prediction model, and performing repeated optimization on the prediction model by using the actual sales scale data of the store to be predicted.
Preferably, according to the specific position and retail scene of the store to be predicted, the prediction model is used for outputting the predicted value of the sales scale of the store to be predicted, and then the operation of the store is reasonably arranged through an effective prediction result, so that the floor effect of the store is improved.
The crawler technology is a program for automatically browsing network, and it automatically captures the information required by user on the world wide web according to a certain rule. With the development of the internet, networks have become carriers of a large amount of information. The crawler technology also becomes an important component of data acquisition and is the most basic step in big data analysis. Preferably, the crawled paths of the crawler technology include, but are not limited to, crawling map data, crawling online shopping data, crawling official statistics. Specifically, website data such as a high-grade map, national data, public comments, Mei Tuo and the like are crawled by using a web crawler tool to obtain information such as peripheral passenger flow, population, commercial distribution, interest point distribution, traffic conditions and the like.
Preferably, the first store information includes city data characteristics, business circle data characteristics and geographic data characteristics, and particularly, since the macro economic characteristics of a city have a certain correlation with the number and quality of potential customers, the city data characteristics need to be considered first, and a detailed analysis of the population structure is also necessary, so that the analysis in the business circle radiation range is more reasonable, wherein the collection of the city data characteristics includes: the number of permanent population, the number of household registers, the total GDP amount, the urbanization rate, the dominance income of urban residents and the total retail amount of social consumer goods; the acquisition of the characteristics of the business circle data comprises the following steps: demographic data characteristics, passenger flow data characteristics, portrait characteristics; according to the theory of "retail clustering", the retail enterprises can generate "overflow effect" when opening the store in the same section, thereby attracting larger passenger flow, so that the number and distribution of the competitive pairs need to be considered. Traffic facilities and POI can then obviously attract the passenger flow volume, and people can select the more convenient place of traffic to consume usually, and a large amount of passenger gathering points can directly bring the passenger flow simultaneously, therefore, include to the collection of geographic data characteristic: competitor data, transportation facilities, point of interest (POI).
The second store information includes store membership data characteristics and store item data characteristics. In particular, since members of stores are major consumers, it is necessary to analyze their characteristics, including: membership number, membership permeability, membership annual rate, membership annual purchase amount, and membership customer unit price; since the characteristics of the store item itself are obviously directly related to the sales, the subsequent characteristic analysis is mainly based on the data, and the store item data characteristics comprise: city category, market level, city grade division, business district type, store property, operation time, annual tax sales, annual rent, floor distribution, inner area of the sleeve and building area.
The flow chart of data processing analysis as disclosed in fig. 2, the normalization correction comprises: and carrying out logarithmic deviation correction on the continuous variable data which are approximately normally distributed, and then carrying out mean value normalization processing on the corrected continuous variable data. Specifically, after the above-mentioned various data are collated, continuous variable data having a large influence on store sales, for example, are: after linear analysis is carried out on plateau effect, sales volume, in-sleeve area, rent and the like, the normal distribution is found to be relatively close to normal distribution, but the normal distribution has obvious positive deviation, if the normalization is directly carried out, an ideal effect cannot be achieved, therefore, the positive deviation trend needs to be relieved by firstly carrying out log (1+ x), wherein x represents continuous variable data before logarithm is taken, linear analysis is carried out on the result obtained after logarithm is taken again, the result can be seen to reach normal distribution through the analysis of a computer, and on the basis, the mean value normalization is carried out on the result: [ x _ scale ═ x1-mean)/std]Where mean is the mean of the set of logarithmized continuous variable data, std is the standard deviation of the set of logarithmized continuous variable data, and x1Is an element of a set of continuous variable data after logarithmic processing, and x _ scale represents the result after normalization and correction. Through mean value normalization, the continuous variable data can reduce fluctuation, so that a global minimum value is reached, the effect of a learner in the subsequent machine learning process is facilitated to be more convincing and accurate, and errors are reduced.
Preferably, the continuously optimizing the assignment comprises: and carrying out unitary variance analysis on the discrete variable data, screening the discrete variable data which has a large influence on the store plateau effect, converting the screened discrete variable data into the store plateau effect mean value corresponding to the store, and continuously sequencing the different plateau effect mean values according to different quantitative values. Specifically, for the discrete variable data, if the numerical value is larger after the first variance is calculated, it is proved that the influence on the store plateau effect is larger, and therefore, the first variance is calculated for a plurality of groups of discrete variable data by a computer, and then, through the discrete characteristic analysis, it is found that a plurality of discrete variable data have a larger influence on the store plateau effect, such as store property, city grade, and the like, so that a 1,2,3,4 plateau effect mean value can be defined for each discrete variable data according to the mean value of the plateau effect under the corresponding value of each discrete variable data, and the influence on the plateau effect can be quantitatively described through the plateau effect mean value, which is equivalent to the value assignment of the discrete variable data to make the discrete variable data become a continuous variable data amount. The method adopts a method different from One-Hot coding to process discrete data, compared with the One-Hot coding, more information can be reserved, the acquisition breadth and the processing depth of important sales data are obviously improved, and therefore the accuracy and the representativeness of subsequent machine learning are guaranteed.
Preferably, before the prediction result is output by using the prediction model, the method further comprises the steps of determining the state type of the retail scene, and automatically associating and matching the retail scene to the corresponding specified prediction model according to the state type. Specifically, different retail scenes can generate different prediction models, and different retail scenes have different business types, so that different prediction models can be generated according to different retail scenes after the machine learns, before the prediction models are utilized, the retail scenes of stores to be predicted need to be firstly determined, the computer can automatically associate and match the specified prediction models according to the input retail scenes, and compared with the traditional method that unified prediction models are generated through machine learning, the association and matching are stronger in pertinence and more accurate in prediction data, for the stores of enterprises, the output sales scale of the stores is more accurate, so that the stores can reasonably arrange the operation of the stores, and the level effect of the stores is improved.
Machine learning is a multi-field cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Specifically, based on the existing feature data and previous investigation, when the prediction model is automatically established, since the machine learning includes the bagging algorithm and the boosting algorithm, the embodiment preferably utilizes the random forest regression in the bagging algorithm. The random forest is a strong learner which is composed of a plurality of classification and regression trees as weak learners, and the random forest can automatically select optimal characteristics and score. When building a model for predicting stores, 20 processed feature data having a large influence on stores are input, and the feature data are respectively the following features: 'store level', 'region', 'city level', 'city category', 'market level', 'distance from core enterprise', 'commercial site planning type', 'store nature', 'time of opening', 'if it is a shopping mall store', 'annual rent', 'in-house area _ SUM', 'in-house area _ B2F', 'in-house area _ B1F', 'in-house area _1F', 'in-house area _2F', 'in-house area _3F', 'in-house area _4F', 'in-house area _5F', 'in-house area _6F', 'in-house area _7F', 'in-house area _8F', 'in-house area _ interlayer', 'no-sales-level-effect-in-house', 'sales-all-year-including tax-recovery',; then the computer automatically selects 16 feature data with the highest score, and simultaneously selects 500 regression trees as weak learners; selecting 80% of the sample set as a training set, and taking the rest 20% as a testing set; the score value of the prediction model learned from the training set was 0.84.
Preferably, when the prediction model in the embodiment of the present invention is used, the prediction model in the embodiment of the present invention may be used to predict not only the store to be opened but also the store already opened, wherein the prediction model may clearly show where the store is opened with the largest sales scale, the highest plateau effect, and the like; for the established stores, the actual sales scale of the established stores is compared with the predicted sales scale in real time through a computer, and the actual data is imported into the machine learning process while the prediction result is verified, so that the replication optimization of the prediction model is completed, the prediction model is in a dynamic optimization and correction process, the availability and the accuracy of the prediction model are further improved, and the error range of the prediction model is reduced.
Example two:
fig. 3 discloses an architectural diagram of a predictive model of store data according to an embodiment of the present invention, which includes:
the data acquisition module 1 comprises a web crawler unit 11 and an enterprise data acquisition unit 12, wherein the web crawler unit 11 is used for acquiring first store information around a store through a web crawler technology, the enterprise data acquisition unit 12 is used for acquiring second store information through an enterprise internal management system, and the data acquisition module 1 stores the first store information and the second store information into a database;
the data processing module 2 is used for cleaning and performing structured data processing on the first store information and the second store information to obtain different types of sales data, dividing the sales data into continuous variable data and discrete variable data, performing normalization correction on the continuous variable data, and performing assignment continuous optimization on the discrete variable data;
and the machine learning module 3 is used for scoring the characteristic data after the normalization correction and the continuous optimization, automatically selecting the characteristic data with high score to perform training set learning to obtain a prediction model of sales scale, and performing multi-disk optimization on the prediction model by using the actual sales scale data of the store to be predicted.
Specifically, the data collected by the web crawler unit 11 includes: city data characteristics, business circle data characteristics and geographic data characteristics, and the data collected by the enterprise data collection unit 12 includes member data characteristics and store project data characteristics. Almost all key features can be covered for the sales scale of the store to be forecasted through the web crawler unit 11 and the enterprise data collecting unit 12, wherein the detail features of each collection are described in the first embodiment and are not described in detail here.
Preferably, the data processing module 2 includes a continuous variable correcting unit 21, configured to perform logarithmic correction on the continuous variable data that is approximately normally distributed, and then perform mean normalization on the corrected continuous variable data; further, the data processing module 2 includes a discrete variable optimization unit 22, configured to perform unitary variance analysis on the discrete variable data, screen the discrete variable data that has a large influence on the store plateau effect, convert the screened discrete variable data into the store plateau effect mean value, and continuously sort the different store plateau effect mean values according to different quantitative values. Since the web crawler unit 11 and the enterprise data collection unit 12 are only used for collecting and acquiring data, the data input into the database needs to be further optimized and corrected, so that the data entering the data prediction module has universal representativeness and accuracy. Specifically, the continuous variable correcting unit 21 first performs log (1+ x) taking to alleviate the forward bias trend, where x represents continuous variable data before taking the log, linear analysis is performed again on the result obtained after taking the log, and it can be seen through computer analysis that the result has reached normal distribution, and on this basis, mean normalization is performed on the result: [ x _ scale ═ x1-mean)/std](ii) a The detailed normalization correction method is described in the first embodiment, and is not described again in detail. The discrete variable optimization unit 22 first calculates a univariate variance of a plurality of sets of discrete variable data, and then finds that the plurality of discrete variable data have a large influence on the store level effect, such as store properties, city grades, and the like, through discrete characteristic analysis, so that a level effect mean value of 1,2,3,4 can be defined for each discrete variable data according to the mean value of the level effect under the corresponding value of each discrete variable data, and the influence of the discrete variable data on the level effect is quantitatively described through the level effect mean value, which is equivalent to the value assignment of the discrete variable data to make the discrete variable data become a continuous variable data quantity.
Preferably, when the sales scale of the store is predicted by using a prediction model through a computer, a prediction model pairing unit of the computer is needed, wherein the prediction model pairing unit is used for determining the state type of the retail scene before a prediction result is output by using the prediction model, and then automatically associating and matching the state type with a corresponding specified prediction model; further, the machine learning module 3 further includes a model replication optimization unit 31, wherein different retail scenes generate different prediction models, and different business types exist in the different retail scenes, and the prediction models are directly input into retail scenes of the stores, and can be matched with corresponding specified prediction models through the model matching unit, so as to complete the prediction of the sales scale of the stores, and the prediction models can display the prediction results through a prediction result display unit on the computer. In order to further improve the accuracy of the prediction model, the model replication optimization unit 31 compares the actual sales scale of the store that has been opened with the predicted sales scale, verifies the prediction result, and imports the actual data into the machine learning process, thereby completing the replication optimization of the prediction model, so that the prediction model is in a dynamic optimization and correction process, further improving the usability and accuracy of the prediction model, and reducing the error range of the prediction model.
All the above-mentioned optional technical solutions can be combined arbitrarily to form the optional embodiments of the present invention, and are not described herein again.
It should be noted that: in the store data prediction model provided in the above embodiment, when predicting the sales scale of the store, only the division of the functional modules is illustrated, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the store data prediction model may be divided into different functional modules to complete all or part of the functions described above. In addition, the store data prediction model and the store data prediction method embodiment provided by the above embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment and is not described herein again.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (10)
1. A building method of a store data prediction model is characterized by comprising the following steps:
acquiring first store information around a store through a web crawler technology, acquiring second store information through an enterprise internal management system, and storing the first store information and the second store information into a database;
cleaning and structuring data processing are carried out on the first store information and the second store information to obtain different types of sales data, the sales data are divided into continuous variable data and discrete variable data, normalization correction is carried out on the continuous variable data, and assignment continuous optimization is carried out on the discrete variable data;
and (3) scoring the normalized and corrected and continuously optimized feature data by using a machine learning method, automatically selecting the feature data with high score to perform training set learning to obtain a prediction model of sales scale, and then repeating and optimizing the prediction model by using the actual sales scale data of the store to be predicted.
2. The method for building the store data prediction model according to claim 1, wherein the prediction model is used to output the predicted value of the sales scale of the store to be predicted according to the specific location and retail scene of the store to be predicted.
3. The method of building a store data prediction model according to claim 1, wherein the crawled paths by crawler technology include at least crawling map data, crawling online shopping data, crawling official statistics.
4. The method for building the store data prediction model according to claim 1, wherein the first store information includes city data features, business circle data features and geographic data features, and the second store information includes store member data features and store project data features.
5. The store data prediction model building method according to claim 1, wherein the normalization correction includes: and carrying out logarithmic deviation correction on the continuous variable data which are approximately normally distributed, and then carrying out mean value normalization processing on the corrected continuous variable data.
6. The method for building the store data prediction model according to claim 1, wherein the assigning continuous optimization comprises: and carrying out unitary variance analysis on the discrete variable data, screening the discrete variable data which has a large influence on the store plateau effect, converting the screened discrete variable data into the store plateau effect mean value corresponding to the store, and continuously sequencing the different plateau effect mean values according to different quantitative values.
7. The method for building the store data prediction model according to claim 2, further comprising the step of specifying a business type of the retail scene before outputting the prediction result by using the prediction model, and automatically associating and matching the business type with the corresponding specified prediction model.
8. A predictive model of store data, comprising:
the system comprises a data acquisition module and a database, wherein the data acquisition module comprises a web crawler unit and an enterprise data acquisition unit, the web crawler unit is used for acquiring first store information around stores through a web crawler technology, the enterprise data acquisition unit is used for acquiring second store information through an enterprise internal management system, and the data acquisition module stores the first store information and the second store information into the database;
the data processing module is used for cleaning and carrying out structured data processing on the first store information and the second store information to obtain different types of sales data, dividing the sales data into continuous variable data and discrete variable data, carrying out normalization correction on the continuous variable data, and carrying out assignment continuous optimization on the discrete variable data;
and the machine learning module is used for scoring the characteristic data after the normalization correction and the continuous optimization, automatically selecting the characteristic data with high score to perform training set learning to obtain a prediction model of the sales scale, and performing repeated optimization on the prediction model by using the actual sales scale data of the store to be predicted.
9. The store data prediction model of claim 8, wherein the data processing module comprises a continuous variable correction unit, and is configured to perform logarithmic correction on the continuous variable data that is approximately normally distributed, and then perform mean normalization on the corrected continuous variable data.
10. The store data prediction model according to claim 8, wherein the data processing module comprises a discrete variable optimization unit, and is configured to perform unitary variance analysis on the discrete variable data, screen the discrete variable data that has a larger influence on store floor effects, convert the screened discrete variable data into store-corresponding floor effect means, and continuously sort the different floor effect means according to different quantitative values.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910683454.6A CN110633401A (en) | 2019-07-26 | 2019-07-26 | Prediction model of store data and establishment method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910683454.6A CN110633401A (en) | 2019-07-26 | 2019-07-26 | Prediction model of store data and establishment method thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110633401A true CN110633401A (en) | 2019-12-31 |
Family
ID=68969004
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910683454.6A Pending CN110633401A (en) | 2019-07-26 | 2019-07-26 | Prediction model of store data and establishment method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110633401A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275172A (en) * | 2020-01-21 | 2020-06-12 | 复旦大学 | Feedforward neural network structure searching method based on search space optimization |
CN113159934A (en) * | 2021-05-26 | 2021-07-23 | 中国工商银行股份有限公司 | Method and system for predicting passenger flow of network, electronic equipment and storage medium |
CN113837794A (en) * | 2021-08-26 | 2021-12-24 | 润联软件系统(深圳)有限公司 | Chain retail store sales prediction method based on space-time graph convolutional network |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106408181A (en) * | 2016-09-09 | 2017-02-15 | 广州速鸿信息科技有限公司 | Smart store system and method based on big data analysis |
US20170124492A1 (en) * | 2015-10-28 | 2017-05-04 | Fractal Industries, Inc. | System for automated capture and analysis of business information for reliable business venture outcome prediction |
CN107180275A (en) * | 2017-05-16 | 2017-09-19 | 厦门数图科技有限公司 | One kind standardization turnover predictor method and system |
CN107292672A (en) * | 2017-07-05 | 2017-10-24 | 上海数道信息科技有限公司 | System and method for is realized in a kind of catering industry sales forecast |
CN107908778A (en) * | 2017-12-04 | 2018-04-13 | 杭州华量软件有限公司 | A kind of wisdom market big data management system |
CN108629618A (en) * | 2017-03-22 | 2018-10-09 | 董泽平 | Product sales prediction method and system without model speculation foundation |
CN109523314A (en) * | 2018-11-12 | 2019-03-26 | 深圳云行智能科技有限公司 | A kind of supply chain management-control method and its system and storage medium based on AI technology |
-
2019
- 2019-07-26 CN CN201910683454.6A patent/CN110633401A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170124492A1 (en) * | 2015-10-28 | 2017-05-04 | Fractal Industries, Inc. | System for automated capture and analysis of business information for reliable business venture outcome prediction |
CN106408181A (en) * | 2016-09-09 | 2017-02-15 | 广州速鸿信息科技有限公司 | Smart store system and method based on big data analysis |
CN108629618A (en) * | 2017-03-22 | 2018-10-09 | 董泽平 | Product sales prediction method and system without model speculation foundation |
US20180300738A1 (en) * | 2017-03-22 | 2018-10-18 | National Taiwan Normal University | Method and system for forecasting product sales on model-free prediction basis |
CN107180275A (en) * | 2017-05-16 | 2017-09-19 | 厦门数图科技有限公司 | One kind standardization turnover predictor method and system |
CN107292672A (en) * | 2017-07-05 | 2017-10-24 | 上海数道信息科技有限公司 | System and method for is realized in a kind of catering industry sales forecast |
CN107908778A (en) * | 2017-12-04 | 2018-04-13 | 杭州华量软件有限公司 | A kind of wisdom market big data management system |
CN109523314A (en) * | 2018-11-12 | 2019-03-26 | 深圳云行智能科技有限公司 | A kind of supply chain management-control method and its system and storage medium based on AI technology |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275172A (en) * | 2020-01-21 | 2020-06-12 | 复旦大学 | Feedforward neural network structure searching method based on search space optimization |
CN111275172B (en) * | 2020-01-21 | 2023-09-01 | 复旦大学 | Feedforward neural network structure searching method based on search space optimization |
CN113159934A (en) * | 2021-05-26 | 2021-07-23 | 中国工商银行股份有限公司 | Method and system for predicting passenger flow of network, electronic equipment and storage medium |
CN113837794A (en) * | 2021-08-26 | 2021-12-24 | 润联软件系统(深圳)有限公司 | Chain retail store sales prediction method based on space-time graph convolutional network |
CN113837794B (en) * | 2021-08-26 | 2024-07-19 | 华润数字科技有限公司 | Chain retail store sales prediction method based on space-time diagram convolution network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ali et al. | A data-driven approach for multi-scale GIS-based building energy modeling for analysis, planning and support decision making | |
Song et al. | A review of research on tourism demand forecasting: Launching the Annals of Tourism Research Curated Collection on tourism demand forecasting | |
CN110222267B (en) | Game platform information pushing method, system, storage medium and equipment | |
Tekouabou | Intelligent management of bike sharing in smart cities using machine learning and Internet of Things | |
US11068911B1 (en) | Automatically determining market rental rate index for properties | |
Yoo et al. | Variable selection for hedonic model using machine learning approaches: A case study in Onondaga County, NY | |
Deng et al. | Inter-company comparison using modified TOPSIS with objective weights | |
Bostancı et al. | Investigating the satisfaction of citizens in municipality services using fuzzy modelling | |
Mimis et al. | Property valuation with artificial neural network: the case of Athens | |
Jeong et al. | Integrating buildings into a rural landscape using a multi-criteria spatial decision analysis in GIS-enabled web environment | |
CN105868847A (en) | Shopping behavior prediction method and device | |
CN103544663A (en) | Method and system for recommending network public classes and mobile terminal | |
CN110633401A (en) | Prediction model of store data and establishment method thereof | |
Bottero et al. | Decision support systems for evaluating urban regeneration | |
CN109214863B (en) | Method for predicting urban house demand based on express delivery data | |
CN105808637A (en) | Personalized recommendation method and device | |
Ahtesham et al. | House price prediction using machine learning algorithm-the case of Karachi city, Pakistan | |
US10460406B1 (en) | Automatically determining market rental rates for properties | |
CN112149352B (en) | Prediction method for marketing activity clicking by combining GBDT automatic characteristic engineering | |
CN112668803B (en) | Automobile service chain enterprise shop-opening and site-selecting method based on LightGBM model | |
Miguez et al. | Selection of non-financial sustainability indicators as key elements for multi-criteria analysis of hotel chains | |
CN111898860A (en) | Site selection and operation strategy generation method for digital audio-visual place and storage medium | |
Swietek et al. | Visual Capital: Evaluating building-level visual landscape quality at scale | |
CN114329240A (en) | Site selection feature screening method and device, electronic equipment and storage medium | |
Lynn et al. | The british crime survey: A review of methodology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191231 |
|
RJ01 | Rejection of invention patent application after publication |