CN110866696B - Training method and device for risk assessment model of shop drop - Google Patents
Training method and device for risk assessment model of shop drop Download PDFInfo
- Publication number
- CN110866696B CN110866696B CN201911122189.0A CN201911122189A CN110866696B CN 110866696 B CN110866696 B CN 110866696B CN 201911122189 A CN201911122189 A CN 201911122189A CN 110866696 B CN110866696 B CN 110866696B
- Authority
- CN
- China
- Prior art keywords
- training
- model
- index
- sub
- shop
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the invention provides a training method and device for a shop drop risk assessment model, wherein the method comprises the following steps: obtaining a training sample set, wherein the training sample comprises a good sample and a bad sample, and each sample comprises a plurality of index data; training an initial model based on the training samples to obtain a target model; the target model is used for assessment of risk of shop drop. The model obtained through training can directly evaluate the shop falling probability of the shop, and the evaluation efficiency and the evaluation precision of the shop falling are improved.
Description
Technical Field
The application relates to the technical field of risk early warning, in particular to a training method and device for a shop drop-down risk assessment model.
Background
In recent years, the overall development trend of commercial properties is still good under the situation of stable economic operation and continuous upgrading of consumption. However, in the operation of commercial property, there are many risks and difficulties, such as the return of lease of business, the adjustment of business state, and the management of property. Among these, the "drop-out" (merchant out of business) management risk is the major risk faced by many commercial real estate enterprises. The occurrence of "drop-out" not only reduces rental income, but also increases maintenance cost and re-recruitment cost, causing huge losses to the enterprise. If the risk monitoring and early warning can be carried out on the 'falling shop', the shops can be assisted in time to avoid the situation that the shops are rented back due to difficult operation, or the follow-up business operation is carried out in advance, the loss caused by the business withdrawal of the merchants can be reduced, and the enterprise competitiveness is enhanced.
The traditional shop drop analysis method is mainly based on manual analysis and prediction, and often needs to undergo multiple steps of data collection, data arrangement, data analysis, result comparison, report writing and the like, so that the working time of marketing staff is greatly occupied, and the working efficiency is lower. After the modern business is added with big data age, the defects of the traditional method are more remarkable, the difficulty of extracting useful information from massive, high-speed and low-value-density data by using the traditional manual analysis method is very high, the accuracy and timeliness are not high, and the macro risk condition monitoring cannot be performed.
Disclosure of Invention
According to the training method and device for the shop risk assessment model, the shop risk early warning is carried out by using the model obtained through training, and the problem of low efficiency of traditional manual analysis is solved.
In order to achieve the improvement of accuracy and evaluation efficiency of the shop drop evaluation, the embodiment of the invention provides the following technical scheme:
the embodiment of the invention provides a training method for a shop drop risk assessment model, which comprises the following steps: obtaining a training sample set, wherein the training sample comprises a good sample and a bad sample, and each sample comprises a plurality of index data; training an initial model based on the training samples to obtain a target model; the target model is used for assessment of risk of shop drop.
Optionally, the plurality of indexes are divided into a plurality of index dimensions, wherein each index dimension corresponds to a part of indexes in the plurality of indexes; the initial model is trained based on a plurality of metrics in a plurality of metrics dimensions.
Optionally, the step of training the initial model based on the training samples to obtain the target model includes:
dividing time intervals, wherein one time interval corresponds to one sub-model; training according to the same steps based on the training samples to obtain each sub-model; and integrating all the sub-models obtained through training to obtain a final target model.
The application also provides a shop drop risk assessment model training device, it includes: the acquisition module is used for acquiring training samples, wherein the training samples comprise good samples and bad samples, and each sample comprises a plurality of index data; the training module is used for training an initial model based on the training sample to obtain a target model; the target model is used for assessment of risk of shop drop.
Compared with the prior art, the invention has the beneficial effects that:
1. through the acquired multiple indexes of the normal operation shops and the shops with the shop falling, the model is trained based on the multiple indexes, the trained model can directly evaluate the shop falling probability of the shops, and the evaluation efficiency and the evaluation accuracy of the shop falling are improved.
2. The method comprises the steps of dividing a plurality of indexes into a plurality of index dimensions with pertinence, training a model based on indexes in the plurality of index dimensions, and acquiring the influence of the plurality of index dimensions on the shop from the result of evaluating the shop from the trained model, so that the shop can improve the operation strategy according to the evaluation result, and the risk of the shop is reduced.
3. Before a training sample is input into a model for training, all indexes in the training sample are subjected to distinguishing capability processing in a WOE box dividing mode, the IV values of all the obtained indexes are compared, and only the indexes with strong distinguishing capability are reserved, so that the training efficiency and the prediction accuracy of the model are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is an electronic device provided in an embodiment of the present application;
FIG. 2 is a flowchart of a training method for a risk assessment model for a shop drop-down in accordance with an embodiment of the present application;
fig. 3 is a flow chart illustrating the sub-steps of step S120;
fig. 4 is a flow chart illustrating the sub-steps of step S122;
FIG. 5 is a schematic flow chart of a screening step of the index;
FIG. 6 is a block diagram of a training device for a risk assessment model for a shop drop-down provided by the present application;
fig. 7 is a schematic structural diagram of a target model according to an embodiment of the present application.
Fig. 8 is a graph of ROC in experimental examples.
Fig. 9 is a PR graph in experimental examples.
Icon: 10-an electronic device; 12-a processor; 14-a memory; 100-shop risk assessment model training device; 110-an acquisition module; 120-training module; 130-a checking module.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The invention is realized by the following technical scheme.
The technical scheme of the invention will be described in detail below with reference to the accompanying drawings.
Referring to fig. 1, an electronic device 10 provided in the embodiments of the present application includes a memory 14 and a processor 12, where the memory and the processor are directly or indirectly electrically connected to each other to implement data transmission or interaction, and the electronic device may be a server, a terminal device, or any device having data storage and processing capabilities.
The memory stores a software function module stored in the memory in a form of software or firmware (Fimware), and the processor executes various functional applications and data processing by running a software program and a module stored in the memory, such as the shop drop risk assessment model training device 100 in the embodiment of the present invention, so as to implement the shop drop risk assessment model training method in the embodiment of the present invention.
Referring to fig. 2, fig. 2 is a flow chart of a training method of a shop drop risk assessment model according to an embodiment of the present application, and when the electronic device implements the training method of the shop drop risk assessment model, steps S110 to S130 are executed.
Step S110, a training sample is obtained.
In the embodiment of the application, training samples may be obtained from a database, where the training samples include good samples (shops in normal operation) and bad samples (shops in which a drop-out phenomenon has occurred), and a plurality of index data included in the good samples and the bad samples.
In embodiments of the present application, the plurality of metrics may include, but are not limited to: first-class business state, merchant property, contract day, bunk type, agency mode, average increase rate of turnover, average change amount of turnover, average increase rate of passenger flow, maximum withdrawal of passenger flow, media activity, business area ratio of stores in the same square and industry, city type and business turnover ratio of stores in the same square and industry.
In this embodiment, optionally, based on the own data of the database, the multiple indexes are divided into multiple index dimensions according to the business logic and various common index construction techniques, where each index dimension corresponds to a part of the multiple indexes. For example: dividing a plurality of indexes into store characteristics, operating conditions, activity levels and competing factors, wherein the corresponding part of indexes in store characteristics can be primary business states, merchant characteristics, contract days, berth types and agency modes; the corresponding part of indexes in the operation condition can be average increase rate of turnover and average value of turnover change; the partial indexes corresponding to the activity degree can be average increase rate of the passenger flow, maximum withdrawal of the passenger flow and media activities; the corresponding part of indexes in the competing factors can be the operating area ratio of stores in the same industry in the same square, the city type and the operating amount ratio of stores in the same industry in the same square.
Wherein, (1) store trait: the shop and business relationship is that the probability of falling off in the catering industry is larger than the probability of falling off in the life top-quality product and the probability of falling off in the clothing industry. Store area versus drop, such stores exhibit a lower likelihood of drop when the store area is between 100 square meters and 150 square meters. And when the shop area is more than 330 square meters, the shop may terminate the contract in advance as the business condition deteriorates. Contract days and shop drop relation, when the signed contract years are shops of about three years, the shop drop probability is relatively high.
(2) Operating conditions: the turnover growth rate is related to the turnover loss, and when the average turnover cycle ratio per month is reduced by more than 5%, the probability of the turnover loss of the shop is remarkably increased. The relation between maximum withdrawal of business and shop dropping increases with the greater maximum withdrawal of business in the last half year. The relation between continuous decrease of turnover and falling of shop, if turnover continuously slips down for three months, the probability of shop falling increases. Relationship between tarmac effect and drop, there is a monotonic linear relationship between tarmac effect and drop, and the higher the tarmac effect of store, the lower the drop probability. The relation between the income and rent ratio and the drop of the berth is that the income and rent ratio and the drop of the berth have monotonic linear relation, and the higher the income and rent ratio is, the lower the drop probability is.
(3) Degree of activity: the relation between the traffic flow increase rate and the drop-out rate is that the monthly average traffic flow increase rate and the drop-out rate have a monotonic linear relation, and the higher the monthly average traffic flow increase rate is, the lower the drop-out rate is. The relation between the maximum passenger flow withdrawal and the drop-out is that the maximum passenger flow withdrawal and the drop-out have a monotonic linear relation, and the larger the maximum passenger flow withdrawal is, the higher the drop-out probability is. The relation between the passenger flow and the falling of the pavement is that the average passenger flow and the falling of the pavement have a monotonic linear relation, and the higher the average passenger flow of the month is, the lower the falling probability is. The relationship between continuous decrease of the passenger flow and falling of the store increases if the passenger flow of the store continues to decrease for three months. An initial model is trained based on a plurality of metrics in a plurality of metrics dimensions.
It is easy to understand that the selection of the above indexes and the division of the dimensions are only an illustrative example, and different indexes or more indexes can be adopted based on different considerations, and different dimension divisions can be performed, so that the method of the invention has no hard requirement.
In the embodiment of the present application, the sample acquisition further includes a sample sampling manner, and in view of that bad samples (falling samples) are not small in number, and that the data provided in the database are already average samples, and spread over various cities and squares, so when the training set test set is classified on the existing data, the embodiment of the present application directly adopts a random classification method:
in the present embodiment, 90% of all samples are used as training data sets, and the remaining 10% are used as test data sets (if new data is available, they can be re-scaled). In order to ensure the accuracy and stability of the model and reduce coefficient deviation, a Bootstrap sampling method is adopted to generate 100 groups of training samples according to the ratio of 1:4 for the good sample list and each training sample is used for training an initial model to generate 100 groups of coefficients, and each coefficient is averaged to obtain the final output value (coefficient) of the model.
Step S120, training the initial model based on the training samples to obtain the target model.
In the embodiment of the application, the target model is obtained by training an initial model for a plurality of times, and is used for evaluating risk of shop falling. The initial model is trained based on a plurality of indexes of a plurality of index dimensions in the training sample, and then a target model is obtained.
Step S130, checking the obtained object model.
In the embodiment of the application, after the initial model is trained based on a plurality of indexes of a plurality of index dimensions in the training sample and the target model is obtained, in order to ensure the evaluation effect of the model, the target model obtained through training can be checked. For example, the resulting overall model is validated against a reserved test dataset, and classification AUC, accuracy and recall are calculated by drawing ROC and PR curves to validate the model reliability and accuracy. As shown in fig. 8-9, the analysis of the target model by using the training sample data and the test sample data shows a better effect, that is, the target model obtained by the method of the invention has a higher prediction capability for the risk of shop drop. Note that this step is an optional step, and is not a necessary step.
Referring to fig. 3, fig. 3 is a flow chart illustrating a sub-step of step S120. The initial model described in this embodiment includes a plurality of sub-models, where each sub-model corresponds to a time interval. Specifically, the method comprises the following steps:
in step S121, time intervals are divided, and one time interval corresponds to one sub-model.
In this embodiment, the present month where the contract of the shop-off behavior or the normal operation ends is taken as a cut-off point, for example, the present month is pushed forward for 1 month, 2 months, 3 months, 4 months, 5 months and 6 months, respectively, and training of the submodel is performed based on the indexes of different time sections of the shop operation.
It will be appreciated that for the division of time intervals and the number of sub-models, there are only one illustrative example, and there may be other different processing manners, for example, a time interval of every two months, and for example, pushing forward for a year and a month for a time interval, and then obtaining 12 sub-models.
Step S122, training according to the same steps to obtain each sub-model based on the training samples.
Referring to fig. 4, each sub-model training step includes:
in step S1221, a training sample is input.
In the embodiment of the invention, training samples are input into the sub-model, wherein a plurality of indexes in the training samples are indexes in a time interval corresponding to the sub-model. For example, 6 models correspond to one month and two months … … for six months respectively, one model corresponds to one month before the time node, and all indexes in the training sample should be indexes of the store in the one month. If a sub-model corresponds to a time interval of two months, all the indicators in the training sample should be indicators of the store within the two months.
Step S1222, processing is performed based on the multiple indexes in the training sample, and evaluation values of multiple index dimensions are obtained.
In the embodiment of the application, the evaluation values of the plurality of index dimensions are equivalent to the evaluation situations of the store in the plurality of index dimensions, and the falling probabilities of the good samples and the bad samples can be obtained through the evaluation values.
Optionally, scoring is performed based on the multiple indexes, and scoring values of multiple index dimensions are obtained. The scoring values of the plurality of index dimensions are the sum of the scoring values of the plurality of indexes. And carrying out weight distribution on the plurality of index dimensions by combining the grading value of each index dimension to obtain the weights of the plurality of index dimensions. The scoring value and the weight of each index dimension are taken as the scoring value.
In the embodiment of the application, a submodel is trained by adopting a logistic regression algorithm based on a plurality of indexes of a plurality of index dimensions in a training sample. For example, an index dimension is a store trait that is modeled primarily based on a series of indices of the index dimension. Store attributes include: the method comprises the steps of taking 8 indexes in total, namely a primary property, a secondary property, a tertiary property, a logarithm of operation area, a merchant property, a contract day, a bunk type and a proxy mode, inputting the 8 indexes as variables, training according to a logistic regression algorithm to obtain and output a scoring value for the index dimension as the store property. Whereby the sub-model is trained by a logistic regression algorithm based on other index dimensions.
In the embodiment of the application, weight distribution is performed on each index dimension by adopting a genetic algorithm, and an objective function of the genetic algorithm is information entropy. And superposing the scoring values of all the index dimensions based on the obtained weights, thereby obtaining a submodel based on observation of a certain time point.
In some embodiments, first, the above four index dimensions into which a plurality of indexes are divided are respectively trained to obtain a score of each index dimension. And secondly, the four index dimensions are subjected to genetic algorithm to obtain weights corresponding to the four index dimensions, and a sub-model 1 is obtained after accumulation, namely a training model based on a month (a time period corresponding to the sub-model) before a store cut-off point. In order to obtain an optimal (locally optimal) weight distribution in four index dimensions, the sum of the four weights is ensured to be 1, and a genetic algorithm is adopted for weight searching. The specific method comprises the following steps: setting an optimization function of a genetic algorithm, wherein a cross entropy loss function is adopted, setting the size of an initial population, the maximum iteration times, the replication probability, the variation probability and the cross probability, and finally automatically stopping iteration after the optimization function converges to return the optimal result weight. For example, in one test example, the weights of the four index dimensions are 0.271, 0.239, 0.24, and 0.24, respectively.
Step S1223, checking the evaluation value obtained in step S1222, stopping training if the error of the evaluation value reaches the set error precision or the iteration number reaches the set value, otherwise returning to step S1221, inputting new sample data for repeated training until the training is finished.
And step S123, integrating all the sub-models trained in the step S122 to obtain a final target model. In this embodiment, weights of all sub-models are distributed among all the sub-models by adopting a genetic algorithm, and then the scoring values of all the sub-models and the weights thereof are overlapped to obtain a final target model, namely, a risk model capable of predicting the "drop-down" possibility of a shop. The model outputs a predicted probability of falling off in half a year of the final merchant, from 0 to 1, indicating that the probability of falling off of the merchant is from low to high.
In the step S110, for determining the index included in the sample, after the primary determination according to the business logic and the priori knowledge, the discrimination capability analysis may be performed, the index with poor discrimination capability may be removed, and the index with strong discrimination capability may be reserved.
Referring to fig. 5 in combination, fig. 5 is a flow chart illustrating a screening procedure for an index, where the specific screening process includes:
step S1101, WOE binning is performed on the preliminarily determined indexes to obtain IV values of each index.
In the embodiment of the application, WOE (WOE) box division is firstly carried out on all index data in a training sample before the training sample enters the model, variables with continuous values are divided into a plurality of discrete classes, one class is a box division, one value is a box division for the variables with discrete values, interference of extreme data, abnormal data and missing data on the model is avoided, and training efficiency and prediction accuracy of the model are improved. The IV value of the single variable can be calculated based on WOE binning to evaluate the distinguishing capability of the single index to the shops.
In some embodiments, based on the existing data, 58 indexes are determined in total and divided into 4 large index dimensions (store characteristics, operation conditions, liveness and competing factors), and most indexes show good prediction ability according to IV values of each index. And before entering the model, WOE binning is carried out on all field data of the test set, and variables with continuous values are divided into a plurality of discrete classes. Discrete data binning is performed according to the number of data types. Each discrete type of data would be a bin of WOE. The continuous data, if it can be divided into five boxes, is divided into five boxes; if the continuous variable cannot be divided into five boxes (a situation that a certain value occurs multiple times), the modified value is set to a special value and divided into one box, and the remaining values are divided into five boxes. If the operation is still unable to be performed, the first two values which are more frequently present are set as special values and divided into two separate boxes, and the remaining data are divided into four boxes.
Step S1102, determining the distinguishing capability of each index according to the preset IV value interval based on the IV value of the index.
In the embodiment of the application, based on the obtained IV value of each index, the index with weak distinguishing capability and strong distinguishing capability can be set through the preset IV value interval, and the indexes with weak distinguishing capability, medium distinguishing capability and strong distinguishing capability can also be set, so that the preset IV value can be set by self through the requirement. The IV value represents the duty ratio difference of the good and bad samples in different value groups, and the larger the IV value is, the larger the index has the larger discrimination capability on the good and bad samples. The preset interval of IV value may be, for example: IV is [0.02,0.1 ], the index has weak distinguishing capability, IV is [0.1, 0.3), the index has medium distinguishing capability, IV is greater than or equal to 0.3, and the index has strong distinguishing capability. The index of weak discrimination is deleted and the index of medium discrimination and the index of strong discrimination are retained. For example, a media activity indicator IV value of 0.000 indicates that the indicator has little distinguishing ability, so the media activity indicator is screened out when the model is entered; the merchant property index IV value is 0.334, which indicates that the index has strong distinguishing capability and retains the merchant property index.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a training device for a risk assessment model of a shop, which is provided by the present application, and the device includes an acquisition module, a training module and a verification module.
The acquisition module is used for acquiring training samples, wherein the training samples comprise good samples and bad samples, and each sample comprises a plurality of index data.
In the embodiment of the present application, the obtaining module is configured to perform step S110 in fig. 2, and the specific description of the obtaining module may refer to the specific description of step S110 in fig. 2.
The training module is used for training an initial model based on the training sample to obtain a target model; the target model is used for assessment of risk of shop drop.
In the embodiment of the present application, the training module is configured to perform step S120 in fig. 2, and the specific description of the training module may refer to the specific description of step S120 in fig. 2.
The inspection module is used for inspecting the obtained target model.
In the embodiment of the present application, the inspection module is used to perform step S130 in fig. 2, and the specific description of the inspection module may refer to the specific description of step S130 in fig. 2.
In some embodiments, the initial model comprises a plurality of sub-models, wherein each sub-model corresponds to a time interval; the training module comprises a dividing sub-module, a training sub-module and an integration sub-module.
The dividing submodule is used for dividing time intervals, and one time interval corresponds to one submodule. For example, the time interval is divided by pushing forward for 1 month, 2 months, 3 months, 4 months, 5 months and 6 months respectively with the month of the end of the contract where the drop-out behavior or the normal operation occurs as a cut-off point.
In the embodiment of the present application, the dividing submodule is configured to perform step S121 in fig. 3, and the specific description of the inspection module may refer to the specific description of step S121 in fig. 3.
The training sub-module is used for training each sub-model according to the same steps based on the training samples.
In the embodiment of the present application, the input module is configured to perform step S122 in fig. 3, and the specific description of the inspection module may refer to the specific description of step S122 in fig. 3.
The integration submodule is used for integrating all the submodels obtained through training to obtain a final target model.
In the embodiment of the present application, the input module is configured to perform step S123 in fig. 3, and the specific description of the inspection module may refer to the specific description of step S123 in fig. 3.
Referring to fig. 7 in combination, fig. 7 is a schematic structural diagram of a target model according to an embodiment of the present application.
In the embodiment of the application, store characteristics, operation conditions, activity level and competing factors are targeted index dimensions divided into a plurality of indexes. Model 1, model 2, model 3, model 4, model 5, and model 6 are six sub-models. The cut-off point is the date when the shop operation contract expires, and the corresponding cut-off point is the corresponding time interval for each sub-model.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (6)
1. A method for training a risk assessment model of a shop drop-down, the method comprising:
obtaining a training sample set, wherein the training sample comprises a good sample and a bad sample, and each sample comprises a plurality of index data;
the plurality of indexes are divided into a plurality of index dimensions, wherein each index dimension corresponds to a part of indexes in the plurality of indexes; training an initial model based on a plurality of indicators in a plurality of indicator dimensions;
training an initial model based on the training samples to obtain a target model; the target model is used for evaluating risk of shop falling;
the step of training the initial model based on the training samples to obtain the target model comprises:
dividing time intervals, taking the month of the falling-off behavior or the ending of the normal operation contract as a cut-off point, pushing a plurality of time intervals forwards, wherein one time interval corresponds to one sub-model;
training according to the same steps based on the training samples to obtain each sub-model;
and integrating all the sub-models obtained through training to obtain a final target model.
2. The method of claim 1, wherein training each sub-model according to the same steps based on training samples comprises:
inputting a training sample, wherein a plurality of index data in the training sample are index data in a time interval corresponding to the sub-model;
performing logistic regression training based on the multiple indexes in the training sample to obtain evaluation values of the multiple index dimensions;
and checking the obtained evaluation value, stopping training if the error of the evaluation value reaches the set error precision or the iteration number reaches the set value, otherwise, inputting new sample data to perform repeated training until the training is finished.
3. The method of claim 1, wherein the step of integrating all the sub-models trained comprises: and (3) carrying out weight distribution on all the sub-models through a genetic algorithm, and then superposing the scoring value and the weight of each sub-model to obtain a final target model.
4. The method of claim 1, wherein the index in the sample is determined by:
for each preliminarily determined index, carrying out WOE box-dividing treatment, and obtaining an IV value of each index;
determining the strength of distinguishing capability of each index according to a preset IV value interval, reserving the index with strong distinguishing capability, eliminating the index with weak distinguishing capability, wherein the index in the sample is reserved index.
5. The utility model provides a shop falls shop risk assessment model trainer which characterized in that, it includes:
the acquisition module is used for acquiring training samples, wherein the training samples comprise good samples and bad samples, and each sample comprises a plurality of index data;
the plurality of indexes are divided into a plurality of index dimensions, wherein each index dimension corresponds to a part of indexes in the plurality of indexes; training an initial model based on a plurality of indicators in a plurality of indicator dimensions;
the training module is used for training an initial model based on the training sample to obtain a target model; the target model is used for evaluating risk of shop falling;
the initial model comprises a plurality of sub-models, wherein each sub-model corresponds to a time interval; the training module comprises:
dividing sub-module for dividing time interval, using current month of contract ending of running or normal running as cut-off point, pushing several time intervals forward, one time interval corresponding to one sub-model;
the training sub-module is used for training according to the same steps based on the training samples to obtain each sub-model;
and the integration sub-module is used for integrating all the sub-models obtained through training to obtain a final target model.
6. A shop drop risk assessment model, characterized in that it is trained by the training method of the shop drop risk assessment model according to any one of claims 1 to 3, wherein the shop drop risk assessment model is used for assessing the shop drop risk.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911122189.0A CN110866696B (en) | 2019-11-15 | 2019-11-15 | Training method and device for risk assessment model of shop drop |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911122189.0A CN110866696B (en) | 2019-11-15 | 2019-11-15 | Training method and device for risk assessment model of shop drop |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110866696A CN110866696A (en) | 2020-03-06 |
CN110866696B true CN110866696B (en) | 2023-05-26 |
Family
ID=69654776
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911122189.0A Active CN110866696B (en) | 2019-11-15 | 2019-11-15 | Training method and device for risk assessment model of shop drop |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110866696B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112561320A (en) * | 2020-12-14 | 2021-03-26 | 中国建设银行股份有限公司 | Training method of mechanism risk prediction model, mechanism risk prediction method and device |
CN113487414A (en) * | 2021-07-07 | 2021-10-08 | 广东中盈盛达数字科技有限公司 | Method and system for dividing modeling interval of wind control scoring card |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104867017A (en) * | 2015-05-16 | 2015-08-26 | 成都数联铭品科技有限公司 | Electronic commerce client false evaluation identification system |
CN105608590A (en) * | 2016-03-04 | 2016-05-25 | 刘恺之 | Operation method for consumption ecosphere business model established based on CRM core |
WO2017115626A1 (en) * | 2015-12-29 | 2017-07-06 | ビーエルデーオリエンタル株式会社 | Participating store evaluation device and participation-type facility using participating store evaluation device |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105844530A (en) * | 2016-03-24 | 2016-08-10 | 深圳市前海安测信息技术有限公司 | Actuarial studying method and actuarial studying system based on health management |
CN106570631B (en) * | 2016-10-28 | 2021-01-01 | 南京邮电大学 | P2P platform-oriented operation risk assessment method and system |
CN108960431A (en) * | 2017-05-25 | 2018-12-07 | 北京嘀嘀无限科技发展有限公司 | The prediction of index, the training method of model and device |
CN108921398B (en) * | 2018-06-14 | 2020-12-11 | 口口相传(北京)网络技术有限公司 | Shop quality evaluation method and device |
CN108932585B (en) * | 2018-06-19 | 2022-02-22 | 腾讯科技(深圳)有限公司 | Merchant operation management method and equipment, storage medium and electronic equipment thereof |
CN109101989B (en) * | 2018-06-29 | 2021-06-29 | 创新先进技术有限公司 | Merchant classification model construction and merchant classification method, device and equipment |
CN109345368A (en) * | 2018-08-22 | 2019-02-15 | 中国平安人寿保险股份有限公司 | Credit estimation method, device, electronic equipment and storage medium based on big data |
CN109447461B (en) * | 2018-10-26 | 2022-05-03 | 北京三快在线科技有限公司 | User credit evaluation method and device, electronic equipment and storage medium |
-
2019
- 2019-11-15 CN CN201911122189.0A patent/CN110866696B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104867017A (en) * | 2015-05-16 | 2015-08-26 | 成都数联铭品科技有限公司 | Electronic commerce client false evaluation identification system |
WO2017115626A1 (en) * | 2015-12-29 | 2017-07-06 | ビーエルデーオリエンタル株式会社 | Participating store evaluation device and participation-type facility using participating store evaluation device |
JPWO2017115626A1 (en) * | 2015-12-29 | 2018-10-18 | ビーエルデーオリエンタル株式会社 | Participating store evaluation device and participatory facility using the participating store evaluation device |
CN105608590A (en) * | 2016-03-04 | 2016-05-25 | 刘恺之 | Operation method for consumption ecosphere business model established based on CRM core |
Also Published As
Publication number | Publication date |
---|---|
CN110866696A (en) | 2020-03-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
McDonald | Measuring the fiscal health of municipalities | |
CN108876034B (en) | Improved Lasso + RBF neural network combination prediction method | |
KR20200074087A (en) | Chinese medicine production process knowledge system | |
CN111127105A (en) | User hierarchical model construction method and system, and operation analysis method and system | |
CN111445121A (en) | Risk assessment method and apparatus, storage medium, and electronic apparatus | |
CN110866696B (en) | Training method and device for risk assessment model of shop drop | |
US20170270546A1 (en) | Service churn model | |
CN112232944B (en) | Method and device for creating scoring card and electronic equipment | |
CN113762764A (en) | Automatic grading and early warning system and method for safety risk of imported food | |
Cascarino et al. | Explainable artificial intelligence: interpreting default forecasting models based on machine learning | |
CN112488496A (en) | Financial index prediction method and device | |
CN114519519A (en) | Method, device and medium for assessing enterprise default risk based on GBDT algorithm and logistic regression model | |
CN116739742A (en) | Monitoring method, device, equipment and storage medium of credit wind control model | |
CN113869423A (en) | Marketing response model construction method, equipment and medium | |
CN113450004A (en) | Power credit report generation method and device, electronic equipment and readable storage medium | |
CN113538021B (en) | Machine learning method for store duration prediction | |
CN112884301A (en) | Method, equipment and computer storage medium for enterprise risk analysis | |
CN106874286B (en) | Method and device for screening user characteristics | |
CN114677006A (en) | Method, system, equipment and readable medium for enterprise health degree prejudgment | |
CN114048592A (en) | Finish rolling whole-flow distributed operation performance evaluation and non-optimal reason tracing method | |
CN114418450A (en) | Data processing method and device | |
CN114596152A (en) | Method, device and storage medium for predicting debt subject default based on unsupervised model | |
Eremina et al. | The Special Aspects of Devising a Methodology for Predicting Economic Indicators in the Context of Situational Response to Digital Transformation. | |
CN113379211A (en) | Block chain-based logistics information platform default risk management and control system and method | |
Even et al. | Understanding Impartial Versus Utility-Driven Quality Assessment In Large Datasets. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |