CN116151857A - Marketing model construction method and device - Google Patents

Marketing model construction method and device Download PDF

Info

Publication number
CN116151857A
CN116151857A CN202310127830.XA CN202310127830A CN116151857A CN 116151857 A CN116151857 A CN 116151857A CN 202310127830 A CN202310127830 A CN 202310127830A CN 116151857 A CN116151857 A CN 116151857A
Authority
CN
China
Prior art keywords
user
training
data
model
marketing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310127830.XA
Other languages
Chinese (zh)
Inventor
刘敬学
朱奕
柳寻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN202310127830.XA priority Critical patent/CN116151857A/en
Publication of CN116151857A publication Critical patent/CN116151857A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a method and a device for constructing a marketing model, comprising the following steps: acquiring static data and dynamic data of a user, and determining data characteristics of the user; inputting the data characteristics into a first training model to obtain a first training result; if the first training result is judged to comprise the mixed area, mapping the characteristic data of the user in the mixed area to a second training model of the first training model to obtain a second training result; the blending area includes data characteristics of at least two categories of users; the number of output neurons of the second training model is the same as the number of output neurons of the first training model; taking the second training result as the first training result until the first training result does not comprise the mixing region; and constructing a marketing model according to the first training model and the second training model, wherein the marketing model is used for outputting the sensitivity of the user to the marketing strategy. The dimension of the training sample is increased, the clustering accuracy of the constructed marketing model is improved, and the accuracy of the prediction result of the constructed marketing model is further improved.

Description

Marketing model construction method and device
Technical Field
The invention relates to the technical field of model construction, in particular to a marketing model construction method and device.
Background
With the rapid development of the internet, a retail scenario enters a new O2O (Online To Offline ) retail era, where the new retail era combines offline business opportunities with the internet, making the internet a platform for offline transactions. The market such as new retail needs to analyze prediction results such as sensitivity of users to marketing strategies, preference of users, probability of responding to the marketing strategies and the like aiming at marketing activities through a network model.
At present, a network model aiming at a marketing strategy is generally based on a single training model, and the training model is trained by taking basic information such as identification, gender, age and the like of a user as a training sample to obtain the network model of the marketing strategy.
However, the accuracy of the prediction result of the network model constructed for the marketing strategy is low, so how to improve the accuracy of the prediction result of the network model of the marketing strategy is a problem to be solved.
Disclosure of Invention
The embodiment of the invention provides a method and a device for constructing a marketing model, which are used for improving the accuracy of a predicted result of the marketing model corresponding to a marketing strategy.
In a first aspect, an embodiment of the present invention provides a method for constructing a marketing model, including:
acquiring static data and dynamic data of a user, and determining data characteristics of the user;
inputting the data characteristics of the user into a first training model to obtain a first training result;
judging whether the first training result comprises a mixing region or not; if yes, mapping the characteristic data of the user in the mixed region to a second training model of the first training model aiming at any mixed region to obtain a second training result; the mixing region includes data characteristics of at least two categories of users; the number of output neurons of the second training model is the same as that of the first training model;
taking the second training result as a first training result until the first training result is judged to not comprise a mixing region;
constructing a marketing model according to the first training model and the second training model; the marketing model is used to output a user's sensitivity to a marketing strategy.
In the above technical solution, the data features of the user include static data and dynamic data of the user; wherein, the static data represents the stable information of the user, such as gender, age, marital status, birth address, etc.; the dynamic data represents behavior change information of the user, such as a browsed shopping website, commodity links, types of paid commodities, and the like. The data characteristics of the user are used as training samples of the model, and the accuracy of the prediction result of the constructed marketing model is improved by improving the dimension of the training samples.
In the embodiment of the invention, the first training model and the second training model are SOM (Self-Organizing Map) models. The second training model is a sub SOM model of the first training model, that is, the constructed marketing model is a tree structure model. And the quantity of output neurons of the second training model is the same as that of output neurons of the first training model, and the clustering results of the second training model and the first training model are the same, so that the clustering accuracy of the constructed marketing model is improved, and the accuracy of the prediction result of the constructed marketing model is further improved.
In addition, judging that the first training result does not comprise a mixed region, indicating that the first training result comprises only a single region, and calling an SOM model corresponding to the single region as a leaf model; wherein the leaf model does not have a sub-SOM model, and a single region represents data features that include only one category of users. Therefore, the accuracy of the prediction result of the constructed marketing model is improved.
Sensitivity represents the predicted outcome of the user's preferences, probability of responding to marketing strategies, etc.
Optionally, determining the data characteristics of the user according to the static data and the dynamic data of the user includes:
extracting features of the static data and the dynamic data of the user to obtain the static features and the dynamic features of the user;
and determining the data characteristics of the user according to the static characteristics and the first weight of the user, the dynamic characteristics and the second weight of the user.
In the above technical solution, the first weight and the second weight are preset based on the marketing strategy. The specific gravity of the dynamic data and the static data is set through the first weight and the second weight, so that the established marketing model has pertinence to the marketing strategy, and the accuracy of the forecast result of the established marketing model for the marketing strategy is improved.
Optionally, before inputting the data features of the user into the first training model, the method further includes:
for the data characteristics of any user, calculating a first Euclidean distance between the data characteristics of the user and the data characteristics of other users;
and carrying out anomaly detection on the user according to the first Euclidean distance.
Optionally, the detecting the abnormality of the user according to the first euclidean distance includes:
calculating a quantity proportion that the first Euclidean distance is larger than a first threshold value;
judging whether the quantity proportion is larger than a second threshold value or not; if yes, determining the user as an abnormal user; otherwise, the user is determined to be a normal user.
According to the technical scheme, before the data features of the user are input into the first training model, the user is subjected to anomaly detection, so that the clustering influence of the data features of the abnormal user on the SOM model is reduced, the clustering precision and stability of the SOM model are improved, and the accuracy of the established marketing model on the prediction result of the marketing strategy is improved.
Optionally, determining whether the first training result includes a blending region includes:
dividing the first training result into a plurality of areas;
for any region, if the data characteristics of at least two users in the region are in different preset classification ranges, the region is a mixed region, and the first training result is determined to comprise the mixed region;
if the data characteristics of all users in the area are in the same preset classification range, the area is a single area, and the first training result is determined to not comprise a mixed area; the single region includes data features of users of the same category.
Optionally, the method further comprises:
the first training model and the second training model perform the following training operations:
initializing the weight and the learning rate of the output neurons;
determining a winning neuron from the output neurons according to the characteristics of the data to be input;
and calculating the weight of the winning neuron according to the characteristics of the data to be input until the learning rate meets the training condition.
Optionally, determining a winning neuron from the output neurons according to the characteristics of the data to be input includes:
normalizing the data characteristics to be input;
calculating the multiplication result of the normalized data characteristic and the weight of the output neuron;
determining an output neuron corresponding to the maximum multiplication result as a winning neuron;
or calculating a second Euclidean distance between the data characteristic to be input and the weight of the output neuron;
the output neuron corresponding to the smallest second euclidean distance is determined as the winning neuron.
In a second aspect, an embodiment of the present invention provides a model building apparatus, including:
the determining module is used for acquiring static data and dynamic data of the user and determining data characteristics of the user;
the processing module is used for inputting the data characteristics of the user into a first training model to obtain a first training result;
judging whether the first training result comprises a mixing region or not; if yes, mapping the characteristic data of the user in the mixed region to a second training model of the first training model aiming at any mixed region to obtain a second training result; the mixing region includes data characteristics of at least two categories of users; the number of output neurons of the second training model is the same as that of the first training model;
taking the second training result as a first training result until the first training result is judged to not comprise a mixing region;
constructing a marketing model according to the first training model and the second training model; the marketing model is used to output a user's sensitivity to a marketing strategy.
Optionally, the determining module is specifically configured to:
extracting features of the static data and the dynamic data of the user to obtain the static features and the dynamic features of the user;
and determining the data characteristics of the user according to the static characteristics and the first weight of the user, the dynamic characteristics and the second weight of the user.
Optionally, the processing module is further configured to:
before the data features of the users are input into a first training model, calculating a first Euclidean distance between the data features of the users and the data features of other users according to the data features of any user;
and carrying out anomaly detection on the user according to the first Euclidean distance.
Optionally, the processing module is specifically configured to:
calculating a quantity proportion that the first Euclidean distance is larger than a first threshold value;
judging whether the quantity proportion is larger than a second threshold value or not; if yes, determining the user as an abnormal user; otherwise, the user is determined to be a normal user.
Optionally, the processing module is specifically configured to:
dividing the first training result into a plurality of areas;
for any region, if the data characteristics of at least two users in the region are in different preset classification ranges, the region is a mixed region, and the first training result is determined to comprise the mixed region;
if the data characteristics of all users in the area are in the same preset classification range, the area is a single area, and the first training result is determined to not comprise a mixed area; the single region includes data features of users of the same category.
In a third aspect, an embodiment of the present invention further provides a computer apparatus, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the model construction method according to the obtained program.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the above-described model building method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a model building method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a marketing model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of distribution of feature data of a user before inputting a marketing model according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of distribution of feature data of a user after inputting a marketing model according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a model building apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In order to better explain the technical scheme of the invention, the nouns related to the invention are explained below.
New retail: refers to combination of on-line and off-line, and the off-line serves as an on-line experience store, so that drainage investment is reduced.
Feature extraction: the method and the process refer to a method and a process for extracting characteristic information in data such as images, texts and the like by using a computer, and the method and the process are used for converting the data such as the images, the texts and the like into characteristic parameters.
SOM: the method is one of the non-supervision learning methods, and is a two-layer network consisting of an input layer and a competitive layer (output layer) and used for completing classification clustering. Wherein, the classification is supervised, and the clustering is unsupervised.
With the development of the internet, retail merchants are increasingly being moved off-line to on-line. The new operating scheme and the new traffic matrix are formed online by means of a social platform, a shopping platform and the like, so that the retail scene enters an O2O new retail scene.
Outliers: refers to extreme large or small values that are far from the sequence level values.
Merchants or brands wish to promote sales by means of new business schemes (e.g., red envelope, discount, full subtraction schemes, etc. for guiding customers). Combining with market needs such as wisdom business district, new retail industry, etc. to try innovative operation scheme, integrating marketing, data, technological ability provides the digital comprehensive solution of online business complex for retail business, business district, brand side in the business to this helps new retail business to build marketing strategy.
For the marketing strategy, the data of the user needs to be analyzed so as to determine the information such as the sensitivity of the user to the marketing strategy, the preference of the user, the probability of responding to the marketing strategy and the like.
In the related art center, generally, user data is collected, and is subjected to feature extraction by combining a machine learning technology with feature extraction capability, and then classified based on the extracted features, and a marketing strategy model is constructed.
The marketing strategy model is used for analyzing and clustering the users, so that target users corresponding to the marketing strategy are dug out. For example, for a marketing strategy of a 9-fold campaign, the input parameters are data features of the user, the output parameters are interest degrees of the user, and the interest degrees comprise interest, general interest and no interest. That is, each user who inputs is classified into 3 categories, that is, a user who is interested in the 9-fold activity, a user who is generally interested in the 9-fold activity, and a user who is not interested in the 9-fold activity.
However, in the above technical solution, the user data extracted by the features is not abundant enough, the constructed model is generally a single model, and only algorithms such as data statistics and traditional cluster analysis are adopted, so that the accuracy of the prediction result of the constructed marketing strategy model is low.
Therefore, a model construction method is needed to improve the accuracy of the prediction result of the marketing model and the accuracy of classification of users.
Fig. 1 illustrates a system architecture to which embodiments of the present invention are applicable, the system architecture including a server 100, the server 100 may include a processor 110, a communication interface 120, and a memory 130.
Wherein the communication interface 120 is used to obtain static data and dynamic data of the user.
The processor 110 is a control center of the server 100, connects various parts of the entire server 100 using various interfaces and routes, and performs various functions of the server 100 and processes data by running or executing software programs and/or modules stored in the memory 130, and calling data stored in the memory 130. Optionally, the processor 110 may include one or more processing units.
The memory 130 may be used to store software programs and modules, and the processor 110 performs various functional applications and data processing by executing the software programs and modules stored in the memory 130. The memory 130 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function, and the like; the storage data area may store data created according to business processes, etc. In addition, memory 130 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
It should be noted that the structure shown in fig. 1 is merely an example, and the embodiment of the present invention is not limited thereto.
Based on the above description, fig. 2 schematically illustrates a flow chart of a model building method according to an embodiment of the present invention, where the flow chart may be executed by a model building apparatus.
As shown in fig. 2, the process specifically includes:
step 210, acquiring static data and dynamic data of a user, and determining data characteristics of the user.
In the embodiment of the invention, the static data of the user represents basic information of the user, such as information of age, gender, marital status, ethnicity, birth place, birth date, height, weight, life region and the like of the user. The dynamic information of the user represents behavior change information of the user, such as information of shopping websites browsed by the user, types of searched commodities, prices of the searched commodities, types of ordered payment commodities and the like.
Step 220, inputting the data features of the user into a first training model to obtain a first training result.
In the embodiment of the invention, the first training model is a SOM model. Wherein the number of input layer neurons of the first training model is determined by the dimension of the input vector; the number of competing layer neurons represents the number of classifications. For example, the number of classifications is 3, and the user interest level for the marketing strategy is classified into 3 categories of good, good and bad.
The input layer is a one-dimensional neuron, and is provided with a plurality of nodes, and the neurons of the competition layer are positioned on two-dimensional plane network nodes to form a two-dimensional node matrix. The neurons of the input layer and the neurons of the competition layer are connected through weights.
The first training result represents a classification result for the user.
Step 230, judging whether the first training result includes a mixing region; if yes, mapping the characteristic data of the user in the mixed region to a second training model of the first training model aiming at any mixed region to obtain a second training result;
in the embodiment of the invention, the number of the output neurons of the second training model is the same as that of the output neurons of the first training model, which means that the classification number of the second training model is the same as that of the first training model.
The blending area includes data characteristics of at least two categories of users; for example, the blending area includes 2 categories of user data features, respectively users interested in the marketing strategy, users not interested in the marketing strategy.
And step 240, taking the second training result as a first training result until the first training result is judged to not include a mixed region.
In the embodiment of the present invention, the first training result includes the following 3 cases: the first is that the first training result includes only a mixed region, the second is that the first training result includes a mixed region and a single region, and the third is that the first training result includes only a single region. Wherein a single region represents a data feature that includes only one category of users. For example, if all users in the first training result are interested in the marketing strategy, the first training result includes only a single region.
And 250, constructing a marketing model according to the first training model and the second training model.
In the embodiment of the invention, the marketing model is used for outputting the sensitivity of the user to the marketing strategy.
In the related art, the training results of the marketing model include data features of a plurality of categories of users. In the embodiment of the invention, the second training model is a sub-SOM model of the first training model, and only the SOM model corresponding to the single region is included as a leaf model, i.e. the training result of the leaf model only includes data features of one category of users. Therefore, the accuracy of the predicted result of the leaf model is higher than that of the marketing model in the related technology, and the accuracy of the predicted result of the constructed marketing model is further improved.
In step 210, feature extraction is performed on static data and dynamic data of a user to obtain static features and dynamic features of the user; and determining the data characteristics of the user according to the static characteristics and the first weight of the user, the dynamic characteristics and the second weight of the user.
The first weight and the second weight are illustratively preset based on the marketing strategy. For example, the first weight is 0.3, the second weight is 0.7, etc., and is not particularly limited herein.
Obtaining a first result by multiplying the static feature by a first weight; and multiplying the dynamic characteristic by a second weight to obtain a second result. And then combining the first result and the second result to obtain the data characteristics of the user. For example, the static feature of the user is [0.8,0.9], the static feature of the user is [0.4,0.5], the first result is [0.24,0.27], the second result is [0.28,0.35], and the data feature of the user is [0.24,0.27,0.28,0.35].
In the embodiment of the invention, the specific gravity of the dynamic data and the static data is set through the first weight and the second weight, so that the established marketing model has pertinence to the marketing strategy. The static data is used for data statistics analysis, and the dynamic data is used for analyzing the comprehensive preference degree of the category of the user and the sensitivity to the marketing strategy, so that the accuracy of the prediction result of the constructed marketing model aiming at the marketing strategy is improved.
The dynamic characteristics and the static characteristics are fused by setting the weight function, and dynamic updating is carried out according to the first weight and the second weight, so that the characteristic extraction of the dynamic data and the static data of the user is more abstract, the single application scene of the marketing model constructed by single static data or single dynamic data is overcome, and the application scene richness of the marketing model is increased.
Illustratively, before inputting the data features of the users into the first training model, calculating a first euclidean distance between the data features of the users and the data features of other users for the data features of any user; and detecting the abnormality of the user according to the first Euclidean distance.
For example, using the user's data characteristics as data points, a total of 10 user data points (a 1, a2, … …, a 10), for data point a1, a first euclidean distance (b 2, b3, … …, b 10) between data point a1 and other data points (a 2, a3, … …, a 10) is calculated. Then, abnormality detection is performed on the data point a1 based on the first euclidean distance (b 2, b3, … …, b 10).
Illustratively, calculating a quantitative ratio of the first euclidean distance greater than the first threshold value; judging whether the quantity proportion is larger than a second threshold value or not; if yes, determining the user as an abnormal user; otherwise, the user is determined to be a normal user. The first threshold and the second threshold may be empirically preset values, for example, the first threshold is 0.1, the second threshold is 0.8, etc., which are not specifically limited herein.
For example, based on the above description, in the first euclidean distance (b 2, b3, … …, b 10), the number greater than the first threshold is 8, the number not greater than the first threshold is 1, and the ratio is 0.88. If the ratio is greater than the second threshold (0.8), the data point a1 is determined to be an abnormal data point, namely the data point a1 is an outlier, and then the user corresponding to the data point a1 is determined to be an abnormal user.
In the embodiment of the invention, the outliers are isolated, namely the outliers are not used as training samples and do not participate in training in the model construction. Therefore, the clustering influence of the data features of the abnormal users on the SOM model is reduced, the clustering precision and stability of the SOM model are improved, and the accuracy of the prediction result of the constructed marketing model for the marketing strategy is improved.
In step 230, it is determined whether the first training result includes a blending region according to a preset classification range. Specifically, the first training result is divided into a plurality of areas, and for any area, if the data features of at least two users in the area are in different classification ranges, the area is determined to be a mixed area.
In the embodiment of the invention, the number of the second training models is determined according to the number of the mixing areas. Specifically, the number of second training models corresponds to the number of blending regions.
In order to better illustrate the technical solution of the present invention, fig. 3 is a schematic diagram of a marketing model provided by an exemplary embodiment of the present invention. As shown in FIG. 3, the marketing model includes a 3-layer SOM model in total. Wherein, the layer 1 SOM model is a root SOM model; the layer 2 SOM model is a sub SOM model of the layer 1 SOM model; the layer 3 SOM model is a sub SOM model of the layer 2 SOM model and is also a leaf model.
By way of example based on fig. 3, before inputting the user's data features into the first training model, the weights of the root SOM model are initialized and a feature matrix is constructed. If the feature matrix is constructed by the data features of 4 users, the data features of 3 users are respectively user a1: [0.24,0.27,0.28,0.35]User a2: [0.61,0.65,0.58,0.53]User a3: [0.82,0.73,0.91,0.75]User a4: [0.72,0.83,0.81,0.92]. The feature matrix is obtained as follows:
Figure BDA0004082761790000111
and inputting the feature matrix into a first training model, wherein the first training model is a root SOM model, and training is performed by the root SOM model to obtain a first training result. In the embodiment of the present invention, the first training result may be referred to as a first output plane, where the first output plane is a root plane.
Illustratively, the root plane includes a first region and a second region, the first region including the user a1: 0.24,0.27,0.28,0.35 and user a2: [0.61,0.65,0.58,0.53], the second area includes user a3: 0.82,0.73,0.91,0.75, user a4: [0.72,0.83,0.81,0.92]. The classification range is assumed to include 0 to 0.4, 0.4 to 0.7, and 0.7 to 1.
It can be seen that the user a1 and the user a2 are in different classification ranges in the first area, and thus the user a1 and the user a2 are in different categories, and thus the first area is a confusion area. The user a3 and the user a4 in the second area are in the same classification range, so the user a3 and the user a4 are in the same category, and thus the second area is a single area.
And mapping the characteristic data of the user in the mixed region to a second training model of the first training model aiming at the mixed region, wherein the second training model is a layer 2 SOM model, and training is carried out by the layer 2 SOM model according to the characteristic data of the user in the mixed region to obtain a second training result. In the embodiment of the present invention, the second training result may be referred to as a tree plane, that is, the output plane other than the root plane is the tree plane.
After obtaining the second training result, the second training result is taken as the first training result, and the step 230 is executed again until the first training result does not include the mixing region. For example, the training result of the layer 2 SOM model is used as the first training result, whether the training result of the layer 2 SOM model includes a mixed region is judged, if yes, the mixed region in the training result of the layer 2 SOM model is mapped to the layer 3 SOM model until the first training result does not include the mixed region.
Taking fig. 3 as an example, assuming that the training result of the layer 2 SOM model includes two blending regions y1 and y2, the layer 3 SOM model includes a SOM model z1 and a SOM model z2, where the SOM model z1 is used to train the blending region y1, and the SOM model z2 is used to train the blending region y 2. And if the training results of the SOM model z1 and the SOM model z2 do not comprise a mixed region, the SOM model z1 and the SOM model z2 serve as leaf models in the marketing model.
In the embodiment of the present invention, for any layer of SOM model shown in fig. 3, the SOM model performs the following training operations:
the weights and learning rates of the output neurons are initialized.
For example, the weights of the output neurons of the output layer are assigned in a random decimal manner, and normalized to obtain W j ,W j Representing the weight of the jth output neuron. An initial value is given to the learning rate η. For example, η may be an empirically preset value, such as η is 0.15, which is not specifically limited herein.
A winning neuron is determined from the output neurons based on the characteristics of the data to be input.
Specifically, normalizing the characteristics of the data to be input; calculating the multiplication result of the normalized data characteristic and the weight of the output neuron; and determining the output neuron corresponding to the maximum multiplication result as a winning neuron. For example, normalizing the features of the data to be input to obtain X p ,X p Representing the characteristic of the data to be input after the p-th normalization processing, and calculating X p And W is equal to j Output neuron W corresponding to the maximum multiplication result ij As winning neurons. W (W) ij Representing the ith neuron in the output neurons.
Or calculating a second Euclidean distance between the feature of the data to be input and the weight of the output neuron; the output neuron corresponding to the smallest second euclidean distance is determined as the winning neuron.
And calculating the weight of the winning neuron according to the characteristics of the data to be input until the learning rate meets the training condition.
Specifically, the weights of the winning neurons are calculated according to the following formula (1);
W ij (t+1)=W ij (t)+η(t)*(x-W ij (t)) (1);
wherein W is ij (t+1) represents the weight of the winning neuron at time t+1, W ij (t) represents the weight of the winning neuron at the time t, η (t) represents the learning rate at the time t, and x represents the feature of the data to be input. The training condition is that the learning rate is 0 or the learning rate is smaller than the learning threshold value.
In one application scenario, the marketing strategy is a coupon, and the marketing model is used to determine whether the user is sensitive to the coupon, and to classify the user into 4 categories, namely very sensitive, more sensitive, generally sensitive, and insensitive.
The static data of the user comprises the age, sex and the like of the user, and the dynamic data of the user is historical order data of the user in a annual time range. The proportion of the user order to participate in the activities is used for judging the category of the user, and specifically, the higher the proportion of the user to participate in the preferential activities is, the higher the sensitivity is, the larger the proportion of the preferential amount is, and the higher the sensitivity is. The first weight of the static feature may be set to 0.3 and the second weight of the dynamic feature may be set to 0.7.
Based on the above technical solution, fig. 4 is a schematic distribution diagram of feature data of a user before inputting a marketing model according to an exemplary embodiment of the present invention. As shown in fig. 4, a plurality of data points are included, including 4 categories, respectively presented as circle markers, triangle markers, quadrilateral markers, and five-pointed star markers. Wherein, the circle mark represents a very sensitive category, the triangle mark represents a more sensitive category, the quadrangle mark represents a general sensitive category, and the five-pointed star mark represents a insensitive 4 category.
Fig. 5 is a schematic diagram illustrating distribution of feature data of a user after inputting a marketing model according to an exemplary embodiment of the present invention. It can be seen that after passing through the marketing model, each data point is clustered.
In the technical scheme, any area of the leaf model is a single area, so that the best matched neuron of each input vector (namely the data characteristic of the user) can be found in the single area, the accuracy and stability of user clustering are improved, and the accuracy of the prediction result of the established marketing model for the marketing strategy is improved. And cluster analysis is carried out through a plurality of SOM models, and various analysis modes such as user class preference, user grouping, marketing sensitivity and the like are synthesized, so that the accuracy of the marketing model on the prediction result of the marketing strategy is improved.
Based on the same technical concept, fig. 6 schematically illustrates a structural diagram of a model building apparatus provided by an embodiment of the present invention, where the apparatus may perform the flow of the model building method described above.
As shown in fig. 6, the apparatus specifically includes:
a determining module 610, configured to obtain static data and dynamic data of a user, and determine data characteristics of the user;
a processing module 620, configured to input the data features of the user into a first training model, to obtain a first training result;
judging whether the first training result comprises a mixing region or not; if yes, mapping the characteristic data of the user in the mixed region to a second training model of the first training model aiming at any mixed region to obtain a second training result; the mixing region includes data characteristics of at least two categories of users; the number of output neurons of the second training model is the same as that of the first training model;
taking the second training result as a first training result until the first training result is judged to not comprise a mixing region;
constructing a marketing model according to the first training model and the second training model; the marketing model is used to output a user's sensitivity to a marketing strategy.
Optionally, the determining module 610 is specifically configured to:
extracting features of the static data and the dynamic data of the user to obtain the static features and the dynamic features of the user;
and determining the data characteristics of the user according to the static characteristics and the first weight of the user, the dynamic characteristics and the second weight of the user.
Optionally, the processing module 620 is further configured to:
before the data features of the users are input into a first training model, calculating a first Euclidean distance between the data features of the users and the data features of other users according to the data features of any user;
and carrying out anomaly detection on the user according to the first Euclidean distance.
Optionally, the processing module 620 is specifically configured to:
calculating a quantity proportion that the first Euclidean distance is larger than a first threshold value;
judging whether the quantity proportion is larger than a second threshold value or not; if yes, determining the user as an abnormal user; otherwise, the user is determined to be a normal user.
Optionally, the processing module 620 is specifically configured to:
dividing the first training result into a plurality of areas;
for any region, if the data characteristics of at least two users in the region are in different preset classification ranges, the region is a mixed region, and the first training result is determined to comprise the mixed region;
if the data characteristics of all users in the area are in the same preset classification range, the area is a single area, and the first training result is determined to not comprise a mixed area; the single region includes data features of users of the same category.
Based on the same technical concept, the embodiment of the invention further provides a computer device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the model construction method according to the obtained program.
Based on the same technical idea, the embodiment of the present invention also provides a computer-readable storage medium storing computer-executable instructions for causing a computer to execute the above-described model building method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (10)

1. A method of constructing a marketing model, comprising:
acquiring static data and dynamic data of a user, and determining data characteristics of the user;
inputting the data characteristics of the user into a first training model to obtain a first training result;
judging whether the first training result comprises a mixing region or not; if yes, mapping the characteristic data of the user in the mixed region to a second training model of the first training model aiming at any mixed region to obtain a second training result; the mixing region includes data characteristics of at least two categories of users; the number of output neurons of the second training model is the same as that of the first training model;
taking the second training result as a first training result until the first training result is judged to not comprise a mixing region;
constructing a marketing model according to the first training model and the second training model; the marketing model is used to output a user's sensitivity to a marketing strategy.
2. The method of claim 1, wherein determining the data characteristics of the user from static data and dynamic data of the user comprises:
extracting features of the static data and the dynamic data of the user to obtain the static features and the dynamic features of the user;
and determining the data characteristics of the user according to the static characteristics and the first weight of the user, the dynamic characteristics and the second weight of the user.
3. The method of claim 1, wherein prior to entering the user's data features into the first training model, further comprising:
for the data characteristics of any user, calculating a first Euclidean distance between the data characteristics of the user and the data characteristics of other users;
and carrying out anomaly detection on the user according to the first Euclidean distance.
4. The method of claim 3, wherein anomaly detection for the user based on the first euclidean distance comprises:
calculating a quantity proportion that the first Euclidean distance is larger than a first threshold value;
judging whether the quantity proportion is larger than a second threshold value or not; if yes, determining the user as an abnormal user; otherwise, the user is determined to be a normal user.
5. The method of claim 1, wherein determining whether the first training result includes a blending region comprises:
dividing the first training result into a plurality of areas;
for any region, if the data characteristics of at least two users in the region are in different preset classification ranges, the region is a mixed region, and the first training result is determined to comprise the mixed region;
if the data characteristics of all users in the area are in the same preset classification range, the area is a single area, and the first training result is determined to not comprise a mixed area; the single region includes data features of users of the same category.
6. The method of any one of claims 1 to 5, further comprising:
the first training model and the second training model perform the following training operations:
initializing the weight and the learning rate of the output neurons;
determining a winning neuron from the output neurons according to the characteristics of the data to be input;
and calculating the weight of the winning neuron according to the characteristics of the data to be input until the learning rate meets the training condition.
7. The method of claim 6, wherein determining winning neurons from the output neurons based on characteristics of data to be input comprises:
normalizing the data characteristics to be input;
calculating the multiplication result of the normalized data characteristic and the weight of the output neuron;
determining an output neuron corresponding to the maximum multiplication result as a winning neuron;
or (b)
Calculating a second Euclidean distance between the data characteristics to be input and the weights of the output neurons;
the output neuron corresponding to the smallest second euclidean distance is determined as the winning neuron.
8. A marketing model building apparatus, comprising:
the determining module is used for acquiring static data and dynamic data of the user and determining data characteristics of the user;
the processing module is used for inputting the data characteristics of the user into a first training model to obtain a first training result;
judging whether the first training result comprises a mixing region or not; if yes, mapping the characteristic data of the user in the mixed region to a second training model of the first training model aiming at any mixed region to obtain a second training result; the mixing region includes data characteristics of at least two categories of users; the number of output neurons of the second training model is the same as that of the first training model;
taking the second training result as a first training result until the first training result is judged to not comprise a mixing region;
constructing a marketing model according to the first training model and the second training model; the marketing model is used to output a user's sensitivity to a marketing strategy.
9. A computer device, comprising:
a memory for storing program instructions;
a processor for invoking program instructions stored in said memory to perform the method of any of claims 1-7 in accordance with the obtained program.
10. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the method of any one of claims 1 to 7.
CN202310127830.XA 2023-02-17 2023-02-17 Marketing model construction method and device Pending CN116151857A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310127830.XA CN116151857A (en) 2023-02-17 2023-02-17 Marketing model construction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310127830.XA CN116151857A (en) 2023-02-17 2023-02-17 Marketing model construction method and device

Publications (1)

Publication Number Publication Date
CN116151857A true CN116151857A (en) 2023-05-23

Family

ID=86338717

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310127830.XA Pending CN116151857A (en) 2023-02-17 2023-02-17 Marketing model construction method and device

Country Status (1)

Country Link
CN (1) CN116151857A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291649A (en) * 2023-11-27 2023-12-26 云南电网有限责任公司信息中心 Intensive marketing data processing method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117291649A (en) * 2023-11-27 2023-12-26 云南电网有限责任公司信息中心 Intensive marketing data processing method and system
CN117291649B (en) * 2023-11-27 2024-02-23 云南电网有限责任公司信息中心 Intensive marketing data processing method and system

Similar Documents

Publication Publication Date Title
US11416867B2 (en) Machine learning system for transaction reconciliation
CN107016026B (en) User tag determination method, information push method, user tag determination device, information push device
CN108345587B (en) Method and system for detecting authenticity of comments
CN111582932A (en) Inter-scene information pushing method and device, computer equipment and storage medium
CN111966886A (en) Object recommendation method, object recommendation device, electronic equipment and storage medium
Alzahrani et al. Clustering and labeling auction fraud data
CN114997916A (en) Prediction method, system, electronic device and storage medium of potential user
CN116151857A (en) Marketing model construction method and device
CN111144899A (en) Method and device for identifying false transactions and electronic equipment
Barik et al. A blockchain-based evaluation approach to analyse customer satisfaction using AI techniques
CN114202336A (en) Risk behavior monitoring method and system in financial scene
CN111667307B (en) Method and device for predicting financial product sales volume
CN111091409B (en) Client tag determination method and device and server
CN116821759A (en) Identification prediction method and device for category labels, processor and electronic equipment
CN115439180A (en) Target object determination method and device, electronic equipment and storage medium
CN105512914B (en) Information processing method and electronic equipment
CN114529399A (en) User data processing method, device, computer equipment and storage medium
CN116764236A (en) Game prop recommending method, game prop recommending device, computer equipment and storage medium
CN113706258A (en) Product recommendation method, device, equipment and storage medium based on combined model
CN111784403A (en) User category analysis method and device based on online shopping mall and computer equipment
Lee et al. Application of machine learning in credit risk scorecard
Fu et al. Customer churn prediction for a webcast platform via a voting-based ensemble learning model with Nelder-Mead optimizer
CN117009883B (en) Object classification model construction method, object classification method, device and equipment
US20230394512A1 (en) Methods and systems for profit optimization
Shaji et al. Weather Prediction Using Machine Learning Algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination