CN106022800A - User feature data processing method and device - Google Patents
User feature data processing method and device Download PDFInfo
- Publication number
- CN106022800A CN106022800A CN201610323618.0A CN201610323618A CN106022800A CN 106022800 A CN106022800 A CN 106022800A CN 201610323618 A CN201610323618 A CN 201610323618A CN 106022800 A CN106022800 A CN 106022800A
- Authority
- CN
- China
- Prior art keywords
- user
- data
- model
- client
- customer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
Landscapes
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Finance (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a user feature data processing method. The method comprises the following steps: obtaining user behavior data and data item information data from a data source; carrying out data integration on the obtained user behavior data and the data item information data according to different service logics to obtain user feature data corresponding to the service logics; and carrying out processing on the user feature data by utilizing packaging models corresponding to the user feature data to obtain processing result data corresponding to the service logics. The plurality of packaging models are established, and processing is carried out on the user feature data by utilizing the packaging models corresponding to the user feature data to obtain the processing result data corresponding to the service logics, thereby providing full-amount data mining model encapsulation for enterprises, and providing more accurate user behavior feature information for the enterprises.
Description
Technical field
The present invention relates to data mining analysis, data modeling technical field, particularly relate to a kind of user characteristics
The treating method and apparatus of data.
Background technology
Along with the popular feeling that deepens continuously of management philosophy customer-centric, analyze client, understand client also
The demand guiding client has become the important topic of enterprise operation.Based on data mining technology, enterprise will be
Utilize customer resources to limits, carry out analysis and the prediction of customer action, client is classified.Have
Help customer profitability analysis, find potential valuable client, carry out personalized service, improve
The satisfaction of client and loyalty.
But, during existing corporate member manages, statistical analysis member's essential information, transaction
Data etc., do not go deep into mining analysis to the behavior of member the Internet, and the internet behavior of such as member is inclined
Good, social networks preference etc., some IT application in enterprises is fairly perfect, also simply uses sampled data to excavate
Analyze, do not possess magnanimity big data mining ability.
Summary of the invention
The present invention provides the treating method and apparatus of a kind of user characteristic data, can be that enterprise provides full dose
The model encapsulation of data mining, provides more accurate user behavior characteristic information for enterprise.
On the one hand, embodiments provide the processing method of a kind of user characteristic data, including:
User behavior data and data item information data is obtained from data source;
According to different service logics, user behavior data and data item information data to described acquisition enter
Row Data Integration, obtains the user characteristic data corresponding with described service logic;
Use the packaging model corresponding with described user characteristic data that described user characteristic data is carried out
Process and obtain the result data corresponding with described service logic.
On the other hand, the embodiment of the present invention provides the processing means of a kind of user characteristic data, including:
Acquisition module, for obtaining user behavior data and data item information data from data source;
Integrate module, for according to different service logics, the user behavior data sum to described acquisition
Carry out Data Integration according to item information data, obtain the user characteristic data corresponding with described service logic;
Processing module, for using the packaging model corresponding with described user characteristic data to described user
Characteristic carries out processing and obtains the result data corresponding with described service logic.
The embodiment of the present invention, by setting up multiple packaging model, uses the envelope corresponding with user characteristic data
Described user characteristic data is processed and obtains the result data corresponding with service logic by die-filling type,
Can be the model encapsulation of enterprise's offer full dose data mining, provide more accurate user behavior for enterprise
Characteristic information.
Accompanying drawing explanation
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes of the application
Point, the schematic description and description of the application is used for explaining the application, is not intended that the application's
Improper restriction.In the accompanying drawings:
Figure 1A is the schematic flow sheet of the processing method of a kind of user characteristic data of the embodiment of the present invention;
Figure 1B is a certain implementing procedure schematic diagram of step 102 in Figure 1A;
Fig. 1 C is the flow process signal of the customer segmentation model setting up client's probation of one embodiment of the invention
Figure;
Fig. 1 D is the flow process signal of the client's diffusion model setting up client's period of maturation of one embodiment of the invention
Figure;
Fig. 2 is the pattern shop system architecture diagram of the embodiment of the present invention;
Fig. 3 is the Establishing process schematic diagram of the customer segmentation model of the embodiment of the present invention;
Fig. 4 is the Establishing process schematic diagram of the customer value degree model of the embodiment of the present invention;
Fig. 5 is the Establishing process schematic diagram of the customer loyalty identification model of the embodiment of the present invention;
Fig. 6 is the Establishing process schematic diagram of client's diffusion model of the embodiment of the present invention;
Fig. 7 is the Establishing process schematic diagram of client's liveness model of the embodiment of the present invention;
Fig. 8 is the Establishing process schematic diagram of client's social network analysis model of the embodiment of the present invention;
Fig. 9 is the Establishing process schematic diagram of the customer defection early warning model of the embodiment of the present invention;
Figure 10 is the module diagram of the processing means of the user characteristic data of the embodiment of the present invention.
Detailed description of the invention
Presently filed embodiment is described in detail, thereby to the application below in conjunction with drawings and Examples
How application technology means solve technical problem and reach the process that realizes of technology effect and can fully understand
And implement according to this.
In a typical configuration, calculating equipment include one or more processor (CPU), input/
Output interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory
(RAM) and/or the form such as Nonvolatile memory, such as read only memory (ROM) or flash memory (flash RAM).
Internal memory is the example of computer-readable medium.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be by
Any method or technology realize information storage.Information can be computer-readable instruction, data structure,
The module of program or other data.The example of the storage medium of computer includes, but are not limited to phase transition internal memory
(PRAM), static RAM (SRAM), dynamic random access memory (DRAM),
Other kinds of random access memory (RAM), read only memory (ROM), electrically erasable
Read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read only memory
(CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassette tape, tape
Magnetic rigid disk storage or other magnetic storage apparatus or any other non-transmission medium, can be used for storage can be by
The information that calculating equipment accesses.According to defining herein, computer-readable medium does not include non-temporary electricity
Brain readable media (transitory media), such as data signal and the carrier wave of modulation.
As employed some vocabulary in the middle of description and claim to censure specific components.This area skill
Art personnel are it is to be appreciated that hardware manufacturer may call same assembly with different nouns.This explanation
In the way of book and claim not difference by title is used as distinguishing assembly, but with assembly in function
On difference be used as distinguish criterion." bag as mentioned by the middle of description in the whole text and claim
Contain " it is an open language, therefore " comprise but be not limited to " should be construed to." substantially " refer to receivable
In range of error, those skilled in the art can solve described technical problem, base in the range of certain error
Originally described technique effect is reached.Additionally, " coupling " word comprises any directly and indirectly electrical coupling at this
Catcher section.Therefore, if a first device is coupled to one second device described in literary composition, then described first is represented
Device can directly be electrically coupled to described second device, or by other devices or to couple means the most electric
Property is coupled to described second device.Description subsequent descriptions is to implement the better embodiment of the application, so
For the purpose of described description is the rule so that the application to be described, it is not limited to scope of the present application.
The protection domain of the application is when being as the criterion depending on the defined person of claims.
Also, it should be noted term " includes ", " comprising " or its any other variant are intended to non-
Comprising of exclusiveness, so that include that the commodity of a series of key element or system not only include that those are wanted
Element, but also include other key elements being not expressly set out, or also include for this commodity or be
Unite intrinsic key element.In the case of there is no more restriction, statement " including ... " limit
Key element, it is not excluded that there is also other identical element in the commodity including described key element or system.
Figure 1A is the schematic flow sheet of the processing method of a kind of user characteristic data of the embodiment of the present invention,
As shown in Figure 1A:
Step 101, obtain user behavior data and data item information data from data source.
In embodiments of the present invention, described data source can include first party data, third party's data;Number
Structural data, semi-structured data and unstructured data can be included according to the data type in source;Specifically
, crm system or first party own website that first party data can be had by oneself by first party are had
User behavior data;Electric business that third party's data can be had by third party businessman, online media sites number
According to etc..
Described user behavior data refers to the data that can be used for representing user behavior, specifically can include using
Family navigation patterns data, purchasing behavior data etc..
Described data item information data refers to the Back ground Information of user, specifically can include address name, year
The Back ground Informations such as age, sex, cell-phone number, unique identification number.
The embodiment of the present invention utilizes data mining can effectively obtain the various information of user.Such as pass through
Data mining it appeared that the consumer (user) buying certain commodity is male or women, educational background,
Income how, has anything to like, is what occupation etc..Even it appeared that different users is purchasing
This kind of commodity, and which type of user is the most likely bought after buying the dependent merchandise of this kind of commodity
This kind of commodity of what model etc. can be bought.After have employed data mining, send for targeted customer
The effectiveness of advertisement and response rate will be greatly enhanced, the cost of distribution will be substantially reduced.With
Time, on the basis of user data excavates, enterprise, it appeared that emphasis user and evaluation market performance, makes
Determine personal marketing strategy, widen Sales Channel and scope, formulate production strategy and development plan for enterprise
The foundation of offer science.
Step 102, according to different service logics, user behavior data and the data item to described acquisition
Information data carries out Data Integration, obtains the user characteristic data corresponding with described service logic.
In embodiments of the present invention, described service logic refers to process the logic of data.
Described user characteristic data refers to the set of the characteristic attribute of the user of each model correspondence input data.
Such as: customer value degree service logic is intended to distinguish the high, normal, basic, by acquisition of user's action value
User behavior data and data item information data integrate, obtain meeting this customer value degree business and patrol
The user characteristic data of volume input: the last time buying, accumulative buy the frequency, the cumulative consumption amount of money.
Wherein, step 102 can be by the following method to the user behavior data obtained and data item Information Number
According to carrying out Data Integration, include when implementing:
Receive, get through and integrate multiple data source user behavior data and data item information data, the most more
New to unified data platform.Concrete, shown in Figure 1B, step 102 can include;:
1021, multiple data source user behavior data and data item information data are received.Concrete, permissible
Multiple data source user behavior datas and data item information data are received from the big data operation system of BD-OS.
1022, user behavior data and the data item information data in different pieces of information source are got through.Specifically can root
User behavior data and the data item information data in different pieces of information source is got through, by compiling according to the default relation of getting through
Write shell script the data in different pieces of information source to be got through.The described relation of getting through of presetting can be by difference
Matching identification in data source user behavior data and/or data item information data is arranged, described data source
Including the first data source and the second data source, concrete grammar is as follows:
Determine the first matching identification in described first data source user behavior and/or data item information data
Group, determines the second matching identification in described second data source user behavior and/or data item information data
Group;
When judge described first matching identification group has be present in described second matching identification group
When joining mark, the first data source and the user behavior data of the second data source and data item information data are beaten
Logical.
Illustrate with actual application scenarios below.Citing: first party data (the i.e. first data source) are come
Come from CRM, store data item information data, including the basic data of user, such as ID, surname
Name, age, cell-phone number, whether marry, whether have the Back ground Informations such as children, location, mailbox;
Third party's data (the i.e. second data source) derive from percentage point electricity quotient data, and such as user is at what
Which commodity time with the addition of and enters shopping cart, when places an order, uses the information such as what mode payment.
Order ID, order time, purchase commodity ID, merchandise resources, ID, user name, user mobile phone
Number wait user behavior data;
The method first passing through cell-phone number coupling, gets through the user of first party data and third party's data and reflects
Penetrate as same user;
Then integrate the information of this user, be ID, name, age, cell-phone number after integration, be
No marriage, whether have children, location, mailbox, order ID, the order time, buy commodity ID,
Merchandise resources, ID, user name, user mobile phone number.Concrete, get through and number after integrating
According to can be as shown in table 1.Wherein order ID, ID can carry out denoising by linear dimensionality reduction.
Table 1
1023, the data after getting through are carried out denoising;Specifically can use the method pair of linear dimensionality reduction
Data after getting through in table 1 carry out denoising.Concrete, dimension reduction method has multiple, according to the characteristic of data
Linear dimensionality reduction and Nonlinear Dimension Reduction can be used, according to whether consider and utilize the supervision message of data permissible
Use without supervision dimensionality reduction and have supervision dimensionality reduction etc..
1024, the above-mentioned data handled well are deposited into unified data platform.Concrete, can will process
Good data are deposited in the HDFS file of the big data operation system of BD-OS.Here, above-mentioned process
Good data refer to the data after aforementioned 1022 and 1023 steps perform.
Step 103, use the packaging model corresponding with described user characteristic data to described user characteristics
Data carry out processing and obtain the result data corresponding with described service logic.
Specifically, before step 103, including:
Set up the corresponding relation between user characteristic data with packaging model;
Wherein, the packaging model described in the embodiment of the present invention include client's probation customer segmentation model,
Client forms the customer value model of phase, customer loyalty identification model, the client in client's period of maturation diffusion
Model, client's liveness model, social network analysis model or the customer defection early warning mould of client's phase of decline
Type.
Concrete, in embodiments of the present invention, the corresponding pass between user characteristic data with packaging model
System can be such that
The user characteristic data that customer segmentation model correspondence comprises is respectively ID, address name, user
Age, wed no, whether have children, household electrical appliance, number, Clothes decoration articles, footwear, Automobile Products, fortune
The characteristics such as dynamic open air.In embodiments of the present invention, customer segmentation model may be used for analyzing effective hole
Examine client, the effective category feature finding client, thus realize efficiency and benefit is double rises.
User characteristic data respectively ID, the last time that customer value degree model correspondence comprises are purchased
Buy time, accumulative the purchase frequency, the cumulative consumption amount of money.In embodiments of the present invention, customer value degree mould
Type may be used for analyzing measurement customer value and client's ability to make profits obtains means.
User characteristic data respectively ID that customer loyalty identification model correspondence comprises, accumulative step on
Record number of times, accumulated dwelling time, accumulative login natural law etc..In embodiments of the present invention, customer loyalty
Model may be used for analyze quantify client Material Quality level, can identify enterprise loyalty customer and
Normal client.
The user characteristic data that client's diffusion model correspondence includes is respectively ID, local life, U.S.
Hold body shaping, beauty, manicure etc..In embodiments of the present invention, client's diffusion model may be used for analyzing
Solve client characteristics, help to excavate storage potential customers, and give not according to customer data feature and population characteristic
Generic potential customers are tagged.
The user characteristic data that client's liveness model correspondence comprises is respectively ID, sends out for the last time
Post in note time, last money order receipt to be signed and returned to the sender time, half a year number, money order receipt to be signed and returned to the sender number in half a year, accumulative log in natural law,
Accumulated dwelling time etc..In embodiments of the present invention, client's liveness model may be used for analytical calculation net
That stands enlivens visit capacity, it is also possible to calculate the active users that different time is interval.
The user characteristic data that social network analysis model correspondence comprises respectively is posted ID, money order receipt to be signed and returned to the sender
ID etc..In embodiments of the present invention, client's social network analysis model may be used for analyzing social activity
Responsible consumer in the network platform, opinion leader in the social networks that Identification platform is constituted, active point
Son, society beauty.
The user characteristic data that customer defection early warning model correspondence comprises is respectively ID, accumulative access sky
Count, add up to access duration, buy number of times, consumption total amount, average daily visit capacity, jumping mistake rate, nearest one
Individual month access day accounting etc..In embodiments of the present invention, customer defection early warning model may be used for analyzing
The population characteristic of customer revenue, it was predicted that go out the probability of customer churn, to height, loss probability client is identified,
And can go out, in conjunction with customer value model discrimination, the high value needing emphasis to safeguard and have the client of loss orientation, this
Maintenance to top-tier customer is highly important analysis means.
To this end, shown in Fig. 1 C, the embodiment of the present invention can also include the visitor setting up described client's probation
Family Segmentation Model:
1051, by known user characteristic data, it is worth and/or user behavior standard according to user,
Number and the central point of each user group of user group are set.In embodiments of the present invention, user
Characteristic can include or corresponding user behavior data.Such as: the user characteristic data of known maiden is
Under-18s, like South Korean TV soaps etc., when getting a certain user behavior data for " repeatedly browsing South Korean TV soaps ",
But the user that then this user behavior data is corresponding is probably maiden.
Known user characteristic data refers to known users demographic categories and feature, is usually by business event
Personnel or expert of the art rule of thumb give.Further according to the different significant difference features of different groups, if
Put the central point of each user group.
Such as: boy student colony is generally liked " physical culture ", " Taobao " is generally liked by schoolgirl colony, then
When known user characteristic data major part is " physical culture ", " Taobao ", then could be arranged to two
User group, the significant difference feature of Liang Ge colony can be then whether to like " physical culture " or " Taobao ".
1052, by distributed computing method calculate each user and each user group's central point away from
From;Concrete, use Euclidean distance method to calculate the distance of each user and each user group's central point;
1053, according to the distance of each user Yu each user group's central point, user is divided into and it
In the user group belonging to central point that distance is minimum.
User group can be divided into N class by customer segmentation model, is used for analyzing effective insight into customer, has
The category feature finding client of effect, thus realize that efficiency and benefit are double to be risen.
The embodiment of the present invention also includes setting up client and forms the customer value model of phase:
The last consumption time according to each user, accumulation buy the frequency and accumulation spending amount,
Described user is carried out user's value analysis, and the user that described user is divided into correspondence is worth in colony.
Concrete, can carry out user's value analysis by the following method:
First calculate the meansigma methods of each value variable of each user, and by the value of each value variable with
This average compares, more a height of than average+, low be-, thus user is divided into different users be worth group.
Such as table 2 below, wherein, in table 2, value variable M represents client's accumulation spending amount (Monetary),
Value variable F represents that client accumulates the purchase frequency (Frequency), and value variable R represents nearest one
Secondary consumption time (Regency).
Table 2
M | F | R | Output result label |
+ | + | + | Important value client |
- | + | + | General value client |
- | - | + | General development client |
- | - | - | Typically keep client |
+ | - | + | Important development client |
+ | - | - | Important keep client |
- | + | - | Typically keep client |
+ | + | - | Important holding client |
Customer value degree model may be used for analyzing measurement customer value and client's ability to make profits obtains means.
The embodiment of the present invention also includes setting up customer loyalty identification model:
Calculating each user to mark the loyalty of a certain product, described loyalty is marked between 0-100,
Data are the biggest, and loyalty is the highest, and described product includes but not limited to website or brand.
Customer loyalty model may be used for analyzing the Material Quality level quantifying client, can identify enterprise
The loyalty customer of industry and normal client.
Referring to Fig. 1 D, the embodiment of the present invention can also include that the client setting up client's period of maturation spreads mould
Type:
1061, according to the seed user data imported, the full dimension data of seed user is extracted;
Seed user is the user with certain user characteristic data, the user characteristic data root of seed user
There is different definition according to different scenes, electricity business be probably the user liking buying certain class I goods,
The user etc. seeing a certain class novel is may like in media.
Full dimension data refers to all characteristic attributes of seed user, the media of such as seed user or electricity
The preference data of business browses record.
1062, the full dimension data to seed user carries out seed user outlier, missing values processes, number
According to normalized, obtain the characteristic point of seed user.
Concrete, the normalized of step 1062 is accomplished by
Normal distribution method is utilized to obtain seed user outlier;
Then the missing values of existence is deleted;
Finally use min-max standardized method to carry out data normalization process, make result be mapped to [0-1]
Between.
User characteristic data corresponding to mapping result is the characteristic point of seed user.
1063, according to the characteristic point of seed user, seed user is enabled focusing solutions analysis, extract with
The customer group that seed user is similar.
1064, the similarity of nearest active time user and seed user is calculated;Concrete, utilize KNN to calculate
Method calculates the similarity of nearest active time user and seed user;Similar value is the biggest, represents user and gets over phase
Seemingly;Similar value is the least, represents user the most dissimilar.
1065, it is ranked up from big to small according to similarity, extracts N number of user data that similarity is bigger.
The user data extracted be diffusion with seed user, there is a group user of similar features, be diffusion model
Purpose.
Client's diffusion model may be used for analyzing understands client characteristics, helps to excavate storage potential customers, and
Give different classes of potential customers tagged according to customer data feature and population characteristic.
The embodiment of the present invention also includes setting up client's liveness model:
According to the default liveness factor, calculating the liveness of each user, described liveness is at 0-100
Between, data are the biggest, and liveness is the highest.
Calculating liveness may be accomplished by:
Give weighted value for each default liveness factor, use min-max standardized method to each
The liveness factor is normalized, and then takes Log, is multiplied by weighted value simultaneously, finally uses weighting
What summation method obtained each user enlivens angle value.
What client's liveness model may be used for analytical calculation website enlivens visit capacity, it is also possible to calculate difference
The active users of time interval.
It is exemplified below, it is assumed that when the liveness factor is to post time, last money order receipt to be signed and returned to the sender for the last time
Between, number of posting in half a year, money order receipt to be signed and returned to the sender number in half a year, accumulative to log in the user such as natural law, accumulated dwelling time special
Levy data, post in being respectively post for the last time time, last money order receipt to be signed and returned to the sender time, half a year and count, partly
In year, money order receipt to be signed and returned to the sender number, the accumulative user characteristic data such as natural law, accumulated dwelling time that logs in give weighted value, adopt
With min-max standardized method to each user characteristic data normalized, then take Log, simultaneously
It is multiplied by weighted value, finally uses what weighted sum method obtained each user to enliven angle value.
The embodiment of the present invention also includes setting up social network analysis model:
Calculate the out-degree of each user node in social networks, in-degree, corresponding point degree centrad, determine
Opinion leader user in social networks, activist user, society beauty user.
Social network analysis is that limit represents by building social network relationships figure, wherein node on behalf user
Relation between user and user;Then according to out-degree, in-degree, the calculating of corresponding point degree centrad, meter
The result obtained compares with the threshold value (typically can rule of thumb give) of setting, determines in network
Opinion leader, activist and society beauty.Wherein, the containing of opinion leader, activist and society beauty
Justice is as follows.
1) opinion leader: replied more user for opinion leader by other people.
In-degree: enter user node edge strip number, money order receipt to be signed and returned to the sender people reply post people once, for posting, people remembers one
Secondary in-degree.
2) activist: other people reply more user is activist.
Out-degree: leave the edge strip number of user node, money order receipt to be signed and returned to the sender people reply post people once, for money order receipt to be signed and returned to the sender people
Remember an out-degree.
3) society beauty: other users can reply mutually with society beauty, important in constituent relation network
The user of node.
Middle centrad: calculate the shortest path between multiple user nodes in figure.Computing formula is as follows:
σst: from all shortest path numbers of user node s to user node t
σst(ν): through the bar number of user node v in all shortest paths from user node s to t
Certain user node occurs the most in these paths, and middle centrad is the highest.And center, centre
Spending user corresponding to the highest user node is then the user of important node in network.Concrete, when certain
When one middle centrad is higher than predetermined threshold value, it is determined that user corresponding to this middle centrad is then network
The user of middle important node.
Client's social network analysis model may be used for analyzing the responsible consumer in social network-i i-platform, identifies
Opinion leader in the social networks that platform is constituted, activist, society beauty.
The embodiment of the present invention also includes the loss Early-warning Model setting up client's phase of decline:
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, runs off to height
Probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.Concrete,
Setting up of customer defection early warning model is specific as follows:
By the population characteristic at given known flow apraxia family, use the logistic regression algorithm of 0-1 variable,
Determine the regression coefficient of each characteristic variable, then the characteristic of user to be predicted is fitted, estimates
Calculate the loss probability of user, the probability obtained is contrasted, for higher than threshold value with the threshold value preset
User carry out run off mark, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
Customer defection early warning model may be used for analyzing the population characteristic of customer revenue, it was predicted that goes out customer churn
Probability, to height, loss probability client is identified, and can go out to need in conjunction with customer value model discrimination weight
The high value that point is safeguarded has the client of loss orientation, and this is highly important analysis to the maintenance of top-tier customer
Means.
The embodiment of the present invention, by setting up multiple packaging model, uses the envelope corresponding with user characteristic data
Described user characteristic data is processed and obtains the result data corresponding with service logic by die-filling type,
Can be the model encapsulation of enterprise's offer full dose data mining, provide more accurate user behavior for enterprise
Characteristic information.
Below by specific embodiment, the specific implementation of the present invention is described in detail.
Fig. 2 is the pattern shop system architecture diagram of the embodiment of the present invention, as in figure 2 it is shown, pattern shop system
System is mainly made up of following assembly:
1) data acquisition: obtain multiple data by the data platform big data operation system of such as BD-OS
The user behavior data in source and data item information data;
Described user behavior data refers to the data that can be used for representing user behavior, specifically can include using
Family navigation patterns data, purchasing behavior data etc.;
Described data item information refers to the Back ground Information of user, specifically can include address name, the age,
The Back ground Informations such as sex.
2) data platform: use distributed treatment framework and customization to improve big data mining technology implementation,
For according to different service logics, user behavior data and data item information data to described acquisition enter
Row Data Integration, obtains the user characteristic data corresponding with described service logic;
In embodiments of the present invention, described service logic refers to process the logic of data;
Described user characteristic data refers to the set of the characteristic attribute of the user of each model correspondence input data;
3) pattern shop: by simply configuring the page, to letters such as the input of model, parameter, outputs
Breath configures;For using the packaging model corresponding with described user characteristic data special to described user
Levy data to carry out processing and obtain the result data corresponding with described service logic;
4) scheduler program: perform packaged model by scheduler program;
5) early warning and monitoring: during model performs, system can be monitored by monitoring system, as
Fruit occurs performing mistake, and system can provide timely early warning.Concrete, use Nagios network monitor tools
Realize, for effective monitoring model perform task, when tasks carrying failure, it is possible to by send out mail or
The mode of note carries out early warning.
Wherein, Fig. 3 customer segmentation model is implemented to describe:
1, the effect of model system is dividing user groups, and ordinary circumstance is worth and/or user's row from user
It is analyzed for standard two aspect, specifically can be analyzed by clustering algorithm.Therefore model system is defeated
Enter data (user characteristic data that the most each model is corresponding) to determine according to concrete scene, such as
Electricity business's subscriber segmentation, media subscriber segmentation.In embodiments of the present invention, user is worth and user behavior is
The aspects such as the action value of finger user and the online browsing behavior of user, Shopping Behaviors, by the algorithm of cluster
It is analyzed.
2, after determining concrete scene, segmentation colony number parameter is set by model system, with
And the central point of each colony, this central point is divided by known users group character.
3, by existing user by the way of Distributed Calculation, Europe is used according to the characteristic of user
Formula distance method calculates the distance of each user and each colony central point, and user is divided into and it
That apoplexy due to endogenous wind that distance is minimum.Model system is periodically flushed, new user is divided in known colony.
Wherein, the formula of Euclidean distance method is d=sqrt ((x1-x2)^2+(y1-y2)^2);Wherein d is distance, x1,y1
For two features of user 1, x2,y2Two features for user 2.
Fig. 4 customer value degree model is implemented to describe
The purchase frequency is accumulated by the last consumption time (Regency) of a client, client
(Frequency) and client's accumulation spending amount (Monetary) carries out RFM to client and hives off,
Client is divided into the colony of different value, and concrete division methods can be as shown in table 2.According to different
Business scenario, the input of 3 value variables of Regency, Frequency, Monetary is different.
Concrete colony division rule is as follows:
First calculate the meansigma methods of each value variable of each user, and by the value of each value variable with
This average compares, more a height of than average+, low be-.Finally all clients are divided into 8 colonies, as
Aforementioned table 2:
Table 2
M | F | R | Output result label |
+ | + | + | Important value client |
- | + | + | General value client |
- | - | + | General development client |
- | - | - | Typically keep client |
+ | - | + | Important development client |
+ | - | - | Important keep client |
- | + | - | Typically keep client |
+ | + | - | Important holding client |
In actual applications, business scenario is arranged for industry and the automobile industry of disappearing soon, 3 value variables
Standard different, the renewal frequency of the product that disappear soon is higher, and the renewal frequency of automobile industry just ratio is relatively low.
Fig. 5 customer loyalty model is implemented to describe
1) use principal component analysis, provide one to recommend weight for each loyal variable;
2) by each loyal variable uses efficiency coefficient method, this loyalty variate-value normalizing is mapped to 0-100
Between: (loyal variable-this loyalty variable in minima)/(maximum-this loyalty in this loyalty variable
Minima in variable) * 100
3) by the mapping value * weight of each loyal variable and add up, the mark of a loyalty is obtained.
4) calculated loyalty points is mapped between 0-100, and by front-end configuration page
Face transmission represents the loyal parameter of loyalty " high, normal, basic ", to reach to adjust loyalty " high, normal, basic "
The purpose of mark section.Concrete, loyal parameter is to be inputted by front end page parameter according to different scenes
It is configured;Such as: high loyalty: 0.3;Middle loyalty: 0.3;Low loyalty: 0.4;Wherein three ginsengs
The summation of number is 1.
Fig. 6 client's diffusion model is implemented to describe
1) data import module: import the uid of seed user, telephone number, Email;Seed user is
Having the user of certain user characteristic data, seed user has different definition according to different scenes,
Electricity business is probably the user liking buying certain class I goods, may like in media and see a certain class novel
User etc.;
2) seed user data extraction module: extract the full dimension data of seed user;Full dimension data is
Referring to all characteristic attributes of seed user, the media of such as seed user or the preference data of electricity business browse
Record.
3) data preprocessing module: seed user outlier, missing values process, data normalization processes;
4) manual feature screening: select seed user characteristic attribute, the period of such as surfing the Net, purchasing power,
Commodity, media categories browse, buy feature;The last active time selects;Manual specific characteristic belongs to
The significance level of property;This process can also automatic screening, can be belonged to by the feature of automatic nodes for research user
Property complete screening;
5) automated characterization screening: if user ignores manual feature screening, enable factorial analysis module, carry
Take the feature of seed user;
6) cluster module: seed user enables focusing solutions analysis, extracts the use similar to seed user
Family group, concrete, extract the representative user C with cohesion class;
7) similarity computing module: calculate and meet nearest active time user and the direct similarity of user C;
Concrete, utilize KNN algorithm to calculate nearest active time user and the similarity representing user;Similar value
The biggest, represent user the most similar;Similar value is the least, represents user the most dissimilar.
8) sequence extraction module: have according to similarity and be ranked up to little greatly, extracts topN user data.
Fig. 7 client's liveness model is implemented to describe
1) use principal component analysis, provide one to recommend weight for each diffusion variable.
2) by each diffusion variable uses efficiency coefficient method, this diffusion variate-value normalizing is mapped to 0-100
Between: (minima in diffusion variable-this diffusion variable)/(maximum-this diffusion in this diffusion variable
Minima in variable) * 100.
3) by the mapping value * weight of each diffusion variable and add up, the mark of a liveness is obtained.
4) calculated liveness mark is mapped between 0-100, and by the front-end configuration page
What transmission represented liveness " high, normal, basic " enlivens parameter, reaches to adjust the mark of liveness " high, normal, basic "
The purpose of section.Concrete, enlivening parameter is to be carried out by the input of front end page parameter according to different scenes
Arrange;Such as: high liveness: 0.5;Middle liveness: 0.3;Low liveness: 0.2;Wherein three parameters
Summation is 1.
Fig. 8 social networks model describes
Setting up of social networks model is specific as follows:
1) data prepare: data are extracted, and generate social middle table.Wherein, extract in social networks every
The data of individual user, can include user behavior data and data item information data, such as user id, send out
The data such as note time, theme of posting, content of posting, reply user id, money order receipt to be signed and returned to the sender time, money order receipt to be signed and returned to the sender content;
Thus generate social middle table (including post ID, money order receipt to be signed and returned to the sender ID, money order receipt to be signed and returned to the sender number of times).Social
Middle table is used for generating social network diagram, on the user of figure interior joint representative, limit representation relation, limit
Value represents money order receipt to be signed and returned to the sender number of times.
2) model is set up: calculate the out-degree of each user node, in-degree, corresponding point in social networks
Degree centrad, determines that the opinion leader user in social networks, activist user, society beauty use
Family.
3) storage model: form generates and store social networks model according to the rules.
And herein presented technical term is explained as follows.
A) opinion leader: replied more user for opinion leader by other people.
In-degree: enter user node edge strip number, money order receipt to be signed and returned to the sender people reply post people once, for posting, people remembers one
Secondary in-degree.
B) activist: other people reply more user is activist.
Out-degree: leave the edge strip number of user node, money order receipt to be signed and returned to the sender people reply post people once, for money order receipt to be signed and returned to the sender people
Remember an out-degree.
C) society beauty: other users can reply mutually with society beauty, important in constituent relation network
The user of node.
Middle centrad: calculate the shortest path between multiple user nodes in figure.Computing formula is as follows:
σst: from all shortest path numbers of user node s to user node t
σst(ν): through the bar number of user node v in all shortest paths from user node s to t
Certain user node occurs the most in these paths, and middle centrad is the highest.And center, centre
Spending user corresponding to the highest user node is then the user of important node in network.Concrete, when certain
When one middle centrad is higher than predetermined threshold value, it is determined that user corresponding to this middle centrad is then network
The user of middle important node.
Fig. 9 customer defection early warning model describes
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, runs off to height
Probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
Being typically based on different application scenarios, the definition of customer loss is different, for media generally with
The access browsing time definition customer loss of user, for retailer under electricity business and line, real with user
Border place an order the time buying definition customer loss.
Customer defection early warning model is the population characteristic by given known flow apraxia family, uses 0-1 to become
The logistic regression algorithm of amount, determines the regression coefficient of each characteristic variable, then by the spy of user to be predicted
Levy data to be fitted, estimate the loss probability of user, the probability obtained is carried out with the threshold value preset
Contrast, for carrying out, higher than the user of threshold value, the mark that runs off.
The description of customer defection early warning model is specific as follows:
1) data prepare: data are extracted, and generate loss middle table.Wherein, extract user base information,
The data such as user behavior data (such as navigation patterns data), thus generate loss middle table (according to accumulative
Access day, accumulative access duration, purchase number of times, consumption total amount, average daily visit capacity, jumping mistake rate,
Nearest one month access day accounting).
Variable processes and selects
(B1) dependent variable: attrition status (0/1)
In certain period of time, the user without active record is defined as running off.If the loss time cannot be specified,
Model, by spaced apart for the ensemble average active time according to user data, chosen and is the most suitably defined use
The time span run off in family.
(B2) explanatory variable
Client's essential information: member's grade, consumption grade
Client browses information: access day, access duration, average daily visit capacity, commodity page browsing accounting,
Jump mistake rate, nearest one month (or three months, six months) access day accounting
Customer purchase information: buy number of times, consumption total amount
Client contact's information: complain number of times, return of goods number of times, marketing mail clicking rate
(B3) data normalization
Different evaluation index often has different dimensions and dimensional unit, and such situation can shadow
Ring the result to data analysis.In order to eliminate the dimension impact between index, enter mould in data
Before type, it usually needs advanced row data normalization (normalization) processes, refer to solving data
Comparability between mark.
Min-Max standardized method is that initial data is carried out linear transformation, is allowed to fall into one specifically
Interval, such as [0,1].
(B4) notable variable is chosen
All of explanatory variable is put into regression model (full model), passes through maximum likelihood method
Its regression parameter is estimated.Through the significance test of parameter or weed out according to AIC standard
Inapparent variable, determines correlated variables and the regression parameter of each variable being selected into final mask.
2) model is set up: catch customer churn feature by homing method.Concrete, it is possible to use patrol
Collect homing method (Logistic Regression) and catch customer churn feature, by 0-1 type dependent variable
It is converted into a continuous dependent variable, processes the regression problem of 0-1 variable, also referred to as qualitative variable and return.
3) analyze and predict: analyze and predict user's potential loss probability.
Each achievement data of user to be predicted is substituted into the final mask of matching, estimates the loss of client
Probability P (Y=1 │ x).Sort from high to low according to loss probability score situation, to height loss probability crowd
It is identified.It is as follows that user's accounting chosen by each label:
(C1) may specify cut off value p_1, p_2 (such as p_1=0.4, p_2=0.8), general by running off
Each user is stamped different labels by rate marking, is respectively divided excessive risk, risk, low-risk 3
In individual different colony.
(C2) also may specify each colony accounting c_1, c_2, c_3 (as c_1=20%, c_2=30%,
C_3=50%), then all users are sorted from high to low according to loss probability, higher front 20% user's mark
Knowing is excessive risk user, and minimum 50% is designated low-risk user, and other users are then divided into risk
Colony.
Above-mentioned concrete cut off value and accounting can be adjusted according to the number specifically marketed and keep measure and need
Whole.
4) storage model: form generates and store customer defection early warning model according to the rules.
Referring to Figure 10, the module for the processing means of the user characteristic data of embodiment of the present invention offer is shown
Being intended to, the processing means of user characteristic data includes: acquisition module 1001, integration module 1002 and place
Reason module 1003.Specific as follows:
Acquisition module 1001, for obtaining user behavior data and data item information data from data source;
Integrate module 1002, for according to different service logics, the user behavior data to described acquisition
Carry out Data Integration with data item information data, obtain the user characteristics number corresponding with described service logic
According to;
Processing module 1003, for using the packaging model corresponding with described user characteristic data to institute
State user characteristic data to carry out processing and obtain the result data corresponding with described service logic.
Further, described integration module 1002 is used for:
Receive, get through and integrate multiple data source user behavior data and data item information data, the most more
New to unified data platform.
Further, the processing means of described user characteristic data also includes:
Set up module, for setting up the corresponding relation between user characteristic data with packaging model;
Described packaging model includes that the customer segmentation model of client's probation, client form the customer value of phase
Model, customer loyalty identification model, client's diffusion model in client's period of maturation, client's liveness model,
The loss Early-warning Model of social network analysis model or client's phase of decline.
Further, described set up module, be additionally operable to set up the customer segmentation model of described client's probation,
Specifically include:
By known user data, it is worth and/or user behavior standard according to user, user is set
The number of colony and the central point of each user group;
The method being used Euclidean distance by Distributed Calculation is calculated in each user and each user group
The distance of heart point;
According to the distance of each user Yu each user group's central point, user is divided into its distance
In the little user group belonging to central point.
Further, described set up module, be additionally operable to set up client and form the customer value model of phase, tool
Body includes:
The last consumption time according to each user, accumulation buy the frequency and accumulation spending amount,
Described user is carried out user's value analysis, and the user that described user is divided into correspondence is worth in colony.
Further, described set up module, be additionally operable to set up customer loyalty identification model, specifically include:
Calculating each user to mark the loyalty of a certain product, described loyalty is marked between 0-100,
Data are the biggest, and loyalty is the highest, and described product includes but not limited to website or brand.
Further, described set up module, be additionally operable to set up client's diffusion model in client's period of maturation, tool
Body includes:
According to the seed user data imported, extract the full dimension data of seed user;
The full dimension data of seed user is carried out seed user outlier, missing values process, data normalizing
Change processes, and obtains the characteristic point of seed user;
According to the characteristic point of seed user, seed user is enabled focusing solutions analysis, extract and use with seed
The customer group that family is similar;
KNN algorithm is utilized to calculate nearest active time user and the similarity representing user;
It is ranked up from big to small according to similarity, extracts N number of user data that similarity is bigger.
Further, described set up module, be additionally operable to set up client's liveness model, specifically include:
According to the default liveness factor, calculating the liveness of each user, described liveness is at 0-100
Between, data are the biggest, and liveness is the highest.
Further, described set up module, be additionally operable to set up social network analysis model, specifically include:
Calculate the out-degree of each user node in social networks, in-degree, corresponding point degree centrad, determine
Opinion leader user in social networks, activist user, society beauty user.
Further, described set up module, be additionally operable to set up the loss Early-warning Model of client's phase of decline, tool
Body includes:
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, to height stream
Lose probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
This device embodiment is the most corresponding with the feature in said method embodiment, and each module can corresponding perform
In preceding method embodiment, the associated description of method flow part, does not repeats them here.
Described above illustrate and describes some preferred embodiments of the present invention, but as previously mentioned, it should reason
Solve the present invention and be not limited to form disclosed herein, be not to be taken as the eliminating to other embodiments,
And can be used for various other combination, amendment and environment, and can in invention contemplated scope described herein,
It is modified by above-mentioned teaching or the technology of association area or knowledge.And those skilled in the art are carried out changes
Move and change is without departing from the spirit and scope of the present invention, the most all should be in the protection of claims of the present invention
In the range of.
Claims (20)
1. the processing method of a user characteristic data, it is characterised in that including:
User behavior data and data item information data is obtained from data source;
According to different service logics, user behavior data and data item information data to described acquisition enter
Row Data Integration, obtains the user characteristic data corresponding with described service logic;
Use the packaging model corresponding with described user characteristic data that described user characteristic data is carried out
Process and obtain the result data corresponding with described service logic.
2. the method for claim 1, it is characterised in that according to different service logics, right
User behavior data and the data item information data of described acquisition carry out Data Integration, obtain and described business
The user characteristic data that logic is corresponding, including:
Receive, get through and integrate multiple data source user behavior data and data item information data, the most more
New to unified data platform.
3. the method for claim 1, it is characterised in that use and described user characteristic data
Corresponding packaging model carries out process and obtains corresponding with described service logic described user characteristic data
Result data before, including:
Set up the corresponding relation between user characteristic data with packaging model;
Described packaging model includes that the customer segmentation model of client's probation, client form the customer value of phase
Model, customer loyalty identification model, client's diffusion model in client's period of maturation, client's liveness model,
The loss Early-warning Model of social network analysis model or client's phase of decline.
4. method as claimed in claim 3, it is characterised in that also include that setting up described client investigates
The customer segmentation model of phase:
By known user data, it is worth and/or user behavior standard according to user, user is set
The number of colony and the central point of each user group;
The method being used Euclidean distance by Distributed Calculation is calculated in each user and each user group
The distance of heart point;
According to the distance of each user Yu each user group's central point, user is divided into and its distance
In the minimum user group belonging to central point.
5. method as claimed in claim 3, it is characterised in that also include that setting up client forms the phase
Customer value model:
The last consumption time according to each user, accumulation buy the frequency and accumulation spending amount,
Described user is carried out user's value analysis, and the user that described user is divided into correspondence is worth in colony.
6. method as claimed in claim 3, it is characterised in that also include that setting up customer loyalty knows
Other model:
Calculating each user to mark the loyalty of a certain product, described loyalty is marked between 0-100,
Data are the biggest, and loyalty is the highest, and described product includes but not limited to website or brand.
7. method as claimed in claim 3, it is characterised in that also include setting up client's period of maturation
Client's diffusion model:
According to the seed user data imported, extract the full dimension data of seed user;
The full dimension data of seed user is carried out seed user outlier, missing values process, data normalizing
Change processes, and obtains the characteristic point of seed user;
According to the characteristic point of seed user, seed user is enabled focusing solutions analysis, extract and use with seed
The customer group that family is similar;
KNN algorithm is utilized to calculate nearest active time user and the similarity representing user;
It is ranked up from big to small according to similarity, extracts N number of user data that similarity is bigger.
8. method as claimed in claim 3, it is characterised in that also include setting up client's liveness mould
Type:
According to the default liveness factor, calculating the liveness of each user, described liveness is at 0-100
Between, data are the biggest, and liveness is the highest.
9. method as claimed in claim 3, it is characterised in that also include setting up social network analysis
Model:
Calculate the out-degree of each user node in social networks, in-degree, corresponding point degree centrad, determine
Opinion leader user in social networks, activist user, society beauty user.
10. method as claimed in claim 3, it is characterised in that also include setting up client's phase of decline
Loss Early-warning Model:
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, runs off to height
Probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
The processing means of 11. 1 kinds of user characteristic data, it is characterised in that including:
Acquisition module, for obtaining user behavior data and data item information data from data source;
Integrate module, for according to different service logics, the user behavior data sum to described acquisition
Carry out Data Integration according to item information data, obtain the user characteristic data corresponding with described service logic;
Processing module, for using the packaging model corresponding with described user characteristic data to described user
Characteristic carries out processing and obtains the result data corresponding with described service logic.
12. devices as claimed in claim 11, it is characterised in that described integration module is used for:
Receive, get through and integrate multiple data source user behavior data and data item information data, the most more
New to unified data platform.
13. devices as claimed in claim 11, it is characterised in that also include:
Set up module, for setting up the corresponding relation between user characteristic data with packaging model;
Described packaging model includes that the customer segmentation model of client's probation, client form the customer value of phase
Model, customer loyalty identification model, client's diffusion model in client's period of maturation, client's liveness model,
The loss Early-warning Model of social network analysis model or client's phase of decline.
14. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to
Set up the customer segmentation model of described client's probation, specifically include:
By known user data, it is worth and/or user behavior standard according to user, user is set
The number of colony and the central point of each user group;
The method being used Euclidean distance by Distributed Calculation is calculated in each user and each user group
The distance of heart point;
According to the distance of each user Yu each user group's central point, user is divided into its distance
In the little user group belonging to central point.
15. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to
Set up client and form the customer value model of phase, specifically include:
The last consumption time according to each user, accumulation buy the frequency and accumulation spending amount,
Described user is carried out user's value analysis, and the user that described user is divided into correspondence is worth in colony.
16. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to
Set up customer loyalty identification model, specifically include:
Calculating each user to mark the loyalty of a certain product, described loyalty is marked between 0-100,
Data are the biggest, and loyalty is the highest, and described product includes but not limited to website or brand.
17. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to
Set up client's diffusion model in client's period of maturation, specifically include:
According to the seed user data imported, extract the full dimension data of seed user;
The full dimension data of seed user is carried out seed user outlier, missing values process, data normalizing
Change processes, and obtains the characteristic point of seed user;
According to the characteristic point of seed user, seed user is enabled focusing solutions analysis, extract and use with seed
The customer group that family is similar;
KNN algorithm is utilized to calculate nearest active time user and the similarity representing user;
It is ranked up from big to small according to similarity, extracts N number of user data that similarity is bigger.
18. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to
Set up client's liveness model, specifically include:
According to the default liveness factor, calculating the liveness of each user, described liveness is at 0-100
Between, data are the biggest, and liveness is the highest.
19. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to
Set up social network analysis model, specifically include:
Calculate the out-degree of each user node in social networks, in-degree, corresponding point degree centrad, determine
Opinion leader user in social networks, activist user, society beauty user.
20. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to
Set up the loss Early-warning Model of client's phase of decline, specifically include:
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, runs off to height
Probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610323618.0A CN106022800A (en) | 2016-05-16 | 2016-05-16 | User feature data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610323618.0A CN106022800A (en) | 2016-05-16 | 2016-05-16 | User feature data processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106022800A true CN106022800A (en) | 2016-10-12 |
Family
ID=57097486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610323618.0A Pending CN106022800A (en) | 2016-05-16 | 2016-05-16 | User feature data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106022800A (en) |
Cited By (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106384209A (en) * | 2016-10-27 | 2017-02-08 | 合肥工业大学 | Method and device for improving configuration of intelligent products based on operation data |
CN106411711A (en) * | 2016-10-20 | 2017-02-15 | 宁波江东大金佰汇信息技术有限公司 | Improved temporary social network determination system based on computer big data |
CN106503863A (en) * | 2016-11-10 | 2017-03-15 | 北京红马传媒文化发展有限公司 | Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal |
CN106651605A (en) * | 2016-10-20 | 2017-05-10 | 宁波江东大金佰汇信息技术有限公司 | Computer big data-based temporary social network determining system |
CN106779803A (en) * | 2016-11-24 | 2017-05-31 | 久远谦长(北京)技术服务有限公司 | A kind of method that financial institution's flowing water is matched with carrier data |
CN106779808A (en) * | 2016-11-25 | 2017-05-31 | 上海斐讯数据通信技术有限公司 | Consumer space's behavior analysis system and method in a kind of commercial circle |
CN106776737A (en) * | 2016-11-18 | 2017-05-31 | 北京红马传媒文化发展有限公司 | A kind of method based on big data analyzing evaluation artist's viscosity |
CN106845706A (en) * | 2017-01-19 | 2017-06-13 | 浙江工商大学 | Online social network user relationship strength Forecasting Methodology |
CN107093092A (en) * | 2016-11-17 | 2017-08-25 | 北京小度信息科技有限公司 | Data analysing method and device |
CN107122425A (en) * | 2017-04-07 | 2017-09-01 | 广东精点数据科技股份有限公司 | The method and system evaluated corporate client |
CN107480187A (en) * | 2017-07-10 | 2017-12-15 | 北京京东尚科信息技术有限公司 | User's value category method and apparatus based on cluster analysis |
CN107730019A (en) * | 2017-09-29 | 2018-02-23 | 携程计算机技术(上海)有限公司 | User based on user's portrait retrieves method and system |
CN108122123A (en) * | 2016-11-29 | 2018-06-05 | 华为技术有限公司 | A kind of method and device for extending potential user |
CN108428155A (en) * | 2018-03-20 | 2018-08-21 | 南京邮电大学 | A kind of behavior processing analysis method based on service feature model |
CN108540993A (en) * | 2018-04-08 | 2018-09-14 | 中国联合网络通信集团有限公司 | User's Valuation Method and device |
CN108628866A (en) * | 2017-03-20 | 2018-10-09 | 大有秦鼎(北京)科技有限公司 | The method and apparatus of data fusion |
CN108764994A (en) * | 2018-05-24 | 2018-11-06 | 深圳前海桔子信息技术有限公司 | A kind of user behavior guidance method, device, server and storage medium |
CN108876394A (en) * | 2017-05-16 | 2018-11-23 | 北京京东尚科信息技术有限公司 | Identify the potential method and apparatus for being lost user of e-commerce platform |
WO2018223719A1 (en) * | 2017-06-09 | 2018-12-13 | 平安科技(深圳)有限公司 | Method for predicting insurance purchasing behavior of a user, device, computing apparatus, and medium |
CN109190959A (en) * | 2018-08-23 | 2019-01-11 | 杭州颜铺科技有限公司 | A kind of intelligent management system for beauty's industry |
CN109255640A (en) * | 2017-07-13 | 2019-01-22 | 阿里健康信息技术有限公司 | A kind of method, apparatus and system of determining user grouping |
WO2019037391A1 (en) * | 2017-08-24 | 2019-02-28 | 平安科技(深圳)有限公司 | Method and apparatus for predicting customer purchase intention, and electronic device and medium |
CN109657998A (en) * | 2018-12-25 | 2019-04-19 | 国信优易数据有限公司 | A kind of resource allocation methods, device, equipment and storage medium |
CN109841250A (en) * | 2017-11-24 | 2019-06-04 | 光宝科技股份有限公司 | The forecasting system method for building up and operating method of decoded state |
CN109919667A (en) * | 2019-02-21 | 2019-06-21 | 江苏苏宁银行股份有限公司 | A kind of method and apparatus of the IP of enterprise for identification |
WO2019119635A1 (en) * | 2017-12-18 | 2019-06-27 | 平安科技(深圳)有限公司 | Seed user development method, electronic device and computer-readable storage medium |
CN109978547A (en) * | 2017-12-28 | 2019-07-05 | 北京京东尚科信息技术有限公司 | Risk behavior control method and system, equipment and storage medium |
CN109993556A (en) * | 2017-12-30 | 2019-07-09 | 中国移动通信集团湖北有限公司 | User behavior analysis method, apparatus calculates equipment and storage medium |
CN110222975A (en) * | 2019-05-31 | 2019-09-10 | 北京奇艺世纪科技有限公司 | A kind of loss customer analysis method, apparatus, electronic equipment and storage medium |
WO2019169961A1 (en) * | 2018-03-06 | 2019-09-12 | 阿里巴巴集团控股有限公司 | Method and device for determining group of target users |
CN110276514A (en) * | 2019-05-06 | 2019-09-24 | 阿里巴巴集团控股有限公司 | Appraisal procedure, device and the equipment of business correlative factor |
CN110348914A (en) * | 2019-07-19 | 2019-10-18 | 中国银行股份有限公司 | Customer churn data analysing method and device |
CN110516155A (en) * | 2019-08-29 | 2019-11-29 | 深圳市云积分科技有限公司 | Marketing strategy generation method and system |
CN110610371A (en) * | 2018-06-14 | 2019-12-24 | 北京京东尚科信息技术有限公司 | Latent user analysis method, system, and computer-readable storage medium |
CN110659269A (en) * | 2019-08-15 | 2020-01-07 | 中国平安财产保险股份有限公司 | User access data processing method and device, computer equipment and storage medium |
CN110717085A (en) * | 2019-10-12 | 2020-01-21 | 浙江工商大学 | Opinion leader identification method based on virtual brand community |
CN110991875A (en) * | 2019-11-29 | 2020-04-10 | 广州市百果园信息技术有限公司 | Platform user quality evaluation system |
CN111311331A (en) * | 2020-02-26 | 2020-06-19 | 北京慧博科技有限公司 | RFM analysis method |
CN111695819A (en) * | 2020-06-16 | 2020-09-22 | 中国联合网络通信集团有限公司 | Method and device for scheduling seat personnel |
CN111738331A (en) * | 2020-06-19 | 2020-10-02 | 北京同邦卓益科技有限公司 | User classification method and device, computer-readable storage medium and electronic device |
CN111899036A (en) * | 2020-08-03 | 2020-11-06 | 上海同儒信息技术有限公司 | Client hierarchical classification management system based on big data |
CN112070548A (en) * | 2020-09-11 | 2020-12-11 | 上海风秩科技有限公司 | User layering method, device, equipment and storage medium |
CN112594937A (en) * | 2020-12-16 | 2021-04-02 | 珠海格力电器股份有限公司 | Control method and device of water heater, electronic equipment and storage medium |
CN112598442A (en) * | 2020-12-25 | 2021-04-02 | 中国建设银行股份有限公司 | Multidimensional operation analysis method and multidimensional operation analysis device for network traffic |
CN113793060A (en) * | 2021-09-27 | 2021-12-14 | 武汉众邦银行股份有限公司 | Customer rating method and device based on customer transaction data and storage medium |
TWI778568B (en) * | 2021-04-06 | 2022-09-21 | 富邦人壽保險股份有限公司 | Systems and methods for generating recommendation list |
CN115879984A (en) * | 2023-03-03 | 2023-03-31 | 北京一凌宸飞科技有限公司 | Network marketing method and system based on big data analysis |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103118111A (en) * | 2013-01-31 | 2013-05-22 | 北京百分点信息科技有限公司 | Information push method based on data from a plurality of data interaction centers |
CN103714139A (en) * | 2013-12-20 | 2014-04-09 | 华南理工大学 | Parallel data mining method for identifying a mass of mobile client bases |
CN104866969A (en) * | 2015-05-25 | 2015-08-26 | 百度在线网络技术(北京)有限公司 | Personal credit data processing method and device |
-
2016
- 2016-05-16 CN CN201610323618.0A patent/CN106022800A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103118111A (en) * | 2013-01-31 | 2013-05-22 | 北京百分点信息科技有限公司 | Information push method based on data from a plurality of data interaction centers |
CN103714139A (en) * | 2013-12-20 | 2014-04-09 | 华南理工大学 | Parallel data mining method for identifying a mass of mobile client bases |
CN104866969A (en) * | 2015-05-25 | 2015-08-26 | 百度在线网络技术(北京)有限公司 | Personal credit data processing method and device |
Cited By (64)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106411711A (en) * | 2016-10-20 | 2017-02-15 | 宁波江东大金佰汇信息技术有限公司 | Improved temporary social network determination system based on computer big data |
CN106651605A (en) * | 2016-10-20 | 2017-05-10 | 宁波江东大金佰汇信息技术有限公司 | Computer big data-based temporary social network determining system |
CN106651605B (en) * | 2016-10-20 | 2019-11-15 | 福州盛世凌云环保科技有限公司 | A kind of temporary social network based on computer big data determines system |
CN106384209A (en) * | 2016-10-27 | 2017-02-08 | 合肥工业大学 | Method and device for improving configuration of intelligent products based on operation data |
CN106503863A (en) * | 2016-11-10 | 2017-03-15 | 北京红马传媒文化发展有限公司 | Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal |
CN107093092A (en) * | 2016-11-17 | 2017-08-25 | 北京小度信息科技有限公司 | Data analysing method and device |
CN107093092B (en) * | 2016-11-17 | 2021-07-09 | 北京星选科技有限公司 | Data analysis method and device |
CN106776737A (en) * | 2016-11-18 | 2017-05-31 | 北京红马传媒文化发展有限公司 | A kind of method based on big data analyzing evaluation artist's viscosity |
CN106779803A (en) * | 2016-11-24 | 2017-05-31 | 久远谦长(北京)技术服务有限公司 | A kind of method that financial institution's flowing water is matched with carrier data |
CN106779803B (en) * | 2016-11-24 | 2021-01-15 | 久远谦长(北京)技术服务有限公司 | Method for matching financial institution running water with operator data |
CN106779808A (en) * | 2016-11-25 | 2017-05-31 | 上海斐讯数据通信技术有限公司 | Consumer space's behavior analysis system and method in a kind of commercial circle |
CN108122123B (en) * | 2016-11-29 | 2021-08-20 | 华为技术有限公司 | Method and device for expanding potential users |
CN108122123A (en) * | 2016-11-29 | 2018-06-05 | 华为技术有限公司 | A kind of method and device for extending potential user |
CN106845706A (en) * | 2017-01-19 | 2017-06-13 | 浙江工商大学 | Online social network user relationship strength Forecasting Methodology |
CN108628866A (en) * | 2017-03-20 | 2018-10-09 | 大有秦鼎(北京)科技有限公司 | The method and apparatus of data fusion |
CN108628866B (en) * | 2017-03-20 | 2020-11-06 | 大有秦鼎(北京)科技有限公司 | Data fusion method and device |
CN107122425A (en) * | 2017-04-07 | 2017-09-01 | 广东精点数据科技股份有限公司 | The method and system evaluated corporate client |
CN108876394A (en) * | 2017-05-16 | 2018-11-23 | 北京京东尚科信息技术有限公司 | Identify the potential method and apparatus for being lost user of e-commerce platform |
WO2018223719A1 (en) * | 2017-06-09 | 2018-12-13 | 平安科技(深圳)有限公司 | Method for predicting insurance purchasing behavior of a user, device, computing apparatus, and medium |
CN107480187A (en) * | 2017-07-10 | 2017-12-15 | 北京京东尚科信息技术有限公司 | User's value category method and apparatus based on cluster analysis |
CN109255640A (en) * | 2017-07-13 | 2019-01-22 | 阿里健康信息技术有限公司 | A kind of method, apparatus and system of determining user grouping |
WO2019037391A1 (en) * | 2017-08-24 | 2019-02-28 | 平安科技(深圳)有限公司 | Method and apparatus for predicting customer purchase intention, and electronic device and medium |
CN107730019A (en) * | 2017-09-29 | 2018-02-23 | 携程计算机技术(上海)有限公司 | User based on user's portrait retrieves method and system |
CN107730019B (en) * | 2017-09-29 | 2021-06-11 | 携程计算机技术(上海)有限公司 | User retrieval method and system based on user portrait |
CN109841250B (en) * | 2017-11-24 | 2020-11-13 | 建兴储存科技股份有限公司 | Method for establishing prediction system of decoding state and operation method |
CN109841250A (en) * | 2017-11-24 | 2019-06-04 | 光宝科技股份有限公司 | The forecasting system method for building up and operating method of decoded state |
WO2019119635A1 (en) * | 2017-12-18 | 2019-06-27 | 平安科技(深圳)有限公司 | Seed user development method, electronic device and computer-readable storage medium |
CN109978547A (en) * | 2017-12-28 | 2019-07-05 | 北京京东尚科信息技术有限公司 | Risk behavior control method and system, equipment and storage medium |
CN109993556A (en) * | 2017-12-30 | 2019-07-09 | 中国移动通信集团湖北有限公司 | User behavior analysis method, apparatus calculates equipment and storage medium |
CN109993556B (en) * | 2017-12-30 | 2021-06-08 | 中国移动通信集团湖北有限公司 | User behavior analysis method and device, computing equipment and storage medium |
WO2019169961A1 (en) * | 2018-03-06 | 2019-09-12 | 阿里巴巴集团控股有限公司 | Method and device for determining group of target users |
CN108428155A (en) * | 2018-03-20 | 2018-08-21 | 南京邮电大学 | A kind of behavior processing analysis method based on service feature model |
CN108540993A (en) * | 2018-04-08 | 2018-09-14 | 中国联合网络通信集团有限公司 | User's Valuation Method and device |
CN108764994A (en) * | 2018-05-24 | 2018-11-06 | 深圳前海桔子信息技术有限公司 | A kind of user behavior guidance method, device, server and storage medium |
CN110610371A (en) * | 2018-06-14 | 2019-12-24 | 北京京东尚科信息技术有限公司 | Latent user analysis method, system, and computer-readable storage medium |
CN109190959A (en) * | 2018-08-23 | 2019-01-11 | 杭州颜铺科技有限公司 | A kind of intelligent management system for beauty's industry |
CN109190959B (en) * | 2018-08-23 | 2021-07-06 | 杭州颜铺科技有限公司 | Intelligent management system for beauty industry |
CN109657998A (en) * | 2018-12-25 | 2019-04-19 | 国信优易数据有限公司 | A kind of resource allocation methods, device, equipment and storage medium |
CN109657998B (en) * | 2018-12-25 | 2020-11-27 | 国信优易数据股份有限公司 | Resource allocation method, device, equipment and storage medium |
CN109919667B (en) * | 2019-02-21 | 2022-07-22 | 江苏苏宁银行股份有限公司 | Method and device for identifying enterprise IP |
CN109919667A (en) * | 2019-02-21 | 2019-06-21 | 江苏苏宁银行股份有限公司 | A kind of method and apparatus of the IP of enterprise for identification |
CN110276514B (en) * | 2019-05-06 | 2023-04-07 | 创新先进技术有限公司 | Method, device and equipment for evaluating business related factors |
CN110276514A (en) * | 2019-05-06 | 2019-09-24 | 阿里巴巴集团控股有限公司 | Appraisal procedure, device and the equipment of business correlative factor |
CN110222975A (en) * | 2019-05-31 | 2019-09-10 | 北京奇艺世纪科技有限公司 | A kind of loss customer analysis method, apparatus, electronic equipment and storage medium |
CN110348914A (en) * | 2019-07-19 | 2019-10-18 | 中国银行股份有限公司 | Customer churn data analysing method and device |
CN110659269A (en) * | 2019-08-15 | 2020-01-07 | 中国平安财产保险股份有限公司 | User access data processing method and device, computer equipment and storage medium |
CN110659269B (en) * | 2019-08-15 | 2024-04-02 | 中国平安财产保险股份有限公司 | User access data processing method, device, computer equipment and storage medium |
CN110516155A (en) * | 2019-08-29 | 2019-11-29 | 深圳市云积分科技有限公司 | Marketing strategy generation method and system |
CN110717085B (en) * | 2019-10-12 | 2021-08-06 | 浙江工商大学 | Opinion leader identification method based on virtual brand community |
CN110717085A (en) * | 2019-10-12 | 2020-01-21 | 浙江工商大学 | Opinion leader identification method based on virtual brand community |
CN110991875B (en) * | 2019-11-29 | 2023-09-26 | 广州市百果园信息技术有限公司 | Platform user quality evaluation system |
CN110991875A (en) * | 2019-11-29 | 2020-04-10 | 广州市百果园信息技术有限公司 | Platform user quality evaluation system |
CN111311331A (en) * | 2020-02-26 | 2020-06-19 | 北京慧博科技有限公司 | RFM analysis method |
CN111695819B (en) * | 2020-06-16 | 2023-06-02 | 中国联合网络通信集团有限公司 | Seat personnel scheduling method and device |
CN111695819A (en) * | 2020-06-16 | 2020-09-22 | 中国联合网络通信集团有限公司 | Method and device for scheduling seat personnel |
CN111738331A (en) * | 2020-06-19 | 2020-10-02 | 北京同邦卓益科技有限公司 | User classification method and device, computer-readable storage medium and electronic device |
CN111899036A (en) * | 2020-08-03 | 2020-11-06 | 上海同儒信息技术有限公司 | Client hierarchical classification management system based on big data |
CN112070548A (en) * | 2020-09-11 | 2020-12-11 | 上海风秩科技有限公司 | User layering method, device, equipment and storage medium |
CN112070548B (en) * | 2020-09-11 | 2024-02-20 | 上海秒针网络科技有限公司 | User layering method, device, equipment and storage medium |
CN112594937A (en) * | 2020-12-16 | 2021-04-02 | 珠海格力电器股份有限公司 | Control method and device of water heater, electronic equipment and storage medium |
CN112598442A (en) * | 2020-12-25 | 2021-04-02 | 中国建设银行股份有限公司 | Multidimensional operation analysis method and multidimensional operation analysis device for network traffic |
TWI778568B (en) * | 2021-04-06 | 2022-09-21 | 富邦人壽保險股份有限公司 | Systems and methods for generating recommendation list |
CN113793060A (en) * | 2021-09-27 | 2021-12-14 | 武汉众邦银行股份有限公司 | Customer rating method and device based on customer transaction data and storage medium |
CN115879984A (en) * | 2023-03-03 | 2023-03-31 | 北京一凌宸飞科技有限公司 | Network marketing method and system based on big data analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106022800A (en) | User feature data processing method and device | |
US11157926B2 (en) | Digital content prioritization to accelerate hyper-targeting | |
Chitra et al. | Data mining techniques and its applications in banking sector | |
Yoseph et al. | The impact of big data market segmentation using data mining and clustering techniques | |
US8341101B1 (en) | Determining relationships between data items and individuals, and dynamically calculating a metric score based on groups of characteristics | |
EP2474945A1 (en) | Analyzing transactional data | |
Haenlein | A social network analysis of customer-level revenue distribution | |
CN108415913A (en) | Crowd's orientation method based on uncertain neighbours | |
CN108572988A (en) | A kind of house property assessment data creation method and device | |
Xu et al. | Potential buyer identification and purchase likelihood quantification by mining user-generated content on social media | |
Yuping et al. | New methods of customer segmentation and individual credit evaluation based on machine learning | |
CN112330373A (en) | User behavior analysis method and device and computer readable storage medium | |
Bouzidi et al. | LSTM-based automated learning with smart data to improve marketing fraud detection and financial forecasting | |
KR20100046421A (en) | Method and server for estimating preference of commodity | |
Vikram et al. | Data mining tools and techniques: a review | |
Martins et al. | Characterizing sponsored content in Facebook and Instagram | |
Thorleuchter et al. | Using Webcrawling of Publicly Available Websites to Assess E-commerce Relationships | |
Osaysa | Improving the quality of marketing analytics systems | |
Asmat et al. | Data mining framework for the identification of profitable customer based on recency, frequency, monetary (RFM) | |
Silpa et al. | Detection of Fake Online Reviews by using Machine Learning | |
Gupta et al. | Segmentation of retail customers based on cluster analysis in building successful CRM | |
Hu et al. | Research on long tail recommendation algorithm | |
CN107705135A (en) | A kind of method that potential commercial value is evaluated based on company's storage contact data | |
Ming | Application research of customer big data analysis for online shop based on smart cloud platform tools | |
Iqbal et al. | Association rule analysis-based identification of influential users in the social media |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161012 |