CN106022800A - User feature data processing method and device - Google Patents

User feature data processing method and device Download PDF

Info

Publication number
CN106022800A
CN106022800A CN201610323618.0A CN201610323618A CN106022800A CN 106022800 A CN106022800 A CN 106022800A CN 201610323618 A CN201610323618 A CN 201610323618A CN 106022800 A CN106022800 A CN 106022800A
Authority
CN
China
Prior art keywords
user
data
model
client
customer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610323618.0A
Other languages
Chinese (zh)
Inventor
苏萌
杜晓梦
刘译璟
苏海波
王双双
魏太云
金英
李慧
鲍亚翟
高书明
张凯
张文学
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baifendian Information Science & Technology Co Ltd
Original Assignee
Beijing Baifendian Information Science & Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baifendian Information Science & Technology Co Ltd filed Critical Beijing Baifendian Information Science & Technology Co Ltd
Priority to CN201610323618.0A priority Critical patent/CN106022800A/en
Publication of CN106022800A publication Critical patent/CN106022800A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a user feature data processing method. The method comprises the following steps: obtaining user behavior data and data item information data from a data source; carrying out data integration on the obtained user behavior data and the data item information data according to different service logics to obtain user feature data corresponding to the service logics; and carrying out processing on the user feature data by utilizing packaging models corresponding to the user feature data to obtain processing result data corresponding to the service logics. The plurality of packaging models are established, and processing is carried out on the user feature data by utilizing the packaging models corresponding to the user feature data to obtain the processing result data corresponding to the service logics, thereby providing full-amount data mining model encapsulation for enterprises, and providing more accurate user behavior feature information for the enterprises.

Description

A kind for the treatment of method and apparatus of user characteristic data
Technical field
The present invention relates to data mining analysis, data modeling technical field, particularly relate to a kind of user characteristics The treating method and apparatus of data.
Background technology
Along with the popular feeling that deepens continuously of management philosophy customer-centric, analyze client, understand client also The demand guiding client has become the important topic of enterprise operation.Based on data mining technology, enterprise will be Utilize customer resources to limits, carry out analysis and the prediction of customer action, client is classified.Have Help customer profitability analysis, find potential valuable client, carry out personalized service, improve The satisfaction of client and loyalty.
But, during existing corporate member manages, statistical analysis member's essential information, transaction Data etc., do not go deep into mining analysis to the behavior of member the Internet, and the internet behavior of such as member is inclined Good, social networks preference etc., some IT application in enterprises is fairly perfect, also simply uses sampled data to excavate Analyze, do not possess magnanimity big data mining ability.
Summary of the invention
The present invention provides the treating method and apparatus of a kind of user characteristic data, can be that enterprise provides full dose The model encapsulation of data mining, provides more accurate user behavior characteristic information for enterprise.
On the one hand, embodiments provide the processing method of a kind of user characteristic data, including:
User behavior data and data item information data is obtained from data source;
According to different service logics, user behavior data and data item information data to described acquisition enter Row Data Integration, obtains the user characteristic data corresponding with described service logic;
Use the packaging model corresponding with described user characteristic data that described user characteristic data is carried out Process and obtain the result data corresponding with described service logic.
On the other hand, the embodiment of the present invention provides the processing means of a kind of user characteristic data, including:
Acquisition module, for obtaining user behavior data and data item information data from data source;
Integrate module, for according to different service logics, the user behavior data sum to described acquisition Carry out Data Integration according to item information data, obtain the user characteristic data corresponding with described service logic;
Processing module, for using the packaging model corresponding with described user characteristic data to described user Characteristic carries out processing and obtains the result data corresponding with described service logic.
The embodiment of the present invention, by setting up multiple packaging model, uses the envelope corresponding with user characteristic data Described user characteristic data is processed and obtains the result data corresponding with service logic by die-filling type, Can be the model encapsulation of enterprise's offer full dose data mining, provide more accurate user behavior for enterprise Characteristic information.
Accompanying drawing explanation
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes of the application Point, the schematic description and description of the application is used for explaining the application, is not intended that the application's Improper restriction.In the accompanying drawings:
Figure 1A is the schematic flow sheet of the processing method of a kind of user characteristic data of the embodiment of the present invention;
Figure 1B is a certain implementing procedure schematic diagram of step 102 in Figure 1A;
Fig. 1 C is the flow process signal of the customer segmentation model setting up client's probation of one embodiment of the invention Figure;
Fig. 1 D is the flow process signal of the client's diffusion model setting up client's period of maturation of one embodiment of the invention Figure;
Fig. 2 is the pattern shop system architecture diagram of the embodiment of the present invention;
Fig. 3 is the Establishing process schematic diagram of the customer segmentation model of the embodiment of the present invention;
Fig. 4 is the Establishing process schematic diagram of the customer value degree model of the embodiment of the present invention;
Fig. 5 is the Establishing process schematic diagram of the customer loyalty identification model of the embodiment of the present invention;
Fig. 6 is the Establishing process schematic diagram of client's diffusion model of the embodiment of the present invention;
Fig. 7 is the Establishing process schematic diagram of client's liveness model of the embodiment of the present invention;
Fig. 8 is the Establishing process schematic diagram of client's social network analysis model of the embodiment of the present invention;
Fig. 9 is the Establishing process schematic diagram of the customer defection early warning model of the embodiment of the present invention;
Figure 10 is the module diagram of the processing means of the user characteristic data of the embodiment of the present invention.
Detailed description of the invention
Presently filed embodiment is described in detail, thereby to the application below in conjunction with drawings and Examples How application technology means solve technical problem and reach the process that realizes of technology effect and can fully understand And implement according to this.
In a typical configuration, calculating equipment include one or more processor (CPU), input/ Output interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/or the form such as Nonvolatile memory, such as read only memory (ROM) or flash memory (flash RAM). Internal memory is the example of computer-readable medium.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be by Any method or technology realize information storage.Information can be computer-readable instruction, data structure, The module of program or other data.The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), dynamic random access memory (DRAM), Other kinds of random access memory (RAM), read only memory (ROM), electrically erasable Read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassette tape, tape Magnetic rigid disk storage or other magnetic storage apparatus or any other non-transmission medium, can be used for storage can be by The information that calculating equipment accesses.According to defining herein, computer-readable medium does not include non-temporary electricity Brain readable media (transitory media), such as data signal and the carrier wave of modulation.
As employed some vocabulary in the middle of description and claim to censure specific components.This area skill Art personnel are it is to be appreciated that hardware manufacturer may call same assembly with different nouns.This explanation In the way of book and claim not difference by title is used as distinguishing assembly, but with assembly in function On difference be used as distinguish criterion." bag as mentioned by the middle of description in the whole text and claim Contain " it is an open language, therefore " comprise but be not limited to " should be construed to." substantially " refer to receivable In range of error, those skilled in the art can solve described technical problem, base in the range of certain error Originally described technique effect is reached.Additionally, " coupling " word comprises any directly and indirectly electrical coupling at this Catcher section.Therefore, if a first device is coupled to one second device described in literary composition, then described first is represented Device can directly be electrically coupled to described second device, or by other devices or to couple means the most electric Property is coupled to described second device.Description subsequent descriptions is to implement the better embodiment of the application, so For the purpose of described description is the rule so that the application to be described, it is not limited to scope of the present application. The protection domain of the application is when being as the criterion depending on the defined person of claims.
Also, it should be noted term " includes ", " comprising " or its any other variant are intended to non- Comprising of exclusiveness, so that include that the commodity of a series of key element or system not only include that those are wanted Element, but also include other key elements being not expressly set out, or also include for this commodity or be Unite intrinsic key element.In the case of there is no more restriction, statement " including ... " limit Key element, it is not excluded that there is also other identical element in the commodity including described key element or system.
Figure 1A is the schematic flow sheet of the processing method of a kind of user characteristic data of the embodiment of the present invention, As shown in Figure 1A:
Step 101, obtain user behavior data and data item information data from data source.
In embodiments of the present invention, described data source can include first party data, third party's data;Number Structural data, semi-structured data and unstructured data can be included according to the data type in source;Specifically , crm system or first party own website that first party data can be had by oneself by first party are had User behavior data;Electric business that third party's data can be had by third party businessman, online media sites number According to etc..
Described user behavior data refers to the data that can be used for representing user behavior, specifically can include using Family navigation patterns data, purchasing behavior data etc..
Described data item information data refers to the Back ground Information of user, specifically can include address name, year The Back ground Informations such as age, sex, cell-phone number, unique identification number.
The embodiment of the present invention utilizes data mining can effectively obtain the various information of user.Such as pass through Data mining it appeared that the consumer (user) buying certain commodity is male or women, educational background, Income how, has anything to like, is what occupation etc..Even it appeared that different users is purchasing This kind of commodity, and which type of user is the most likely bought after buying the dependent merchandise of this kind of commodity This kind of commodity of what model etc. can be bought.After have employed data mining, send for targeted customer The effectiveness of advertisement and response rate will be greatly enhanced, the cost of distribution will be substantially reduced.With Time, on the basis of user data excavates, enterprise, it appeared that emphasis user and evaluation market performance, makes Determine personal marketing strategy, widen Sales Channel and scope, formulate production strategy and development plan for enterprise The foundation of offer science.
Step 102, according to different service logics, user behavior data and the data item to described acquisition Information data carries out Data Integration, obtains the user characteristic data corresponding with described service logic.
In embodiments of the present invention, described service logic refers to process the logic of data.
Described user characteristic data refers to the set of the characteristic attribute of the user of each model correspondence input data.
Such as: customer value degree service logic is intended to distinguish the high, normal, basic, by acquisition of user's action value User behavior data and data item information data integrate, obtain meeting this customer value degree business and patrol The user characteristic data of volume input: the last time buying, accumulative buy the frequency, the cumulative consumption amount of money.
Wherein, step 102 can be by the following method to the user behavior data obtained and data item Information Number According to carrying out Data Integration, include when implementing:
Receive, get through and integrate multiple data source user behavior data and data item information data, the most more New to unified data platform.Concrete, shown in Figure 1B, step 102 can include;:
1021, multiple data source user behavior data and data item information data are received.Concrete, permissible Multiple data source user behavior datas and data item information data are received from the big data operation system of BD-OS.
1022, user behavior data and the data item information data in different pieces of information source are got through.Specifically can root User behavior data and the data item information data in different pieces of information source is got through, by compiling according to the default relation of getting through Write shell script the data in different pieces of information source to be got through.The described relation of getting through of presetting can be by difference Matching identification in data source user behavior data and/or data item information data is arranged, described data source Including the first data source and the second data source, concrete grammar is as follows:
Determine the first matching identification in described first data source user behavior and/or data item information data Group, determines the second matching identification in described second data source user behavior and/or data item information data Group;
When judge described first matching identification group has be present in described second matching identification group When joining mark, the first data source and the user behavior data of the second data source and data item information data are beaten Logical.
Illustrate with actual application scenarios below.Citing: first party data (the i.e. first data source) are come Come from CRM, store data item information data, including the basic data of user, such as ID, surname Name, age, cell-phone number, whether marry, whether have the Back ground Informations such as children, location, mailbox;
Third party's data (the i.e. second data source) derive from percentage point electricity quotient data, and such as user is at what Which commodity time with the addition of and enters shopping cart, when places an order, uses the information such as what mode payment. Order ID, order time, purchase commodity ID, merchandise resources, ID, user name, user mobile phone Number wait user behavior data;
The method first passing through cell-phone number coupling, gets through the user of first party data and third party's data and reflects Penetrate as same user;
Then integrate the information of this user, be ID, name, age, cell-phone number after integration, be No marriage, whether have children, location, mailbox, order ID, the order time, buy commodity ID, Merchandise resources, ID, user name, user mobile phone number.Concrete, get through and number after integrating According to can be as shown in table 1.Wherein order ID, ID can carry out denoising by linear dimensionality reduction.
Table 1
1023, the data after getting through are carried out denoising;Specifically can use the method pair of linear dimensionality reduction Data after getting through in table 1 carry out denoising.Concrete, dimension reduction method has multiple, according to the characteristic of data Linear dimensionality reduction and Nonlinear Dimension Reduction can be used, according to whether consider and utilize the supervision message of data permissible Use without supervision dimensionality reduction and have supervision dimensionality reduction etc..
1024, the above-mentioned data handled well are deposited into unified data platform.Concrete, can will process Good data are deposited in the HDFS file of the big data operation system of BD-OS.Here, above-mentioned process Good data refer to the data after aforementioned 1022 and 1023 steps perform.
Step 103, use the packaging model corresponding with described user characteristic data to described user characteristics Data carry out processing and obtain the result data corresponding with described service logic.
Specifically, before step 103, including:
Set up the corresponding relation between user characteristic data with packaging model;
Wherein, the packaging model described in the embodiment of the present invention include client's probation customer segmentation model, Client forms the customer value model of phase, customer loyalty identification model, the client in client's period of maturation diffusion Model, client's liveness model, social network analysis model or the customer defection early warning mould of client's phase of decline Type.
Concrete, in embodiments of the present invention, the corresponding pass between user characteristic data with packaging model System can be such that
The user characteristic data that customer segmentation model correspondence comprises is respectively ID, address name, user Age, wed no, whether have children, household electrical appliance, number, Clothes decoration articles, footwear, Automobile Products, fortune The characteristics such as dynamic open air.In embodiments of the present invention, customer segmentation model may be used for analyzing effective hole Examine client, the effective category feature finding client, thus realize efficiency and benefit is double rises.
User characteristic data respectively ID, the last time that customer value degree model correspondence comprises are purchased Buy time, accumulative the purchase frequency, the cumulative consumption amount of money.In embodiments of the present invention, customer value degree mould Type may be used for analyzing measurement customer value and client's ability to make profits obtains means.
User characteristic data respectively ID that customer loyalty identification model correspondence comprises, accumulative step on Record number of times, accumulated dwelling time, accumulative login natural law etc..In embodiments of the present invention, customer loyalty Model may be used for analyze quantify client Material Quality level, can identify enterprise loyalty customer and Normal client.
The user characteristic data that client's diffusion model correspondence includes is respectively ID, local life, U.S. Hold body shaping, beauty, manicure etc..In embodiments of the present invention, client's diffusion model may be used for analyzing Solve client characteristics, help to excavate storage potential customers, and give not according to customer data feature and population characteristic Generic potential customers are tagged.
The user characteristic data that client's liveness model correspondence comprises is respectively ID, sends out for the last time Post in note time, last money order receipt to be signed and returned to the sender time, half a year number, money order receipt to be signed and returned to the sender number in half a year, accumulative log in natural law, Accumulated dwelling time etc..In embodiments of the present invention, client's liveness model may be used for analytical calculation net That stands enlivens visit capacity, it is also possible to calculate the active users that different time is interval.
The user characteristic data that social network analysis model correspondence comprises respectively is posted ID, money order receipt to be signed and returned to the sender ID etc..In embodiments of the present invention, client's social network analysis model may be used for analyzing social activity Responsible consumer in the network platform, opinion leader in the social networks that Identification platform is constituted, active point Son, society beauty.
The user characteristic data that customer defection early warning model correspondence comprises is respectively ID, accumulative access sky Count, add up to access duration, buy number of times, consumption total amount, average daily visit capacity, jumping mistake rate, nearest one Individual month access day accounting etc..In embodiments of the present invention, customer defection early warning model may be used for analyzing The population characteristic of customer revenue, it was predicted that go out the probability of customer churn, to height, loss probability client is identified, And can go out, in conjunction with customer value model discrimination, the high value needing emphasis to safeguard and have the client of loss orientation, this Maintenance to top-tier customer is highly important analysis means.
To this end, shown in Fig. 1 C, the embodiment of the present invention can also include the visitor setting up described client's probation Family Segmentation Model:
1051, by known user characteristic data, it is worth and/or user behavior standard according to user, Number and the central point of each user group of user group are set.In embodiments of the present invention, user Characteristic can include or corresponding user behavior data.Such as: the user characteristic data of known maiden is Under-18s, like South Korean TV soaps etc., when getting a certain user behavior data for " repeatedly browsing South Korean TV soaps ", But the user that then this user behavior data is corresponding is probably maiden.
Known user characteristic data refers to known users demographic categories and feature, is usually by business event Personnel or expert of the art rule of thumb give.Further according to the different significant difference features of different groups, if Put the central point of each user group.
Such as: boy student colony is generally liked " physical culture ", " Taobao " is generally liked by schoolgirl colony, then When known user characteristic data major part is " physical culture ", " Taobao ", then could be arranged to two User group, the significant difference feature of Liang Ge colony can be then whether to like " physical culture " or " Taobao ".
1052, by distributed computing method calculate each user and each user group's central point away from From;Concrete, use Euclidean distance method to calculate the distance of each user and each user group's central point;
1053, according to the distance of each user Yu each user group's central point, user is divided into and it In the user group belonging to central point that distance is minimum.
User group can be divided into N class by customer segmentation model, is used for analyzing effective insight into customer, has The category feature finding client of effect, thus realize that efficiency and benefit are double to be risen.
The embodiment of the present invention also includes setting up client and forms the customer value model of phase:
The last consumption time according to each user, accumulation buy the frequency and accumulation spending amount, Described user is carried out user's value analysis, and the user that described user is divided into correspondence is worth in colony.
Concrete, can carry out user's value analysis by the following method:
First calculate the meansigma methods of each value variable of each user, and by the value of each value variable with This average compares, more a height of than average+, low be-, thus user is divided into different users be worth group. Such as table 2 below, wherein, in table 2, value variable M represents client's accumulation spending amount (Monetary), Value variable F represents that client accumulates the purchase frequency (Frequency), and value variable R represents nearest one Secondary consumption time (Regency).
Table 2
M F R Output result label
+ + + Important value client
- + + General value client
- - + General development client
- - - Typically keep client
+ - + Important development client
+ - - Important keep client
- + - Typically keep client
+ + - Important holding client
Customer value degree model may be used for analyzing measurement customer value and client's ability to make profits obtains means.
The embodiment of the present invention also includes setting up customer loyalty identification model:
Calculating each user to mark the loyalty of a certain product, described loyalty is marked between 0-100, Data are the biggest, and loyalty is the highest, and described product includes but not limited to website or brand.
Customer loyalty model may be used for analyzing the Material Quality level quantifying client, can identify enterprise The loyalty customer of industry and normal client.
Referring to Fig. 1 D, the embodiment of the present invention can also include that the client setting up client's period of maturation spreads mould Type:
1061, according to the seed user data imported, the full dimension data of seed user is extracted;
Seed user is the user with certain user characteristic data, the user characteristic data root of seed user There is different definition according to different scenes, electricity business be probably the user liking buying certain class I goods, The user etc. seeing a certain class novel is may like in media.
Full dimension data refers to all characteristic attributes of seed user, the media of such as seed user or electricity The preference data of business browses record.
1062, the full dimension data to seed user carries out seed user outlier, missing values processes, number According to normalized, obtain the characteristic point of seed user.
Concrete, the normalized of step 1062 is accomplished by
Normal distribution method is utilized to obtain seed user outlier;
Then the missing values of existence is deleted;
Finally use min-max standardized method to carry out data normalization process, make result be mapped to [0-1] Between.
User characteristic data corresponding to mapping result is the characteristic point of seed user.
1063, according to the characteristic point of seed user, seed user is enabled focusing solutions analysis, extract with The customer group that seed user is similar.
1064, the similarity of nearest active time user and seed user is calculated;Concrete, utilize KNN to calculate Method calculates the similarity of nearest active time user and seed user;Similar value is the biggest, represents user and gets over phase Seemingly;Similar value is the least, represents user the most dissimilar.
1065, it is ranked up from big to small according to similarity, extracts N number of user data that similarity is bigger. The user data extracted be diffusion with seed user, there is a group user of similar features, be diffusion model Purpose.
Client's diffusion model may be used for analyzing understands client characteristics, helps to excavate storage potential customers, and Give different classes of potential customers tagged according to customer data feature and population characteristic.
The embodiment of the present invention also includes setting up client's liveness model:
According to the default liveness factor, calculating the liveness of each user, described liveness is at 0-100 Between, data are the biggest, and liveness is the highest.
Calculating liveness may be accomplished by:
Give weighted value for each default liveness factor, use min-max standardized method to each The liveness factor is normalized, and then takes Log, is multiplied by weighted value simultaneously, finally uses weighting What summation method obtained each user enlivens angle value.
What client's liveness model may be used for analytical calculation website enlivens visit capacity, it is also possible to calculate difference The active users of time interval.
It is exemplified below, it is assumed that when the liveness factor is to post time, last money order receipt to be signed and returned to the sender for the last time Between, number of posting in half a year, money order receipt to be signed and returned to the sender number in half a year, accumulative to log in the user such as natural law, accumulated dwelling time special Levy data, post in being respectively post for the last time time, last money order receipt to be signed and returned to the sender time, half a year and count, partly In year, money order receipt to be signed and returned to the sender number, the accumulative user characteristic data such as natural law, accumulated dwelling time that logs in give weighted value, adopt With min-max standardized method to each user characteristic data normalized, then take Log, simultaneously It is multiplied by weighted value, finally uses what weighted sum method obtained each user to enliven angle value.
The embodiment of the present invention also includes setting up social network analysis model:
Calculate the out-degree of each user node in social networks, in-degree, corresponding point degree centrad, determine Opinion leader user in social networks, activist user, society beauty user.
Social network analysis is that limit represents by building social network relationships figure, wherein node on behalf user Relation between user and user;Then according to out-degree, in-degree, the calculating of corresponding point degree centrad, meter The result obtained compares with the threshold value (typically can rule of thumb give) of setting, determines in network Opinion leader, activist and society beauty.Wherein, the containing of opinion leader, activist and society beauty Justice is as follows.
1) opinion leader: replied more user for opinion leader by other people.
In-degree: enter user node edge strip number, money order receipt to be signed and returned to the sender people reply post people once, for posting, people remembers one Secondary in-degree.
2) activist: other people reply more user is activist.
Out-degree: leave the edge strip number of user node, money order receipt to be signed and returned to the sender people reply post people once, for money order receipt to be signed and returned to the sender people Remember an out-degree.
3) society beauty: other users can reply mutually with society beauty, important in constituent relation network The user of node.
Middle centrad: calculate the shortest path between multiple user nodes in figure.Computing formula is as follows:
C B ( v ) = Σ s ≠ v ≠ t ∈ V σ s t ( v ) σ s t
σst: from all shortest path numbers of user node s to user node t
σst(ν): through the bar number of user node v in all shortest paths from user node s to t
Certain user node occurs the most in these paths, and middle centrad is the highest.And center, centre Spending user corresponding to the highest user node is then the user of important node in network.Concrete, when certain When one middle centrad is higher than predetermined threshold value, it is determined that user corresponding to this middle centrad is then network The user of middle important node.
Client's social network analysis model may be used for analyzing the responsible consumer in social network-i i-platform, identifies Opinion leader in the social networks that platform is constituted, activist, society beauty.
The embodiment of the present invention also includes the loss Early-warning Model setting up client's phase of decline:
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, runs off to height Probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.Concrete, Setting up of customer defection early warning model is specific as follows:
By the population characteristic at given known flow apraxia family, use the logistic regression algorithm of 0-1 variable, Determine the regression coefficient of each characteristic variable, then the characteristic of user to be predicted is fitted, estimates Calculate the loss probability of user, the probability obtained is contrasted, for higher than threshold value with the threshold value preset User carry out run off mark, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
Customer defection early warning model may be used for analyzing the population characteristic of customer revenue, it was predicted that goes out customer churn Probability, to height, loss probability client is identified, and can go out to need in conjunction with customer value model discrimination weight The high value that point is safeguarded has the client of loss orientation, and this is highly important analysis to the maintenance of top-tier customer Means.
The embodiment of the present invention, by setting up multiple packaging model, uses the envelope corresponding with user characteristic data Described user characteristic data is processed and obtains the result data corresponding with service logic by die-filling type, Can be the model encapsulation of enterprise's offer full dose data mining, provide more accurate user behavior for enterprise Characteristic information.
Below by specific embodiment, the specific implementation of the present invention is described in detail.
Fig. 2 is the pattern shop system architecture diagram of the embodiment of the present invention, as in figure 2 it is shown, pattern shop system System is mainly made up of following assembly:
1) data acquisition: obtain multiple data by the data platform big data operation system of such as BD-OS The user behavior data in source and data item information data;
Described user behavior data refers to the data that can be used for representing user behavior, specifically can include using Family navigation patterns data, purchasing behavior data etc.;
Described data item information refers to the Back ground Information of user, specifically can include address name, the age, The Back ground Informations such as sex.
2) data platform: use distributed treatment framework and customization to improve big data mining technology implementation, For according to different service logics, user behavior data and data item information data to described acquisition enter Row Data Integration, obtains the user characteristic data corresponding with described service logic;
In embodiments of the present invention, described service logic refers to process the logic of data;
Described user characteristic data refers to the set of the characteristic attribute of the user of each model correspondence input data;
3) pattern shop: by simply configuring the page, to letters such as the input of model, parameter, outputs Breath configures;For using the packaging model corresponding with described user characteristic data special to described user Levy data to carry out processing and obtain the result data corresponding with described service logic;
4) scheduler program: perform packaged model by scheduler program;
5) early warning and monitoring: during model performs, system can be monitored by monitoring system, as Fruit occurs performing mistake, and system can provide timely early warning.Concrete, use Nagios network monitor tools Realize, for effective monitoring model perform task, when tasks carrying failure, it is possible to by send out mail or The mode of note carries out early warning.
Wherein, Fig. 3 customer segmentation model is implemented to describe:
1, the effect of model system is dividing user groups, and ordinary circumstance is worth and/or user's row from user It is analyzed for standard two aspect, specifically can be analyzed by clustering algorithm.Therefore model system is defeated Enter data (user characteristic data that the most each model is corresponding) to determine according to concrete scene, such as Electricity business's subscriber segmentation, media subscriber segmentation.In embodiments of the present invention, user is worth and user behavior is The aspects such as the action value of finger user and the online browsing behavior of user, Shopping Behaviors, by the algorithm of cluster It is analyzed.
2, after determining concrete scene, segmentation colony number parameter is set by model system, with And the central point of each colony, this central point is divided by known users group character.
3, by existing user by the way of Distributed Calculation, Europe is used according to the characteristic of user Formula distance method calculates the distance of each user and each colony central point, and user is divided into and it That apoplexy due to endogenous wind that distance is minimum.Model system is periodically flushed, new user is divided in known colony.
Wherein, the formula of Euclidean distance method is d=sqrt ((x1-x2)^2+(y1-y2)^2);Wherein d is distance, x1,y1 For two features of user 1, x2,y2Two features for user 2.
Fig. 4 customer value degree model is implemented to describe
The purchase frequency is accumulated by the last consumption time (Regency) of a client, client (Frequency) and client's accumulation spending amount (Monetary) carries out RFM to client and hives off, Client is divided into the colony of different value, and concrete division methods can be as shown in table 2.According to different Business scenario, the input of 3 value variables of Regency, Frequency, Monetary is different.
Concrete colony division rule is as follows:
First calculate the meansigma methods of each value variable of each user, and by the value of each value variable with This average compares, more a height of than average+, low be-.Finally all clients are divided into 8 colonies, as Aforementioned table 2:
Table 2
M F R Output result label
+ + + Important value client
- + + General value client
- - + General development client
- - - Typically keep client
+ - + Important development client
+ - - Important keep client
- + - Typically keep client
+ + - Important holding client
In actual applications, business scenario is arranged for industry and the automobile industry of disappearing soon, 3 value variables Standard different, the renewal frequency of the product that disappear soon is higher, and the renewal frequency of automobile industry just ratio is relatively low.
Fig. 5 customer loyalty model is implemented to describe
1) use principal component analysis, provide one to recommend weight for each loyal variable;
2) by each loyal variable uses efficiency coefficient method, this loyalty variate-value normalizing is mapped to 0-100 Between: (loyal variable-this loyalty variable in minima)/(maximum-this loyalty in this loyalty variable Minima in variable) * 100
3) by the mapping value * weight of each loyal variable and add up, the mark of a loyalty is obtained.
4) calculated loyalty points is mapped between 0-100, and by front-end configuration page Face transmission represents the loyal parameter of loyalty " high, normal, basic ", to reach to adjust loyalty " high, normal, basic " The purpose of mark section.Concrete, loyal parameter is to be inputted by front end page parameter according to different scenes It is configured;Such as: high loyalty: 0.3;Middle loyalty: 0.3;Low loyalty: 0.4;Wherein three ginsengs The summation of number is 1.
Fig. 6 client's diffusion model is implemented to describe
1) data import module: import the uid of seed user, telephone number, Email;Seed user is Having the user of certain user characteristic data, seed user has different definition according to different scenes, Electricity business is probably the user liking buying certain class I goods, may like in media and see a certain class novel User etc.;
2) seed user data extraction module: extract the full dimension data of seed user;Full dimension data is Referring to all characteristic attributes of seed user, the media of such as seed user or the preference data of electricity business browse Record.
3) data preprocessing module: seed user outlier, missing values process, data normalization processes;
4) manual feature screening: select seed user characteristic attribute, the period of such as surfing the Net, purchasing power, Commodity, media categories browse, buy feature;The last active time selects;Manual specific characteristic belongs to The significance level of property;This process can also automatic screening, can be belonged to by the feature of automatic nodes for research user Property complete screening;
5) automated characterization screening: if user ignores manual feature screening, enable factorial analysis module, carry Take the feature of seed user;
6) cluster module: seed user enables focusing solutions analysis, extracts the use similar to seed user Family group, concrete, extract the representative user C with cohesion class;
7) similarity computing module: calculate and meet nearest active time user and the direct similarity of user C; Concrete, utilize KNN algorithm to calculate nearest active time user and the similarity representing user;Similar value The biggest, represent user the most similar;Similar value is the least, represents user the most dissimilar.
8) sequence extraction module: have according to similarity and be ranked up to little greatly, extracts topN user data.
Fig. 7 client's liveness model is implemented to describe
1) use principal component analysis, provide one to recommend weight for each diffusion variable.
2) by each diffusion variable uses efficiency coefficient method, this diffusion variate-value normalizing is mapped to 0-100 Between: (minima in diffusion variable-this diffusion variable)/(maximum-this diffusion in this diffusion variable Minima in variable) * 100.
3) by the mapping value * weight of each diffusion variable and add up, the mark of a liveness is obtained.
4) calculated liveness mark is mapped between 0-100, and by the front-end configuration page What transmission represented liveness " high, normal, basic " enlivens parameter, reaches to adjust the mark of liveness " high, normal, basic " The purpose of section.Concrete, enlivening parameter is to be carried out by the input of front end page parameter according to different scenes Arrange;Such as: high liveness: 0.5;Middle liveness: 0.3;Low liveness: 0.2;Wherein three parameters Summation is 1.
Fig. 8 social networks model describes
Setting up of social networks model is specific as follows:
1) data prepare: data are extracted, and generate social middle table.Wherein, extract in social networks every The data of individual user, can include user behavior data and data item information data, such as user id, send out The data such as note time, theme of posting, content of posting, reply user id, money order receipt to be signed and returned to the sender time, money order receipt to be signed and returned to the sender content; Thus generate social middle table (including post ID, money order receipt to be signed and returned to the sender ID, money order receipt to be signed and returned to the sender number of times).Social Middle table is used for generating social network diagram, on the user of figure interior joint representative, limit representation relation, limit Value represents money order receipt to be signed and returned to the sender number of times.
2) model is set up: calculate the out-degree of each user node, in-degree, corresponding point in social networks Degree centrad, determines that the opinion leader user in social networks, activist user, society beauty use Family.
3) storage model: form generates and store social networks model according to the rules.
And herein presented technical term is explained as follows.
A) opinion leader: replied more user for opinion leader by other people.
In-degree: enter user node edge strip number, money order receipt to be signed and returned to the sender people reply post people once, for posting, people remembers one Secondary in-degree.
B) activist: other people reply more user is activist.
Out-degree: leave the edge strip number of user node, money order receipt to be signed and returned to the sender people reply post people once, for money order receipt to be signed and returned to the sender people Remember an out-degree.
C) society beauty: other users can reply mutually with society beauty, important in constituent relation network The user of node.
Middle centrad: calculate the shortest path between multiple user nodes in figure.Computing formula is as follows:
C B ( v ) = Σ s ≠ v ≠ t ∈ V σ s t ( v ) σ s t
σst: from all shortest path numbers of user node s to user node t
σst(ν): through the bar number of user node v in all shortest paths from user node s to t
Certain user node occurs the most in these paths, and middle centrad is the highest.And center, centre Spending user corresponding to the highest user node is then the user of important node in network.Concrete, when certain When one middle centrad is higher than predetermined threshold value, it is determined that user corresponding to this middle centrad is then network The user of middle important node.
Fig. 9 customer defection early warning model describes
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, runs off to height Probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
Being typically based on different application scenarios, the definition of customer loss is different, for media generally with The access browsing time definition customer loss of user, for retailer under electricity business and line, real with user Border place an order the time buying definition customer loss.
Customer defection early warning model is the population characteristic by given known flow apraxia family, uses 0-1 to become The logistic regression algorithm of amount, determines the regression coefficient of each characteristic variable, then by the spy of user to be predicted Levy data to be fitted, estimate the loss probability of user, the probability obtained is carried out with the threshold value preset Contrast, for carrying out, higher than the user of threshold value, the mark that runs off.
The description of customer defection early warning model is specific as follows:
1) data prepare: data are extracted, and generate loss middle table.Wherein, extract user base information, The data such as user behavior data (such as navigation patterns data), thus generate loss middle table (according to accumulative Access day, accumulative access duration, purchase number of times, consumption total amount, average daily visit capacity, jumping mistake rate, Nearest one month access day accounting).
Variable processes and selects
(B1) dependent variable: attrition status (0/1)
In certain period of time, the user without active record is defined as running off.If the loss time cannot be specified, Model, by spaced apart for the ensemble average active time according to user data, chosen and is the most suitably defined use The time span run off in family.
(B2) explanatory variable
Client's essential information: member's grade, consumption grade
Client browses information: access day, access duration, average daily visit capacity, commodity page browsing accounting, Jump mistake rate, nearest one month (or three months, six months) access day accounting
Customer purchase information: buy number of times, consumption total amount
Client contact's information: complain number of times, return of goods number of times, marketing mail clicking rate
(B3) data normalization
Different evaluation index often has different dimensions and dimensional unit, and such situation can shadow Ring the result to data analysis.In order to eliminate the dimension impact between index, enter mould in data Before type, it usually needs advanced row data normalization (normalization) processes, refer to solving data Comparability between mark.
Min-Max standardized method is that initial data is carried out linear transformation, is allowed to fall into one specifically Interval, such as [0,1].
(B4) notable variable is chosen
All of explanatory variable is put into regression model (full model), passes through maximum likelihood method Its regression parameter is estimated.Through the significance test of parameter or weed out according to AIC standard Inapparent variable, determines correlated variables and the regression parameter of each variable being selected into final mask.
2) model is set up: catch customer churn feature by homing method.Concrete, it is possible to use patrol Collect homing method (Logistic Regression) and catch customer churn feature, by 0-1 type dependent variable It is converted into a continuous dependent variable, processes the regression problem of 0-1 variable, also referred to as qualitative variable and return.
3) analyze and predict: analyze and predict user's potential loss probability.
Each achievement data of user to be predicted is substituted into the final mask of matching, estimates the loss of client Probability P (Y=1 │ x).Sort from high to low according to loss probability score situation, to height loss probability crowd It is identified.It is as follows that user's accounting chosen by each label:
(C1) may specify cut off value p_1, p_2 (such as p_1=0.4, p_2=0.8), general by running off Each user is stamped different labels by rate marking, is respectively divided excessive risk, risk, low-risk 3 In individual different colony.
(C2) also may specify each colony accounting c_1, c_2, c_3 (as c_1=20%, c_2=30%, C_3=50%), then all users are sorted from high to low according to loss probability, higher front 20% user's mark Knowing is excessive risk user, and minimum 50% is designated low-risk user, and other users are then divided into risk Colony.
Above-mentioned concrete cut off value and accounting can be adjusted according to the number specifically marketed and keep measure and need Whole.
4) storage model: form generates and store customer defection early warning model according to the rules.
Referring to Figure 10, the module for the processing means of the user characteristic data of embodiment of the present invention offer is shown Being intended to, the processing means of user characteristic data includes: acquisition module 1001, integration module 1002 and place Reason module 1003.Specific as follows:
Acquisition module 1001, for obtaining user behavior data and data item information data from data source;
Integrate module 1002, for according to different service logics, the user behavior data to described acquisition Carry out Data Integration with data item information data, obtain the user characteristics number corresponding with described service logic According to;
Processing module 1003, for using the packaging model corresponding with described user characteristic data to institute State user characteristic data to carry out processing and obtain the result data corresponding with described service logic.
Further, described integration module 1002 is used for:
Receive, get through and integrate multiple data source user behavior data and data item information data, the most more New to unified data platform.
Further, the processing means of described user characteristic data also includes:
Set up module, for setting up the corresponding relation between user characteristic data with packaging model;
Described packaging model includes that the customer segmentation model of client's probation, client form the customer value of phase Model, customer loyalty identification model, client's diffusion model in client's period of maturation, client's liveness model, The loss Early-warning Model of social network analysis model or client's phase of decline.
Further, described set up module, be additionally operable to set up the customer segmentation model of described client's probation, Specifically include:
By known user data, it is worth and/or user behavior standard according to user, user is set The number of colony and the central point of each user group;
The method being used Euclidean distance by Distributed Calculation is calculated in each user and each user group The distance of heart point;
According to the distance of each user Yu each user group's central point, user is divided into its distance In the little user group belonging to central point.
Further, described set up module, be additionally operable to set up client and form the customer value model of phase, tool Body includes:
The last consumption time according to each user, accumulation buy the frequency and accumulation spending amount, Described user is carried out user's value analysis, and the user that described user is divided into correspondence is worth in colony.
Further, described set up module, be additionally operable to set up customer loyalty identification model, specifically include:
Calculating each user to mark the loyalty of a certain product, described loyalty is marked between 0-100, Data are the biggest, and loyalty is the highest, and described product includes but not limited to website or brand.
Further, described set up module, be additionally operable to set up client's diffusion model in client's period of maturation, tool Body includes:
According to the seed user data imported, extract the full dimension data of seed user;
The full dimension data of seed user is carried out seed user outlier, missing values process, data normalizing Change processes, and obtains the characteristic point of seed user;
According to the characteristic point of seed user, seed user is enabled focusing solutions analysis, extract and use with seed The customer group that family is similar;
KNN algorithm is utilized to calculate nearest active time user and the similarity representing user;
It is ranked up from big to small according to similarity, extracts N number of user data that similarity is bigger.
Further, described set up module, be additionally operable to set up client's liveness model, specifically include:
According to the default liveness factor, calculating the liveness of each user, described liveness is at 0-100 Between, data are the biggest, and liveness is the highest.
Further, described set up module, be additionally operable to set up social network analysis model, specifically include:
Calculate the out-degree of each user node in social networks, in-degree, corresponding point degree centrad, determine Opinion leader user in social networks, activist user, society beauty user.
Further, described set up module, be additionally operable to set up the loss Early-warning Model of client's phase of decline, tool Body includes:
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, to height stream Lose probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
This device embodiment is the most corresponding with the feature in said method embodiment, and each module can corresponding perform In preceding method embodiment, the associated description of method flow part, does not repeats them here.
Described above illustrate and describes some preferred embodiments of the present invention, but as previously mentioned, it should reason Solve the present invention and be not limited to form disclosed herein, be not to be taken as the eliminating to other embodiments, And can be used for various other combination, amendment and environment, and can in invention contemplated scope described herein, It is modified by above-mentioned teaching or the technology of association area or knowledge.And those skilled in the art are carried out changes Move and change is without departing from the spirit and scope of the present invention, the most all should be in the protection of claims of the present invention In the range of.

Claims (20)

1. the processing method of a user characteristic data, it is characterised in that including:
User behavior data and data item information data is obtained from data source;
According to different service logics, user behavior data and data item information data to described acquisition enter Row Data Integration, obtains the user characteristic data corresponding with described service logic;
Use the packaging model corresponding with described user characteristic data that described user characteristic data is carried out Process and obtain the result data corresponding with described service logic.
2. the method for claim 1, it is characterised in that according to different service logics, right User behavior data and the data item information data of described acquisition carry out Data Integration, obtain and described business The user characteristic data that logic is corresponding, including:
Receive, get through and integrate multiple data source user behavior data and data item information data, the most more New to unified data platform.
3. the method for claim 1, it is characterised in that use and described user characteristic data Corresponding packaging model carries out process and obtains corresponding with described service logic described user characteristic data Result data before, including:
Set up the corresponding relation between user characteristic data with packaging model;
Described packaging model includes that the customer segmentation model of client's probation, client form the customer value of phase Model, customer loyalty identification model, client's diffusion model in client's period of maturation, client's liveness model, The loss Early-warning Model of social network analysis model or client's phase of decline.
4. method as claimed in claim 3, it is characterised in that also include that setting up described client investigates The customer segmentation model of phase:
By known user data, it is worth and/or user behavior standard according to user, user is set The number of colony and the central point of each user group;
The method being used Euclidean distance by Distributed Calculation is calculated in each user and each user group The distance of heart point;
According to the distance of each user Yu each user group's central point, user is divided into and its distance In the minimum user group belonging to central point.
5. method as claimed in claim 3, it is characterised in that also include that setting up client forms the phase Customer value model:
The last consumption time according to each user, accumulation buy the frequency and accumulation spending amount, Described user is carried out user's value analysis, and the user that described user is divided into correspondence is worth in colony.
6. method as claimed in claim 3, it is characterised in that also include that setting up customer loyalty knows Other model:
Calculating each user to mark the loyalty of a certain product, described loyalty is marked between 0-100, Data are the biggest, and loyalty is the highest, and described product includes but not limited to website or brand.
7. method as claimed in claim 3, it is characterised in that also include setting up client's period of maturation Client's diffusion model:
According to the seed user data imported, extract the full dimension data of seed user;
The full dimension data of seed user is carried out seed user outlier, missing values process, data normalizing Change processes, and obtains the characteristic point of seed user;
According to the characteristic point of seed user, seed user is enabled focusing solutions analysis, extract and use with seed The customer group that family is similar;
KNN algorithm is utilized to calculate nearest active time user and the similarity representing user;
It is ranked up from big to small according to similarity, extracts N number of user data that similarity is bigger.
8. method as claimed in claim 3, it is characterised in that also include setting up client's liveness mould Type:
According to the default liveness factor, calculating the liveness of each user, described liveness is at 0-100 Between, data are the biggest, and liveness is the highest.
9. method as claimed in claim 3, it is characterised in that also include setting up social network analysis Model:
Calculate the out-degree of each user node in social networks, in-degree, corresponding point degree centrad, determine Opinion leader user in social networks, activist user, society beauty user.
10. method as claimed in claim 3, it is characterised in that also include setting up client's phase of decline Loss Early-warning Model:
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, runs off to height Probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
The processing means of 11. 1 kinds of user characteristic data, it is characterised in that including:
Acquisition module, for obtaining user behavior data and data item information data from data source;
Integrate module, for according to different service logics, the user behavior data sum to described acquisition Carry out Data Integration according to item information data, obtain the user characteristic data corresponding with described service logic;
Processing module, for using the packaging model corresponding with described user characteristic data to described user Characteristic carries out processing and obtains the result data corresponding with described service logic.
12. devices as claimed in claim 11, it is characterised in that described integration module is used for:
Receive, get through and integrate multiple data source user behavior data and data item information data, the most more New to unified data platform.
13. devices as claimed in claim 11, it is characterised in that also include:
Set up module, for setting up the corresponding relation between user characteristic data with packaging model;
Described packaging model includes that the customer segmentation model of client's probation, client form the customer value of phase Model, customer loyalty identification model, client's diffusion model in client's period of maturation, client's liveness model, The loss Early-warning Model of social network analysis model or client's phase of decline.
14. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to Set up the customer segmentation model of described client's probation, specifically include:
By known user data, it is worth and/or user behavior standard according to user, user is set The number of colony and the central point of each user group;
The method being used Euclidean distance by Distributed Calculation is calculated in each user and each user group The distance of heart point;
According to the distance of each user Yu each user group's central point, user is divided into its distance In the little user group belonging to central point.
15. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to Set up client and form the customer value model of phase, specifically include:
The last consumption time according to each user, accumulation buy the frequency and accumulation spending amount, Described user is carried out user's value analysis, and the user that described user is divided into correspondence is worth in colony.
16. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to Set up customer loyalty identification model, specifically include:
Calculating each user to mark the loyalty of a certain product, described loyalty is marked between 0-100, Data are the biggest, and loyalty is the highest, and described product includes but not limited to website or brand.
17. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to Set up client's diffusion model in client's period of maturation, specifically include:
According to the seed user data imported, extract the full dimension data of seed user;
The full dimension data of seed user is carried out seed user outlier, missing values process, data normalizing Change processes, and obtains the characteristic point of seed user;
According to the characteristic point of seed user, seed user is enabled focusing solutions analysis, extract and use with seed The customer group that family is similar;
KNN algorithm is utilized to calculate nearest active time user and the similarity representing user;
It is ranked up from big to small according to similarity, extracts N number of user data that similarity is bigger.
18. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to Set up client's liveness model, specifically include:
According to the default liveness factor, calculating the liveness of each user, described liveness is at 0-100 Between, data are the biggest, and liveness is the highest.
19. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to Set up social network analysis model, specifically include:
Calculate the out-degree of each user node in social networks, in-degree, corresponding point degree centrad, determine Opinion leader user in social networks, activist user, society beauty user.
20. devices as claimed in claim 13, it is characterised in that described set up module, are additionally operable to Set up the loss Early-warning Model of client's phase of decline, specifically include:
Population characteristic by modeling analysis customer revenue, it was predicted that go out the probability of customer loss, runs off to height Probability crowd be identified, and combine user be worth filter out emphasis safeguard be easy to run off crowd.
CN201610323618.0A 2016-05-16 2016-05-16 User feature data processing method and device Pending CN106022800A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610323618.0A CN106022800A (en) 2016-05-16 2016-05-16 User feature data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610323618.0A CN106022800A (en) 2016-05-16 2016-05-16 User feature data processing method and device

Publications (1)

Publication Number Publication Date
CN106022800A true CN106022800A (en) 2016-10-12

Family

ID=57097486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610323618.0A Pending CN106022800A (en) 2016-05-16 2016-05-16 User feature data processing method and device

Country Status (1)

Country Link
CN (1) CN106022800A (en)

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106384209A (en) * 2016-10-27 2017-02-08 合肥工业大学 Method and device for improving configuration of intelligent products based on operation data
CN106411711A (en) * 2016-10-20 2017-02-15 宁波江东大金佰汇信息技术有限公司 Improved temporary social network determination system based on computer big data
CN106503863A (en) * 2016-11-10 2017-03-15 北京红马传媒文化发展有限公司 Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal
CN106651605A (en) * 2016-10-20 2017-05-10 宁波江东大金佰汇信息技术有限公司 Computer big data-based temporary social network determining system
CN106779803A (en) * 2016-11-24 2017-05-31 久远谦长(北京)技术服务有限公司 A kind of method that financial institution's flowing water is matched with carrier data
CN106779808A (en) * 2016-11-25 2017-05-31 上海斐讯数据通信技术有限公司 Consumer space's behavior analysis system and method in a kind of commercial circle
CN106776737A (en) * 2016-11-18 2017-05-31 北京红马传媒文化发展有限公司 A kind of method based on big data analyzing evaluation artist's viscosity
CN106845706A (en) * 2017-01-19 2017-06-13 浙江工商大学 Online social network user relationship strength Forecasting Methodology
CN107093092A (en) * 2016-11-17 2017-08-25 北京小度信息科技有限公司 Data analysing method and device
CN107122425A (en) * 2017-04-07 2017-09-01 广东精点数据科技股份有限公司 The method and system evaluated corporate client
CN107480187A (en) * 2017-07-10 2017-12-15 北京京东尚科信息技术有限公司 User's value category method and apparatus based on cluster analysis
CN107730019A (en) * 2017-09-29 2018-02-23 携程计算机技术(上海)有限公司 User based on user's portrait retrieves method and system
CN108122123A (en) * 2016-11-29 2018-06-05 华为技术有限公司 A kind of method and device for extending potential user
CN108428155A (en) * 2018-03-20 2018-08-21 南京邮电大学 A kind of behavior processing analysis method based on service feature model
CN108540993A (en) * 2018-04-08 2018-09-14 中国联合网络通信集团有限公司 User's Valuation Method and device
CN108628866A (en) * 2017-03-20 2018-10-09 大有秦鼎(北京)科技有限公司 The method and apparatus of data fusion
CN108764994A (en) * 2018-05-24 2018-11-06 深圳前海桔子信息技术有限公司 A kind of user behavior guidance method, device, server and storage medium
CN108876394A (en) * 2017-05-16 2018-11-23 北京京东尚科信息技术有限公司 Identify the potential method and apparatus for being lost user of e-commerce platform
WO2018223719A1 (en) * 2017-06-09 2018-12-13 平安科技(深圳)有限公司 Method for predicting insurance purchasing behavior of a user, device, computing apparatus, and medium
CN109190959A (en) * 2018-08-23 2019-01-11 杭州颜铺科技有限公司 A kind of intelligent management system for beauty's industry
CN109255640A (en) * 2017-07-13 2019-01-22 阿里健康信息技术有限公司 A kind of method, apparatus and system of determining user grouping
WO2019037391A1 (en) * 2017-08-24 2019-02-28 平安科技(深圳)有限公司 Method and apparatus for predicting customer purchase intention, and electronic device and medium
CN109657998A (en) * 2018-12-25 2019-04-19 国信优易数据有限公司 A kind of resource allocation methods, device, equipment and storage medium
CN109841250A (en) * 2017-11-24 2019-06-04 光宝科技股份有限公司 The forecasting system method for building up and operating method of decoded state
CN109919667A (en) * 2019-02-21 2019-06-21 江苏苏宁银行股份有限公司 A kind of method and apparatus of the IP of enterprise for identification
WO2019119635A1 (en) * 2017-12-18 2019-06-27 平安科技(深圳)有限公司 Seed user development method, electronic device and computer-readable storage medium
CN109978547A (en) * 2017-12-28 2019-07-05 北京京东尚科信息技术有限公司 Risk behavior control method and system, equipment and storage medium
CN109993556A (en) * 2017-12-30 2019-07-09 中国移动通信集团湖北有限公司 User behavior analysis method, apparatus calculates equipment and storage medium
CN110222975A (en) * 2019-05-31 2019-09-10 北京奇艺世纪科技有限公司 A kind of loss customer analysis method, apparatus, electronic equipment and storage medium
WO2019169961A1 (en) * 2018-03-06 2019-09-12 阿里巴巴集团控股有限公司 Method and device for determining group of target users
CN110276514A (en) * 2019-05-06 2019-09-24 阿里巴巴集团控股有限公司 Appraisal procedure, device and the equipment of business correlative factor
CN110348914A (en) * 2019-07-19 2019-10-18 中国银行股份有限公司 Customer churn data analysing method and device
CN110516155A (en) * 2019-08-29 2019-11-29 深圳市云积分科技有限公司 Marketing strategy generation method and system
CN110610371A (en) * 2018-06-14 2019-12-24 北京京东尚科信息技术有限公司 Latent user analysis method, system, and computer-readable storage medium
CN110659269A (en) * 2019-08-15 2020-01-07 中国平安财产保险股份有限公司 User access data processing method and device, computer equipment and storage medium
CN110717085A (en) * 2019-10-12 2020-01-21 浙江工商大学 Opinion leader identification method based on virtual brand community
CN110991875A (en) * 2019-11-29 2020-04-10 广州市百果园信息技术有限公司 Platform user quality evaluation system
CN111311331A (en) * 2020-02-26 2020-06-19 北京慧博科技有限公司 RFM analysis method
CN111695819A (en) * 2020-06-16 2020-09-22 中国联合网络通信集团有限公司 Method and device for scheduling seat personnel
CN111738331A (en) * 2020-06-19 2020-10-02 北京同邦卓益科技有限公司 User classification method and device, computer-readable storage medium and electronic device
CN111899036A (en) * 2020-08-03 2020-11-06 上海同儒信息技术有限公司 Client hierarchical classification management system based on big data
CN112070548A (en) * 2020-09-11 2020-12-11 上海风秩科技有限公司 User layering method, device, equipment and storage medium
CN112594937A (en) * 2020-12-16 2021-04-02 珠海格力电器股份有限公司 Control method and device of water heater, electronic equipment and storage medium
CN112598442A (en) * 2020-12-25 2021-04-02 中国建设银行股份有限公司 Multidimensional operation analysis method and multidimensional operation analysis device for network traffic
CN113793060A (en) * 2021-09-27 2021-12-14 武汉众邦银行股份有限公司 Customer rating method and device based on customer transaction data and storage medium
TWI778568B (en) * 2021-04-06 2022-09-21 富邦人壽保險股份有限公司 Systems and methods for generating recommendation list
CN115879984A (en) * 2023-03-03 2023-03-31 北京一凌宸飞科技有限公司 Network marketing method and system based on big data analysis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103118111A (en) * 2013-01-31 2013-05-22 北京百分点信息科技有限公司 Information push method based on data from a plurality of data interaction centers
CN103714139A (en) * 2013-12-20 2014-04-09 华南理工大学 Parallel data mining method for identifying a mass of mobile client bases
CN104866969A (en) * 2015-05-25 2015-08-26 百度在线网络技术(北京)有限公司 Personal credit data processing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103118111A (en) * 2013-01-31 2013-05-22 北京百分点信息科技有限公司 Information push method based on data from a plurality of data interaction centers
CN103714139A (en) * 2013-12-20 2014-04-09 华南理工大学 Parallel data mining method for identifying a mass of mobile client bases
CN104866969A (en) * 2015-05-25 2015-08-26 百度在线网络技术(北京)有限公司 Personal credit data processing method and device

Cited By (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106411711A (en) * 2016-10-20 2017-02-15 宁波江东大金佰汇信息技术有限公司 Improved temporary social network determination system based on computer big data
CN106651605A (en) * 2016-10-20 2017-05-10 宁波江东大金佰汇信息技术有限公司 Computer big data-based temporary social network determining system
CN106651605B (en) * 2016-10-20 2019-11-15 福州盛世凌云环保科技有限公司 A kind of temporary social network based on computer big data determines system
CN106384209A (en) * 2016-10-27 2017-02-08 合肥工业大学 Method and device for improving configuration of intelligent products based on operation data
CN106503863A (en) * 2016-11-10 2017-03-15 北京红马传媒文化发展有限公司 Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal
CN107093092A (en) * 2016-11-17 2017-08-25 北京小度信息科技有限公司 Data analysing method and device
CN107093092B (en) * 2016-11-17 2021-07-09 北京星选科技有限公司 Data analysis method and device
CN106776737A (en) * 2016-11-18 2017-05-31 北京红马传媒文化发展有限公司 A kind of method based on big data analyzing evaluation artist's viscosity
CN106779803A (en) * 2016-11-24 2017-05-31 久远谦长(北京)技术服务有限公司 A kind of method that financial institution's flowing water is matched with carrier data
CN106779803B (en) * 2016-11-24 2021-01-15 久远谦长(北京)技术服务有限公司 Method for matching financial institution running water with operator data
CN106779808A (en) * 2016-11-25 2017-05-31 上海斐讯数据通信技术有限公司 Consumer space's behavior analysis system and method in a kind of commercial circle
CN108122123B (en) * 2016-11-29 2021-08-20 华为技术有限公司 Method and device for expanding potential users
CN108122123A (en) * 2016-11-29 2018-06-05 华为技术有限公司 A kind of method and device for extending potential user
CN106845706A (en) * 2017-01-19 2017-06-13 浙江工商大学 Online social network user relationship strength Forecasting Methodology
CN108628866A (en) * 2017-03-20 2018-10-09 大有秦鼎(北京)科技有限公司 The method and apparatus of data fusion
CN108628866B (en) * 2017-03-20 2020-11-06 大有秦鼎(北京)科技有限公司 Data fusion method and device
CN107122425A (en) * 2017-04-07 2017-09-01 广东精点数据科技股份有限公司 The method and system evaluated corporate client
CN108876394A (en) * 2017-05-16 2018-11-23 北京京东尚科信息技术有限公司 Identify the potential method and apparatus for being lost user of e-commerce platform
WO2018223719A1 (en) * 2017-06-09 2018-12-13 平安科技(深圳)有限公司 Method for predicting insurance purchasing behavior of a user, device, computing apparatus, and medium
CN107480187A (en) * 2017-07-10 2017-12-15 北京京东尚科信息技术有限公司 User's value category method and apparatus based on cluster analysis
CN109255640A (en) * 2017-07-13 2019-01-22 阿里健康信息技术有限公司 A kind of method, apparatus and system of determining user grouping
WO2019037391A1 (en) * 2017-08-24 2019-02-28 平安科技(深圳)有限公司 Method and apparatus for predicting customer purchase intention, and electronic device and medium
CN107730019A (en) * 2017-09-29 2018-02-23 携程计算机技术(上海)有限公司 User based on user's portrait retrieves method and system
CN107730019B (en) * 2017-09-29 2021-06-11 携程计算机技术(上海)有限公司 User retrieval method and system based on user portrait
CN109841250B (en) * 2017-11-24 2020-11-13 建兴储存科技股份有限公司 Method for establishing prediction system of decoding state and operation method
CN109841250A (en) * 2017-11-24 2019-06-04 光宝科技股份有限公司 The forecasting system method for building up and operating method of decoded state
WO2019119635A1 (en) * 2017-12-18 2019-06-27 平安科技(深圳)有限公司 Seed user development method, electronic device and computer-readable storage medium
CN109978547A (en) * 2017-12-28 2019-07-05 北京京东尚科信息技术有限公司 Risk behavior control method and system, equipment and storage medium
CN109993556A (en) * 2017-12-30 2019-07-09 中国移动通信集团湖北有限公司 User behavior analysis method, apparatus calculates equipment and storage medium
CN109993556B (en) * 2017-12-30 2021-06-08 中国移动通信集团湖北有限公司 User behavior analysis method and device, computing equipment and storage medium
WO2019169961A1 (en) * 2018-03-06 2019-09-12 阿里巴巴集团控股有限公司 Method and device for determining group of target users
CN108428155A (en) * 2018-03-20 2018-08-21 南京邮电大学 A kind of behavior processing analysis method based on service feature model
CN108540993A (en) * 2018-04-08 2018-09-14 中国联合网络通信集团有限公司 User's Valuation Method and device
CN108764994A (en) * 2018-05-24 2018-11-06 深圳前海桔子信息技术有限公司 A kind of user behavior guidance method, device, server and storage medium
CN110610371A (en) * 2018-06-14 2019-12-24 北京京东尚科信息技术有限公司 Latent user analysis method, system, and computer-readable storage medium
CN109190959A (en) * 2018-08-23 2019-01-11 杭州颜铺科技有限公司 A kind of intelligent management system for beauty's industry
CN109190959B (en) * 2018-08-23 2021-07-06 杭州颜铺科技有限公司 Intelligent management system for beauty industry
CN109657998A (en) * 2018-12-25 2019-04-19 国信优易数据有限公司 A kind of resource allocation methods, device, equipment and storage medium
CN109657998B (en) * 2018-12-25 2020-11-27 国信优易数据股份有限公司 Resource allocation method, device, equipment and storage medium
CN109919667B (en) * 2019-02-21 2022-07-22 江苏苏宁银行股份有限公司 Method and device for identifying enterprise IP
CN109919667A (en) * 2019-02-21 2019-06-21 江苏苏宁银行股份有限公司 A kind of method and apparatus of the IP of enterprise for identification
CN110276514B (en) * 2019-05-06 2023-04-07 创新先进技术有限公司 Method, device and equipment for evaluating business related factors
CN110276514A (en) * 2019-05-06 2019-09-24 阿里巴巴集团控股有限公司 Appraisal procedure, device and the equipment of business correlative factor
CN110222975A (en) * 2019-05-31 2019-09-10 北京奇艺世纪科技有限公司 A kind of loss customer analysis method, apparatus, electronic equipment and storage medium
CN110348914A (en) * 2019-07-19 2019-10-18 中国银行股份有限公司 Customer churn data analysing method and device
CN110659269A (en) * 2019-08-15 2020-01-07 中国平安财产保险股份有限公司 User access data processing method and device, computer equipment and storage medium
CN110659269B (en) * 2019-08-15 2024-04-02 中国平安财产保险股份有限公司 User access data processing method, device, computer equipment and storage medium
CN110516155A (en) * 2019-08-29 2019-11-29 深圳市云积分科技有限公司 Marketing strategy generation method and system
CN110717085B (en) * 2019-10-12 2021-08-06 浙江工商大学 Opinion leader identification method based on virtual brand community
CN110717085A (en) * 2019-10-12 2020-01-21 浙江工商大学 Opinion leader identification method based on virtual brand community
CN110991875B (en) * 2019-11-29 2023-09-26 广州市百果园信息技术有限公司 Platform user quality evaluation system
CN110991875A (en) * 2019-11-29 2020-04-10 广州市百果园信息技术有限公司 Platform user quality evaluation system
CN111311331A (en) * 2020-02-26 2020-06-19 北京慧博科技有限公司 RFM analysis method
CN111695819B (en) * 2020-06-16 2023-06-02 中国联合网络通信集团有限公司 Seat personnel scheduling method and device
CN111695819A (en) * 2020-06-16 2020-09-22 中国联合网络通信集团有限公司 Method and device for scheduling seat personnel
CN111738331A (en) * 2020-06-19 2020-10-02 北京同邦卓益科技有限公司 User classification method and device, computer-readable storage medium and electronic device
CN111899036A (en) * 2020-08-03 2020-11-06 上海同儒信息技术有限公司 Client hierarchical classification management system based on big data
CN112070548A (en) * 2020-09-11 2020-12-11 上海风秩科技有限公司 User layering method, device, equipment and storage medium
CN112070548B (en) * 2020-09-11 2024-02-20 上海秒针网络科技有限公司 User layering method, device, equipment and storage medium
CN112594937A (en) * 2020-12-16 2021-04-02 珠海格力电器股份有限公司 Control method and device of water heater, electronic equipment and storage medium
CN112598442A (en) * 2020-12-25 2021-04-02 中国建设银行股份有限公司 Multidimensional operation analysis method and multidimensional operation analysis device for network traffic
TWI778568B (en) * 2021-04-06 2022-09-21 富邦人壽保險股份有限公司 Systems and methods for generating recommendation list
CN113793060A (en) * 2021-09-27 2021-12-14 武汉众邦银行股份有限公司 Customer rating method and device based on customer transaction data and storage medium
CN115879984A (en) * 2023-03-03 2023-03-31 北京一凌宸飞科技有限公司 Network marketing method and system based on big data analysis

Similar Documents

Publication Publication Date Title
CN106022800A (en) User feature data processing method and device
US11157926B2 (en) Digital content prioritization to accelerate hyper-targeting
Chitra et al. Data mining techniques and its applications in banking sector
Yoseph et al. The impact of big data market segmentation using data mining and clustering techniques
US8341101B1 (en) Determining relationships between data items and individuals, and dynamically calculating a metric score based on groups of characteristics
EP2474945A1 (en) Analyzing transactional data
Haenlein A social network analysis of customer-level revenue distribution
CN108415913A (en) Crowd's orientation method based on uncertain neighbours
CN108572988A (en) A kind of house property assessment data creation method and device
Xu et al. Potential buyer identification and purchase likelihood quantification by mining user-generated content on social media
Yuping et al. New methods of customer segmentation and individual credit evaluation based on machine learning
CN112330373A (en) User behavior analysis method and device and computer readable storage medium
Bouzidi et al. LSTM-based automated learning with smart data to improve marketing fraud detection and financial forecasting
KR20100046421A (en) Method and server for estimating preference of commodity
Vikram et al. Data mining tools and techniques: a review
Martins et al. Characterizing sponsored content in Facebook and Instagram
Thorleuchter et al. Using Webcrawling of Publicly Available Websites to Assess E-commerce Relationships
Osaysa Improving the quality of marketing analytics systems
Asmat et al. Data mining framework for the identification of profitable customer based on recency, frequency, monetary (RFM)
Silpa et al. Detection of Fake Online Reviews by using Machine Learning
Gupta et al. Segmentation of retail customers based on cluster analysis in building successful CRM
Hu et al. Research on long tail recommendation algorithm
CN107705135A (en) A kind of method that potential commercial value is evaluated based on company's storage contact data
Ming Application research of customer big data analysis for online shop based on smart cloud platform tools
Iqbal et al. Association rule analysis-based identification of influential users in the social media

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161012