CN106844427A - The sorting technique and device of automobile user - Google Patents
The sorting technique and device of automobile user Download PDFInfo
- Publication number
- CN106844427A CN106844427A CN201611132772.6A CN201611132772A CN106844427A CN 106844427 A CN106844427 A CN 106844427A CN 201611132772 A CN201611132772 A CN 201611132772A CN 106844427 A CN106844427 A CN 106844427A
- Authority
- CN
- China
- Prior art keywords
- variable
- user
- classified
- variables
- correlation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
Abstract
The invention discloses the sorting technique and device of a kind of automobile user.Wherein, the method includes:Obtain user data to be sorted;According to default class condition, the classified variable of predetermined number is obtained from user data to be sorted;The classified variable of predetermined number is classified based on default disaggregated model, obtains classification results, wherein, classification results are used to characterize the type of user data.The granularity of classification that the present invention solves the sorting technique of automobile user in the prior art is big, the technical problem of Consumer's Experience sense difference.
Description
Technical field
The present invention relates to electric automobile field, in particular to the sorting technique and device of a kind of automobile user.
Background technology
Beijing is the city that domestic continent promotes Development of Electric Vehicles and demonstration operation in the first batch, electrically-charging equipment of making rational planning for,
Specification network operation, lifting Consumer's Experience turns into the bottleneck of key breakthrough, fully with the data of electric automobile operation accumulation, from
With the characteristics of each dimension digging user charging row, the information such as charging station operation characteristic turn into the desirable technique of breakthrough bottleneck.
User's classification is on the basis of collecting and arranging user behavior information, to be practised according to the characteristics of demand of user, behavior
The notable difference of the aspect such as used, overall user is divided into the assorting process of several customer groups.So each customer group is
There is the colony that the user of similar feature is constituted in one aspect, and adhering to separately between the user of different user group has substantially
Otherness.
In practical application, user's classification is the premise of all marketing activities, especially into personalized Consumer's Experience
In the epoch, the technology of science is more needed to use to segment user behavior.User's row based on huge user behavior data basis
For subdivision is a kind of customer recognition for carrying out science, risk management, the personal marketing kimonos that overseas bank begins to use already
The indispensable means of business, belong to the application category in the very powerful and exceedingly arrogant business intelligence field of current developed country.Many financial institutions from
Strategic development product-centered in the past has turned to development strategy customer-centric, a key step of this strategic change
Rapid is exactly that the enough information of mobile phone is finely divided to client, and to the client of different groups using specific aim and effective ditch
It is logical.
But, the granularity of classification to the sorting technique of automobile user is larger in the prior art, it is impossible to accomplish electronic vapour
The sophisticated category at automobile-used family, it is impossible to for different user provides more accuracy service, causes Consumer's Experience sense poor.
Granularity of classification for the sorting technique of automobile user in the prior art is big, the problem of Consumer's Experience sense difference,
Not yet propose effective solution at present.
The content of the invention
The sorting technique and device of a kind of automobile user are the embodiment of the invention provides, at least to solve prior art
The granularity of classification of the sorting technique of middle automobile user is big, the technical problem of Consumer's Experience sense difference.
A kind of one side according to embodiments of the present invention, there is provided sorting technique of automobile user, including:Obtain
User data to be sorted;According to default class condition, the classified variable of predetermined number is obtained from user data to be sorted;
The classified variable of predetermined number is classified based on default disaggregated model, obtains classification results, wherein, classification results are used for table
Take over the type of user data for use.
Further, classification results include:Stabilization user, volatile user, value type user, non-value type user, stream
Appraxia family and non-streaming appraxia family.
Further, according to default class condition, the classified variable of predetermined number is obtained from user data to be sorted
Including:According to default class condition, user data to be sorted is processed, obtain multiple variables;From multiple variables, really
Fixed multiple Available Variables;From multiple Available Variables, the classified variable of predetermined number is obtained.
Further, from multiple variables, it is determined that multiple Available Variables include:To each variable and other any one changes
Amount carries out correlation analysis, obtains the correlation results of each variable and other any one variables, wherein, correlation results are extremely
Include less:Coefficient correlation and check value;Judge whether the correlation results of each variable and other any one variable meet pre-
If condition;If the correlation results of first variable and other any one variable meet pre-conditioned, it is determined that the first variable
It is Available Variables.
Further, judge whether each variable and the correlation results of other any one variables meet pre-conditioned bag
Include:Whether coefficient correlation is judged in the first preset range, and whether check value is in the second preset range;If at coefficient correlation
In the first preset range, and check value is in the second preset range, it is determined that the phase of each variable and other any one variable
Closing property result meets pre-conditioned.
Further, according to default class condition, the classification that predetermined number is obtained from user data to be sorted becomes
After amount, the above method also includes:Classified variable to predetermined number is standardized, and the classification after being standardized becomes
Amount;Classified variable to the preset order in the classified variable after standardization is classified, and obtains classification results.
Further, according to default class condition, the classification that predetermined number is obtained from user data to be sorted becomes
After amount, the above method also includes:Classified variable after standardization is ranked up, the classified variable after being sorted;According to
The classified variable of predeterminated position in classified variable after sequence, the distributed intelligence of the classified variable after generation standardization.
Another aspect according to embodiments of the present invention, additionally provides a kind of sorter of automobile user, including:The
One acquiring unit, for obtaining user data to be sorted;Second acquisition unit, for according to default class condition, from treating point
The classified variable of predetermined number is obtained in the user data of class;Taxon, for based on default disaggregated model to predetermined number
Classified variable classified, obtain classification results, wherein, classification results are used to characterize the type of user data.
Further, classification results include:Stabilization user, volatile user, value type user, non-value type user, stream
Appraxia family and non-streaming appraxia family.
Further, second acquisition unit includes:Processing module, for according to default class condition, to use to be sorted
User data is processed, and obtains multiple variables;Determining module, for from multiple variables, it is determined that multiple Available Variables;Obtain mould
Block, for from multiple Available Variables, obtaining the classified variable of predetermined number.
Further, it is determined that module includes:Treatment submodule, is carried out for any one variable to each variable and other
Correlation analysis, obtain the correlation results of each variable and other any one variables, wherein, correlation results are at least wrapped
Include:Coefficient correlation and check value;Judging submodule, the correlation results for judging each variable and other any one variables
Whether meet pre-conditioned;Determination sub-module, if expired for the correlation results of the first variable and other any one variables
Foot is pre-conditioned, it is determined that the first variable is Available Variables.
Further, judging submodule includes:Baryon module is judged, for judging whether coefficient correlation is preset in first
Whether scope, check value is in the second preset range;Baryon module is determined, if being in the first default model for coefficient correlation
Enclose, and check value is in the second preset range, it is determined that each variable meets with the correlation results of other any one variables
It is pre-conditioned.
Further, said apparatus also include:Processing unit, place is standardized for the classified variable to predetermined number
Reason, the classified variable after being standardized;Taxon is additionally operable to the classification to the preset order in the classified variable after sequence
Variable is classified, and obtains classification results.
Further, said apparatus also include:Sequencing unit, for being ranked up to the classified variable after standardization, obtains
Classified variable after to sequence;Generation unit, for the classified variable according to predeterminated position in the classified variable after sequence, generation
The distributed intelligence of the classified variable after standardization.
In embodiments of the present invention, user data to be sorted is obtained, according to default class condition, from user to be sorted
The classified variable of predetermined number is obtained in data, the classified variable of predetermined number is classified based on default disaggregated model, obtained
To classification results, wherein, classification results are used to characterize the type of user data.It is easily noted that, can be based on default point
Class model is classified, and obtains classification results, so as to realize carrying out automobile user the purpose of sophisticated category, is solved existing
There is the granularity of classification of the sorting technique of automobile user in technology big, the technical problem of Consumer's Experience sense difference.Therefore, by this
The scheme that invention above-described embodiment is provided, can reach lifting Consumer's Experience, reduce because electric card brings tired using inconvenience to user
The effect disturbed.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair
Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 is a kind of flow chart of the sorting technique of automobile user according to embodiments of the present invention;And
Fig. 2 is a kind of schematic diagram of the sorter of automobile user according to embodiments of the present invention.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention
Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, it is clear that described embodiment is only
The embodiment of a part of the invention, rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people
The every other embodiment that member is obtained under the premise of creative work is not made, should all belong to the model of present invention protection
Enclose.
It should be noted that term " first ", " in description and claims of this specification and above-mentioned accompanying drawing
Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so using
Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except illustrating herein or
Order beyond those of description is implemented.Additionally, term " comprising " and " having " and their any deformation, it is intended that cover
Lid is non-exclusive to be included, for example, the process, method, system, product or the equipment that contain series of steps or unit are not necessarily limited to
Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product
Or other intrinsic steps of equipment or unit.
Embodiment 1
According to embodiments of the present invention, there is provided a kind of sorting technique embodiment of automobile user, it is necessary to explanation,
Can be performed in the such as one group computer system of computer executable instructions the step of the flow of accompanying drawing is illustrated, and
And, although logical order is shown in flow charts, but in some cases, can perform institute with different from order herein
The step of showing or describe.
Fig. 1 is a kind of flow chart of the sorting technique of automobile user according to embodiments of the present invention, as shown in figure 1,
The method comprises the following steps:
Step S102, obtains user data to be sorted.
Specifically, above-mentioned user data can be the use data of automobile user, electric automobile can be included
Charging interval, charge capacity, remaining sum etc. in electric card card.
Step S104, according to default class condition, obtains the classified variable of predetermined number from user data to be sorted.
Specifically, above-mentioned default class condition can be the purpose classified to automobile user, for example, determining
Card time user more long is opened in the user charged commonly using electric card, or determination;Above-mentioned predetermined number can be root
The quantity of the classified variable as user's classification foundation determined according to classification purpose, for example, it may be 5.
In a kind of optional scheme, after the initial data for getting automobile user, that is, get to be sorted
User data after, can according to classification purpose, 5 are filtered out from the initial data for getting as user's classification foundation
Classified variable, so as to by 5 classified variables for filtering out, classify to automobile user.
Step S106, is classified based on default disaggregated model to the classified variable of predetermined number, obtains classification results, its
In, classification results are used to characterize the type of user data.
Optionally, in the above embodiment of the present invention, classification results include:Stabilization user, volatile user, value type are used
Family, non-value type user, loss user and non-streaming appraxia family.
Specifically, above-mentioned default disaggregated model can be K_means algorithm models, K_means clustering procedures are more common
Sorting technique, its most important feature be the time of algorithmic statement be to be directly proportional with the observation number of data to be analyzed.Therefore,
K_means clustering procedures are often used to the larger data for the treatment of scale;In classification results, it is believed that " value type non-streaming appraxia family " is
Best, opposite " value type is lost in user " needs to avoid.For " value type non-streaming appraxia family ", this certain customers is to need
The user further to safeguard, can reward this certain customers, and then lift the loyalty of these users by modes such as integrating systems
Degree;" value type is lost in user " may be due to having screened out this partial data, later it is possible to going out in process of data preprocessing
The existing phenomenon, then need by visiting, the reason for the mode such as survey is found out this certain customers and is lost in, such as:The discarded original of electric card
Cause, self reason etc., and then diplomatic electrically-charging equipment is safeguarded.
In a kind of optional scheme, can be right by K_means algorithm models according to 5 classified variables for filtering out
User is classified.In classification results, value type user (electric card) accounts for 11%, is lost in user's (electric card) and accounts for 15%.By
This is visible, and in full dose data, value type user (electric card) is less, all concentrates in long-time users, and this is due to judging value
One measurement index of type is to use electricity summation, and long-time users are easier to accumulate more electricity;And the user of loss is several
Concentrate in Short-term user, illustrating the behavior of short-term card user has stronger unstability.For " non-value type is used
Family " can be divided into " the non-value type user of long time type " (accounting for the 9% of full dose data) and " the non-value type user of temporary type " and (account for again
Full dose data 80%).Wherein, " the non-value type user of temporary type " be probably due to electric card use time is short, so
Its value still has to be seen;And " the non-value type user of long time type " is probably the not rule due to hand-held multiple electric cards of user
One electric card of use caused by.As can be seen here, improve electric card quality has heavy to closing for normalization analysis user's charging behavior
The meaning wanted.
According to the above embodiment of the present invention, user data to be sorted is obtained, according to default class condition, to be sorted
The classified variable of predetermined number is obtained in user data, the classified variable of predetermined number is divided based on default disaggregated model
Class, obtains classification results, wherein, classification results are used to characterize the type of user data.It is easily noted that, can be based on pre-
If disaggregated model is classified, classification results are obtained, so as to realize carrying out automobile user the purpose of sophisticated category, solved
The granularity of classification of the sorting technique of automobile user is big in the prior art, the technical problem of Consumer's Experience sense difference.Therefore, lead to
The scheme of the above embodiment of the present invention offer is provided, lifting Consumer's Experience can be reached, reduce because electric card using inconvenience to user with
Come the effect for perplexing.
Optionally, in the above embodiment of the present invention, step S104, according to default class condition, from user to be sorted
The classified variable of predetermined number is obtained in data, including:
Step S1042, according to default class condition, is processed user data to be sorted, obtains multiple variables.
Specifically, above-mentioned multiple variables can be multiple derivative variables.
In a kind of optional scheme, the original variable in user data to be sorted can be carried out according to classification purpose
Processing, the multiple derivative variables of generation, as shown in table 1 to table 4.
Table 1 derives variable
Table 2 derives variable
Table 3 derives variable
OBS | Vector prepares | Title |
35 | pub_sum_PQ_rate | pub_sum_PQ/(pub_sum_PQ+tax_sum_PQ) |
36 | pub_tax_sum_PQ_rate | pub_sum_PQ/tax_sum_PQ |
37 | pub_tax_mean_PQ_rate | tax_mean_PQ/pub_mean_PQ |
38 | pub_tax_max_PQ_rate | tax_max_PQ/pub_max_PQ |
39 | pub_tax_mean_time_rate | tax_mean_time_rate/pub_mean_time_rate |
40 | pub_tax_max_time_rate | tax_max_time_rate/pub_max_time_rate |
41 | pub_tax_min_time_rate | tax_min_time_rate/pub_min_time_rate |
42 | tax_freq_rate | tax_freq/(pub_freq+tax_freq) |
43 | pub_freq_rate | pub_freq/(pub_freq+tax_freq) |
44 | pub_tax_freq_rate | pub_freq/tax_freq |
45 | AC_sum_PQ_rate | AC_sum_PQ/(AC_sum_PQ+DC_sum_PQ) |
46 | DC_sum_PQ_rate | DC_sum_PQ/(AC_sum_PQ+DC_sum_PQ) |
47 | AC_DC_sum_PQ_rate | AC_sum_PQ/DC_sum_PQ |
48 | AC_DC_mean_PQ_rate | AC_mean_PQ/DC_mean_PQ |
49 | AC_DC_max_PQ_rate | AC_max_PQ/DC_max_PQ |
50 | AC_DC_mean_time_rate | AC_mean_time_rate/DC_mean_time_rate |
51 | AC_DC_max_time_rate | AC_max_time_rate/DC_max_time_rate |
52 | AC_DC_min_time_rate | AC_min_time_rate/DC_min_time_rate |
53 | AC_freq_rate | AC_freq/(AC_freq+DC_freq) |
54 | DC_freq_rate | DC_freq/(AC_freq+DC_freq) |
55 | AC_DC_freq_rate | AC_freq/AC_freq |
56 | min_recently | min(AC_min_recently,DC_min_recently) |
57 | max_recently | min(AC_max_recently,DC_max_recently) |
Table 4 derives variable
OBS | Vector prepares | Title |
58 | card_month | max_recently-min_recently |
Step S1044, from multiple variables, it is determined that multiple Available Variables.
Specifically, above-mentioned Available Variables can be the variable of non-correlation.
In a kind of optional scheme, correlation can be carried out to the derivative variable of multiple by correlation Inspection and analysis method
Property check analysis, determine the Available Variables of non-correlation, so as to avoid choose classified variable between have correlation.
Step S1046, from multiple Available Variables, obtains the classified variable of predetermined number.
In a kind of optional scheme, can be picked out in the Available Variables of non-correlation 5 as user classify according to
According to classified variable, for example, it may be sum_PURCHASE_PQ (user uses total electricity);Mean_use_time_rate is (flat
Use/occupancy situation);Min_recently (duration of last time charging distance deadline);card_balance_
Mean (remaining sum in mean 0100 calorie);Card_month (opens card total duration).
Optionally, in the above embodiment of the present invention, step S1044, from multiple variables, it is determined that multiple Available Variables bags
Include:
Each variable and other any one variables are carried out correlation analysis by step S122, obtain each variable and its
The correlation results of his any one variable, wherein, correlation results at least include:Coefficient correlation and check value.
Specifically, correlation test refers to a kind of hypothesis testing for checking two variables with the presence or absence of dependency relation.In the vacation
If in inspection, ρ is the parameter of coefficient correlation, and P is check value.
Step S124, judges whether the correlation results of each variable and other any one variable meet pre-conditioned.
Specifically, it is above-mentioned it is pre-conditioned can be user according to classification purpose, the determination for pre-setting two is derivative to be become
Measure non-correlation condition, including coefficient correlation condition and the condition of check value.
Step S126, if the correlation results of first variable and other any one variable meet pre-conditioned, really
Fixed first variable is Available Variables.
In a kind of optional scheme, can each derives variable and carries out correlation point with other by a derivative variable
Analysis, can calculate coefficient correlation and the school between the derivative variable and other each derivative variables by Correlation Calibration analytic approach
Test value, according to the coefficient correlation and check value that are calculated, matching judgment carried out by preset value, determine the derivative variable and its
Whether his each derivative variable is related, if the derivative variable is to other, and each derivative variable is uncorrelated, this can spread out
The amount of changing is used as Available Variables.
Optionally, in the above embodiment of the present invention, step S124 judges each variable with other any one variables
Correlation results whether meet it is pre-conditioned including:
Whether step S1242, judge coefficient correlation in the first preset range, and whether check value is in the second default model
Enclose.
Specifically, the first above-mentioned preset range can be 0, the second above-mentioned preset range may be greater than equal to 0.05
Scope.
In a kind of optional scheme, null hypothesis H in correlation test0With alternative hypothesis H1Respectively:H0:ρ=0, H1:ρ
≠ 0, i.e. ρ=0 can be expressed as two derivative variable non-correlations, and ρ ≠ 0 can be expressed as two derivative variables correlation;
Generally, P is worked as<When 0.05, then it represents that the linear relationship between two derivative variables is significant, but the size of P values can not represent
The power of correlation, and P be worth size influenceed by sample size.
Step S1244, if coefficient correlation is in the first preset range, and check value is in the second preset range, then really
The correlation results of fixed any one variable of each variable and other meet pre-conditioned.
In a kind of optional scheme, if a derivative variable derives ρ ≠ 0 of variable, and P with another<0.05, then
Can determine that this two derivative variables have correlation, and correlation is notable;If a derivative variable and other each derivatives
ρ=0 of variable, and P >=0.05, it is determined that each derivative variable is uncorrelated to other for the derivative variable, can derive this
Variable is used as Available Variables.
Optionally, in the above embodiment of the present invention, in step S104, according to default class condition, from use to be sorted
Obtained in user data after the classified variable of predetermined number, the method also includes:
Step S108, the classified variable to predetermined number is standardized, the classified variable after being standardized.
In a kind of optional scheme, in order to avoid being differed greatly between the variance of classified variable, can be to classified variable
It is standardized, Plays process is such as SAS (statistical analysis system, Statistical Analysis System's writes a Chinese character in simplified form)
Under:
Proc fastclus data=data sets;
Var variables;
Run。
Step S110, the classified variable to the preset order in standardized classified variable is classified, and obtains classification knot
Really.
In a kind of optional scheme, can be after being standardized to 5 classified variables, according to by K_means
Algorithm model according to standardization after 5 classified variables, user is classified, improve user classification the degree of accuracy.
Optionally, in the above embodiment of the present invention, in step S104, according to default class condition, from use to be sorted
Obtained in user data after the classified variable of predetermined number, the method also includes:
Step S112, is ranked up to the classified variable after standardization, the classified variable after being sorted.
Step S114, according to the classified variable of predeterminated position in the classified variable after sequence, the classification after generation standardization
The distributed intelligence of variable.
Specifically, above-mentioned predeterminated position can be 1% position set in advance, 25% position, 50% position, 75%
Put, 90% position and 99% position.
In a kind of optional scheme, distribution situation understanding can be carried out to the classified variable that will classify, i.e., according to mark
Ascending 1% position for putting in order of classified variable after standardization, 25% position, 50% position, 75% position, 90% position
Listed one by one with the value corresponding to 99% position, classified variable is better understood by and to dividing by the distribution situation of these respective values
Class foundation provides help.
Embodiment 2
According to embodiments of the present invention, there is provided a kind of sorter embodiment of automobile user.
Fig. 2 is a kind of schematic diagram of the sorter of automobile user according to embodiments of the present invention, as shown in Fig. 2
The device includes:
First acquisition unit 21, for obtaining user data to be sorted.
Specifically, above-mentioned user data can be the use data of automobile user, electric automobile can be included
Charging interval, charge capacity, remaining sum etc. in electric card card.
Second acquisition unit 23, for according to default class condition, predetermined number being obtained from user data to be sorted
Classified variable.
Specifically, above-mentioned default class condition can be the purpose classified to automobile user, for example, determining
Card time user more long is opened in the user charged commonly using electric card, or determination;Above-mentioned predetermined number can be root
The quantity of the classified variable as user's classification foundation determined according to classification purpose, for example, it may be 5.
In a kind of optional scheme, after the initial data for getting automobile user, that is, get to be sorted
User data after, can according to classification purpose, 5 are filtered out from the initial data for getting as user's classification foundation
Classified variable, so as to by 5 classified variables for filtering out, classify to automobile user.
Taxon 25, for being classified to the classified variable of predetermined number based on default disaggregated model, is classified
As a result, wherein, classification results are used to characterize the type of user data.
Optionally, in the above embodiment of the present invention, classification results include:Stabilization user, volatile user, value type are used
Family, non-value type user, loss user and non-streaming appraxia family.
Specifically, above-mentioned default disaggregated model can be K_means algorithm models, K_means clustering procedures are more common
Sorter, its most important feature be the time of algorithmic statement be to be directly proportional with the observation number of data to be analyzed.Therefore,
K_means clustering procedures are often used to the larger data for the treatment of scale;In classification results, it is believed that " value type non-streaming appraxia family " is
Best, opposite " value type is lost in user " needs to avoid.For " value type non-streaming appraxia family ", this certain customers is to need
The user further to safeguard, can reward this certain customers, and then lift the loyalty of these users by modes such as integrating systems
Degree;" value type is lost in user " may be due to having screened out this partial data, later it is possible to going out in process of data preprocessing
The existing phenomenon, then need by visiting, the reason for the mode such as survey is found out this certain customers and is lost in, such as:The discarded original of electric card
Cause, self reason etc., and then diplomatic electrically-charging equipment is safeguarded.
In a kind of optional scheme, can be right by K_means algorithm models according to 5 classified variables for filtering out
User is classified.In classification results, value type user (electric card) accounts for 11%, is lost in user's (electric card) and accounts for 15%.By
This is visible, and in full dose data, value type user (electric card) is less, all concentrates in long-time users, and this is due to judging value
One measurement index of type is to use electricity summation, and long-time users are easier to accumulate more electricity;And the user of loss is several
Concentrate in Short-term user, illustrating the behavior of short-term card user has stronger unstability.For " non-value type is used
Family " can be divided into " the non-value type user of long time type " (accounting for the 9% of full dose data) and " the non-value type user of temporary type " and (account for again
Full dose data 80%).Wherein, " the non-value type user of temporary type " be probably due to electric card use time is short, so
Its value still has to be seen;And " the non-value type user of long time type " is probably the not rule due to hand-held multiple electric cards of user
One electric card of use caused by.As can be seen here, improve electric card quality has heavy to closing for normalization analysis user's charging behavior
The meaning wanted.
According to the above embodiment of the present invention, first acquisition unit obtains user data to be sorted, second acquisition unit root
According to default class condition, the classified variable of predetermined number is obtained from user data to be sorted, taxon is based on default point
Class model is classified to the classified variable of predetermined number, obtains classification results, wherein, classification results are used to characterize user data
Type.It is easily noted that, can be classified based on default disaggregated model, classification results is obtained, so as to realize to electricity
Electrical automobile user carries out the purpose of sophisticated category, solves the granularity of classification of the sorting technique of automobile user in the prior art
Greatly, the technical problem of Consumer's Experience sense difference.Therefore, the scheme for being provided by the above embodiment of the present invention, can reach lifting and use
Family is experienced, and reduction brings the effect of puzzlement because of electric card using inconvenience to user.
Optionally, in the above embodiment of the present invention, the second acquisition unit includes:
Processing module, for according to default class condition, processing user data to be sorted, obtains multiple changes
Amount.
Specifically, above-mentioned multiple variables can be multiple derivative variables.
In a kind of optional scheme, the original variable in user data to be sorted can be carried out according to classification purpose
Processing, the multiple derivative variables of generation, as shown in table 1 to table 4.
Determining module, for from multiple variables, it is determined that multiple Available Variables.
Specifically, above-mentioned Available Variables can be non-correlation variable,
In a kind of optional scheme, correlation can be carried out to the derivative variable of multiple by correlation Inspection and analysis device
Property check analysis, determine the Available Variables of non-correlation, so as to avoid choose classified variable between have correlation.
Acquisition module, for from multiple Available Variables, obtaining the classified variable of predetermined number.
In a kind of optional scheme, can be picked out in the Available Variables of non-correlation 5 as user classify according to
According to classified variable, for example, it may be sum_PURCHASE_PQ (user uses total electricity);Mean_use_time_rate is (flat
Use/occupancy situation);Min_recently (duration of last time charging distance deadline);card_balance_
Mean (remaining sum in mean 0100 calorie);Card_month (opens card total duration).
Optionally, in the above embodiment of the present invention, the determining module includes:
Treatment submodule, for carrying out correlation analysis to each variable and other any one variables, obtains each change
The correlation results of amount and other any one variables, wherein, correlation results at least include:Coefficient correlation and check value.
Specifically, correlation test refers to a kind of hypothesis testing for checking two variables with the presence or absence of dependency relation.In the vacation
If in inspection, ρ is the parameter of coefficient correlation, and P is check value.
Whether judging submodule, the correlation results for judging each variable and other any one variable meet default
Condition.
Specifically, it is above-mentioned it is pre-conditioned can be user according to classification purpose, the determination for pre-setting two is derivative to be become
Measure non-correlation condition, including coefficient correlation condition and the condition of check value.
Determination sub-module, if meeting default bar for the correlation results of the first variable and other any one variables
Part, it is determined that the first variable is Available Variables.
In a kind of optional scheme, can each derives variable and carries out correlation point with other by a derivative variable
Analysis, can calculate coefficient correlation and the school between the derivative variable and other each derivative variables by Correlation Calibration analytic approach
Test value, according to the coefficient correlation and check value that are calculated, matching judgment carried out by preset value, determine the derivative variable and its
Whether his each derivative variable is related, if the derivative variable is to other, and each derivative variable is uncorrelated, this can spread out
The amount of changing is used as Available Variables.
Optionally, in the above embodiment of the present invention, the judging submodule includes:
Judge baryon module, whether for judging coefficient correlation in the first preset range, whether check value is in second
Preset range.
Specifically, the first above-mentioned preset range can be 0, the second above-mentioned preset range may be greater than equal to 0.05
Scope.
In a kind of optional scheme, null hypothesis H in correlation test0With alternative hypothesis H1Respectively:H0:ρ=0, H1:ρ
≠ 0, i.e. ρ=0 can be expressed as two derivative variable non-correlations, and ρ ≠ 0 can be expressed as two derivative variables correlation;
Generally, P is worked as<When 0.05, then it represents that the linear relationship between two derivative variables is significant, but the size of P values can not represent
The power of correlation, and P be worth size influenceed by sample size.
Determine baryon module, if the first preset range were in for coefficient correlation, and check value would be in the second default model
Enclose, it is determined that the correlation results of each variable and other any one variable meet pre-conditioned.
In a kind of optional scheme, if a derivative variable derives ρ ≠ 0 of variable, and P with another<0.05, then
Can determine that this two derivative variables have correlation, and correlation is notable;If a derivative variable and other each derivatives
ρ=0 of variable, and P >=0.05, it is determined that each derivative variable is uncorrelated to other for the derivative variable, can derive this
Variable is used as Available Variables.
Optionally, in the above embodiment of the present invention, the device also includes:
Processing unit, is standardized for the classified variable to predetermined number, and the classification after being standardized becomes
Amount.
In a kind of optional scheme, in order to avoid being differed greatly between the variance of classified variable, can be to classified variable
It is standardized, Plays process is such as SAS (statistical analysis system, Statistical Analysis System's writes a Chinese character in simplified form)
Under:
Proc fastclus data=data sets;
Var variables;
Run。
Taxon is additionally operable to classify the classified variable of the preset order in standardized classified variable, is divided
Class result.
In a kind of optional scheme, can be after being standardized to 5 classified variables, according to by K_means
Algorithm model according to standardization after 5 classified variables, user is classified, improve user classification the degree of accuracy.
Optionally, in the above embodiment of the present invention, the device also includes:
Sequencing unit, for being ranked up to the classified variable after standardization, the classified variable after being sorted.
Generation unit, for the classified variable according to predeterminated position in the classified variable after sequence, after generation standardization
The distributed intelligence of classified variable.
Specifically, above-mentioned predeterminated position can be 1% position set in advance, 25% position, 50% position, 75%
Put, 90% position and 99% position.
In a kind of optional scheme, distribution situation understanding can be carried out to the classified variable that will classify, i.e., according to mark
Ascending 1% position for putting in order of classified variable after standardization, 25% position, 50% position, 75% position, 90% position
Listed one by one with the value corresponding to 99% position, classified variable is better understood by and to dividing by the distribution situation of these respective values
Class foundation provides help.
The embodiments of the present invention are for illustration only, and the quality of embodiment is not represented.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment
The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other
Mode is realized.Wherein, device embodiment described above is only schematical, such as division of described unit, Ke Yiwei
A kind of division of logic function, can there is other dividing mode when actually realizing, such as multiple units or component can combine or
Person is desirably integrated into another system, or some features can be ignored, or does not perform.Another, shown or discussed is mutual
Between coupling or direct-coupling or communication connection can be the INDIRECT COUPLING or communication link of unit or module by some interfaces
Connect, can be electrical or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit
The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple
On unit.Some or all of unit therein can be according to the actual needs selected to realize the purpose of this embodiment scheme.
In addition, during each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to
It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list
Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or use
When, can store in a computer read/write memory medium.Based on such understanding, technical scheme is substantially
The part for being contributed to prior art in other words or all or part of the technical scheme can be in the form of software products
Embody, the computer software product is stored in a storage medium, including some instructions are used to so that a computer
Equipment (can be personal computer, server or network equipment etc.) perform each embodiment methods described of the invention whole or
Part steps.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can be with store program codes
Medium.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should
It is considered as protection scope of the present invention.
Claims (14)
1. a kind of sorting technique of automobile user, it is characterised in that including:
Obtain user data to be sorted;
According to default class condition, the classified variable of predetermined number is obtained from the user data to be sorted;
The classified variable of the predetermined number is classified based on default disaggregated model, obtains classification results, wherein, described point
Class result is used to characterize the type of the user data.
2. method according to claim 1, it is characterised in that the classification results include:Stabilization user, unstable use
Family, value type user, non-value type user, loss user and non-streaming appraxia family.
3. method according to claim 2, it is characterised in that according to default class condition, from the user to be sorted
The classified variable that predetermined number is obtained in data includes:
According to the default class condition, the user data to be sorted is processed, obtain multiple variables;
From the multiple variable, it is determined that multiple Available Variables;
From the multiple Available Variables, the classified variable of the predetermined number is obtained.
4. method according to claim 3, it is characterised in that from the multiple variable, it is determined that multiple Available Variables bags
Include:
Correlation analysis are carried out to each variable and other any one variables, described each variable is obtained and described other is any
One correlation results of variable, wherein, the correlation results at least include:Coefficient correlation and check value;
Judge whether described each variable meets pre-conditioned with the correlation results of other any one variables;
If the first variable meets described pre-conditioned with the correlation results of other any one variables, it is determined that described
First variable is the Available Variables.
5. method according to claim 4, it is characterised in that judge described each variable and described other any one changes
The correlation results of amount whether meet it is pre-conditioned including:
Whether the coefficient correlation is judged in the first preset range, and whether the check value is in the second preset range;
If the coefficient correlation is in first preset range, and the check value is in second preset range, then
It is determined that the correlation results of any one variable of each variable and other meet described pre-conditioned.
6. method as claimed in any of claims 1 to 5, it is characterised in that according to default class condition, from institute
State after the classified variable of acquisition predetermined number in user data to be sorted, methods described also includes:
Classified variable to the predetermined number is standardized, the classified variable after being standardized;
Classified variable after the standardization is classified, the classification results are obtained.
7. method according to claim 6, it is characterised in that according to default class condition, from the use to be sorted
Obtained in user data after the classified variable of predetermined number, methods described also includes:
Classified variable after the standardization is ranked up, the classified variable after being sorted;
According to the classified variable of predeterminated position in the classified variable after the sequence, the classified variable after the standardization is generated
Distributed intelligence.
8. a kind of sorter of automobile user, it is characterised in that including:
First acquisition unit, for obtaining user data to be sorted;
Second acquisition unit, for according to default class condition, predetermined number being obtained from the user data to be sorted
Classified variable;
Taxon, for classifying to the classified variable of the predetermined number based on default disaggregated model, obtains classification knot
Really, wherein, the classification results are used to characterize the type of the user data.
9. device according to claim 8, it is characterised in that the classification results include:Stabilization user, unstable use
Family, value type user, non-value type user, loss user and non-streaming appraxia family.
10. device according to claim 9, it is characterised in that the second acquisition unit includes:
Processing module, for according to the default class condition, processing the user data to be sorted, obtains multiple
Variable;
Determining module, for from the multiple variable, it is determined that multiple Available Variables;
Acquisition module, for from the multiple Available Variables, obtaining the classified variable of the predetermined number.
11. devices according to claim 10, it is characterised in that the determining module includes:
Treatment submodule, for carrying out correlation analysis to each variable and other any one variables, obtains described each change
The correlation results with other any one variables are measured, wherein, the correlation results at least include:Coefficient correlation and school
Test value;
Judging submodule, for judging whether described each variable meets with the correlation results of other any one variables
It is pre-conditioned;
Determination sub-module, if meeting described default for the first variable and the correlation results of other any one variables
Condition, it is determined that first variable is the Available Variables.
12. devices according to claim 11, it is characterised in that the judging submodule includes:
Judge baryon module, whether for judging the coefficient correlation in the first preset range, whether the check value is in
Second preset range;
Determine baryon module, if first preset range were in for the coefficient correlation, and the check value would be in institute
State the second preset range, it is determined that described each variable meets the default bar with the correlation results of other any one variables
Part.
13. device according to any one in claim 8 to 12, it is characterised in that described device also includes:
Processing unit, is standardized for the classified variable to the predetermined number, and the classification after being standardized becomes
Amount;
The taxon is additionally operable to classify the classified variable after the standardization, obtains the classification results.
14. devices according to claim 13, it is characterised in that described device also includes:
Sequencing unit, for being ranked up to the classified variable after the standardization, the classified variable after being sorted;
Generation unit, for the classified variable according to predeterminated position in the classified variable after the sequence, generates the standardization
The distributed intelligence of classified variable afterwards.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611132772.6A CN106844427A (en) | 2016-12-09 | 2016-12-09 | The sorting technique and device of automobile user |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611132772.6A CN106844427A (en) | 2016-12-09 | 2016-12-09 | The sorting technique and device of automobile user |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106844427A true CN106844427A (en) | 2017-06-13 |
Family
ID=59139848
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611132772.6A Pending CN106844427A (en) | 2016-12-09 | 2016-12-09 | The sorting technique and device of automobile user |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106844427A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657705A (en) * | 2018-12-03 | 2019-04-19 | 国网天津市电力公司电力科学研究院 | A kind of automobile user clustering method and device based on random forests algorithm |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101587349A (en) * | 2008-05-22 | 2009-11-25 | 上海宝信软件股份有限公司 | The using standard classified variable is realized the method for quality analysis |
CN105825232A (en) * | 2016-03-15 | 2016-08-03 | 国网北京市电力公司 | Classification method and device for electromobile users |
-
2016
- 2016-12-09 CN CN201611132772.6A patent/CN106844427A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101587349A (en) * | 2008-05-22 | 2009-11-25 | 上海宝信软件股份有限公司 | The using standard classified variable is realized the method for quality analysis |
CN105825232A (en) * | 2016-03-15 | 2016-08-03 | 国网北京市电力公司 | Classification method and device for electromobile users |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657705A (en) * | 2018-12-03 | 2019-04-19 | 国网天津市电力公司电力科学研究院 | A kind of automobile user clustering method and device based on random forests algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Rahman | Service quality, corporate image and customer’s satisfaction towards customers perception: an exploratory study on telecom customers in Bangladesh | |
CN107066616A (en) | Method, device and electronic equipment for account processing | |
CN105490823B (en) | data processing method and device | |
Bose et al. | Exploring business opportunities from mobile services data of customers: An inter-cluster analysis approach | |
CN112559900B (en) | Product recommendation method and device, computer equipment and storage medium | |
CN112308462A (en) | Power consumer classification method and device | |
CN105825232A (en) | Classification method and device for electromobile users | |
CN109376766A (en) | A kind of portrait prediction classification method, device and equipment | |
CN107358456A (en) | Data show method and apparatus | |
CN111062806B (en) | Personal finance credit risk evaluation method, system and storage medium | |
CN107358360A (en) | The abnormal traffic data screening method of anti money washing system | |
CN107194815B (en) | Client segmentation method and system | |
CN111639102A (en) | Client data resource sharing method and device and electronic equipment | |
CN110019774A (en) | Label distribution method, device, storage medium and electronic device | |
CN106844427A (en) | The sorting technique and device of automobile user | |
CN110046951A (en) | A kind of trading activity judgment method and system | |
CN107403263B (en) | Method for identifying electricity consumption demand of large-power customer | |
Apparao et al. | Financial statement fraud detection by data mining | |
CN106952111A (en) | Personalized recommendation method and device | |
CN115689708A (en) | Screening method, risk assessment method, device, equipment and medium of training data | |
Biscarri et al. | A Mining Framework to Detect Non-technical Losses in Power Utilities. | |
Diwandari et al. | Analysis of customer purchase behavior using association rules in e-shop | |
CN107563599A (en) | Patent valve estimating system based on big data | |
Vachane | Online Products Fake Reviews Detection System Using Machine Learning | |
CN112907308A (en) | Data detection method and device and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170613 |
|
RJ01 | Rejection of invention patent application after publication |