CN106844427A - The sorting technique and device of automobile user - Google Patents

The sorting technique and device of automobile user Download PDF

Info

Publication number
CN106844427A
CN106844427A CN201611132772.6A CN201611132772A CN106844427A CN 106844427 A CN106844427 A CN 106844427A CN 201611132772 A CN201611132772 A CN 201611132772A CN 106844427 A CN106844427 A CN 106844427A
Authority
CN
China
Prior art keywords
variable
user
classified
variables
correlation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611132772.6A
Other languages
Chinese (zh)
Inventor
李香龙
潘鸣宇
孙舟
王伟贤
田贺平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Beijing Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Beijing Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Beijing Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201611132772.6A priority Critical patent/CN106844427A/en
Publication of CN106844427A publication Critical patent/CN106844427A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Abstract

The invention discloses the sorting technique and device of a kind of automobile user.Wherein, the method includes:Obtain user data to be sorted;According to default class condition, the classified variable of predetermined number is obtained from user data to be sorted;The classified variable of predetermined number is classified based on default disaggregated model, obtains classification results, wherein, classification results are used to characterize the type of user data.The granularity of classification that the present invention solves the sorting technique of automobile user in the prior art is big, the technical problem of Consumer's Experience sense difference.

Description

The sorting technique and device of automobile user
Technical field
The present invention relates to electric automobile field, in particular to the sorting technique and device of a kind of automobile user.
Background technology
Beijing is the city that domestic continent promotes Development of Electric Vehicles and demonstration operation in the first batch, electrically-charging equipment of making rational planning for, Specification network operation, lifting Consumer's Experience turns into the bottleneck of key breakthrough, fully with the data of electric automobile operation accumulation, from With the characteristics of each dimension digging user charging row, the information such as charging station operation characteristic turn into the desirable technique of breakthrough bottleneck.
User's classification is on the basis of collecting and arranging user behavior information, to be practised according to the characteristics of demand of user, behavior The notable difference of the aspect such as used, overall user is divided into the assorting process of several customer groups.So each customer group is There is the colony that the user of similar feature is constituted in one aspect, and adhering to separately between the user of different user group has substantially Otherness.
In practical application, user's classification is the premise of all marketing activities, especially into personalized Consumer's Experience In the epoch, the technology of science is more needed to use to segment user behavior.User's row based on huge user behavior data basis For subdivision is a kind of customer recognition for carrying out science, risk management, the personal marketing kimonos that overseas bank begins to use already The indispensable means of business, belong to the application category in the very powerful and exceedingly arrogant business intelligence field of current developed country.Many financial institutions from Strategic development product-centered in the past has turned to development strategy customer-centric, a key step of this strategic change Rapid is exactly that the enough information of mobile phone is finely divided to client, and to the client of different groups using specific aim and effective ditch It is logical.
But, the granularity of classification to the sorting technique of automobile user is larger in the prior art, it is impossible to accomplish electronic vapour The sophisticated category at automobile-used family, it is impossible to for different user provides more accuracy service, causes Consumer's Experience sense poor.
Granularity of classification for the sorting technique of automobile user in the prior art is big, the problem of Consumer's Experience sense difference, Not yet propose effective solution at present.
The content of the invention
The sorting technique and device of a kind of automobile user are the embodiment of the invention provides, at least to solve prior art The granularity of classification of the sorting technique of middle automobile user is big, the technical problem of Consumer's Experience sense difference.
A kind of one side according to embodiments of the present invention, there is provided sorting technique of automobile user, including:Obtain User data to be sorted;According to default class condition, the classified variable of predetermined number is obtained from user data to be sorted; The classified variable of predetermined number is classified based on default disaggregated model, obtains classification results, wherein, classification results are used for table Take over the type of user data for use.
Further, classification results include:Stabilization user, volatile user, value type user, non-value type user, stream Appraxia family and non-streaming appraxia family.
Further, according to default class condition, the classified variable of predetermined number is obtained from user data to be sorted Including:According to default class condition, user data to be sorted is processed, obtain multiple variables;From multiple variables, really Fixed multiple Available Variables;From multiple Available Variables, the classified variable of predetermined number is obtained.
Further, from multiple variables, it is determined that multiple Available Variables include:To each variable and other any one changes Amount carries out correlation analysis, obtains the correlation results of each variable and other any one variables, wherein, correlation results are extremely Include less:Coefficient correlation and check value;Judge whether the correlation results of each variable and other any one variable meet pre- If condition;If the correlation results of first variable and other any one variable meet pre-conditioned, it is determined that the first variable It is Available Variables.
Further, judge whether each variable and the correlation results of other any one variables meet pre-conditioned bag Include:Whether coefficient correlation is judged in the first preset range, and whether check value is in the second preset range;If at coefficient correlation In the first preset range, and check value is in the second preset range, it is determined that the phase of each variable and other any one variable Closing property result meets pre-conditioned.
Further, according to default class condition, the classification that predetermined number is obtained from user data to be sorted becomes After amount, the above method also includes:Classified variable to predetermined number is standardized, and the classification after being standardized becomes Amount;Classified variable to the preset order in the classified variable after standardization is classified, and obtains classification results.
Further, according to default class condition, the classification that predetermined number is obtained from user data to be sorted becomes After amount, the above method also includes:Classified variable after standardization is ranked up, the classified variable after being sorted;According to The classified variable of predeterminated position in classified variable after sequence, the distributed intelligence of the classified variable after generation standardization.
Another aspect according to embodiments of the present invention, additionally provides a kind of sorter of automobile user, including:The One acquiring unit, for obtaining user data to be sorted;Second acquisition unit, for according to default class condition, from treating point The classified variable of predetermined number is obtained in the user data of class;Taxon, for based on default disaggregated model to predetermined number Classified variable classified, obtain classification results, wherein, classification results are used to characterize the type of user data.
Further, classification results include:Stabilization user, volatile user, value type user, non-value type user, stream Appraxia family and non-streaming appraxia family.
Further, second acquisition unit includes:Processing module, for according to default class condition, to use to be sorted User data is processed, and obtains multiple variables;Determining module, for from multiple variables, it is determined that multiple Available Variables;Obtain mould Block, for from multiple Available Variables, obtaining the classified variable of predetermined number.
Further, it is determined that module includes:Treatment submodule, is carried out for any one variable to each variable and other Correlation analysis, obtain the correlation results of each variable and other any one variables, wherein, correlation results are at least wrapped Include:Coefficient correlation and check value;Judging submodule, the correlation results for judging each variable and other any one variables Whether meet pre-conditioned;Determination sub-module, if expired for the correlation results of the first variable and other any one variables Foot is pre-conditioned, it is determined that the first variable is Available Variables.
Further, judging submodule includes:Baryon module is judged, for judging whether coefficient correlation is preset in first Whether scope, check value is in the second preset range;Baryon module is determined, if being in the first default model for coefficient correlation Enclose, and check value is in the second preset range, it is determined that each variable meets with the correlation results of other any one variables It is pre-conditioned.
Further, said apparatus also include:Processing unit, place is standardized for the classified variable to predetermined number Reason, the classified variable after being standardized;Taxon is additionally operable to the classification to the preset order in the classified variable after sequence Variable is classified, and obtains classification results.
Further, said apparatus also include:Sequencing unit, for being ranked up to the classified variable after standardization, obtains Classified variable after to sequence;Generation unit, for the classified variable according to predeterminated position in the classified variable after sequence, generation The distributed intelligence of the classified variable after standardization.
In embodiments of the present invention, user data to be sorted is obtained, according to default class condition, from user to be sorted The classified variable of predetermined number is obtained in data, the classified variable of predetermined number is classified based on default disaggregated model, obtained To classification results, wherein, classification results are used to characterize the type of user data.It is easily noted that, can be based on default point Class model is classified, and obtains classification results, so as to realize carrying out automobile user the purpose of sophisticated category, is solved existing There is the granularity of classification of the sorting technique of automobile user in technology big, the technical problem of Consumer's Experience sense difference.Therefore, by this The scheme that invention above-described embodiment is provided, can reach lifting Consumer's Experience, reduce because electric card brings tired using inconvenience to user The effect disturbed.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 is a kind of flow chart of the sorting technique of automobile user according to embodiments of the present invention;And
Fig. 2 is a kind of schematic diagram of the sorter of automobile user according to embodiments of the present invention.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, it is clear that described embodiment is only The embodiment of a part of the invention, rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people The every other embodiment that member is obtained under the premise of creative work is not made, should all belong to the model of present invention protection Enclose.
It should be noted that term " first ", " in description and claims of this specification and above-mentioned accompanying drawing Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so using Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except illustrating herein or Order beyond those of description is implemented.Additionally, term " comprising " and " having " and their any deformation, it is intended that cover Lid is non-exclusive to be included, for example, the process, method, system, product or the equipment that contain series of steps or unit are not necessarily limited to Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product Or other intrinsic steps of equipment or unit.
Embodiment 1
According to embodiments of the present invention, there is provided a kind of sorting technique embodiment of automobile user, it is necessary to explanation, Can be performed in the such as one group computer system of computer executable instructions the step of the flow of accompanying drawing is illustrated, and And, although logical order is shown in flow charts, but in some cases, can perform institute with different from order herein The step of showing or describe.
Fig. 1 is a kind of flow chart of the sorting technique of automobile user according to embodiments of the present invention, as shown in figure 1, The method comprises the following steps:
Step S102, obtains user data to be sorted.
Specifically, above-mentioned user data can be the use data of automobile user, electric automobile can be included Charging interval, charge capacity, remaining sum etc. in electric card card.
Step S104, according to default class condition, obtains the classified variable of predetermined number from user data to be sorted.
Specifically, above-mentioned default class condition can be the purpose classified to automobile user, for example, determining Card time user more long is opened in the user charged commonly using electric card, or determination;Above-mentioned predetermined number can be root The quantity of the classified variable as user's classification foundation determined according to classification purpose, for example, it may be 5.
In a kind of optional scheme, after the initial data for getting automobile user, that is, get to be sorted User data after, can according to classification purpose, 5 are filtered out from the initial data for getting as user's classification foundation Classified variable, so as to by 5 classified variables for filtering out, classify to automobile user.
Step S106, is classified based on default disaggregated model to the classified variable of predetermined number, obtains classification results, its In, classification results are used to characterize the type of user data.
Optionally, in the above embodiment of the present invention, classification results include:Stabilization user, volatile user, value type are used Family, non-value type user, loss user and non-streaming appraxia family.
Specifically, above-mentioned default disaggregated model can be K_means algorithm models, K_means clustering procedures are more common Sorting technique, its most important feature be the time of algorithmic statement be to be directly proportional with the observation number of data to be analyzed.Therefore, K_means clustering procedures are often used to the larger data for the treatment of scale;In classification results, it is believed that " value type non-streaming appraxia family " is Best, opposite " value type is lost in user " needs to avoid.For " value type non-streaming appraxia family ", this certain customers is to need The user further to safeguard, can reward this certain customers, and then lift the loyalty of these users by modes such as integrating systems Degree;" value type is lost in user " may be due to having screened out this partial data, later it is possible to going out in process of data preprocessing The existing phenomenon, then need by visiting, the reason for the mode such as survey is found out this certain customers and is lost in, such as:The discarded original of electric card Cause, self reason etc., and then diplomatic electrically-charging equipment is safeguarded.
In a kind of optional scheme, can be right by K_means algorithm models according to 5 classified variables for filtering out User is classified.In classification results, value type user (electric card) accounts for 11%, is lost in user's (electric card) and accounts for 15%.By This is visible, and in full dose data, value type user (electric card) is less, all concentrates in long-time users, and this is due to judging value One measurement index of type is to use electricity summation, and long-time users are easier to accumulate more electricity;And the user of loss is several Concentrate in Short-term user, illustrating the behavior of short-term card user has stronger unstability.For " non-value type is used Family " can be divided into " the non-value type user of long time type " (accounting for the 9% of full dose data) and " the non-value type user of temporary type " and (account for again Full dose data 80%).Wherein, " the non-value type user of temporary type " be probably due to electric card use time is short, so Its value still has to be seen;And " the non-value type user of long time type " is probably the not rule due to hand-held multiple electric cards of user One electric card of use caused by.As can be seen here, improve electric card quality has heavy to closing for normalization analysis user's charging behavior The meaning wanted.
According to the above embodiment of the present invention, user data to be sorted is obtained, according to default class condition, to be sorted The classified variable of predetermined number is obtained in user data, the classified variable of predetermined number is divided based on default disaggregated model Class, obtains classification results, wherein, classification results are used to characterize the type of user data.It is easily noted that, can be based on pre- If disaggregated model is classified, classification results are obtained, so as to realize carrying out automobile user the purpose of sophisticated category, solved The granularity of classification of the sorting technique of automobile user is big in the prior art, the technical problem of Consumer's Experience sense difference.Therefore, lead to The scheme of the above embodiment of the present invention offer is provided, lifting Consumer's Experience can be reached, reduce because electric card using inconvenience to user with Come the effect for perplexing.
Optionally, in the above embodiment of the present invention, step S104, according to default class condition, from user to be sorted The classified variable of predetermined number is obtained in data, including:
Step S1042, according to default class condition, is processed user data to be sorted, obtains multiple variables.
Specifically, above-mentioned multiple variables can be multiple derivative variables.
In a kind of optional scheme, the original variable in user data to be sorted can be carried out according to classification purpose Processing, the multiple derivative variables of generation, as shown in table 1 to table 4.
Table 1 derives variable
Table 2 derives variable
Table 3 derives variable
OBS Vector prepares Title
35 pub_sum_PQ_rate pub_sum_PQ/(pub_sum_PQ+tax_sum_PQ)
36 pub_tax_sum_PQ_rate pub_sum_PQ/tax_sum_PQ
37 pub_tax_mean_PQ_rate tax_mean_PQ/pub_mean_PQ
38 pub_tax_max_PQ_rate tax_max_PQ/pub_max_PQ
39 pub_tax_mean_time_rate tax_mean_time_rate/pub_mean_time_rate
40 pub_tax_max_time_rate tax_max_time_rate/pub_max_time_rate
41 pub_tax_min_time_rate tax_min_time_rate/pub_min_time_rate
42 tax_freq_rate tax_freq/(pub_freq+tax_freq)
43 pub_freq_rate pub_freq/(pub_freq+tax_freq)
44 pub_tax_freq_rate pub_freq/tax_freq
45 AC_sum_PQ_rate AC_sum_PQ/(AC_sum_PQ+DC_sum_PQ)
46 DC_sum_PQ_rate DC_sum_PQ/(AC_sum_PQ+DC_sum_PQ)
47 AC_DC_sum_PQ_rate AC_sum_PQ/DC_sum_PQ
48 AC_DC_mean_PQ_rate AC_mean_PQ/DC_mean_PQ
49 AC_DC_max_PQ_rate AC_max_PQ/DC_max_PQ
50 AC_DC_mean_time_rate AC_mean_time_rate/DC_mean_time_rate
51 AC_DC_max_time_rate AC_max_time_rate/DC_max_time_rate
52 AC_DC_min_time_rate AC_min_time_rate/DC_min_time_rate
53 AC_freq_rate AC_freq/(AC_freq+DC_freq)
54 DC_freq_rate DC_freq/(AC_freq+DC_freq)
55 AC_DC_freq_rate AC_freq/AC_freq
56 min_recently min(AC_min_recently,DC_min_recently)
57 max_recently min(AC_max_recently,DC_max_recently)
Table 4 derives variable
OBS Vector prepares Title
58 card_month max_recently-min_recently
Step S1044, from multiple variables, it is determined that multiple Available Variables.
Specifically, above-mentioned Available Variables can be the variable of non-correlation.
In a kind of optional scheme, correlation can be carried out to the derivative variable of multiple by correlation Inspection and analysis method Property check analysis, determine the Available Variables of non-correlation, so as to avoid choose classified variable between have correlation.
Step S1046, from multiple Available Variables, obtains the classified variable of predetermined number.
In a kind of optional scheme, can be picked out in the Available Variables of non-correlation 5 as user classify according to According to classified variable, for example, it may be sum_PURCHASE_PQ (user uses total electricity);Mean_use_time_rate is (flat Use/occupancy situation);Min_recently (duration of last time charging distance deadline);card_balance_ Mean (remaining sum in mean 0100 calorie);Card_month (opens card total duration).
Optionally, in the above embodiment of the present invention, step S1044, from multiple variables, it is determined that multiple Available Variables bags Include:
Each variable and other any one variables are carried out correlation analysis by step S122, obtain each variable and its The correlation results of his any one variable, wherein, correlation results at least include:Coefficient correlation and check value.
Specifically, correlation test refers to a kind of hypothesis testing for checking two variables with the presence or absence of dependency relation.In the vacation If in inspection, ρ is the parameter of coefficient correlation, and P is check value.
Step S124, judges whether the correlation results of each variable and other any one variable meet pre-conditioned.
Specifically, it is above-mentioned it is pre-conditioned can be user according to classification purpose, the determination for pre-setting two is derivative to be become Measure non-correlation condition, including coefficient correlation condition and the condition of check value.
Step S126, if the correlation results of first variable and other any one variable meet pre-conditioned, really Fixed first variable is Available Variables.
In a kind of optional scheme, can each derives variable and carries out correlation point with other by a derivative variable Analysis, can calculate coefficient correlation and the school between the derivative variable and other each derivative variables by Correlation Calibration analytic approach Test value, according to the coefficient correlation and check value that are calculated, matching judgment carried out by preset value, determine the derivative variable and its Whether his each derivative variable is related, if the derivative variable is to other, and each derivative variable is uncorrelated, this can spread out The amount of changing is used as Available Variables.
Optionally, in the above embodiment of the present invention, step S124 judges each variable with other any one variables Correlation results whether meet it is pre-conditioned including:
Whether step S1242, judge coefficient correlation in the first preset range, and whether check value is in the second default model Enclose.
Specifically, the first above-mentioned preset range can be 0, the second above-mentioned preset range may be greater than equal to 0.05 Scope.
In a kind of optional scheme, null hypothesis H in correlation test0With alternative hypothesis H1Respectively:H0:ρ=0, H1:ρ ≠ 0, i.e. ρ=0 can be expressed as two derivative variable non-correlations, and ρ ≠ 0 can be expressed as two derivative variables correlation; Generally, P is worked as<When 0.05, then it represents that the linear relationship between two derivative variables is significant, but the size of P values can not represent The power of correlation, and P be worth size influenceed by sample size.
Step S1244, if coefficient correlation is in the first preset range, and check value is in the second preset range, then really The correlation results of fixed any one variable of each variable and other meet pre-conditioned.
In a kind of optional scheme, if a derivative variable derives ρ ≠ 0 of variable, and P with another<0.05, then Can determine that this two derivative variables have correlation, and correlation is notable;If a derivative variable and other each derivatives ρ=0 of variable, and P >=0.05, it is determined that each derivative variable is uncorrelated to other for the derivative variable, can derive this Variable is used as Available Variables.
Optionally, in the above embodiment of the present invention, in step S104, according to default class condition, from use to be sorted Obtained in user data after the classified variable of predetermined number, the method also includes:
Step S108, the classified variable to predetermined number is standardized, the classified variable after being standardized.
In a kind of optional scheme, in order to avoid being differed greatly between the variance of classified variable, can be to classified variable It is standardized, Plays process is such as SAS (statistical analysis system, Statistical Analysis System's writes a Chinese character in simplified form) Under:
Proc fastclus data=data sets;
Var variables;
Run。
Step S110, the classified variable to the preset order in standardized classified variable is classified, and obtains classification knot Really.
In a kind of optional scheme, can be after being standardized to 5 classified variables, according to by K_means Algorithm model according to standardization after 5 classified variables, user is classified, improve user classification the degree of accuracy.
Optionally, in the above embodiment of the present invention, in step S104, according to default class condition, from use to be sorted Obtained in user data after the classified variable of predetermined number, the method also includes:
Step S112, is ranked up to the classified variable after standardization, the classified variable after being sorted.
Step S114, according to the classified variable of predeterminated position in the classified variable after sequence, the classification after generation standardization The distributed intelligence of variable.
Specifically, above-mentioned predeterminated position can be 1% position set in advance, 25% position, 50% position, 75% Put, 90% position and 99% position.
In a kind of optional scheme, distribution situation understanding can be carried out to the classified variable that will classify, i.e., according to mark Ascending 1% position for putting in order of classified variable after standardization, 25% position, 50% position, 75% position, 90% position Listed one by one with the value corresponding to 99% position, classified variable is better understood by and to dividing by the distribution situation of these respective values Class foundation provides help.
Embodiment 2
According to embodiments of the present invention, there is provided a kind of sorter embodiment of automobile user.
Fig. 2 is a kind of schematic diagram of the sorter of automobile user according to embodiments of the present invention, as shown in Fig. 2 The device includes:
First acquisition unit 21, for obtaining user data to be sorted.
Specifically, above-mentioned user data can be the use data of automobile user, electric automobile can be included Charging interval, charge capacity, remaining sum etc. in electric card card.
Second acquisition unit 23, for according to default class condition, predetermined number being obtained from user data to be sorted Classified variable.
Specifically, above-mentioned default class condition can be the purpose classified to automobile user, for example, determining Card time user more long is opened in the user charged commonly using electric card, or determination;Above-mentioned predetermined number can be root The quantity of the classified variable as user's classification foundation determined according to classification purpose, for example, it may be 5.
In a kind of optional scheme, after the initial data for getting automobile user, that is, get to be sorted User data after, can according to classification purpose, 5 are filtered out from the initial data for getting as user's classification foundation Classified variable, so as to by 5 classified variables for filtering out, classify to automobile user.
Taxon 25, for being classified to the classified variable of predetermined number based on default disaggregated model, is classified As a result, wherein, classification results are used to characterize the type of user data.
Optionally, in the above embodiment of the present invention, classification results include:Stabilization user, volatile user, value type are used Family, non-value type user, loss user and non-streaming appraxia family.
Specifically, above-mentioned default disaggregated model can be K_means algorithm models, K_means clustering procedures are more common Sorter, its most important feature be the time of algorithmic statement be to be directly proportional with the observation number of data to be analyzed.Therefore, K_means clustering procedures are often used to the larger data for the treatment of scale;In classification results, it is believed that " value type non-streaming appraxia family " is Best, opposite " value type is lost in user " needs to avoid.For " value type non-streaming appraxia family ", this certain customers is to need The user further to safeguard, can reward this certain customers, and then lift the loyalty of these users by modes such as integrating systems Degree;" value type is lost in user " may be due to having screened out this partial data, later it is possible to going out in process of data preprocessing The existing phenomenon, then need by visiting, the reason for the mode such as survey is found out this certain customers and is lost in, such as:The discarded original of electric card Cause, self reason etc., and then diplomatic electrically-charging equipment is safeguarded.
In a kind of optional scheme, can be right by K_means algorithm models according to 5 classified variables for filtering out User is classified.In classification results, value type user (electric card) accounts for 11%, is lost in user's (electric card) and accounts for 15%.By This is visible, and in full dose data, value type user (electric card) is less, all concentrates in long-time users, and this is due to judging value One measurement index of type is to use electricity summation, and long-time users are easier to accumulate more electricity;And the user of loss is several Concentrate in Short-term user, illustrating the behavior of short-term card user has stronger unstability.For " non-value type is used Family " can be divided into " the non-value type user of long time type " (accounting for the 9% of full dose data) and " the non-value type user of temporary type " and (account for again Full dose data 80%).Wherein, " the non-value type user of temporary type " be probably due to electric card use time is short, so Its value still has to be seen;And " the non-value type user of long time type " is probably the not rule due to hand-held multiple electric cards of user One electric card of use caused by.As can be seen here, improve electric card quality has heavy to closing for normalization analysis user's charging behavior The meaning wanted.
According to the above embodiment of the present invention, first acquisition unit obtains user data to be sorted, second acquisition unit root According to default class condition, the classified variable of predetermined number is obtained from user data to be sorted, taxon is based on default point Class model is classified to the classified variable of predetermined number, obtains classification results, wherein, classification results are used to characterize user data Type.It is easily noted that, can be classified based on default disaggregated model, classification results is obtained, so as to realize to electricity Electrical automobile user carries out the purpose of sophisticated category, solves the granularity of classification of the sorting technique of automobile user in the prior art Greatly, the technical problem of Consumer's Experience sense difference.Therefore, the scheme for being provided by the above embodiment of the present invention, can reach lifting and use Family is experienced, and reduction brings the effect of puzzlement because of electric card using inconvenience to user.
Optionally, in the above embodiment of the present invention, the second acquisition unit includes:
Processing module, for according to default class condition, processing user data to be sorted, obtains multiple changes Amount.
Specifically, above-mentioned multiple variables can be multiple derivative variables.
In a kind of optional scheme, the original variable in user data to be sorted can be carried out according to classification purpose Processing, the multiple derivative variables of generation, as shown in table 1 to table 4.
Determining module, for from multiple variables, it is determined that multiple Available Variables.
Specifically, above-mentioned Available Variables can be non-correlation variable,
In a kind of optional scheme, correlation can be carried out to the derivative variable of multiple by correlation Inspection and analysis device Property check analysis, determine the Available Variables of non-correlation, so as to avoid choose classified variable between have correlation.
Acquisition module, for from multiple Available Variables, obtaining the classified variable of predetermined number.
In a kind of optional scheme, can be picked out in the Available Variables of non-correlation 5 as user classify according to According to classified variable, for example, it may be sum_PURCHASE_PQ (user uses total electricity);Mean_use_time_rate is (flat Use/occupancy situation);Min_recently (duration of last time charging distance deadline);card_balance_ Mean (remaining sum in mean 0100 calorie);Card_month (opens card total duration).
Optionally, in the above embodiment of the present invention, the determining module includes:
Treatment submodule, for carrying out correlation analysis to each variable and other any one variables, obtains each change The correlation results of amount and other any one variables, wherein, correlation results at least include:Coefficient correlation and check value.
Specifically, correlation test refers to a kind of hypothesis testing for checking two variables with the presence or absence of dependency relation.In the vacation If in inspection, ρ is the parameter of coefficient correlation, and P is check value.
Whether judging submodule, the correlation results for judging each variable and other any one variable meet default Condition.
Specifically, it is above-mentioned it is pre-conditioned can be user according to classification purpose, the determination for pre-setting two is derivative to be become Measure non-correlation condition, including coefficient correlation condition and the condition of check value.
Determination sub-module, if meeting default bar for the correlation results of the first variable and other any one variables Part, it is determined that the first variable is Available Variables.
In a kind of optional scheme, can each derives variable and carries out correlation point with other by a derivative variable Analysis, can calculate coefficient correlation and the school between the derivative variable and other each derivative variables by Correlation Calibration analytic approach Test value, according to the coefficient correlation and check value that are calculated, matching judgment carried out by preset value, determine the derivative variable and its Whether his each derivative variable is related, if the derivative variable is to other, and each derivative variable is uncorrelated, this can spread out The amount of changing is used as Available Variables.
Optionally, in the above embodiment of the present invention, the judging submodule includes:
Judge baryon module, whether for judging coefficient correlation in the first preset range, whether check value is in second Preset range.
Specifically, the first above-mentioned preset range can be 0, the second above-mentioned preset range may be greater than equal to 0.05 Scope.
In a kind of optional scheme, null hypothesis H in correlation test0With alternative hypothesis H1Respectively:H0:ρ=0, H1:ρ ≠ 0, i.e. ρ=0 can be expressed as two derivative variable non-correlations, and ρ ≠ 0 can be expressed as two derivative variables correlation; Generally, P is worked as<When 0.05, then it represents that the linear relationship between two derivative variables is significant, but the size of P values can not represent The power of correlation, and P be worth size influenceed by sample size.
Determine baryon module, if the first preset range were in for coefficient correlation, and check value would be in the second default model Enclose, it is determined that the correlation results of each variable and other any one variable meet pre-conditioned.
In a kind of optional scheme, if a derivative variable derives ρ ≠ 0 of variable, and P with another<0.05, then Can determine that this two derivative variables have correlation, and correlation is notable;If a derivative variable and other each derivatives ρ=0 of variable, and P >=0.05, it is determined that each derivative variable is uncorrelated to other for the derivative variable, can derive this Variable is used as Available Variables.
Optionally, in the above embodiment of the present invention, the device also includes:
Processing unit, is standardized for the classified variable to predetermined number, and the classification after being standardized becomes Amount.
In a kind of optional scheme, in order to avoid being differed greatly between the variance of classified variable, can be to classified variable It is standardized, Plays process is such as SAS (statistical analysis system, Statistical Analysis System's writes a Chinese character in simplified form) Under:
Proc fastclus data=data sets;
Var variables;
Run。
Taxon is additionally operable to classify the classified variable of the preset order in standardized classified variable, is divided Class result.
In a kind of optional scheme, can be after being standardized to 5 classified variables, according to by K_means Algorithm model according to standardization after 5 classified variables, user is classified, improve user classification the degree of accuracy.
Optionally, in the above embodiment of the present invention, the device also includes:
Sequencing unit, for being ranked up to the classified variable after standardization, the classified variable after being sorted.
Generation unit, for the classified variable according to predeterminated position in the classified variable after sequence, after generation standardization The distributed intelligence of classified variable.
Specifically, above-mentioned predeterminated position can be 1% position set in advance, 25% position, 50% position, 75% Put, 90% position and 99% position.
In a kind of optional scheme, distribution situation understanding can be carried out to the classified variable that will classify, i.e., according to mark Ascending 1% position for putting in order of classified variable after standardization, 25% position, 50% position, 75% position, 90% position Listed one by one with the value corresponding to 99% position, classified variable is better understood by and to dividing by the distribution situation of these respective values Class foundation provides help.
The embodiments of the present invention are for illustration only, and the quality of embodiment is not represented.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment The part of detailed description, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed technology contents, can be by other Mode is realized.Wherein, device embodiment described above is only schematical, such as division of described unit, Ke Yiwei A kind of division of logic function, can there is other dividing mode when actually realizing, such as multiple units or component can combine or Person is desirably integrated into another system, or some features can be ignored, or does not perform.Another, shown or discussed is mutual Between coupling or direct-coupling or communication connection can be the INDIRECT COUPLING or communication link of unit or module by some interfaces Connect, can be electrical or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On unit.Some or all of unit therein can be according to the actual needs selected to realize the purpose of this embodiment scheme.
In addition, during each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or use When, can store in a computer read/write memory medium.Based on such understanding, technical scheme is substantially The part for being contributed to prior art in other words or all or part of the technical scheme can be in the form of software products Embody, the computer software product is stored in a storage medium, including some instructions are used to so that a computer Equipment (can be personal computer, server or network equipment etc.) perform each embodiment methods described of the invention whole or Part steps.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can be with store program codes Medium.
The above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims (14)

1. a kind of sorting technique of automobile user, it is characterised in that including:
Obtain user data to be sorted;
According to default class condition, the classified variable of predetermined number is obtained from the user data to be sorted;
The classified variable of the predetermined number is classified based on default disaggregated model, obtains classification results, wherein, described point Class result is used to characterize the type of the user data.
2. method according to claim 1, it is characterised in that the classification results include:Stabilization user, unstable use Family, value type user, non-value type user, loss user and non-streaming appraxia family.
3. method according to claim 2, it is characterised in that according to default class condition, from the user to be sorted The classified variable that predetermined number is obtained in data includes:
According to the default class condition, the user data to be sorted is processed, obtain multiple variables;
From the multiple variable, it is determined that multiple Available Variables;
From the multiple Available Variables, the classified variable of the predetermined number is obtained.
4. method according to claim 3, it is characterised in that from the multiple variable, it is determined that multiple Available Variables bags Include:
Correlation analysis are carried out to each variable and other any one variables, described each variable is obtained and described other is any One correlation results of variable, wherein, the correlation results at least include:Coefficient correlation and check value;
Judge whether described each variable meets pre-conditioned with the correlation results of other any one variables;
If the first variable meets described pre-conditioned with the correlation results of other any one variables, it is determined that described First variable is the Available Variables.
5. method according to claim 4, it is characterised in that judge described each variable and described other any one changes The correlation results of amount whether meet it is pre-conditioned including:
Whether the coefficient correlation is judged in the first preset range, and whether the check value is in the second preset range;
If the coefficient correlation is in first preset range, and the check value is in second preset range, then It is determined that the correlation results of any one variable of each variable and other meet described pre-conditioned.
6. method as claimed in any of claims 1 to 5, it is characterised in that according to default class condition, from institute State after the classified variable of acquisition predetermined number in user data to be sorted, methods described also includes:
Classified variable to the predetermined number is standardized, the classified variable after being standardized;
Classified variable after the standardization is classified, the classification results are obtained.
7. method according to claim 6, it is characterised in that according to default class condition, from the use to be sorted Obtained in user data after the classified variable of predetermined number, methods described also includes:
Classified variable after the standardization is ranked up, the classified variable after being sorted;
According to the classified variable of predeterminated position in the classified variable after the sequence, the classified variable after the standardization is generated Distributed intelligence.
8. a kind of sorter of automobile user, it is characterised in that including:
First acquisition unit, for obtaining user data to be sorted;
Second acquisition unit, for according to default class condition, predetermined number being obtained from the user data to be sorted Classified variable;
Taxon, for classifying to the classified variable of the predetermined number based on default disaggregated model, obtains classification knot Really, wherein, the classification results are used to characterize the type of the user data.
9. device according to claim 8, it is characterised in that the classification results include:Stabilization user, unstable use Family, value type user, non-value type user, loss user and non-streaming appraxia family.
10. device according to claim 9, it is characterised in that the second acquisition unit includes:
Processing module, for according to the default class condition, processing the user data to be sorted, obtains multiple Variable;
Determining module, for from the multiple variable, it is determined that multiple Available Variables;
Acquisition module, for from the multiple Available Variables, obtaining the classified variable of the predetermined number.
11. devices according to claim 10, it is characterised in that the determining module includes:
Treatment submodule, for carrying out correlation analysis to each variable and other any one variables, obtains described each change The correlation results with other any one variables are measured, wherein, the correlation results at least include:Coefficient correlation and school Test value;
Judging submodule, for judging whether described each variable meets with the correlation results of other any one variables It is pre-conditioned;
Determination sub-module, if meeting described default for the first variable and the correlation results of other any one variables Condition, it is determined that first variable is the Available Variables.
12. devices according to claim 11, it is characterised in that the judging submodule includes:
Judge baryon module, whether for judging the coefficient correlation in the first preset range, whether the check value is in Second preset range;
Determine baryon module, if first preset range were in for the coefficient correlation, and the check value would be in institute State the second preset range, it is determined that described each variable meets the default bar with the correlation results of other any one variables Part.
13. device according to any one in claim 8 to 12, it is characterised in that described device also includes:
Processing unit, is standardized for the classified variable to the predetermined number, and the classification after being standardized becomes Amount;
The taxon is additionally operable to classify the classified variable after the standardization, obtains the classification results.
14. devices according to claim 13, it is characterised in that described device also includes:
Sequencing unit, for being ranked up to the classified variable after the standardization, the classified variable after being sorted;
Generation unit, for the classified variable according to predeterminated position in the classified variable after the sequence, generates the standardization The distributed intelligence of classified variable afterwards.
CN201611132772.6A 2016-12-09 2016-12-09 The sorting technique and device of automobile user Pending CN106844427A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611132772.6A CN106844427A (en) 2016-12-09 2016-12-09 The sorting technique and device of automobile user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611132772.6A CN106844427A (en) 2016-12-09 2016-12-09 The sorting technique and device of automobile user

Publications (1)

Publication Number Publication Date
CN106844427A true CN106844427A (en) 2017-06-13

Family

ID=59139848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611132772.6A Pending CN106844427A (en) 2016-12-09 2016-12-09 The sorting technique and device of automobile user

Country Status (1)

Country Link
CN (1) CN106844427A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657705A (en) * 2018-12-03 2019-04-19 国网天津市电力公司电力科学研究院 A kind of automobile user clustering method and device based on random forests algorithm

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101587349A (en) * 2008-05-22 2009-11-25 上海宝信软件股份有限公司 The using standard classified variable is realized the method for quality analysis
CN105825232A (en) * 2016-03-15 2016-08-03 国网北京市电力公司 Classification method and device for electromobile users

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101587349A (en) * 2008-05-22 2009-11-25 上海宝信软件股份有限公司 The using standard classified variable is realized the method for quality analysis
CN105825232A (en) * 2016-03-15 2016-08-03 国网北京市电力公司 Classification method and device for electromobile users

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109657705A (en) * 2018-12-03 2019-04-19 国网天津市电力公司电力科学研究院 A kind of automobile user clustering method and device based on random forests algorithm

Similar Documents

Publication Publication Date Title
Rahman Service quality, corporate image and customer’s satisfaction towards customers perception: an exploratory study on telecom customers in Bangladesh
CN107066616A (en) Method, device and electronic equipment for account processing
CN105490823B (en) data processing method and device
Bose et al. Exploring business opportunities from mobile services data of customers: An inter-cluster analysis approach
CN112559900B (en) Product recommendation method and device, computer equipment and storage medium
CN112308462A (en) Power consumer classification method and device
CN105825232A (en) Classification method and device for electromobile users
CN109376766A (en) A kind of portrait prediction classification method, device and equipment
CN107358456A (en) Data show method and apparatus
CN111062806B (en) Personal finance credit risk evaluation method, system and storage medium
CN107358360A (en) The abnormal traffic data screening method of anti money washing system
CN107194815B (en) Client segmentation method and system
CN111639102A (en) Client data resource sharing method and device and electronic equipment
CN110019774A (en) Label distribution method, device, storage medium and electronic device
CN106844427A (en) The sorting technique and device of automobile user
CN110046951A (en) A kind of trading activity judgment method and system
CN107403263B (en) Method for identifying electricity consumption demand of large-power customer
Apparao et al. Financial statement fraud detection by data mining
CN106952111A (en) Personalized recommendation method and device
CN115689708A (en) Screening method, risk assessment method, device, equipment and medium of training data
Biscarri et al. A Mining Framework to Detect Non-technical Losses in Power Utilities.
Diwandari et al. Analysis of customer purchase behavior using association rules in e-shop
CN107563599A (en) Patent valve estimating system based on big data
Vachane Online Products Fake Reviews Detection System Using Machine Learning
CN112907308A (en) Data detection method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170613

RJ01 Rejection of invention patent application after publication