CN106779929A - A kind of Products Show method, device and computing device - Google Patents

A kind of Products Show method, device and computing device Download PDF

Info

Publication number
CN106779929A
CN106779929A CN201611103409.1A CN201611103409A CN106779929A CN 106779929 A CN106779929 A CN 106779929A CN 201611103409 A CN201611103409 A CN 201611103409A CN 106779929 A CN106779929 A CN 106779929A
Authority
CN
China
Prior art keywords
grader
gini
node
products
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611103409.1A
Other languages
Chinese (zh)
Other versions
CN106779929B (en
Inventor
王碰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Knownsec Information Technology Co Ltd
Original Assignee
Beijing Knownsec Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Knownsec Information Technology Co Ltd filed Critical Beijing Knownsec Information Technology Co Ltd
Priority to CN201611103409.1A priority Critical patent/CN106779929B/en
Publication of CN106779929A publication Critical patent/CN106779929A/en
Application granted granted Critical
Publication of CN106779929B publication Critical patent/CN106779929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of Products Show method, the method is performed in computing device, a plurality of consumer record of multiple users that are stored with computing device, and consumer record includes the characteristic information and consumer products of user, and the method includes:Obtain the characteristic information of targeted customer;According to the characteristic information of targeted customer, the first recommended products is determined using the first grader, wherein, the first grader is with all a plurality of consumer record for storing as sample training draws;According to the characteristic information of targeted customer, the second recommended products is determined using the second grader, wherein, it with the consumer products for storing is not a plurality of consumer record of the first recommended products as sample training draws that the second grader is;Using the first recommended products and the second recommended products as the product recommended to user.The invention also discloses the Products Show device that can implement the above method, and the computing device including said apparatus.

Description

A kind of Products Show method, device and computing device
Technical field
The present invention relates to data mining technology field, more particularly to a kind of Products Show method, device and computing device.
Background technology
With the rise of the development of technology, especially net purchase, the product for being available for consumer to select is more and more.In face of various Product, consumer needs to consume the substantial amounts of time and does selection, and is difficult smoothly to find the suitable product of oneself.Consumer has Shi Huixiang sales forces (customer service) ask for help.However, most of sales forces lack the technological know-how of system and to product Understand in depth, it is impossible to be best understood by consumer demand, be only capable of carrying out recommended products by rule of thumb, its recommend product often It is difficult to make consumer satisfaction.
Accordingly, it would be desirable to a kind of Products Show method, to help consumer to select product, or helps sales force to consumer Recommended products.
The content of the invention
Therefore, the present invention provides a kind of Products Show method, device and computing device, deposited above with solving or at least alleviating Problem.
According to an aspect of the present invention, there is provided a kind of Products Show method, the method is performed in computing device, calculate Be stored with a plurality of consumer record of multiple users in equipment, and consumer record includes the characteristic information and consumer products of user, the party Method includes:Obtain the characteristic information of targeted customer;According to the characteristic information of targeted customer, determine that first pushes away using the first grader Product is recommended, wherein, the first grader is with all a plurality of consumer record for storing as sample training draws;According to targeted customer's Characteristic information, the second recommended products is determined using the second grader, wherein, the second grader is not to be with the consumer products for storing The a plurality of consumer record of the first recommended products draws for sample training;Using the first recommended products and the second recommended products as to The product that family is recommended.
Alternatively, in Products Show method of the invention, characteristic information is included with or many in properties It is individual:Whether which ID, user's property, user and the same city of sales company, if used safety product, used produce safely Product, dos attack number of times, CC number of times of attack, ARP number of times of attack, DNS number of times of attack, database attack number of times, be implanted wooden horse or Viral number of times, by Domain Hijacking number of times, is tampered number of times, authority number of times of attack, other attack type number of times.
Alternatively, in Products Show method of the invention, the first grader and the second grader are classification tree, the One grader and the second grader are trained according to following steps:For each node:Front and rear GINI exponential increments will be divided most Big attribute as optimal Split Attribute, using the minimum splitting condition of the GINI indexes after division as optimal splitting condition, root Enter line splitting to the node according to optimal Split Attribute and optimal splitting condition, produce two child nodes;When the termination for meeting setting During condition, the division of Stop node.
Alternatively, in Products Show method of the invention, the GINI indexes before division are calculated according to below equation:
Wherein, D is the sample set included by node, and k is the quantity of consumer products classification included in sample set, PiFor Consumer products account for the ratio of all samples included in D for the sample size of i;
GINI indexes after division are calculated according to below equation:
Wherein, A represents Split Attribute, and j represents splitting condition, D1、D2Respectively according to Split Attribute A and splitting condition j pairs Node enters the sample set included by two child nodes obtained by line splitting, | D1|、|D2| it is sample set D1、D2In included sample This quantity;
GINI exponential increments are calculated according to below equation:
Δ GINI (A)=GINI (D)-GINIA(D)
Wherein, A is Split Attribute, GINIA(D) it is GINIAj(D) minimum value in.
Alternatively, in Products Show method of the invention, end condition can be any one in following condition Kind:The consumer products classification all same of included sample in node;The depth of tree has reached default depth threshold;In node The quantity of included sample is less than default first threshold;In node included sample size square with division after two The difference of the quadratic sum of the sample size in individual child node is less than default Second Threshold.
Alternatively, in Products Show method of the invention, after the division of Stop node, also include:To One grader carries out beta pruning according to the order from leaf node to root node:For the node for having leaf node, the every of the node is calculated One the first False Rate of leaf node;Calculate the second False Rate of the node;If the second False Rate is missed more than at least one first Sentence rate, then the consumer products classification of the minimum leaf node of the first False Rate is cut this as the consumer products classification of the node All leaf nodes of node.If the first False Rate is more than the second False Rate, two leaf nodes of the node are cut, by the node As leaf node.
Alternatively, in Products Show method of the invention, the first False Rate=(E of leaf node ff+a)/Nf, the Two False Rates=(E+2*a)/N, wherein, EfIt is the sample size divided by mistake in leaf node f, NfIt is sample included in leaf node f This quantity, E is that, by wrong point of sample size in the node, N is included sample size in the node, and a is penalty factor.
Alternatively, in Products Show method of the invention, a=0.5.
According to an aspect of the present invention, there is provided a kind of Products Show device, the device is resided in computing device, calculate Be stored with a plurality of consumer record of multiple users in equipment, and consumer record includes the characteristic information and consumer products of user, the dress Put including:Data obtaining module, is suitable to obtain the characteristic information of targeted customer;First recommending module, is suitable to according to targeted customer Characteristic information, the first recommended products is determined using the first grader, wherein, the first grader is all a plurality of to be disappeared with what is stored Expense is recorded as sample training and draws;Second recommending module, is suitable to the characteristic information according to targeted customer, true using the second grader Fixed second recommended products, wherein, the second grader is to be remembered with a plurality of consumption that the consumer products for storing are not the first recommended products Record as sample training draws;And using the first recommended products and the second recommended products as the product recommended to user.
According to an aspect of the present invention, there is provided a kind of computing device, including Products Show device as described above.
Technology according to the present invention scheme, multiple graders are drawn using existing a plurality of consumer record as sample training, According to the characteristic information of targeted customer, multiple recommended products are determined using multiple graders, and by above-mentioned multiple recommended products As the product recommended to targeted customer.Technical scheme can scientifically to targeted customer's recommended products, it is to avoid Consumer faces the blindness selection of multiple products, also eliminate sales force's technological know-how it is not enough and by subjective experience to The drawbacks of consumer's recommended products.In addition, this programme can recommend multiple products to targeted customer, so as to the various choosings of offer Select, improve consumer select product when and sales force in recommended products when the free degree, have good Consumer's Experience.
In addition, technical scheme is when grader is trained, employs rear beta pruning, calculate multiple steps such as confidence level Suddenly so that grader of the invention is more accurate, so that the product recommended is more suitable for targeted customer.
Brief description of the drawings
In order to realize above-mentioned and related purpose, some illustrative sides are described herein in conjunction with following description and accompanying drawing Face, these aspects indicate the various modes that can put into practice principles disclosed herein, and all aspects and its equivalent aspect It is intended to fall under in the range of theme required for protection.By being read in conjunction with the figure following detailed description, the disclosure it is above-mentioned And other purposes, feature and advantage will be apparent.Throughout the disclosure, identical reference generally refers to identical Part or element.
Fig. 1 shows the schematic diagram of network system according to an embodiment of the invention 100;
Fig. 2 shows the structure chart of computing device according to an embodiment of the invention 200;
Fig. 3 shows the structure chart of Products Show device 300 according to an embodiment of the invention;
Fig. 4 shows the structure chart of Products Show device 300 in accordance with another embodiment of the present invention;
Fig. 5 A show the schematic diagram of node split according to an embodiment of the invention;
Fig. 5 B show the schematic diagram of beta pruning according to an embodiment of the invention;And
Fig. 6 shows the flow chart of Products Show method 600 according to an embodiment of the invention.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here Limited.Conversely, there is provided these embodiments are able to be best understood from the disclosure, and can be by the scope of the present disclosure Complete conveys to those skilled in the art.
Fig. 1 shows the schematic diagram of network system according to an embodiment of the invention 100.Network system shown in Fig. 1 100 include computing device 200, database 110 and multiple client 120~150.It should be pointed out that the network system in Fig. 1 100 is only exemplary, in specific practice situation, can there is computing device, the data of varying number in network system 100 Storehouse and client, the present invention are not limited the quantity of computing device included in network system, database and client System.
Computing device 200 be can managerial marketing platform and for sales force and consumer provide Products Show service set Standby, it specifically can be implemented as server, such as file server, database server, apps server and WEB service Device etc., it is also possible to be embodied as including the personal computer of desktop computer and notebook computer configuration.Client 120~150 Can be that PC, notebook computer, mobile phone, panel computer, notebook computer, TV box, wearable device etc. can connect The equipment for entering internet.Client 120~150 can access internet by wired mode, it is also possible to by 3G, 4G, WiFi, personal focus, IEEE802.11x, bluetooth etc. wirelessly access internet.
User can log in sales platform, and the free choice of goods on sales platform via client 120~150.User is complete Into after purchase, computing device 200 can obtain the consumption information of user, and consumption information includes but is not limited to the user name of user, Product of consumption etc., and the Data Enter database 110 that will be got.It should be pointed out that database 110 can be as local data Storehouse is resided in computing device 200, it is also possible to be arranged at outside computing device 200 as remote data base, and the present invention is to data The deployment way in storehouse is not limited.
Computing device 200, can also be via interconnection in addition to the consumption information that can obtain user from sales platform Net crawls the characteristic information of user.The class of the product that the specific object that characteristic information includes can be sold according to sales platform Type determines, for example, sales platform is used to sell security classes product, then characteristic information can include that (user is for user property Personal or legal person, if being which type of legal person if legal person), whether used safety product, DOS (Denial Of Service) number of times of attack, CC (Challenge Collapsar) number of times of attack, etc..Computing device 200 has crawled use After the characteristic information at family, database 110 is deposited into.The characteristic information and consumer products of user are presented as in database 110 A plurality of feature record with ID as major key, each user records corresponding to one or more feature, each feature note Record includes characteristic information and consumer products.But, characteristic information and consumer products are included in not each feature record The value of value, i.e. characteristic information and consumer products may be sky.For example, for certain user, its characteristic information is only crawled, But the user had not bought any consumer products so as to obtain its consumer products information, then the feature record of the user In characteristic information value for the value of empty, consumer products is sky;For another user, the spy of the user is not crawled Reference ceases, but has collected its consumer products, then the value of the characteristic information in the feature record of the user is empty, consumption The value of product is not sky.In order to be made a distinction to feature record, it is not by the value of characteristic information and consumer products herein Empty feature is designated as consumer record.
Further, it is noted that ID is for identifying the numbering of user in database, a user is uniquely corresponding to one Individual ID.ID can be one in database, and progressive whole number, i.e. computing device 200 are collected one by one since 1 First ID of user be 1, second ID of user for collecting be 2, by that analogy.When computing device 200 When collecting the characteristic information or consumption information of certain user, obtained from the information for collecting and be capable of the unique mark user's The identification information of identity, such as user name, identification card number, cell-phone number, organization mechanism code etc., searching data whether there is in storehouse User with above-mentioned identification information, if so, then the ID of the user is the ID corresponding to the user in database; If it is not, ID maximum in then obtaining current database, increases by 1, as the ID of the user on its basis.
Fig. 2 shows the structure chart of computing device according to an embodiment of the invention 200.In basic configuration 202, meter Calculation equipment 200 typically comprises system storage 206 and one or more processor 204.Memory bus 208 can be used for Communication between processor 204 and system storage 206.
Depending on desired configuration, processor 204 can be any kind for the treatment of, including but not limited to:Microprocessor (μ P), microcontroller (μ C), digital information processor (DSP) or any combination of them.Processor 204 can be included such as The cache of one or more rank of on-chip cache 210 and second level cache 212 etc, processor core 214 and register 216.The processor core 214 of example can include arithmetic and logical unit (ALU), floating-point unit (FPU), Digital signal processing core (DSP core) or any combination of them.The Memory Controller 218 of example can be with processor 204 are used together, or in some implementations, Memory Controller 218 can be an interior section of processor 204.
Depending on desired configuration, system storage 206 can be any type of memory, including but not limited to:Easily The property lost memory (RAM), nonvolatile memory (ROM, flash memory etc.) or any combination of them.System is stored Device 206 can include operating system 220, one or more apply 222 and routine data 224.In some embodiments, May be arranged to be operated using routine data 224 on an operating system using 222.
Computing device 200 can also include contributing to from various interface equipments (for example, output equipment 242, Peripheral Interface 244 and communication equipment 246) to basic configuration 102 via the communication of bus/interface controller 230 interface bus 240.Example Output equipment 242 include GPU 248 and audio treatment unit 250.They can be configured as contributing to via One or more A/V port 252 is communicated with the various external equipments of such as display or loudspeaker etc.Outside example If interface 244 can include serial interface controller 254 and parallel interface controller 256, they can be configured as contributing to Via one or more I/O port 258 and such as input equipment (for example, keyboard, mouse, pen, voice-input device, touch Input equipment) or the external equipment of other peripheral hardwares (such as printer, scanner etc.) etc communicated.The communication of example sets Standby 246 can include network controller 260, and it can be arranged to be easy to via one or more COM1 264 and The communication that individual or multiple other computing devices 262 pass through network communication link.
Network communication link can be an example of communication media.Communication media can be generally presented as in such as carrier wave Or computer-readable instruction, data structure, program module in the modulated data signal of other transmission mechanisms etc, and can With including any information delivery media." modulated data signal " can be with such signal, one in its data set or many It is individual or it change can the mode of coding information in the signal carry out.Used as nonrestrictive example, communication media can be with Wire medium including such as cable network or private line network etc, and it is such as sound, radio frequency (RF), microwave, infrared Or other wireless mediums are in interior various wireless mediums (IR).Term computer-readable medium used herein can include depositing Both storage media and communication media.
In the present invention, the application 222 of computing device 200 includes Products Show device 300.Products Show device 300 During the browser of computing device 200 can be resided at as search engine plug-in unit, or pacify as an independent software Loaded in computing device 200, existence form of the present invention to device 300 in computing device 200 is not limited.The energy of device 300 Enough obtain targeted customer characteristic information, wherein, targeted customer refers to the user of purchase intention, according to its characteristic information come Recommendation draws the product of the suitable targeted customer.Device 300 can be that sales force and consumer provide help.To sales force For, device 300 can help sales force more targetedly to lead referral product;For consumer, device 300 can To help consumer to select most suitable product.
Fig. 3 shows the structure chart of Products Show device 300 according to an embodiment of the invention.As shown in figure 3, device 300 include data obtaining module 310, the first recommending module 320 and the second recommending module 330, wherein, the first recommending module 320 Including the first grader 322, the second recommending module 330 includes the second grader 332.
Data obtaining module 310 is suitable to obtain the characteristic information of targeted customer.Data obtaining module 310 can set from calculating The characteristic information of targeted customer is found out in a plurality of feature record stored in standby 200.When there is no the targeted customer in computing device Feature when recording, a Data Enter interface can be provided to targeted customer, with to its characteristic information of targeted customer's inquiry, and Receive setting of the targeted customer to its characteristic information.The specific object that characteristic information includes can be sold according to sales platform The type of product determine that, when the product of sale is security classes product, according to a kind of embodiment, characteristic information is included but not It is limited to properties:(user is personal or legal person, if being which type of method if legal person for ID, user's property People such as partnership business, Co., Ltd etc.), the whether same city of user and sales company, whether used safety product, used Cross which safety product, be subjected to dos attack number of times, CC number of times of attack, ARP (Address Resolution Protocol) Number of times of attack, DNS (Domain Name System) number of times of attack, database by number of times of attack, be implanted wooden horse or virus time Number, by Domain Hijacking number of times, be tampered number of times, authority number of times of attack, other kinds of number of times of attack etc..The feature of targeted customer Information structure characteristic information vector.
After data obtaining module 310 obtains the characteristic information of targeted customer, characteristic information is transferred to the first recommending module 320.First recommending module 320 determines the first recommended products according to the characteristic information of targeted customer using the first grader 322. First grader 322 be with computing device 200 store all a plurality of consumer record as sample training draws, the first grader 322 input is characterized information, is output as consumer products classification.First recommending module 320 is receiving the feature of targeted customer After information, by characteristic information constitutive characteristic information vector, as the input of the first grader 322.First grader 322 is according to this Input can obtain an output for consumer products classification, i.e. the first recommended products.
After first recommending module 320 determines the first recommended products, the second recommending module 330 is believed according to the feature of targeted customer Breath, the second recommended products is determined using the second grader 332.Second grader 332 is that the consumer products to store are not first The a plurality of consumer record of recommended products show that the input of the second grader 332 is characterized information for sample training, is output as consumption Product category.Using the vectorial input as the second grader 332 of the characteristic information of targeted customer, second classifies second recommending module Device 332 can obtain an output for consumer products classification, i.e. the second recommended products according to the input.Then, second recommends mould Block 330 exports the first recommended products and the second recommended products, used as the product recommended to user.
By the first recommending module 320 and the second recommending module 330, device 300 can recommend two products to user.Base In above description, those skilled in the art be should be recognized that, and the 3rd recommending module, the 4th can also be included in device 300 Recommending module ..., N recommending modules (N is any positive integer), wherein, the N graders in N recommending modules are to calculate The consumer products stored in equipment 200 are not that a plurality of consumer record of the first recommended products~the (N-1) recommended products is sample Training draws, so that device 300 can recommend any number of products to user.Specifically recommend how many products as device 300 Product, can voluntarily be set by targeted customer, sales force and those skilled in the art according to actual conditions, the present invention to this not It is limited.
According to a kind of embodiment, as shown in figure 4, the first grader 322 and the second grader 332 are by classifier training mould The training of block 340 draws.It should be pointed out that classifier training module 340 can be trained before the characteristic information of targeted customer is obtained Draw the first grader 322, it is also possible to which retraining draws the first grader 322 after the characteristic information for obtaining targeted customer, The present invention trains the specific opportunity of the first grader 322 not to be limited classifier training module 340.Certainly, art technology Personnel should be it can be appreciated that the grader of precondition first can accelerate device before the characteristic information of targeted customer is obtained 300 Products Show speed, so as to bring more excellent Consumer's Experience.
According to another embodiment, classifier training module 340 can also be arranged at outside device 300, the void in such as Fig. 4 Shown in wire frame, its function is identical with the treatment foregoing situation about being arranged at when within device 300 of logical AND.
Further, it is noted that the first grader 322, the second grader 332 can be the moulds that can arbitrarily realize classification Type, such as Random Forest model, classification-tree method, kmeans Clustering Models etc., the present invention is to 322, second point of the first grader The concrete form of class device 332 is not limited.The concrete form of the first grader 322 and the second grader 332 can also may be used with identical With difference.Classifier training module 340 can be configured according to the concrete form of the first grader 322 and the second grader 332 Corresponding training logic.According to a kind of embodiment, the first grader 322 and the second grader 332 are classification tree, grader instruction Practice module 340 to be suitable to train the first grader 322 and the second grader 332 according to following steps:
Initially, only one of which node, i.e. root node on classification tree.Sample set included by root node is the institute of participation training There is sample.The process of training classification tree is the process of node split.In division, for each node:Before and after dividing The maximum attribute of GINI exponential increments as optimal Split Attribute, using the minimum splitting condition of the GINI indexes after division as most Good splitting condition, line splitting is entered to the node according to optimal Split Attribute and optimal splitting condition, produces two child nodes;When full During the end condition that foot is set, the division of Stop node.
According to a kind of embodiment, the GINI indexes before division are calculated according to below equation:
Wherein, D is the sample set included by node, and k is the quantity of consumer products classification included in sample set, PiFor Consumer products account for the ratio of all samples included in D for the sample size of i.
GINI indexes after division are calculated according to below equation:
Wherein, A represents Split Attribute, and j represents splitting condition, D1、D2Respectively according to Split Attribute A and splitting condition j pairs Node enters the sample set included by two child nodes obtained by line splitting, | D1|、|D2| it is sample set D1、D2In included sample This quantity.
GINI indexes after division are calculated according to below equation:
Δ GINI (A)=GINI (D)-GINIA(D) (3)
Wherein, A is Split Attribute, GINIA(D) it is GINIAj(D) minimum value in.
According to a kind of embodiment, end condition can be any one in following condition:Included sample in node Consumer products classification all same;The depth of tree has reached default depth threshold;The quantity of included sample is small in node In default first threshold;In node included sample size square with division after two child nodes in sample size Quadratic sum difference be less than default Second Threshold.Specifically use any end condition and above-mentioned first threshold and the second threshold The value of value can voluntarily be set by those skilled in the art according to actual conditions, and the present invention is without limitation.
For example, certain node T includes as shown in the table 4 sample:
Upper table includes four samples 1~4, three attributes:Whether safety product, dos attack number of times, CC was used to attack Number of times, two class consumer products:A and B.
Whether attribute " using safety product " only has two values, i.e. "Yes" and "No", and the attribute can be made in itself It is splitting condition.Attribute " dos attack number of times " has four values, in order to demarcate by this four values, in general there is three Division methods are planted, that is, there are three splitting conditions, these three splitting conditions for example can be DOS≤1, DOS≤2, DOS≤3.Attribute " CC number of times of attack " has four values, and its splitting condition also has three, for example, can be CC≤1, CC≤2, CC≤3.
According to formula (1), the GINI indexes of node T are:
If 1) enter line splitting, D to node T with attribute " whether using safety product "1={ sample 1, sample 3 }, D2= { sample 2, sample 4 }, according to formula (2),
If 2) enter line splitting to node T with attribute " dos attack number of times ":
For splitting condition DOS≤1, D1={ sample 1 }, D2={ sample 2, sample 3, sample 4 }, according to formula (2),
For splitting condition DOS≤2, D1={ sample 1, sample 2 }, D2={ sample 3, sample 4 }, according to formula (2),
For splitting condition DOS≤3, D1={ sample 1, sample 2, sample 3 }, D2={ sample 4 }, according to formula (2),
If 3) enter line splitting to node T with attribute " CC number of times of attack ":
For splitting condition CC≤1, D1={ sample 4 }, D2={ sample 1, sample 2, sample 3 }, according to formula (2),
For splitting condition CC≤2, D1={ sample 3, sample 4 }, D2={ sample 1, sample 2 }, according to formula (2),
For splitting condition CC≤3, D1={ sample 1, sample 3, sample 4 }, D2={ sample 2 }, according to formula (2),
GINICC(T)=min { GINICC,CC≤1(T),GINICC,CC≤2(T),GINICC,CC≤3(T) }=0
According to formula (3), the GINI exponential increments of three above attribute are respectively:
Because the GINI exponential increments of attribute " CC number of times of attack " are maximum, therefore CC is attacked index as optimal division category Property.The GINI indexes entered after line splitting by splitting condition CC≤3 pair node T are minimum, therefore using CC≤3 as optimal splitting condition. After node T is entered into line splitting by splitting condition CC≤3, two child node T as shown in Figure 5A are obtained1、T2, T1In child node The consumer products of sample are A, T2The consumer products of the sample in child node are included sample in the child node of B, i.e., two This consumer products classification all same, therefore no longer to node T1、T2Enter line splitting, T1、T2It is leaf node.
According to a kind of embodiment, in order to avoid over-fitting, classifier training module 340 is generating first according to preceding method After grader 322, the second grader 332, in addition it is also necessary to carry out beta pruning to the two.Pruning algorithms have various, such as based on erroneous judgement Beta pruning, pessimistic beta pruning etc., the specific pruning algorithms that the present invention is used to classifier training module 340 are not limited.According to A kind of embodiment, classifier training module 340 be suitable to according to following steps to the first grader 322 and the second grader 332 by Beta pruning is carried out according to the order from leaf node to root node:For the node for having leaf node, each leaf node of the node is calculated The first False Rate;Calculate the second False Rate of the node;If the second False Rate is more than at least one first False Rates, by the The consumer products classification of the minimum leaf node of one False Rate cuts all leaves of the node as the consumer products classification of the node Node.
According to a kind of embodiment, the first False Rate, the second False Rate are calculated as follows:
The first False Rate=(E of leaf node ff+a)/Nf (4)
Second False Rate=(E+2*a)/N (5)
Wherein, EfIt is the sample size divided by mistake in leaf node f, NfIt is sample size included in leaf node f, E is should By wrong point of sample size in node, N is included sample size in the node, and a is penalty factor.Implemented according to one kind Example, a=0.5.
For example, as shown in Figure 5 B, subtree T3In have three leaf nodes, two numerals, the left side are labeled with each leaf node Numeral represent the sample number of the correct classification of the leaf node, the numeral on the right represents the sample number of leaf node mistake classification.Cut The process of branch is from bottom to top, i.e., to be carried out by the order of leaf node to root node.Therefore, node T is first determined whether4Whether need Beta pruning.
Node T4There are two leaf node T6、T7, according to formula (4), leaf node T6The first False Rate be (2+0.5)/(3+2) =0.5, leaf node T7The first False Rate be (0+0.5)/(4+0)=0.125.According to formula (5), node T4The second False Rate It is (2+0+2*0.5)/(3+2+4+0)=1/3.Obviously, the second False Rate is more than leaf node T7The first False Rate, therefore, need Will be to node T4Beta pruning is carried out, the classification after beta pruning and prophyll node T7Classification it is identical.To node T4After carrying out beta pruning, similarly Continue to node T3Carry out beta pruning.
According to a kind of embodiment, classifier training module 340 can also be using K folding cross-validation methods (K-CV, K-fold Cross Validation) determine the confidence level of the first grader 322 or the second grader 332.For example, being tested in K folding intersections In card, participate in training the sample mean of the first grader 322 to be divided into K parts by all, using any (K-1) part therein as training Collection, 1 part used as test set.So, it will obtain K sub-classifier.For each sub-classifier, with its corresponding 1 part of survey Examination concentrates the classification accuracy of sample as the classification accuracy of the sub-classifier.By K classification accuracy of sub-classifier Average value as the first grader 322 confidence level.It should be pointed out that above-mentioned K can be for arbitrarily large in 1 positive integer, the present invention Value to K is simultaneously unrestricted.In order to ensure that K rolls over the persuasion of cross validation, K is typically taken into a slightly larger number, such as K= 10.With reference to the above method, the confidence level of the second grader 332 can also be similarly calculated.Classifier training module 340 is true After having determined the confidence level of the first grader 322 and the second grader 332, the first recommended products is exported in the first recommending module 320 And the second recommending module 330, when exporting the second recommended products, (each is pushed away can simultaneously to export the confidence level of each recommended products The confidence level of used grader is recommended in confidence level as this time for recommending product), so that targeted customer or sales force refer to. In general, confidence level is higher, the more suitable targeted customer of product of this recommendation.
Fig. 6 shows the flow chart of Products Show method 600 according to an embodiment of the invention, and the method is suitable to preceding State execution in device 300.As shown in fig. 6, the method starts from step S610.
In step S610, the characteristic information of targeted customer is obtained.The specific object that characteristic information includes can basis The type of the product that sales platform is sold determines, special according to a kind of embodiment when the product of sale is security classes product Reference breath is included but is not limited to properties:ID, user's property, the whether same city of user and sales company, whether used Safety product, used which safety product, be subjected to dos attack number of times, CC number of times of attack, ARP number of times of attack, DNS attack Number of times, database by number of times of attack, be implanted wooden horse or viral number of times, by Domain Hijacking number of times, be tampered number of times, authority and attack Number of times, other kinds of number of times of attack etc..
Then, in step S620, according to the characteristic information of targeted customer, determine that the first recommendation is produced using the first grader Product.Wherein, the first grader is with all a plurality of consumer record for storing as sample training draws.According to a kind of embodiment, the The training method of one grader may be referred to the foregoing description to classifier training module 340, and here is omitted.
Then, in step S630, according to the characteristic information of targeted customer, determine that the second recommendation is produced using the second grader Product.Wherein, it with the consumer products for storing is not that a plurality of consumer record of the first recommended products is sample training that the second grader is Draw.According to a kind of embodiment, the training method of the second grader may be referred to foregoing retouching to classifier training module 340 State, here is omitted.
Then, in step S640, using the first recommended products and the second recommended products as the product recommended to user.
According to a kind of embodiment, after step S640, step S650, S660 (not shown in Fig. 6) etc. can also be included Deng in step S650, according to the characteristic information of targeted customer, the 3rd recommended products, the 3rd point being determined using the 3rd grader Class device be not with consumer products the first recommended products or the second recommended products a plurality of consumer record as sample training draws; In step S660, according to the characteristic information of targeted customer, the 4th recommended products is determined using the 4th grader, the 4th grader is Be not with consumer products in the first recommended products~the 3rd recommended products a plurality of consumer record of any one as sample training is obtained Go out;….By that analogy, the step of can including multiple using N graders to determine N recommended products in method 600, its In, N graders (N be any positive integer) are is appointed in the first recommended products~the (N-1) recommended products with consumer products A kind of a plurality of consumer record draws for sample training, so that method 600 can recommend any number of products to user.As for side Method 600 can specifically recommend how many products, can be by targeted customer, sales force and those skilled in the art according to real Border situation is voluntarily set, and the present invention is without limitation.
A6:Method described in A3, wherein, after the division of Stop node, also include:To the first grader according to from leaf Node carries out beta pruning to the order of root node:For the node for having leaf node, the first of each leaf node of the node is calculated False Rate;Calculate the second False Rate of the node;If the second False Rate is more than at least one first False Rates, by the first erroneous judgement The consumer products classification of the minimum leaf node of rate cuts all leaf nodes of the node as the consumer products classification of the node.
A7:Method described in A6, wherein, the first False Rate=(E of leaf node ff+a)/Nf, the second False Rate=(E+2* A)/N, wherein, EfIt is the sample size divided by mistake in leaf node f, NfIt is sample size included in leaf node f, E is the section By the sample size of wrong minute in point, N is included sample size in the node, and a is penalty factor.
A8:Method described in A7, wherein, a=0.5.
B13:Device described in B11, wherein, the end condition can be any one in following condition:In node The consumer products classification all same of included sample;The depth of tree has reached default depth threshold;It is included in node The quantity of sample is less than default first threshold;In node included sample size square with division after two child nodes In sample size quadratic sum difference be less than default Second Threshold.
B14:Device described in B11, wherein, the classifier training module is further adapted for the first grader, the second classification Device carries out beta pruning according to the order from leaf node to root node:For the node for having leaf node, each leaf of the node is calculated First False Rate of node;Calculate the second False Rate of the node;If the second False Rate is more than at least one first False Rates, Using the consumer products classification of the minimum leaf node of the first False Rate as the consumer products classification of the node, the institute of the node is cut There is leaf node.
B15:Device described in B14, wherein, the classifier training module is suitable to calculate the first erroneous judgement according to below equation Rate and the second False Rate:The first False Rate=(E of leaf node ff+a)/Nf, the second False Rate=(E+2*a)/N, wherein, EfFor By wrong point of sample size, N in leaf node ffIt is sample size included in leaf node f, E is by wrong point of sample in the node This quantity, N is included sample size in the node, and a is penalty factor.
B16:Device described in B15, wherein, a=0.5.
In specification mentioned herein, algorithm and display not with any certain computer, virtual system or other Equipment is inherently related.Various general-purpose systems can also be used together with example of the invention.As described above, construct this kind of Structure required by system is obvious.Additionally, the present invention is not also directed to any certain programmed language.It should be understood that can To realize the content of invention described herein using various programming languages, and the description done to language-specific above be for Disclosure preferred forms of the invention.
In specification mentioned herein, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be put into practice in the case of without these details.In some instances, known method, knot is not been shown in detail Structure and technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify one or more that the disclosure and helping understands in each inventive aspect, exist Above to the description of exemplary embodiment of the invention in, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor The application claims of shield are than the feature more features that is expressly recited in each claim.More precisely, as following As claims reflect, inventive aspect is all features less than single embodiment disclosed above.Therefore, abide by Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, and wherein each claim is in itself As separate embodiments of the invention.
Those skilled in the art should be understood the module or unit or group of the equipment in example disclosed herein Part can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in the example In one or more different equipment.Module in aforementioned exemplary can be combined as a module or be segmented into multiple in addition Submodule.
Those skilled in the art are appreciated that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment Unit or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or Sub-component.In addition at least some in such feature and/or process or unit exclude each other, can use any Combine to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so disclosed appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power Profit is required, summary and accompanying drawing) disclosed in each feature can the alternative features of or similar purpose identical, equivalent by offer carry out generation Replace.
Although additionally, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment required for protection is appointed One of meaning mode can be used in any combination.
Additionally, some in the embodiment be described as herein can be by the processor of computer system or by performing The combination of method or method element that other devices of the function are implemented.Therefore, with for implementing methods described or method The processor of the necessary instruction of element forms the device for implementing the method or method element.Additionally, device embodiment Element described in this is the example of following device:The device is used to implement as performed by the element for the purpose for implementing the invention Function.
As used in this, unless specifically stated so, come using ordinal number " first ", " second ", " the 3rd " etc. Description plain objects are merely representative of and are related to the different instances of similar object, and are not intended to imply that the object being so described must Must have the time it is upper, spatially, sequence aspect or given order in any other manner.
Although the embodiment according to limited quantity describes the present invention, above description, the art are benefited from It is interior it is clear for the skilled person that in the scope of the present invention for thus describing, it can be envisaged that other embodiments.Additionally, it should be noted that The language that is used in this specification primarily to readable and teaching purpose and select, rather than in order to explain or limit Determine subject of the present invention and select.Therefore, in the case of without departing from the scope of the appended claims and spirit, for this Many modifications and changes will be apparent from for the those of ordinary skill of technical field.For the scope of the present invention, to this The done disclosure of invention is illustrative and not restrictive, and it is intended that the scope of the present invention be defined by the claims appended hereto.

Claims (10)

1. a kind of Products Show method, performs in computing device, and a plurality of of multiple users that be stored with the computing device disappears Take record, the consumer record includes the characteristic information and consumer products of user, and methods described includes:
Obtain the characteristic information of targeted customer;
According to the characteristic information of the targeted customer, the first recommended products is determined using the first grader, wherein, described first point Class device is with all a plurality of consumer record for storing as sample training draws;
According to the characteristic information of the targeted customer, the second recommended products is determined using the second grader, wherein, described second point It with the consumer products for storing is not a plurality of consumer record of the first recommended products as sample training draws that class device is;
Using the first recommended products and the second recommended products as the product recommended to targeted customer.
2. the method for claim 1, wherein the characteristic information is included with one or more in properties:User ID, user's property, user and the whether same city of sales company, if used safety product, which safety product, DOS used Number of times of attack, CC number of times of attack, ARP number of times of attack, DNS number of times of attack, database attack number of times is implanted wooden horse or virus time Number, by Domain Hijacking number of times, is tampered number of times, authority number of times of attack, other attack type number of times.
3. the method for claim 1, wherein first grader and the second grader are classification tree, described first Grader and the second grader are trained according to following steps:
For each node:The maximum attribute of front and rear GINI exponential increments will be divided as optimal Split Attribute, after division The minimum splitting condition of GINI indexes as optimal splitting condition, according to optimal Split Attribute and optimal splitting condition to the section Line splitting is clicked through, two child nodes are produced;
When the end condition for setting is met, the division of Stop node.
4. method as claimed in claim 3, wherein, the GINI indexes before division are calculated according to below equation:
G I N I ( D ) = 1 - Σ i k P i 2
Wherein, D is the sample set included by node, and k is the quantity of consumer products classification included in sample set, PiIt is consumption Product accounts for the ratio of all samples included in D for the sample size of i;
GINI indexes after division are calculated according to below equation:
GINI A j ( D ) = | D 1 | | D | G I N I ( D 1 ) + | D 2 | | D | G I N I ( D 2 )
Wherein, A represents Split Attribute, and j represents splitting condition, D1、D2Respectively according to Split Attribute A and splitting condition j to node Enter the sample set included by two child nodes obtained by line splitting, | D1|、|D2| it is sample set D1、D2In included sample Quantity;
GINI exponential increments are calculated according to below equation:
Δ GINI (A)=GINI (D)-GINIA(D)
Wherein, A is Split Attribute, GINIA(D) it is GINIAj(D) minimum value in.
5. method as claimed in claim 3, wherein, the end condition can be any one in following condition:
The consumer products classification all same of included sample in node;
The depth of tree has reached default depth threshold;
The quantity of included sample is less than default first threshold in node;
In node included sample size square and division after two child nodes in sample size quadratic sum difference Less than default Second Threshold.
6. a kind of Products Show device, resides in computing device, and a plurality of of multiple users that be stored with the computing device disappears Take record, the consumer record includes the characteristic information and consumer products of user, and described device includes:
Data obtaining module, is suitable to obtain the characteristic information of targeted customer;
First recommending module, is suitable to the characteristic information according to the targeted customer, determines that the first recommendation is produced using the first grader Product, wherein, first grader is with all a plurality of consumer record for storing as sample training draws;
Second recommending module, is suitable to the characteristic information according to the targeted customer, determines that the second recommendation is produced using the second grader Product, wherein, it with the consumer products for storing is not that a plurality of consumer record of the first recommended products is sample that second grader is Training draws;And using the first recommended products and the second recommended products as the product recommended to user.
7. device as claimed in claim 6, wherein, the characteristic information is included with one or more in properties:User ID, user's property, user and the whether same city of sales company, if used safety product, which safety product, DOS used Number of times of attack, CC number of times of attack, ARP number of times of attack, DNS number of times of attack, database attack number of times is implanted wooden horse or virus time Number, by Domain Hijacking number of times, is tampered number of times, authority number of times of attack, other attack type number of times.
8. device as claimed in claim 6, wherein, first grader and the second grader are classification tree, described device Also include classifier training module, be suitable to train first grader and the second grader according to following steps:
For each node:The maximum attribute of front and rear GINI exponential increments will be divided as optimal Split Attribute, after division The minimum splitting condition of GINI indexes as optimal splitting condition, according to optimal Split Attribute and optimal splitting condition to the section Line splitting is clicked through, two child nodes are produced;
When the end condition for setting is met, the division of Stop node.
9. device as claimed in claim 8, wherein, the classifier training module is further adapted for:
The GINI indexes before division are calculated according to below equation:
G I N I ( D ) = 1 - Σ i k P i 2
Wherein, D is the sample set included by node, and k is the quantity of consumer products classification included in sample set, and Pi is consumption Product accounts for the ratio of all samples included in D for the sample size of i;
The GINI indexes after division are calculated according to below equation:
GINI A j ( D ) = | D 1 | | D | G I N I ( D 1 ) + | D 2 | | D | G I N I ( D 2 )
Wherein, A represents Split Attribute, and j represents splitting condition, D1、D2Respectively according to Split Attribute A and splitting condition j to node Enter the sample set included by two child nodes obtained by line splitting, | D1|、|D2| it is sample set D1、D2In included sample Quantity;
GINI exponential increments are calculated according to below equation:
Δ GINI (A)=GINI (D)-GINIA(D)
Wherein, A is Split Attribute, GINIA(D) it is GINIAj(D) minimum value in.
10. a kind of computing device, including the Products Show device as any one of claim 6-9.
CN201611103409.1A 2016-12-05 2016-12-05 Product recommendation method and device and computing equipment Active CN106779929B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611103409.1A CN106779929B (en) 2016-12-05 2016-12-05 Product recommendation method and device and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611103409.1A CN106779929B (en) 2016-12-05 2016-12-05 Product recommendation method and device and computing equipment

Publications (2)

Publication Number Publication Date
CN106779929A true CN106779929A (en) 2017-05-31
CN106779929B CN106779929B (en) 2020-12-29

Family

ID=58884070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611103409.1A Active CN106779929B (en) 2016-12-05 2016-12-05 Product recommendation method and device and computing equipment

Country Status (1)

Country Link
CN (1) CN106779929B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108961641A (en) * 2018-07-24 2018-12-07 民航成都电子技术有限责任公司 A method of the reduction capacitor based on classification tree encloses boundary's alarm system false-alarm
CN110489642A (en) * 2019-07-25 2019-11-22 山东大学 Method of Commodity Recommendation, system, equipment and the medium of Behavior-based control signature analysis
WO2020147259A1 (en) * 2019-01-16 2020-07-23 平安科技(深圳)有限公司 User portait method and apparatus, readable storage medium, and terminal device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103984775A (en) * 2014-06-05 2014-08-13 网易(杭州)网络有限公司 Friend recommending method and equipment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103984775A (en) * 2014-06-05 2014-08-13 网易(杭州)网络有限公司 Friend recommending method and equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
曾子明等: "《电子商务推荐系统与智能谈判技术》", 31 May 2008 *
王宇恒: "推荐系统中随机森林算法的优化与应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
纪达麒: "多模型融合推荐算法-从原理到实践", 《HTTPS://WWW.INFOQ.CN/ARTICLE/MULTI-MODEL-FUSION》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108961641A (en) * 2018-07-24 2018-12-07 民航成都电子技术有限责任公司 A method of the reduction capacitor based on classification tree encloses boundary's alarm system false-alarm
WO2020147259A1 (en) * 2019-01-16 2020-07-23 平安科技(深圳)有限公司 User portait method and apparatus, readable storage medium, and terminal device
CN110489642A (en) * 2019-07-25 2019-11-22 山东大学 Method of Commodity Recommendation, system, equipment and the medium of Behavior-based control signature analysis
CN110489642B (en) * 2019-07-25 2020-05-22 山东大学 Commodity recommendation method, system, equipment and medium based on behavior feature analysis

Also Published As

Publication number Publication date
CN106779929B (en) 2020-12-29

Similar Documents

Publication Publication Date Title
Ruppert et al. Data politics
CN108304526B (en) Data processing method and device and server
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN103678672B (en) Method for recommending information
EP2866421B1 (en) Method and apparatus for identifying a same user in multiple social networks
CN103455522B (en) Recommendation method and system of application extension tools
CN107220352A (en) The method and apparatus that comment collection of illustrative plates is built based on artificial intelligence
CN110582762A (en) Automatic response server device, terminal device, response system, response method, and program
CN108446813A (en) A kind of method of electric business service quality overall merit
CN108108821A (en) Model training method and device
CN109271493A (en) A kind of language text processing method, device and storage medium
Sengupta et al. Jais and jais-chat: Arabic-centric foundation and instruction-tuned open generative large language models
CN105468596B (en) Picture retrieval method and device
CN106874253A (en) Recognize the method and device of sensitive information
CN108269122B (en) Advertisement similarity processing method and device
CN106484777A (en) A kind of multimedia data processing method and device
Mazhari et al. A user-profile-based friendship recommendation solution in social networks
CN108053050A (en) Clicking rate predictor method, device, computing device and storage medium
CN107918778A (en) A kind of information matching method and relevant apparatus
WO2020063524A1 (en) Method and system for determining legal instrument
CN106779929A (en) A kind of Products Show method, device and computing device
US8996989B2 (en) Collaborative first order logic system with dynamic ontology
Meyns et al. What users tweet on NFTs: mining Twitter to understand NFT-related concerns using a topic modeling approach
CN108268602A (en) Analyze method, apparatus, equipment and the computer storage media of text topic point
Qureshi et al. Performance evaluation of machine learning models on large dataset of android applications reviews

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 311501, Unit 1, Building 5, Courtyard 1, Futong East Street, Chaoyang District, Beijing 100102

Applicant after: Beijing Zhichuangyu Information Technology Co., Ltd.

Address before: 100097 Jinwei Building 803, 55 Lanindichang South Road, Haidian District, Beijing

Applicant before: Beijing Knows Chuangyu Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant