CN109272165A - Register probability predictor method, device, storage medium and electronic equipment - Google Patents

Register probability predictor method, device, storage medium and electronic equipment Download PDF

Info

Publication number
CN109272165A
CN109272165A CN201811156192.XA CN201811156192A CN109272165A CN 109272165 A CN109272165 A CN 109272165A CN 201811156192 A CN201811156192 A CN 201811156192A CN 109272165 A CN109272165 A CN 109272165A
Authority
CN
China
Prior art keywords
data
user
user behavior
behavior data
prediction model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811156192.XA
Other languages
Chinese (zh)
Other versions
CN109272165B (en
Inventor
沙韬伟
邓金秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Manbang Information Technology Co ltd
Original Assignee
Jiangsu Manyun Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Manyun Software Technology Co Ltd filed Critical Jiangsu Manyun Software Technology Co Ltd
Priority to CN201811156192.XA priority Critical patent/CN109272165B/en
Publication of CN109272165A publication Critical patent/CN109272165A/en
Application granted granted Critical
Publication of CN109272165B publication Critical patent/CN109272165B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Computer Hardware Design (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of registration probability predictor method, device, storage medium and electronic equipment.The registration probability predictor method includes: to obtain the first user behavior data according to User operation log stream;By first user behavior data input housebroken first prediction model, and obtain first prediction model multiple hidden layers data as second user behavioral data;Cross conformation is carried out by importance values calculated at least partly described first user behavior data and obtains third user behavior data;Second behavioral data and the third user behavior data are spliced to obtain fourth line as data.The present invention extracts the technology combined with traditional characteristic using Recognition with Recurrent Neural Network, the behavioral data of user is acquired in real time according to User operation log stream and guarantees efficient result feedback speed, user behavior is modeled under the premise of having both algorithm frame good continuation performance, the probability of the behaviors such as user's registration, purchase, click can be effectively predicted.

Description

Register probability predictor method, device, storage medium and electronic equipment
Technical field
The present invention relates to the registration probability predictor method of computer field more particularly to a kind of Behavior-based control information, device, Storage medium and electronic equipment.
Background technique
In the content-aggregated class APP such as info class application, such as vehicle and goods matching platform, shopping platform, according to a large amount of in history User behavior data and based on the analysis of specific algorithm it is estimated that user for certain a kind of commodity or goods preference.Just For the registration of APP, user is opposite with APP this period is logged in for the last time of short duration for the first time, how to compress each step of user Model under operation calculates the time, and improving feedback frequency is major issue in need of consideration, and the table of conventional model in this respect It is now more mediocre, it is difficult to which that Accurate Prediction user can not learn user to the registration probability of the APP fancy grade of a certain APP.
Summary of the invention
For the problems of the prior art, the purpose of the present invention is to provide a kind of registration probability predictor method, device, deposit Storage media and electronic equipment, with the probability of the behaviors such as registration, purchase, click that user is effectively predicted.
According to an aspect of the present invention, a kind of registration probability predictor method is provided, the registration probability predictor method includes:
The first user behavior data is obtained according to User operation log stream;
First user behavior data is inputted into housebroken first prediction model, and obtains the first prediction mould The data of multiple hidden layers of type are as second user behavioral data;
Cross conformation is carried out by importance values calculated at least partly described first user behavior data and obtains third User behavior data;
Second behavioral data and the third user behavior data are spliced to obtain fourth line as data;
The fourth user behavioral data is inputted into the second prediction model, using the output of second prediction model as use The discreet value of the registration probability at family.
In one embodiment of the present invention, the User operation log stream includes user basic information, user behavior letter Breath and the facility information of user.
In one embodiment of the present invention, first prediction model is RNN model, and the RNN model includes one defeated Enter layer, multiple hidden layers and an output layer, each hidden layer is a GRU unit.
In one embodiment of the present invention, second prediction model is Logic Regression Models.
In one embodiment of the present invention, first prediction model and second prediction model are according to sample data It is trained, the sample data includes user behavior data and user registration state.
In one embodiment of the present invention, it is described at least partly described first user behavior data by calculated heavy The step of progress cross conformation of the property wanted value obtains third user behavior data further comprises:
First user behavior data is divided into fisrt feature data and second feature by importance values calculated Data;
The second feature data are subjected to cross conformation, to form third feature data;
The fisrt feature data and the third feature data constitute the third user behavior data.
In one embodiment of the present invention, the importance values of first user behavior data are calculated by variance evaluation First user behavior data is divided into fisrt feature data and second feature data.
In one embodiment of the present invention, the important of first user behavior data is calculated by xgboost algorithm Property value is to divide into fisrt feature data and second feature data for first user behavior data.
In one embodiment of the present invention, calculated by cross entropy the importance values of first user behavior data with First user behavior data is divided into fisrt feature data and second feature data.
According to another aspect of the present invention, a kind of registration probability estimating device, the registration probability estimating device packet are provided It includes:
Module is obtained, for obtaining the first user behavior data according to User operation log stream;
First prediction model module, for mould to be predicted in first user behavior data input one housebroken first Type, and obtain first prediction model multiple hidden layers data as second user behavioral data
Data configuration module, for being carried out at least partly described first user behavior data by importance values calculated Cross conformation obtains third user behavior data;
Data processing module, for splicing second behavioral data and third behavioral data to obtain fourth line as number According to;
Second prediction model module, for the fourth user behavioral data to be inputted the second prediction model, by described the Discreet value of the output of two prediction models as the registration probability of user.
According to another aspect of the invention, a kind of storage medium is provided, is stored with computer program on the storage medium, The computer program executes step as described above when being run by processor.
According to another aspect of the invention, a kind of electronic equipment is provided.The electronic equipment includes: processor;Storage is situated between Matter, is stored thereon with computer program, and the computer program executes step as described above when being run by the processor.
Registration probability predictor method proposed by the invention is combined using Recognition with Recurrent Neural Network with traditional characteristic extraction Technology acquires the behavioral data of user according to User operation log stream in real time and guarantees efficient result feedback speed, having both User behavior is modeled under the premise of the good continuation performance of algorithm frame, the registration, purchase, click of user can be effectively predicted The probability of equal behaviors.
Detailed description of the invention
Upon reading the detailed description of non-limiting embodiments with reference to the following drawings, other feature of the invention, Objects and advantages will become more apparent upon.
Fig. 1 is the flow chart that probability predictor method is registered in one embodiment of the invention.
Fig. 2 is the flow chart that probability predictor method is registered in another embodiment of the present invention.
Fig. 3 is the structural schematic diagram that probability estimating device is registered in one embodiment of the invention.
Fig. 4 is the structural schematic diagram that probability estimating device is registered in another embodiment of the present invention.
Fig. 5 is the structural schematic diagram of computer readable storage medium in one embodiment of the invention.And
Fig. 6 is the structural schematic diagram of electronic equipment in one embodiment of the invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein;On the contrary, thesing embodiments are provided so that the disclosure will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.Described feature, knot Structure or characteristic can be incorporated in any suitable manner in one or more embodiments.
In addition, attached drawing is only the schematic illustrations of the disclosure, it is not necessarily drawn to scale.Identical attached drawing mark in figure Note indicates same or similar part, thus will omit repetition thereof.Some block diagrams shown in the drawings are function Energy entity, not necessarily must be corresponding with physically or logically independent entity.These function can be realized using software form Energy entity, or these functional entitys are realized in one or more hardware modules or integrated circuit, or at heterogeneous networks and/or place These functional entitys are realized in reason device device and/or microcontroller device.
In order to solve the deficiencies in the prior art, the present invention provide a kind of registration probability predictor method, device, storage medium and Electronic equipment, with the probability of the behaviors such as registration, purchase, click that user is effectively predicted, the registration probability reflection user is to right The fancy grade of a certain APP.Fig. 1 is the flow chart that probability predictor method is registered in one embodiment of the invention.Fig. 2 is of the invention another The flow chart of probability predictor method is registered in one embodiment.Fig. 3 is the knot that probability estimating device is registered in one embodiment of the invention Structure schematic diagram.Fig. 4 is the structural schematic diagram that probability estimating device is registered in another embodiment of the present invention.Fig. 5 is that the present invention one is real Apply the structural schematic diagram of computer readable storage medium in example.And Fig. 6 is the structure of electronic equipment in one embodiment of the invention Schematic diagram.
According to an aspect of the present invention, a kind of registration probability predictor method is provided, as shown in Figure 1, the registration probability is pre- The method of estimating includes:
S110, the first user behavior data is obtained according to User operation log stream.
Specifically, many initial characteristic datas are recite in User operation log stream, these initial characteristic datas usually by Historical user's behavioural information, user basic information, user equipment information etc., which summarize, to be got, and first user behavior data is usual User device type (can specifically include, user nearly seven days browsing times, users often log in place by the initial characteristic data Etc.) pretreatment after obtain.
S120, first user behavior data is inputted into housebroken first prediction model, and obtains described first The data of multiple hidden layers of prediction model are as second user behavioral data.
Specifically, first user behavior data has been subjected to pretreatment and can directly input first prediction at this time Model.In one embodiment of the present invention, first prediction model is RNN model, and the RNN model includes an input Layer, multiple hidden layers and an output layer, each hidden layer are a GRU unit.RNN model, that is, Recognition with Recurrent Neural Network the mould Type, the principle of RNN model are that neural network model is added to the feature of timing.By hidden layer plus feedback side, each hidden layer it is defeated Enter and not only includes current sample characteristics but also include information brought by a upper timing.Each GRU unit includes two doors, a weight Set door and a update door.The result of the two have passed through a sigmoid function, and codomain is [0,1].Candidate hidden state Resetting door has been used to control the inflow of the upper hidden state comprising last time information.If resetting door approximation 0, upper one A hidden state will be dropped.Therefore, resetting door provides the mechanism of discarding with the following unrelated past hidden state, that is, It says, how many information resetting door determines over and pass into silence.Hidden state Ht is using update door Zt come to a upper hidden state Ht-1 and candidate hidden state are updated.Updating door can control past hidden state in the importance at current time.Such as Fruit updates door approximation 1 always, and past hidden state will save all the time by the time and be transferred to current time.This design can To cope with the gradient attenuation problem in Recognition with Recurrent Neural Network, and preferably captures in time series data and be spaced biggish dependence. Resetting door helps to capture the dependence of time series data middle or short term.Updating door helps to capture dependence long-term in time series data Relationship.The knot whether really registered according to user's operation data, user clickstream data and user of the offline storage in HDFS Fruit updates recirculating network GRU and LR model parameter offline, and HDFS, that is, Hadoop distributed file system (HDFS) is designed At being suitble to the distributed file system operated on common hardware (commodity hardware).
S130, cross conformation acquisition is carried out by importance values calculated at least partly described first user behavior data Third user behavior data.
Due to containing much information in first user behavior data, it is therefore necessary to the weight of the much information The property wanted is distinguished.Specifically, it can be calculated by modes such as variance evaluation, xgboost algorithm and cross entropies to described first The importance values of all types of data are distinguished in user behavior data.
S140, second behavioral data and the third user behavior data are spliced to obtain fourth line as number According to.
Specifically, if second behavioral data be [1,0,1,0,0], the third user behavior data be [0,0, 0,1,1], [1,0,1,0,0] and [0,0,0,1,1] are spliced and obtain the fourth line as data: [1,0,1,0,0,0, 0,0,1,1].Certainly, the fourth line is that data can also be by second behavioral data and the third user behavior data It is otherwise calculated, the present invention makes limitation not to this.
S150, the fourth user behavioral data is inputted into the second prediction model, by the output of second prediction model The discreet value of registration probability as user.
In one embodiment of the present invention, second prediction model is Logic Regression Models.The first prediction mould Type and second prediction model are trained according to sample data, and the sample data includes user behavior data and user's note Volume state.Wherein, the Logic Regression Models are disaggregated models common in machine learning, are primarily used to two classification and ask Topic, feature space is mapped to a kind of possibility by it, and in Logic Regression Models, y is a qualitative variable { 0,1 }, and logic is returned Model is returned to be mainly used for studying the probability that certain things occur.
Registration probability predictor method proposed by the invention is combined using Recognition with Recurrent Neural Network with traditional characteristic extraction Technology acquires the behavioral data of user according to User operation log stream in real time and guarantees efficient result feedback speed, having both User behavior is modeled under the premise of the good continuation performance of algorithm frame, the registration, purchase, click of user can be effectively predicted The probability of equal behaviors.
Due to containing much information in first user behavior data, it is therefore necessary to the weight of the much information The property wanted is distinguished.Fig. 2 is the flow chart that probability predictor method is registered in another embodiment of the present invention.As shown in Fig. 2, in this hair In bright another embodiment, step S130 further comprises:
S1310, first user behavior data is divided into fisrt feature data and by importance values calculated Two characteristics.
S1320, the second feature data that importance values are met to preset requirement carry out cross conformation, to form third Characteristic, meanwhile, the fisrt feature data for keeping importance values to be not up to preset requirement are constant.For example there are two class weights The property wanted value meets the second feature data of preset requirement: the age (be divided into greater than 20 years old, less than 20 years old two groups) and gender (point For male, two groups of female), available 4 groups of third feature data, i.e. year are constructed by second feature data cross described in aforementioned two class Age is greater than 20 years old and gender is male, the age is greater than 20 years old and gender is female, age less than 20 years old and gender is that male and age are small In 20 years old and gender was female.
S1330, the third user behavior data is constituted with the fisrt feature data and the third feature data.By This can cannot be obtained completely to avoid a large number of users information.
Furthermore, the importance values of first user behavior data can be calculated by variance evaluation with will be described First user behavior data divides into fisrt feature data and second feature data.
Optionally, the importance values of first user behavior data are calculated by xgboost algorithm with by described first User behavior data divides into fisrt feature data and second feature data.Wherein,
Xgboost has done the Taylor expansion of second order to loss function, and joined regular terms entirety except objective function Optimal solution is sought, to weigh the decline of objective function and the complexity of model, avoids over-fitting.The present invention passes through xgboost In importance values algorithm (importance) realize the calculating of the importance values of the first user behavior data.
Optionally, the importance values of first user behavior data are calculated by cross entropy with by the first user row It is fisrt feature data and second feature data for data separation.Wherein, cross entropy can be made in neural network (machine learning) For loss function, it is assumed that now with two probability distribution p, q in a sample set.Wherein, p indicates the distribution of authentic signature, and q is then It is distributed for the predictive marker of the model after training, cross entropy loss function can measure the similitude of p and q.Pass through calculating as a result, Similitude between first user behavior data is determined with carrying out two classification to the first user behavior data according to classification results The importance values of each first user behavior data are maximum or minimum.Cross entropy be also an advantage that as loss function be using Sigmoid function is avoided that the problem of mean square error loss function learning rate reduces when gradient declines, because of learning rate The error that can be exported controls.Sigmoid function is a common S type function in biology, and also referred to as S type is grown Curve.In information science, since singly properties, the Sigmoid function such as increasing and the increasing of inverse function list are often used as neural network for it Threshold function table, by variable mappings to 0, between 1.
According to another aspect of the present invention, a kind of registration probability estimating device is provided, Fig. 3 is infused in one embodiment of the invention Volume probability estimating device structural schematic diagram.As shown in figure 3, the registration probability estimating device 200 includes: to obtain module 201, the One prediction model module 202, data configuration module 203, data processing module 204 and the second prediction model module 205.It is described to obtain Modulus block 201 is used to obtain the first user behavior data according to User operation log stream.First prediction model module 202 is used for will First user behavior data inputs housebroken first prediction model, and obtains the multiple hidden of first prediction model Data containing layer are as second user behavioral data.The data configuration module 203 is used for at least partly described first user Behavioral data carries out cross conformation by importance values calculated and obtains third user behavior data.The data processing module 204 for splicing second behavioral data and third behavioral data to obtain fourth line as data.Second prediction model Module 205 is used to the fourth user behavioral data inputting the second prediction model, and the output of second prediction model is made For the discreet value of the registration probability of user.The effect of each module, Yi Jicong in registration probability estimating device described in the present embodiment It is general to the registration for obtaining user by the second prediction model module 205 to obtain the first user behavior data of acquisition of module 201 The specific steps and principle of the discreet value of rate are illustrated in the above-described embodiments, therefore are repeated no more.The present invention uses Recognition with Recurrent Neural Network extracts the technology combined with traditional characteristic, acquires the behavior number of user in real time according to User operation log stream According to and guarantee efficient result feedback speed, user behavior is built under the premise of having both algorithm frame good continuation performance The probability of the behaviors such as the registration, purchase, click of user can be effectively predicted in mould.
Fig. 4 is the structural schematic diagram that probability estimating device is registered in another embodiment of the present invention.As shown in figure 4, the registration The same of probability estimating device 200 includes obtaining module 201, the first prediction model module 202, data configuration module 203, data Processing module 204 and the second prediction model module 205.In addition, the data configuration module 203 may further comprise: differentiation mould Block 2031, cross conformation module 2032 and Data Integration module 2033.The acquisition module 201 is used for according to User operation log Stream obtains the first user behavior data.First prediction model module 202 be used for by first user behavior data input once The first trained prediction model, and obtain first prediction model multiple hidden layers data as second user behavior number According to.The data configuration module 203 be used for at least partly described first user behavior data by importance values calculated into Row cross conformation obtains third user behavior data.The data processing module 204 for splice second behavioral data and Third behavioral data is to obtain fourth line as data.The second prediction model module 205 is used for the fourth user behavior Data input the second prediction model, the discreet value by the output of second prediction model as the registration probability of user.It is described Discriminating module is used to first user behavior data dividing into fisrt feature data and second by importance values calculated Characteristic.The cross conformation module is for intersecting the second feature data that importance values meet preset requirement Construction, to form third feature data.The Data Integration module is used for the fisrt feature data and the third feature Data constitute the third user behavior data.The present invention extracts the skill combined with traditional characteristic using Recognition with Recurrent Neural Network Art acquires the behavioral data of user according to User operation log stream in real time and guarantees efficient result feedback speed, calculates having both User behavior is modeled under the premise of the good continuation performance of method frame, registration, purchase, click of user etc. can be effectively predicted The probability of behavior.
In an exemplary embodiment of the present invention, a kind of computer readable storage medium is additionally provided, meter is stored thereon with Calculation machine program, the program may be implemented registration probability described in any one above-mentioned embodiment and estimate when being executed by such as processor The step of method.In some possible embodiments, various aspects of the invention are also implemented as a kind of program product Form comprising program code, when described program product is run on the terminal device, said program code is for making the end End equipment executes the step of the illustrative embodiments various according to the present invention of the above-mentioned registration probability predictor method description of this specification Suddenly.
Fig. 5 is the structural schematic diagram of computer readable storage medium in one embodiment of the invention.Fig. 5 is described according to this hair The program product 300 for realizing the above method of bright embodiment can use portable compact disc read only memory (CD-ROM) it and including program code, and can be run on terminal device, such as PC.However, program of the invention Product is without being limited thereto, and in this document, readable storage medium storing program for executing can be any tangible medium for including or store program, the program Execution system, device or device use or in connection can be commanded.
Described program product 300 can be using any combination of one or more readable mediums.Readable medium can be can Read signal medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared The system of line or semiconductor, device or device, or any above combination.The more specific example of readable storage medium storing program for executing is (non- The list of exhaustion) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), Read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, the read-only storage of portable compact disc Device (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
The computer readable storage medium may include in a base band or the data as the propagation of carrier wave a part are believed Number, wherein carrying readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetism Signal, optical signal or above-mentioned any appropriate combination.Readable storage medium storing program for executing can also be any other than readable storage medium storing program for executing Readable medium, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or Person's program in connection.The program code for including on readable storage medium storing program for executing can transmit with any suitable medium, packet Include but be not limited to wireless, wired, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for executing operation of the present invention can be write with any combination of one or more programming languages Code, described program design language include object oriented program language-Java, C++ etc., further include conventional mistake Formula programming language-such as " C " language or similar programming language.Program code can be calculated fully in user It executes in equipment, partly execute on a user device, executing, as an independent software package partially in user calculating equipment Upper part executes on a remote computing or executes in remote computing device or server completely.It is being related to remotely counting In the situation for calculating equipment, remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network (WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP To be connected by internet).
The present invention extracts the technology combined with traditional characteristic using Recognition with Recurrent Neural Network, real according to User operation log stream When acquisition user behavioral data and guarantee efficient result feedback speed, having both the good continuation performance of algorithm frame Under the premise of to user behavior model, the probability of the behaviors such as the registration, purchase, click of user can be effectively predicted.
In an exemplary embodiment of the present invention, a kind of electronic equipment is also provided, which may include processor, And the memory of the executable instruction for storing the processor.Wherein, the processor is configured to via described in execution Executable instruction is come the step of executing registration probability predictor method described in any one above-mentioned embodiment.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or Program product.Therefore, various aspects of the invention can be embodied in the following forms, it may be assumed that complete hardware embodiment, complete The embodiment combined in terms of full Software Implementation (including firmware, microcode etc.) or hardware and software, can unite here Referred to as circuit, " module " or " system ".
The electronic equipment 400 of this embodiment according to the present invention is described referring to Fig. 6.The electronics that Fig. 6 is shown Equipment 400 is only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in fig. 6, electronic equipment 400 is showed in the form of universal computing device.The component of electronic equipment 400 can wrap It includes but is not limited to: at least one processing unit 410, at least one storage unit 420, (including the storage of the different system components of connection Unit 420 and processing unit 410) bus 430, display unit 440 etc..
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 410 Row, so that the processing unit 410 executes described in the above-mentioned registration probability predictor method part of this specification according to the present invention The step of various illustrative embodiments.For example, the processing unit 410 can execute step as shown in fig. 1.
The storage unit 420 may include the readable medium of volatile memory cell form, such as random access memory Unit (RAM) 4201 and/or cache memory unit 4202 can further include read-only memory unit (ROM) 4203.
The storage unit 420 can also include program/practical work with one group of (at least one) program module 4205 Tool 4204, such program module 4205 includes but is not limited to: operating system, one or more application program, other programs It may include the realization of network environment in module and program data, each of these examples or certain combination.
Bus 430 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Electronic equipment 400 can also be with one or more external equipments 500 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 400 communicate, and/or with make Any equipment (such as the router, modulation /demodulation that the electronic equipment 400 can be communicated with one or more of the other calculating equipment Device etc.) communication.This communication can be carried out by input/output (I/O) interface 450.Also, electronic equipment 400 can be with By network adapter 460 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, Such as internet) communication.Network adapter 460 can be communicated by bus 430 with other modules of electronic equipment 400.It should Understand, although not shown in the drawings, other hardware and/or software module can be used in conjunction with electronic equipment 400, including but unlimited In: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and number According to backup storage system etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the present invention The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server or network equipment etc.) executes the above-mentioned registration of embodiment according to the present invention Probability predictor method.
The present invention extracts the technology combined with traditional characteristic using Recognition with Recurrent Neural Network, real according to User operation log stream When acquisition user behavioral data and guarantee efficient result feedback speed, having both the good continuation performance of algorithm frame Under the premise of to user behavior model, the probability of the behaviors such as the registration, purchase, click of user can be effectively predicted.
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention Protection scope.

Claims (11)

1. a kind of registration probability predictor method characterized by comprising
The first user behavior data is obtained according to User operation log stream;
First user behavior data is inputted into housebroken first prediction model, and obtains first prediction model The data of multiple hidden layers are as second user behavioral data;
Cross conformation is carried out by importance values calculated at least partly described first user behavior data and obtains third user Behavioral data;
Second behavioral data and the third user behavior data are spliced to obtain fourth line as data;
The fourth user behavioral data is inputted into the second prediction model, by the output of second prediction model as user's Register the discreet value of probability.
2. registration probability predictor method according to claim 1, which is characterized in that the User operation log stream includes to use The facility information of family essential information, user behavior information and user.
3. registration probability predictor method according to claim 1, which is characterized in that first prediction model is RNN mould Type, the RNN model include an input layer, multiple hidden layers and an output layer, and each hidden layer is a GRU unit;Institute Stating the second prediction model is Logic Regression Models.
4. registration probability predictor method according to claim 1, which is characterized in that first prediction model and described Two prediction models are trained according to sample data, and the sample data includes user behavior data and user registration state.
5. registration probability predictor method according to claim 1, which is characterized in that described to use at least partly described first Family behavioral data carries out the step of cross conformation obtains third user behavior data by importance values calculated:
First user behavior data is divided into fisrt feature data and second feature data by importance values calculated;
The second feature data are subjected to cross conformation, to form third feature data;
The fisrt feature data and the third feature data constitute the third user behavior data.
6. registration probability predictor method according to claim 5, which is characterized in that calculate described first by variance evaluation The importance values of user behavior data are to divide into fisrt feature data and second feature number for first user behavior data According to.
7. registration probability predictor method according to claim 5, which is characterized in that by described in the calculating of xgboost algorithm The importance values of first user behavior data are to divide into fisrt feature data and the second spy for first user behavior data Levy data.
8. registration probability predictor method according to claim 5, which is characterized in that calculate described first by cross entropy and use The importance values of family behavioral data are to divide into fisrt feature data and second feature data for first user behavior data.
9. a kind of registration probability estimating device characterized by comprising
Module is obtained, for obtaining the first user behavior data according to User operation log stream;
First prediction model module, for first user behavior data to be inputted housebroken first prediction model, and The data of multiple hidden layers of first prediction model are obtained as second user behavioral data
Data configuration module, for intersecting at least partly described first user behavior data by importance values calculated Construction obtains third user behavior data;
Data processing module, for splicing second behavioral data and third behavioral data to obtain fourth line as data;
Second prediction model module, it is pre- by described second for the fourth user behavioral data to be inputted the second prediction model Survey the discreet value of registering probability of the output as user of model.
10. a kind of storage medium, which is characterized in that be stored with computer program, the computer program on the storage medium Step as claimed in any one of claims 1 to 8 is executed when being run by processor.
11. a kind of electronic equipment, which is characterized in that the electronic equipment includes:
Processor;
Storage medium is stored thereon with computer program, and such as right is executed when the computer program is run by the processor It is required that 1 to 8 described in any item steps.
CN201811156192.XA 2018-09-30 2018-09-30 Registration probability estimation method and device, storage medium and electronic equipment Active CN109272165B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811156192.XA CN109272165B (en) 2018-09-30 2018-09-30 Registration probability estimation method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811156192.XA CN109272165B (en) 2018-09-30 2018-09-30 Registration probability estimation method and device, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN109272165A true CN109272165A (en) 2019-01-25
CN109272165B CN109272165B (en) 2021-04-20

Family

ID=65195482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811156192.XA Active CN109272165B (en) 2018-09-30 2018-09-30 Registration probability estimation method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN109272165B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110288096A (en) * 2019-06-28 2019-09-27 江苏满运软件科技有限公司 Prediction model training and prediction technique, device, electronic equipment, storage medium
CN110674188A (en) * 2019-09-27 2020-01-10 支付宝(杭州)信息技术有限公司 Feature extraction method, device and equipment
CN112950353A (en) * 2021-02-08 2021-06-11 北京淇瑀信息科技有限公司 User strategy generation method and device based on 7-day movement support model and electronic equipment

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080027890A1 (en) * 2006-07-31 2008-01-31 Microsoft Corporation Bayesian probability accuracy improvements for web traffic predictions
CN106407694A (en) * 2016-09-28 2017-02-15 湖南老码信息科技有限责任公司 Neurasthenia prediction method and prediction system based on incremental neural network model
CN106503805A (en) * 2016-11-14 2017-03-15 合肥工业大学 A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method
CN107153887A (en) * 2017-04-14 2017-09-12 华南理工大学 A kind of mobile subscriber's behavior prediction method based on convolutional neural networks
CN107168945A (en) * 2017-04-13 2017-09-15 广东工业大学 A kind of bidirectional circulating neutral net fine granularity opinion mining method for merging multiple features
CN107180284A (en) * 2017-07-07 2017-09-19 北京航空航天大学 A kind of SPOC student based on learning behavior feature shows weekly Forecasting Methodology and device
CN107222787A (en) * 2017-06-02 2017-09-29 中国科学技术大学 Video resource popularity prediction method
CN107330445A (en) * 2017-05-31 2017-11-07 北京京东尚科信息技术有限公司 The Forecasting Methodology and device of user property
CN108090607A (en) * 2017-12-13 2018-05-29 中山大学 A kind of social media user's ascribed characteristics of population Forecasting Methodology based on the fusion of multi-model storehouse
CN108121795A (en) * 2017-12-20 2018-06-05 北京奇虎科技有限公司 User's behavior prediction method and device
CN108256757A (en) * 2018-01-10 2018-07-06 链家网(北京)科技有限公司 A kind of source of houses conclusion of the business predictor method based on xgboost and estimate platform

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080027890A1 (en) * 2006-07-31 2008-01-31 Microsoft Corporation Bayesian probability accuracy improvements for web traffic predictions
CN106407694A (en) * 2016-09-28 2017-02-15 湖南老码信息科技有限责任公司 Neurasthenia prediction method and prediction system based on incremental neural network model
CN106503805A (en) * 2016-11-14 2017-03-15 合肥工业大学 A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method
CN107168945A (en) * 2017-04-13 2017-09-15 广东工业大学 A kind of bidirectional circulating neutral net fine granularity opinion mining method for merging multiple features
CN107153887A (en) * 2017-04-14 2017-09-12 华南理工大学 A kind of mobile subscriber's behavior prediction method based on convolutional neural networks
CN107330445A (en) * 2017-05-31 2017-11-07 北京京东尚科信息技术有限公司 The Forecasting Methodology and device of user property
CN107222787A (en) * 2017-06-02 2017-09-29 中国科学技术大学 Video resource popularity prediction method
CN107180284A (en) * 2017-07-07 2017-09-19 北京航空航天大学 A kind of SPOC student based on learning behavior feature shows weekly Forecasting Methodology and device
CN108090607A (en) * 2017-12-13 2018-05-29 中山大学 A kind of social media user's ascribed characteristics of population Forecasting Methodology based on the fusion of multi-model storehouse
CN108121795A (en) * 2017-12-20 2018-06-05 北京奇虎科技有限公司 User's behavior prediction method and device
CN108256757A (en) * 2018-01-10 2018-07-06 链家网(北京)科技有限公司 A kind of source of houses conclusion of the business predictor method based on xgboost and estimate platform

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋巍,等: "基于兴趣偏好的微博用户性别推断研究", 《电子学报》 *
谢敏敏: "利用数据挖掘技术提高电力客户的满意度", 《电力讯息》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110288096A (en) * 2019-06-28 2019-09-27 江苏满运软件科技有限公司 Prediction model training and prediction technique, device, electronic equipment, storage medium
CN110288096B (en) * 2019-06-28 2021-06-08 满帮信息咨询有限公司 Prediction model training method, prediction model training device, prediction model prediction method, prediction model prediction device, electronic equipment and storage medium
CN110674188A (en) * 2019-09-27 2020-01-10 支付宝(杭州)信息技术有限公司 Feature extraction method, device and equipment
CN112950353A (en) * 2021-02-08 2021-06-11 北京淇瑀信息科技有限公司 User strategy generation method and device based on 7-day movement support model and electronic equipment

Also Published As

Publication number Publication date
CN109272165B (en) 2021-04-20

Similar Documents

Publication Publication Date Title
CN110796190B (en) Exponential modeling with deep learning features
CN110832499B (en) Weak supervision action localization through sparse time pooling network
US11138376B2 (en) Techniques for information ranking and retrieval
CN108846077A (en) Semantic matching method, device, medium and the electronic equipment of question and answer text
CN109863488A (en) The device/server of Neural Network Data input system is disposed
CN110472675A (en) Image classification method, image classification device, storage medium and electronic equipment
US11902043B2 (en) Self-learning home system and framework for autonomous home operation
CN110059794A (en) Man-machine recognition methods and device, electronic equipment, storage medium
US20220114497A1 (en) Smart copy optimization in customer acquisition and customer management platforms
CN109272165A (en) Register probability predictor method, device, storage medium and electronic equipment
Xiao et al. Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network
US11586838B2 (en) End-to-end fuzzy entity matching
CN109710760A (en) Clustering method, device, medium and the electronic equipment of short text
US20210081800A1 (en) Method, device and medium for diagnosing and optimizing data analysis system
CN113642740A (en) Model training method and device, electronic device and medium
US11727686B2 (en) Framework for few-shot temporal action localization
CN110348581B (en) User feature optimizing method, device, medium and electronic equipment in user feature group
CN114925938B (en) Electric energy meter running state prediction method and device based on self-adaptive SVM model
CN116208399A (en) Network malicious behavior detection method and device based on metagraph
CN114492465B (en) Dialogue generation model training method and device, dialogue generation method and electronic equipment
US20220027400A1 (en) Techniques for information ranking and retrieval
CN110020195A (en) Article recommended method and device, storage medium, electronic equipment
WO2024120504A1 (en) Data processing method and related device
EP4095785A1 (en) Classification and prediction of online user behavior using hmm and lstm
US20230419104A1 (en) High dimensional dense tensor representation for log data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210401

Address after: No.123, Kaifa Avenue, Guiyang Economic and Technological Development Zone, 550000, Guizhou Province

Applicant after: Man Bang Information Consulting Co.,Ltd.

Address before: 210012 3-5 / F, building 4, 170-1 software Avenue, Yuhuatai District, Nanjing City, Jiangsu Province

Applicant before: JIANGSU MANYUN SOFTWARE TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: No.123, Kaifa Avenue, Guiyang Economic and Technological Development Zone, 550000, Guizhou Province

Patentee after: Manbang Information Technology Co.,Ltd.

Address before: No.123, Kaifa Avenue, Guiyang Economic and Technological Development Zone, 550000, Guizhou Province

Patentee before: Man Bang Information Consulting Co.,Ltd.

CP01 Change in the name or title of a patent holder