CN110046981A - A kind of credit estimation method, device and storage medium - Google Patents

A kind of credit estimation method, device and storage medium Download PDF

Info

Publication number
CN110046981A
CN110046981A CN201810036839.9A CN201810036839A CN110046981A CN 110046981 A CN110046981 A CN 110046981A CN 201810036839 A CN201810036839 A CN 201810036839A CN 110046981 A CN110046981 A CN 110046981A
Authority
CN
China
Prior art keywords
information
feature
space
user
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810036839.9A
Other languages
Chinese (zh)
Other versions
CN110046981B (en
Inventor
叶方华
张宗一
凌国惠
郑子彬
温志远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810036839.9A priority Critical patent/CN110046981B/en
Publication of CN110046981A publication Critical patent/CN110046981A/en
Application granted granted Critical
Publication of CN110046981B publication Critical patent/CN110046981B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a kind of credit estimation method, device and storage medium, the isomeric data is classified by obtaining isomeric data related to user, obtains multiple classification informations by the embodiment of the present invention;The corresponding vector information of each classification information in the multiple classification information is obtained, the corresponding feature space of each classification information is obtained according to the vector information;The corresponding feature space of each classification information is each mapped to nuclear space, obtains multiple nuclear space;The multiple nuclear space is subjected to multicore linear combination processing, obtains synthesis nuclear space;The credit evaluation result of the user is obtained according to the synthesis nuclear space.The multiple classification informations that can classify from the isomeric data of user in the program accurately assess the credit of user, it effectively overcomes and the information of user is uniformly processed, and the defect assessed according to all feature direct splicings at the credit to user such as eigenmatrix, so that assessment result is relatively reliable.

Description

A kind of credit estimation method, device and storage medium
Technical field
The present invention relates to Internet technical fields, and in particular to a kind of credit estimation method, device and storage medium.
Background technique
With being growing for rapid development of economy and fiduciary loan scale, credit becomes focus concerned by people. And since the risk of fiduciary loan gradually increases, the credit of user is assessed, to effective identification credit risk and evades gold Melt the adverse effect such as crisis, and keep the normal operation of fiduciary loan and financial market, or even maintenance national economy is held Continuous steady growth is all of great significance.
Currently, usually from user's, excavating the friendship of user during the credit to user is assessed Then easy historical data, the academic different types of information such as information and property quantity these information are uniformly processed, i.e., will These all types information carry out unified structure by the same model training, and extract feature, secondly, directly inciting somebody to action The merging features arrived form eigenmatrix, finally, calculating the credit scoring of user according to eigenmatrix.
Since directly the information of user being uniformly processed, when by all types information unification structuring, hold Easily there is the problems such as loss of learning and information errors;Also, it is all types information is very possible by the same model training Lead to error accumulation and error counteracting, so that finally according to the eigenmatrix of feature direct splicing, the letter that is calculated It is very inaccurate with scoring.
Summary of the invention
The embodiment of the present invention provides a kind of credit estimation method, device and storage medium, it is intended to improve the standard of credit evaluation True property.
In order to solve the above technical problems, the embodiment of the present invention the following technical schemes are provided:
A kind of credit estimation method, comprising:
Isomeric data related to user is obtained, the isomeric data is classified, obtains multiple classification informations;
The corresponding vector information of each classification information in the multiple classification information is obtained, is obtained according to the vector information The corresponding feature space of each classification information;
The corresponding feature space of each classification information is each mapped to nuclear space, obtains multiple nuclear space;
The multiple nuclear space is subjected to multicore linear combination processing, obtains synthesis nuclear space;
The credit evaluation result of the user is obtained according to the synthesis nuclear space.
A kind of credit evaluation device, comprising:
The isomeric data is classified, is obtained for obtaining isomeric data related to user by information acquisition unit Multiple classification informations;
Feature acquiring unit, for obtaining the corresponding vector information of each classification information in the multiple classification information, root The corresponding feature space of each classification information is obtained according to the vector information;
First map unit is obtained for the corresponding feature space of each classification information to be each mapped to nuclear space To multiple nuclear space;
Synthesis unit obtains synthesis nuclear space for the multiple nuclear space to be carried out multicore linear combination processing;
Assessment unit, for obtaining the credit evaluation result of the user according to the synthesis nuclear space.
Optionally, the feature acquiring unit includes:
Subelement is extracted, for extracting candidate categories information from the multiple classification information;
Subelement is obtained, is had for constructing the corresponding oriented cum rights network of each candidate categories information, and described in acquisition To the node vector of cum rights network;
First generates subelement, for generating the corresponding feature space of the classification information according to the node vector.
Optionally, the candidate categories information includes Transaction Information, and the acquisition subelement includes:
Record obtains module, for obtaining transfer accounts record and the mobile payment record of the Transaction Information;
Constructing module records corresponding oriented cum rights network for constructing described transfer accounts;
Node vector obtains module, for obtaining the node vector of the oriented cum rights network;
Feature vector obtains module, for obtaining the feature vector of the mobile payment record;
The first generation subelement is specifically used for: generating the transaction according to the node vector and described eigenvector The corresponding feature space of information.
Optionally, the node vector acquisition module includes:
First computational submodule estimates connection probability between every two node for calculating in the oriented cum rights network And experience connects probability;
Second computational submodule, for calculating the distributional difference estimated between connection probability and experience connection probability, Obtain first object function;
Third computational submodule is estimated generally for calculating the context in the oriented cum rights network between every two node Rate and context empirical probability;
4th computational submodule, it is poor for calculating the distribution that the context is estimated between probability and context empirical probability It is different, obtain the second objective function;
Acquisition submodule, for obtaining the oriented cum rights according to the first object function and second objective function The node vector of network.
Optionally, the acquisition submodule is specifically used for:
The first object function is optimized by stochastic gradient descent algorithm, obtains the knot under first object function Point low-dimensional vector;
Second objective function is optimized by stochastic gradient descent algorithm, obtains the knot under the second objective function Point low-dimensional vector;
By the node low-dimensional vector under the node low-dimensional vector and second objective function under the first object function Spliced, obtains the node vector of the oriented cum rights network.
Optionally, described eigenvector acquisition module includes:
Feature acquisition submodule, the multidimensional for obtaining the mobile payment record pay feature;
Encoding submodule gets paid encoded information for encoding to multidimensional payment feature;
Submodule is generated, for generating the feature vector of the payment record according to the payment encoded information.
Optionally, the encoding submodule is specifically used for:
It is that value type pays feature that the multidimensional, which is paid the non-numeric type payment Feature Conversion in feature,;
By the value type being converted to pay feature and the multidimensional payment feature in value type pay feature into Row sliding-model control, gets paid encoded information.
Optionally, the candidate categories information includes behavioural information, and the acquisition subelement is specifically used for:
Obtain the multidimensional behavioural characteristic of the behavioural information;
Construct the corresponding oriented cum rights network of every dimension behavioural characteristic;
Obtain the node vector of each oriented cum rights network;
The first generation subelement is specifically used for: according to the node vector of each oriented cum rights network, generating behavior letter Cease corresponding feature space.
Optionally, the classification information includes attribute information, and the feature acquiring unit includes:
Feature obtains subelement, for obtaining the multidimensional property feature of attribute information in the multiple classification information;
Coded sub-units obtain attribute coding's information for encoding to the multidimensional property feature;
Second generates subelement, empty for generating the corresponding feature of the attribute information according to attribute coding's information Between.
Optionally, the coded sub-units are specifically used for:
By the non-numeric type attributive character in the multidimensional property feature, value type attributive character is converted to;
It is special according to the value type attribute in the value type attributive character and the multidimensional property feature being converted to Sign generates attribute coding's information.
Optionally, the synthesis unit is specifically used for:
The multiple nuclear space is normalized, multiple normalization nuclear space are obtained;
Obtain the corresponding weighted value of each classification information in the multiple classification information;
It is carried out at multicore linear combination according to each corresponding weighted value of classification information and each normalization nuclear space Reason obtains synthesis nuclear space.
Optionally, the assessment unit is specifically used for:
By preset regression model, the credit scoring of the user is calculated according to the synthesis nuclear space.
Optionally, the credit evaluation device further include:
The training sample set is divided into multiple classification informations for obtaining training sample set by information collection acquiring unit Collection;
Second map unit, for the multiple classification information collection to be mapped to nuclear space;
Objective function generation unit, for generating objective function according to the nuclear space and preset regression function;
Model generation unit is generated and is returned for being handled by Lagrange duality algorithm the objective function Model.
A kind of storage medium, the storage medium are stored with a plurality of instruction, and described instruction is suitable for processor and is loaded, with Execute the step in above-mentioned credit estimation method.
The embodiment of the present invention can classify isomeric data related to user, to be divided into multiple classification informations; Then, the corresponding feature space of each classification information is obtained, and the corresponding feature space of each classification information is each mapped to Obtained multiple nuclear space are carried out multicore linear combination processing by nuclear space, obtain synthesis nuclear space;So as to according to synthesis The credit evaluation result of nuclear space acquisition user.The multiple classifications letter that can classify from the isomeric data of user in the program Breath accurately assesses the credit of user, effectively overcomes and the information of user is uniformly processed, and according to all The defect that feature direct splicing is assessed at the credit to user such as eigenmatrix, so that assessment result is relatively reliable.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.
Fig. 1 is the schematic diagram of a scenario of credit evaluation system provided in an embodiment of the present invention;
Fig. 2 is the flow diagram of credit estimation method provided in an embodiment of the present invention;
Fig. 3 is the schematic diagram of the oriented cum rights network provided in an embodiment of the present invention for constructing record of transferring accounts;
Fig. 4 is another flow diagram of credit estimation method provided in an embodiment of the present invention;
Fig. 5 is another flow diagram of credit estimation method provided in an embodiment of the present invention
Fig. 6 is the structural schematic diagram of the feature space of Transaction Information provided in an embodiment of the present invention;
Fig. 7 is the structural schematic diagram of the feature space of behavioural information provided in an embodiment of the present invention;
Fig. 8 is the structural schematic diagram of the feature space of attribute information provided in an embodiment of the present invention;
Fig. 9 is the structural schematic diagram of credit evaluation device provided in an embodiment of the present invention;
Figure 10 is another structural schematic diagram of credit evaluation device provided in an embodiment of the present invention;
Figure 11 is another structural schematic diagram of credit evaluation device provided in an embodiment of the present invention;
Figure 12 is another structural schematic diagram of credit evaluation device provided in an embodiment of the present invention;
Figure 13 is the structural schematic diagram of server provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those skilled in the art's every other implementation obtained without creative efforts Example, shall fall within the protection scope of the present invention.
In the following description, specific embodiments of the present invention will refer to the step as performed by one or multi-section computer And symbol illustrates, unless otherwise stating clearly.Therefore, these steps and operation will have to mention for several times is executed by computer, this paper institute The computer execution of finger includes by representing with the computer processing unit of the electronic signal of the data in a structured form Operation.This operation is converted at the data or the position being maintained in the memory system of the computer, reconfigurable Or in addition change the running of the computer in mode known to the tester of this field.The maintained data structure of the data For the provider location of the memory, there is the specific feature as defined in the data format.But the principle of the invention is with above-mentioned text Word illustrates that be not represented as a kind of limitation, this field tester will appreciate that plurality of step and behaviour as described below Also it may be implemented in hardware.
The embodiment of the present invention provides a kind of credit estimation method, device and storage medium.
Referring to Fig. 1, Fig. 1 is the schematic diagram of a scenario of credit evaluation system provided by the embodiment of the present invention, which is commented Estimating system may include credit evaluation device, which specifically can integrate in the server, be mainly used for obtaining The isomeric data of user, which can be server in real time or interval preset time receives the isomery number that terminal reports According to, can also by the isomeric data store in the database;Wherein, which may include the record of transferring accounts of user, uses The mobile payment record at family, user thumb up the frequency, the hair message frequency of user, the call frequency of user, the gender of user, use The age at family, the city of residence of user, the record of transferring accounts of user good friend, the call frequency of user good friend and user good friend letter With etc. information.Then, isomeric data is classified, obtains multiple classification informations, for example, obtaining classification information A, classification information B and classification information C etc.;Secondly, the corresponding vector information of each classification information in multiple classification informations is obtained, according to vector information The corresponding feature space of each classification information is obtained, for example, available feature space A, feature space B and feature space C etc.; The corresponding feature space of each classification information is each mapped to nuclear space again, obtains multiple nuclear space, for example, available core Space A, nuclear space B and nuclear space C etc.;Again, multiple nuclear space are subjected to multicore linear combination processing, obtain synthetic kernel sky Between, finally, obtaining the credit evaluation of user as a result, the credit evaluation result can be credit scoring or letter according to synthesis nuclear space With grading etc.;Etc..
In addition, the credit evaluation system can also include terminal, which may include tablet computer, mobile phone, notebook Computer and desktop computer etc. have storage element and are equipped with microprocessor and the terminal with operational capability, and the terminal is main For the isomeric data of user to be reported to server, which can be stored the isomeric data received to database In.
It should be noted that the schematic diagram of a scenario of credit evaluation system shown in FIG. 1 is only an example, the present invention is real The credit evaluation system and scene of applying example description are the technical solutions in order to more clearly illustrate the embodiment of the present invention, not The restriction for technical solution provided in an embodiment of the present invention is constituted, those of ordinary skill in the art are it is found that with credit evaluation The differentiation of system and the appearance of new business scene, technical solution provided in an embodiment of the present invention is for similar technical problem, together Sample is applicable in.
It is described in detail separately below.
In the present embodiment, it will be described from the angle of credit evaluation device, which can specifically collect At in the network equipments such as server or gateway.
A kind of credit estimation method, comprising: obtain isomeric data related to user, isomeric data is classified, obtain To multiple classification informations;The corresponding vector information of each classification information in multiple classification informations is obtained, is obtained according to vector information The corresponding feature space of each classification information;The corresponding feature space of each classification information is each mapped to nuclear space, is obtained Multiple nuclear space;Multiple nuclear space are subjected to multicore linear combination processing, obtain synthesis nuclear space;It is obtained according to synthesis nuclear space The credit evaluation result of user.
Referring to Fig. 2, Fig. 2 is the flow diagram for the credit estimation method that one embodiment of the invention provides.The credit is commented Estimating processing method may include:
In step s101, isomeric data related to user is obtained, isomeric data is classified, obtains multiple classifications Information.
Wherein, user can be individual, be also possible to enterprise etc..When user is individual, isomeric data may include using The gender at family, the account number of user, the age of user, the city of residence of user, the comment frequency of user, the hair message of user The frequency, the call frequency of user, the record of transferring accounts of user, the mobile payment record of user, the gender of user good friend, user good friend Call frequency, the credit of user good friend, the age of user good friend and user good friend record etc. of transferring accounts.When user is enterprise When, isomeric data may include business duration, business ground, income record, transfer accounts record and mobile payment record etc..Below will It is described in detail so that user is individual as an example.
Optionally, in one embodiment, credit evaluation device can obtain user's by crawler technology from internet Isomeric data;In another embodiment, credit evaluation device can be opened by social platform (for example, wechat, microblogging and QQ etc.) The application programming interface (Application Programming Interface, API) put obtains the isomery number of user According to.
It should be noted that in order to protect privacy of user, the embodiment of the present invention can be by the isomeric data of user by turning The processing such as code and desensitization, for example, the account number of user can be handled by Hash, obtains a string of longer character strings to indicate The account number of user.Therefore, the isomeric data of the user of statistics of the embodiment of the present invention was after transcoding and desensitization etc. are handled Information, thus achieve the purpose that protect privacy of user.
After obtaining the isomeric data of user, the isomeric data of user can be classified, obtain multiple classification informations, Wherein, different classification informations can be according to the attributive character of isomeric data, purposes, the source of generation, acquisition modes or locating Platform etc. is classified to obtain, and specific mode classification is not construed as limiting here.For example, the isomeric data of user can be divided into Multiple and different classification informations such as Transaction Information, behavioural information and attribute information.Wherein, Transaction Information may include turning for user The transfer accounts record and mobile payment record etc. of account record and mobile payment record, user good friend, behavioural information may include user Thumb up the frequency, the frequency of making comments of user, the hair message frequency of user, the voice communication frequency of user, user video Call frequency, the hair message frequency of user good friend and voice communication frequency of user good friend etc., attribute information may include user Gender, the local of user, the age of user, the city of residence of user, the gender of user good friend, user good friend inhabitation city City and the age of user good friend etc..It is understood that a classification information is a type of isomeric data, multiple classes Other information can respectively correspond a variety of different types of isomeric datas.
It should be noted that the information that can use similar users carries out completion for the isomeric data of missing.For example, When the gender of user A missing, the similitude between each user can be measured by Euclidean distance, is determined and user A Most like similar users, then using the gender of similar users as the gender of user A.
In step s 102, the corresponding vector information of each classification information in multiple classification informations is obtained, is believed according to vector Breath obtains the corresponding feature space of each classification information.
After the isomeric data of user is divided into multiple classification informations, credit evaluation device can be respectively to each classification Information is handled, the corresponding feature space of each classification information is obtained, wherein feature space can be from classification information to The vector space of information composition is measured, which may include feature vector and node vector etc..
Since different classification informations has the characteristics of isomery, each classification information can be obtained using different algorithms Corresponding feature space.
In some embodiments, obtain the corresponding vector information of each classification information in multiple classification informations, according to The step of amount each classification information of acquisition of information corresponding feature space may include:
(1) candidate categories information is extracted from multiple classification informations;
(2) the corresponding oriented cum rights network of each candidate categories information is constructed, and obtains the node of oriented cum rights network Vector;
(3) the corresponding feature space of classification information is generated according to node vector.
Specifically, in order to improve the efficiency handled multiple classification informations, credit evaluation device is first from multiple classes Candidate categories information is extracted in other information, wherein the candidate categories information may include one or more classification informations, the time Select classification information can be by oriented its feature space of cum rights network query function.For example, classification information A, classification information B, classification are believed It ceases in C, classification information D and classification information E, classification information A, classification information B and classification information C can pass through construction if it exists Oriented cum rights network calculates its feature space, then can extract classification information A, classification information B and classification information C as waiting Select classification information.
Then, credit evaluation device constructs the corresponding oriented cum rights network of each candidate categories information, for example, structure classes The corresponding oriented cum rights network G of information AAAre as follows: GA=(VA,EA,WA), wherein VAFor oriented cum rights network GANode, Mei Gejie Point can indicate a user;EAFor oriented cum rights network GASide, each side can indicate existing view between two users Angle information A;WAIt can indicate side EAWeight.
Secondly, credit evaluation device can pass through after obtaining the corresponding oriented cum rights network of each candidate categories information Extensive information network embedded mobile GIS (Large-scale Information Network Embedding, LINE), is based on one Rank approximate (i.e. First-order is approximate) and Two-order approximation (i.e. Second-order is approximate) calculate the node of oriented cum rights network The network node of oriented cum rights network is characterized as low-dimensional vector by vector.Wherein, in oriented cum rights network, two are connected Node similitude it is higher (i.e. single order similitude is higher), two similitudes that are not connected but having many public neighbor nodes Relatively high (i.e. second order similitude is also relatively high) can be very good study by LINE algorithm and arrive both similitudes, thus LINE algorithm remains the information that original oriented cum rights network is included well.
Finally, credit evaluation device can be raw according to the node vector of the corresponding oriented cum rights network of candidate categories information At the corresponding feature space of candidate categories information, for example, can directly set node vector to the spy of candidate categories information Space is levied, the processing such as can also optimize or screen to node vector, treated that node vector is set as candidate categories by general The feature space of information.
It should be noted that for a candidate categories information one or more oriented cum rights networks can be constructed, when one When a candidate categories information structuring goes out multiple oriented cum rights networks, can calculate separately the node of each oriented cum rights network to Amount, obtains multiple node vectors, using multiple node vectors as the corresponding node vector of a candidate categories information, Ke Yigen The corresponding feature space of candidate categories information is generated according to the node vector.
In some embodiments, candidate categories information may include Transaction Information, construct each candidate categories information pair The oriented cum rights network answered, and the node vector of oriented cum rights network is obtained, it is corresponding to generate classification information according to node vector Feature space the step of may include:
(a) transfer accounts record and the mobile payment record of Transaction Information are obtained;
(b) construction, which is transferred accounts, records corresponding oriented cum rights network;
(c) node vector of oriented cum rights network is obtained;
(d) feature vector of mobile payment record is obtained;
(e) the corresponding feature space of Transaction Information is generated according to node vector and feature vector.
By taking candidate categories information is Transaction Information as an example, wherein Transaction Information may include transfer accounts record and mobile payment Record etc., credit evaluation device can extract the information such as record and mobile payment record of transferring accounts from Transaction Information, this is transferred accounts Record may include one or more transferring accounts record, and mobile payment record may include one or more mobile payment record.
It should be noted that in order to protect privacy of user, the embodiment of the present invention can be passed through the classification informations such as Transaction Information The processing such as transcoding and desensitization is crossed, therefore, the classification information of statistics of the embodiment of the present invention was that treated for transcoding and desensitization etc. Information, to achieve the purpose that protect privacy of user.
Then, credit evaluation device construction, which is transferred accounts, records corresponding oriented cum rights network G are as follows: G=(V, E, W), wherein V Indicate that the node in oriented cum rights network G, each node represent a user;E indicates the side in oriented cum rights network G, each Side represents record of transferring accounts between two users;W indicates the weight on side, represents transfer amounts.
For example, as shown in figure 3, by taking record of transferring accounts as an example, it is assumed that there are user u1, user u2, use in oriented cum rights network G Family u3, user u4, user u5 and user u6 can not only have 6 users certainly in oriented cum rights network G, the present embodiment is For ease of description for example, should not be understood as the restriction to number of users, but regardless of how many user, oriented band Power network G construction process be all it is similar, can be understood according to the example.The Compass of the tail portion of arrow in Fig. 3 Show the side user for producing the amount of money, the direction of arrow stem indicates to receive another party user of the amount of money, the side between two users Existing a, b, c, d, e, f, g and h indicate the amount of money transferred accounts, and can know from Fig. 3, and user u1 has transferred accounts a gold to user u2 Volume, user u5 have transferred accounts the e amount of money to user u1, and user u3 has transferred accounts f amount of money, etc. to user u5.
Secondly, credit evaluation device, which obtains to construct to transfer accounts, records the node vector of corresponding oriented cum rights network, at this time may be used To be based on First-order approximation and Second-order approximation by LINE algorithm, network node is characterized as low-dimensional vector, Obtain node vector.
It should be noted that the similitude between two nodes can indicate that user transfers accounts the similitude of record, if to one The sample of users that a little initial nodes provide credit evaluation (i.e. after getting the isomeric data of user, selects a part of letter at random Breath is used as sample information, is manually marked to the credit level of user corresponding to these sample informations), then it can pass through it The similitude of his node and sample of users measures the credit evaluations of other users, i.e., is finishing internet startup disk using LINE algorithm Later, each node in oriented cum rights network G is indicated with the vector of a low-dimensional, therefore can be measured with Euclidean distance Similitude between user.
In addition, mobile payment record refers to that user consumes by social platform (such as wechat) or puts down by online payment Record caused by the online payments such as platform (such as Alipay and shopping website).One mobile payment record may include user Mark, the type of payment for merchandise, payment amount pay shop and the incident timestamp of friendship etc..
After obtaining mobile payment record, the feature vector of the available mobile payment record of credit evaluation device, for example, Can first extract mobile payment record in payment feature, the payment feature may include user identifier, the type of payment for merchandise, Then payment amount, payment shop and the incident timestamp of friendship etc. carry out numeralization processing to these payment features, according to number Feature after value generates feature vector.
Finally, what credit evaluation device can be recorded according to the node vector of obtained record of transferring accounts and mobile payment Feature vector generates the corresponding feature space of Transaction Information, for example, the node vector for record of transferring accounts and mobile payment can be remembered The feature vector of record is spliced, and the corresponding feature space of Transaction Information is obtained.
Optionally, the step of obtaining the node vector of oriented cum rights network may include:
It calculates in oriented cum rights network and estimates connection probability and experience connection probability between every two node;
The distributional difference estimated between connection probability and experience connection probability is calculated, first object function is obtained;
It calculates the context in oriented cum rights network between every two node and estimates probability and context empirical probability;
It calculates context and estimates the distributional difference between probability and context empirical probability, obtain the second objective function;
The node vector of oriented cum rights network is obtained according to first object function and the second objective function.
Specifically, credit evaluation device passes through LINE algorithm first and records corresponding oriented cum rights network to transferring accounts and carry out First-order is approximate:
Calculate in oriented cum rights network and estimate connection probability between every two node, this estimate connection probability can be it is low Dimension space, can be as follows shown in formula (1):
Wherein, p1(vi,vj) indicate node viWith node vjBetween estimate connection probability, viAnd vjRefer to oriented cum rights network In two nodes, viAnd vjBetween have a line, i.e., the side (v in oriented cum rights networki,vj), uiIt refers to calculating by LINE Method obtains node viIt is indicated in the vector of lower dimensional space, ujIt refers to obtaining node v by LINE algorithmjLower dimensional space to Amount indicates that the transposition of T direction amount, exp is indicated using natural constant e as the exponential function at bottom.
It is interconnected that the meaning that experience connection probability indicates, which can be in oriented cum rights network between every two node, Probability, experience connection probability can be higher dimensional space, calculate the experience connection in oriented cum rights network between every two node Probability, can be as follows shown in formula (2):
Wherein,Expression experience connects probability, wi,jFor node v in oriented cum rights networkiWith node vjBetween side Weight, W are the sum of each side right weight in oriented cum rights network, that is,
Then, the distributional difference estimated between connection probability and experience connection probability is calculated, first object function is obtained, it can Shown in following formula (3):
Wherein, O1Indicate first object function,Expression experience connects probability, p1() indicates to estimate connection generally Rate, d () indicate to calculate the KL- divergence (Kullback-for estimating and being distributed between connection probability and experience connection probability Leibler divergence), KL- divergence is also known as relative entropy, and KL- divergence is specifically defined can be as follows shown in formula (4):
It should be noted that one due to internet startup disk is wanted after obtaining estimating connection probability and experience connection probability Ask that be exactly node retain as far as possible in the space of the information after insertion in the former space of oriented cum rights network, therefore between node Range distribution should also be kept, if therefore two nodes be in original oriented cum rights network it is interconnected, be embedded in The distance of vector corresponding to the two nodes also should be a little bit smaller later, in order to portray the otherness between both distributions, It here can be using classical KL- divergence algorithm.
The experience connection probability tables of higher dimensional space are shown as the original connection information (i.e. adjacency matrix) of oriented cum rights network, low The connection probability of estimating of dimension space is indicated the vector space after the node vector in oriented cum rights network, first object letter Number can be minimized the distributional difference of lower dimensional space estimated between connection probability and the experience connection probability of higher dimensional space.This One objective function O1Portraying single order similitude, that is to say, that two nodes being connected in original oriented cum rights network Corresponding vector characterization also should be more close in lower dimensional space.
Further, credit evaluation device records corresponding oriented cum rights network progress to transferring accounts by LINE algorithm Second-order is approximate:
It calculates the context in oriented cum rights network between every two node and estimates probability, it can be as follows shown in formula (5):
Wherein, p2(vj|vi) indicate node viAs node vjContext estimate probability, | V | expression refer to oriented cum rights net The number of node in network, the transposition of T direction amount.
The context empirical probability in oriented cum rights network between every two node is calculated, it can be as follows shown in formula (6):
Wherein, wi,jFor the weight on side in oriented cum rights network, diFor node viOut-degree, out-degree refers to oriented cum rights network In, for a node, the number for the node being connected to;Corresponding with out-degree is in-degree, and in-degree refers to, for a node, It is connected to the number of the node of the node.Node viOut-degree diIt can be expressed as follows:
Then, it calculates the context and estimates distributional difference between probability and context empirical probability, obtain the second mesh Scalar functions, can following formula (7):
Wherein, λiIt can be defined as the degree (including in-degree and out-degree) of each node, it may be assumed that
It should be noted that the context that the second objective function can be minimized lower dimensional space estimates probability and higher dimensional space Context empirical probability between distributional difference.Second objective function O2Portraying second order similitude, that is to say, that former The vector characterization for having two nodes of many common neighbor nodes corresponding in lower dimensional space in the oriented cum rights network to begin It is more close.
After obtaining first object function and the second objective function, the node vector of available first object function, with And the node vector of the second objective function is obtained, then, the knot vector that the two is obtained splices, and obtains oriented cum rights net The node vector of network implies the information for record of transferring accounts in the node vector.
Optionally, the step of node vector of oriented cum rights network being obtained according to first object function and the second objective function May include:
First object function is optimized by stochastic gradient descent algorithm, the node obtained under first object function is low Dimensional vector;
The second objective function is optimized by stochastic gradient descent algorithm, the node obtained under the second objective function is low Dimensional vector;
Node low-dimensional vector under node low-dimensional vector and the second objective function under first object function is spelled It connects, obtains the node vector of oriented cum rights network.
Specifically, the node vector of accurate low-dimensional in order to obtain, is obtaining first object function and the second objective function Afterwards, first object function and the second objective function can be optimized respectively, is had further according to the objective function acquisition after optimization To the node vector of cum rights network.For example, stochastic gradient descent algorithm (Stochastic Gradient can be passed through Descent, SGD) first object function is optimized, the node low-dimensional vector under first object function is obtained, and pass through SGD algorithm optimizes the second objective function, obtains the node low-dimensional vector under the second objective function.Finally, by the first mesh The node vector of the node vector and the second objective optimization function of marking majorized function carries out left and right splicing, obtains oriented cum rights net The node vector of network.
Optionally, the step of feature vector of acquisition mobile payment record may include:
The multidimensional for obtaining mobile payment record pays feature;
Multidimensional payment feature is encoded, encoded information is got paid;
The feature vector of payment record is generated according to payment encoded information.
Specifically, the multidimensional that credit evaluation device obtains mobile payment record pays feature, wherein multidimensional pays feature can To include any more in type, payment amount, payment shop and the incident timestamp of friendship of user identifier, payment for merchandise etc. A payment feature, for example, a mobile payment record can indicate are as follows: user, category, money, shop_name, Time_stamp etc., wherein user indicates that user identifier, the user identifier can be character string type;Category indicates branch The type of commodity is paid, the type of the payment for merchandise can be character string type;Money indicates payment amount, which can To be float, shop_name expression payment shop, which can be character string type, time_stamp table Show and hand over incident timestamp, the incident timestamp of the friendship can be timestamp type.
After obtaining the multidimensional payment feature of mobile payment record, feature can be paid to multidimensional and encoded, be propped up Pay encoded information, wherein obtained payment encoded information can be the information of numeralization, and coding mode can be according to practical need Flexible setting is carried out, particular content is not construed as limiting here.It finally can be according to the payment encoded information of every payment record The feature vector of payment record is generated, may include that one or more payment record is corresponding in the feature vector of the payment record Feature vector.
Optionally, paying the step of feature encodes, gets paid encoded information to multidimensional may include:
It is that value type pays feature that multidimensional, which is paid the non-numeric type payment Feature Conversion in feature,;
By the value type being converted to pay feature and multidimensional payment feature in value type payment feature carry out from Dispersion processing, gets paid encoded information.
When credit evaluation device encodes multidimensional payment feature, it can be and quantize to multidimensional payment feature, The corresponding value type of every dimension payment feature is obtained, for example, reflecting between non-numeric type and value type can be preset Relationship is penetrated, different non-numeric types corresponds to different value types, then obtains multidimensional according to the mapping relations and pays feature Middle non-numeric type pays the corresponding value type of feature, and multidimensional is paid the non-numeric type payment Feature Conversion in feature and is Value type pays feature.
Either, multidimensional is paid into the non-numeric type payment feature in feature and is first converted to value type payment feature, The multidimensional of obtained numeralization pays feature.For example, a mobile payment record can indicate are as follows: user identifier user (character String type), the type category (character string type) of payment for merchandise, payment amount money (float) pay shop Shop_name (character string type) hands over incident timestamp time_stamp (timestamp type) etc., can in this record By the type category of the payment for merchandise of the user identifier user of character string type, character string type and character string type It pays shop shop_name and carries out labeling, i.e., character string type is mapped as value type, obtains corresponding numerical value class Type pays feature.
It is after value type pays feature, at this point, moving that multidimensional, which is paid the non-numeric type payment Feature Conversion in feature, It is value type payment feature that it is corresponding, which to pay feature, for the multidimensional of dynamic payment record, and multidimensional can be paid to the numerical value in feature Type pays feature and carries out sliding-model control, for example, can be by the payment amount money etc. of float in above-mentioned record It is 10 grades (difference of maximum value and minimum value in all payment amounts is divided into 10 grades) apart from dispersion, determines that every level-one is corresponding Value type pays feature;By the incident timestamp time_stamp of the friendship of timestamp type with every 10 minutes for a granularity It divides, determines the corresponding value type payment feature of each granularity.Feature is finally paid according to the multidimensional after sliding-model control, it can To obtain the corresponding payment encoded information of every dimension payment feature.
It should be noted that when the mobile payment record to user is analyzed, it can be first to all mobile payments Record is pre-processed, and is obtained target payment feature, is paid the feature vector that feature generates mobile payment record according to target, In, target payment feature may include the average consumption amount of money, the spending amount most frequently occurred and user most frequently consume when Between section etc..For example, can calculate in all mobile payments record of the user, average consumption amount of money avg_num, most frequently occur Spending amount (spending amount is subjected to equidistant sliding-model control as unit of 100 here) and the spending amount frequency is most The statistics such as period most_time that the average value most_num and user of the spending amount in more sections are most frequently consumed Information is learned, forms the feature vector of payment record according to demographic information.
To sum up, obtain transferring accounts record corresponding oriented cum rights network node vector and mobile payment record feature to After amount, can the feature vector of node vector to record of transferring accounts and mobile payment record carry out left and right splicing, generate transaction letter Cease corresponding feature space.Wherein, left and right splicing be in feature space node vector left, feature vector in right splicing and At alternatively, splicing node vector as in feature space in left and right is spliced in right, feature vector on a left side.
In some embodiments, candidate categories information includes behavioural information, and it is corresponding to construct each candidate categories information Oriented cum rights network, and the node vector of oriented cum rights network is obtained, the corresponding spy of classification information is generated according to node vector Levy space the step of may include:
(a) the multidimensional behavioural characteristic of behavioural information is obtained;
(b) the corresponding oriented cum rights network of every dimension behavioural characteristic is constructed;
(c) node vector of each oriented cum rights network is obtained;
(d) according to the node vector of each oriented cum rights network, the corresponding feature space of behavioural information is generated.
By taking candidate categories information is behavioural information as an example, wherein behavior information may include thumbing up the frequency, sending out for user Table comments on the frequency, the hair message frequency, the voice communication frequency and video calling frequency etc., and credit evaluation device can be believed with subordinate act Multidimensional behavioural characteristic is extracted in breath, which may include thumbing up the frequency, comment, the hair message frequency and call Any number of behavioural characteristics in frequency etc., wherein call frequency includes the voice communication frequency and the video calling frequency etc..
It should be noted that in order to protect privacy of user, the embodiment of the present invention can be passed through the visual informations such as behavioural information The processing such as transcoding and desensitization is crossed, therefore, the visual information of statistics of the embodiment of the present invention was that treated for transcoding and desensitization etc. Information, to achieve the purpose that protect privacy of user.
For every dimension behavioural characteristic of behavioural information, credit evaluation device constructs the corresponding oriented cum rights of every dimension behavioural characteristic Network, for example, behavioural characteristic includes thumbing up the frequency, the comment frequency, hair the message frequency and call frequency etc., at this point it is possible to construct The corresponding oriented cum rights network A of the frequency is thumbed up, which is the net that user mutually thumbs up the frequency in circle of friends Network;The corresponding oriented cum rights network B of comment can be constructed, which is mutual comment of the user in circle of friends Network, the corresponding oriented cum rights network C of the hair message frequency can be constructed, which is user in social platform The network of (such as wechat and QQ) mutually hair message frequency;The corresponding oriented cum rights network D of call frequency can be constructed, this is oriented Cum rights network D mutual video or network of the voice communication frequency between user.
After obtaining the corresponding oriented cum rights network of every dimension behavioural characteristic, credit evaluation device can pass through LINE algorithm base It is approximate in First-order approximation and Second-order, obtain the node vector of each oriented cum rights network, wherein pass through LINE algorithm obtains the node vector of each oriented cum rights network, corresponding has with transferring accounts to record above by the acquisition of LINE algorithm Node vector to cum rights network is similar, repeats no more here.
After the node vector for obtaining the corresponding oriented cum rights network of every dimension behavioural characteristic, credit evaluation device can basis The node vector of each oriented cum rights network generates the corresponding feature space of behavioural information.For example, the oriented band that can will be obtained Weigh the node vector a of network A, the node vector b of oriented cum rights network B, oriented cum rights network C node vector c and oriented band The node vector d for weighing network D, is spliced, obtains the corresponding feature space of behavioural information.
In some embodiments, classification information includes attribute information, obtains each classification information in multiple classification informations Corresponding vector information, the step of obtaining each classification information corresponding feature space according to vector information may include:
(1) the multidimensional property feature of attribute information in multiple classification informations is obtained;
(2) multidimensional property feature is encoded, obtains attribute coding's information;
(3) the corresponding feature space of attribute information is generated according to attribute coding's information.
Specifically, by taking classification information is attribute information as an example, wherein the attribute information may include the gender of user, family Township, city of residence and age etc., credit evaluation device can be to obtain multidimensional property feature, the multidimensional property in dependence information Feature may include any number of attributive character in gender, local, city of residence and age etc..
For example, the multidimensional property feature of the attribute information of a user can indicate are as follows: user, gender, home, Domicile, age etc., wherein user indicates that user identifier, the user identifier can be character string type;Gender indicates to use The gender at family, the gender can be character string type;Home indicates the local of user, which can be character string type; Domicile indicates that the city of residence of user, the city of residence can be character string type;Age indicates the age now of user, The age can be integer type.Wherein, the gender of user, local and city of residence etc. can be user and registering social account Number or while registering other website platform accounts provide, can also be obtained from other approach;The age of user can pass through registration The age and time filled in when account are calculated, and can also obtain from other approach.
It should be noted that in order to protect privacy of user, the embodiment of the present invention can be passed through the visual informations such as attribute information The processing such as transcoding and desensitization is crossed, therefore, the visual information of statistics of the embodiment of the present invention was that treated for transcoding and desensitization etc. Information, to achieve the purpose that protect privacy of user.
After obtaining the multidimensional property feature of attribute information, credit evaluation device can be compiled multidimensional property feature Code, obtains attribute coding's information, wherein obtained attribute coding's information can be the information of numeralization, and coding mode can be with Flexible setting is carried out according to actual needs, and particular content is not construed as limiting here.It can finally be generated according to attribute coding's information The corresponding feature space of attribute information, for example, can be directly by attribute coding's information composition characteristic space.
Optionally, the step of encoding to multidimensional property feature, obtaining attribute coding's information may include:
By the non-numeric type attributive character in multidimensional property feature, value type attributive character is converted to;
It is raw according to the value type attributive character in the value type attributive character and multidimensional property feature being converted to At attribute coding's information.
When credit evaluation device encodes multidimensional property feature, it can be and quantize to multidimensional property feature, The corresponding value type of every dimension attribute feature is obtained, for example, reflecting between non-numeric type and value type can be preset Relationship is penetrated, different non-numeric types corresponds to different value types, then obtains multidimensional property feature according to the mapping relations The corresponding value type of middle non-numeric type attributive character, the non-numeric type attributive character in multidimensional property feature is converted to Value type attributive character.
Either, the non-numeric type attributive character in multidimensional property feature is formed into value type attribute spy by coding Sign, i.e., to every one-dimensional non-numeric type attributive character, count the quantity n of different value, then belong to according to the sequence of 1~n to every dimension Property feature is encoded, and the coding of identical value is identical, and (for example, male is encoded to 1,0) women is encoded to, so as to by non-number Value Types are converted to value type.For example, can by the user identifier user of character string type, gender gender, local home, And city of residence domicile etc., it is mapped as value type, obtains corresponding value type attributive character.
After non-numeric type attributive character in multidimensional property feature is converted to value type attributive character, at this point, belonging to The corresponding multidimensional property feature of property information is value type attributive character, available attribute coding's information.It can basis Attribute coding's information generates the attribute feature vector of attribute information, and it is empty which forms the corresponding feature of attribute information Between.
It is above-mentioned to be directed to each classification information, corresponding feature space is obtained using different processing modes, is improved The flexibility that feature space obtains.
In step s 103, the corresponding feature space of each classification information is each mapped to nuclear space, obtains multiple cores Space.
After obtaining the corresponding feature space of each classification information, credit evaluation device can be corresponding by each classification information Feature space be each mapped to nuclear space, obtain the corresponding nuclear space of each classification information, wherein the nuclear space can be nothing The space for limiting dimension may include the inner product two-by-two between feature vector in the nuclear space, or including between node vector two-by-two Inner product etc..For example, empty in the feature space for the feature space, behavioural information for obtaining Transaction Information and the feature of attribute information Between after, the feature space of the feature space of Transaction Information, the feature space of behavioural information and attribute information can be mapped respectively For nuclear space, the nuclear space of the nuclear space of Transaction Information, the nuclear space of behavioural information and attribute information is obtained.
Specifically, feature space can be mapped to by nuclear space by kernel function, i.e., will characterizes user's in feature space DUAL PROBLEMS OF VECTOR MAPPING is to nuclear space.Kernel function is defined as follows:
If χ is the input space, which can be Euclidean space RnSubset or discrete set, and set H as core sky Between, which can be Hilbert space, if there is a mapping from χ to H: φ (x): χ → H, so as to all X, z ∈ χ, function k (x, z) meet condition:
K (x, z)=φ (x) φ (y)
Then k (x, z) is referred to as kernel function, and φ (x) is mapping function, wherein φ (x) φ (z) be φ (x) and φ (z) it Between inner product.
The kernel function may include linear kernel function, Polynomial kernel function, gaussian kernel function, Radial basis kernel function, Sigmoid kernel function and compound kernel function etc. will be described in detail, the Gauss so that kernel function is gaussian kernel function as an example below Kernel function can be such that
Then, the obtained corresponding feature space of each classification information is corresponded into input space χ, by gaussian kernel function Processing obtain corresponding nuclear space.For example, by the feature space of obtained Transaction Information, behavioural information feature space and The feature space of attribute information passes through the processing of gaussian kernel function, is respectively mapped to the nuclear space of Transaction Information, behavioural information The nuclear space of nuclear space and attribute information.
In step S104, multiple nuclear space are subjected to multicore linear combination processing, obtain synthesis nuclear space.
It, can be by multiple nuclear space after obtaining multiple nuclear space as composed by the corresponding nuclear space of each classification information Multicore linear combination processing is carried out, for example, multiple nuclear space are carried out linear combination calculating, obtains synthesis nuclear space.
In some embodiments, multiple nuclear space are subjected to multicore linear combination processing, obtain the step of synthesis nuclear space Suddenly may include:
(1) multiple nuclear space are normalized, obtain multiple normalization nuclear space;
(2) the corresponding weighted value of each classification information in multiple classification informations is obtained;
(3) it is carried out at multicore linear combination according to the corresponding weighted value of each classification information and each normalization nuclear space Reason obtains synthesis nuclear space.
Due to being taken to every one-dimensional characteristic in each classification information in the corresponding feature space of each classification information of acquisition Value range can be not intended to be limited in any, it is thus possible to which the value range that will lead to some features is larger, and the value of some features Range is smaller, it will place nuclear space can be normalized in order to reduce this influence in the synthesis nuclear space influenced Reason.
Specifically, nuclear space normalized is as follows:
A finite subset of input space χ is set as S={ x1,...xn, Feature Mapping is φ (x): χ → H, kernel function For k (x, z)=φ (x) φ (z), φ (S)={ φ (x is enabled1),...φ(xn) it is image of the S under mapping phi,The element of nuclear matrix K is Kij=k (xi,xj), i, j=1 ... the norm of n, feature vector φ (x) are as follows:
The feature vector of standardization are as follows:
Normalize nuclear space are as follows:
After each nuclear space is normalized respectively, the corresponding normalization core of available each nuclear space is empty Between, therefore after multiple nuclear space are normalized, available multiple normalization nuclear space.
Due to being handled from isomeric data of the different classification informations to user, and different classification information isomery and more Sample, obtained nuclear space have different characteristics, therefore, the nuclear space of different characteristics can be carried out linear combination, obtain multiclass The advantages of nuclear space, so as to obtain more preferably mapping performance.
Multiple nuclear space are subjected to multicore linear combination processing, construction synthesis nuclear space, which can be Linear weighted function summation core.Specifically, the corresponding weighted value of each classification information is first obtained, according to the corresponding power of each classification information Weight values and each normalization nuclear space carry out multicore linear combination processing, obtain synthesis nuclear space.
For example, setting Transaction Information, behavioural information and the corresponding nuclear space of attribute information are respectively k1(x,z),k2(x, z),k3(x, z), the normalization nuclear space after normalization are respectivelyThese three normalization cores are empty Between carry out multicore linear combination processing, obtain synthesis nuclear space K (x, z) are as follows:
Wherein, βiIndicate the corresponding nuclear space of i-th of classification information to the significance level of the credit evaluation of user, i.e., i-th The corresponding weighted value of a classification information.It, can by handling the progress multicore linear combination of multiple nuclear space to obtain synthesis nuclear space Lead to loss of learning to avoid the feature for directly piecing together isomeric data etc. and generates large error.
In step s105, the credit evaluation result of user is obtained according to synthesis nuclear space.
Multicore linear combination processing is being carried out to multiple nuclear space, it, can be according to synthetic kernel sky after obtaining synthesis nuclear space Between obtain user credit evaluation result, wherein credit evaluation result can be credit scoring or credit rating etc..
In some embodiments, may include: according to the step of credit evaluation result of synthesis nuclear space acquisition user By preset regression model, the credit scoring of user is calculated according to synthesis nuclear space.
Credit evaluation device can preset regression model, wherein the regression model is mainly used for according to synthetic kernel sky Between calculate the credit evaluation of user as a result, by taking credit evaluation result is credit scoring as an example, the synthesis nuclear space that can will obtain Regression model is inputted, so as to export the credit scoring of user.It is higher that credit scoring can be set, credit is better;Credit is commented Point lower, credit is poorer.
Optionally, it by taking credit evaluation result is credit rating as an example, can be according to obtained credit scoring as a result, into one Step grades to the credit of user, for example, it may be determined that the corresponding credit rating of obtained credit scoring, can be set credit It scores higher, credit rating is also higher;Credit scoring is lower, and credit rating is also lower;It is higher that credit rating can be set, credit Better;Credit rating is lower, and credit is poorer.
In some embodiments, the credit scoring of user is calculated according to synthesis nuclear space by preset regression model The step of before, which can also include:
(1) training sample set is obtained, training sample set is divided into multiple classification information collection;
(2) multiple classification information collection are mapped into nuclear space;
(3) objective function is generated according to nuclear space and preset regression function;
(4) objective function is handled by Lagrange duality algorithm, generates regression model.
Specifically, the historical data that can first collect user, using the historical data of user as training sample set, for example, It can be using whole historical datas as training sample set, be also possible to from the history number being collected at random or according to default Rules Filtering goes out a part of data as training sample set.Then, the credit level of the corresponding user of training sample set is carried out Artificial mark, for training regression model.
Below will to the corresponding support vector regression model of single classification information (Support Vector Regression, SVR it) is illustrated:
If training sample set isWherein xiFor input value, yiFor output valve, d is dimension Number, n is number of training.In ε-SVR, x will be first inputtediIt is mapped to feature space by Nonlinear Mapping φ, so that in spy Sign can be fitted output valve y with linear function f (x)=ω φ (x)+b in spacei, and f (x) is for all training samples This, has | f (xi)-yi|≤ε, and f (x) is smooth as far as possible.Thus the regression function of ε-SVR algorithm is obtained:
Wherein, above formula indicates the function distance between the hands-on point of regression function and training sample in a model, right The loss function L defined in formula, elocutionary meaning are that model is allowed to have certain error, and the point in error range is regarded as model On point, and the point outside error range to make it with fitting regression function apart from as small as possible;Constant term C in formula is Punishment parameter.
Similar to SVM, slack variable ξ and ξ are introduced in SVR*, above-mentioned regression function is converted are as follows:
It is solved by Lagrange duality method, converts dual problem for above-mentioned primal problem, obtain new recurrence letter Number, as follows:
Wherein, αiWithIt is above-mentioned secondary by solving for the corresponding Lagrange multiplier of two constraint conditions in former problem Planning problem acquires optimal α and α*, by α and α*The value of the ω and b of available original problem.
The regression function of above-mentioned single classification information can simply expand to multiple classification informations very much, i.e., first will training sample This collection is divided into multiple classification information collection, multiple classification information collection is respectively mapped to nuclear space in the manner described above, according to core Therefore space and regression function, which generate objective function, can define the objective function for being directed to the regression model of multi-class information It is as follows:
The above problem is similarly quadratic programming problem, acquires optimal value α, α by Lagrange duality algorithm*, β and b Etc. after parameters, obtain the fitting function of training sample set, which is regression model, can be as follows:
After training regression model, when there is new user xnewWhen needing to assess its credit, the user xnewCredit can To be calculated according to following formula:
Therefore, it is based on above-mentioned regression model, can accurately be assessed from credit of multiple classifications to user, effective gram The defect that traditional single classification is uniformly processed is taken, so that assessment result is more accurate.
From the foregoing, it will be observed that the embodiment of the present invention can classify isomeric data related to user, it is multiple to be divided into Classification information;Then, the corresponding feature space of each classification information is obtained, and the corresponding feature space of each classification information is divided It is not mapped as nuclear space, obtained multiple nuclear space are subjected to multicore linear combination processing, obtains synthesis nuclear space;So as to The credit evaluation result of user is obtained according to synthesis nuclear space.It can classify from the isomeric data of user in the program more A classification information accurately assesses the credit of user, effectively overcomes and the information of user is uniformly processed, and According to the defect that all feature direct splicings are assessed at the credit to user such as eigenmatrix, so that assessment result more may be used It leans on.
Citing, is described in further detail by the method according to described in above-described embodiment below.
Using credit evaluation device as server, for carrying out credit scoring to user A, the master data of user A is divided For three different classifications such as Transaction Information, behavioural information and attribute information, learn respectively using from different classifications, it will be different Classification information be mapped to nuclear space from feature space, the nuclear space of each classification information is subjected to multiple linear combinations and is closed At nuclear space, score eventually by credit of the regression model to user A.
Referring to Fig. 4, Fig. 4 is the flow diagram of credit estimation method provided in an embodiment of the present invention.This method process May include:
S201, server obtain the isomeric data of user A.
Wherein, isomeric data may include the gender of user A, the age, city of residence, the comment frequency, the hair message frequency, lead to Voice frequency time, record of transferring accounts, mobile payment record, the gender of user A good friend, the age of user A good friend, user A good friend and user A Between call frequency, transfer accounts record and credit of user A good friend etc. between user A good friend and user A.
Server can obtain the isomeric data of user A by crawler technology from internet, can also be flat by social activity The open API of platform (for example, wechat, microblogging and QQ etc.) obtains the isomeric data of user A.
It is understood that server can also obtain the isomeric data of user A, specific acquisition modes by other means It is not construed as limiting here.
It should be noted that the embodiment of the present invention can be handled the isomeric data of user by transcoding and desensitization etc., from And achieve the purpose that protect privacy of user.
The isomeric data of user A is divided into Transaction Information, behavioural information and attribute information by S202, server.
After the isomeric data for obtaining user A, the isomeric data of user A can be divided into Transaction Information, row by server For three different classes of classification informations such as information and attribute information, as shown in Figure 5.
Wherein, Transaction Information may include transfer accounts record and mobile payment record etc., and behavioural information may include thumbing up frequency Secondary, the comment frequency, the hair message frequency, the voice communication frequency and video calling frequency etc., attribute information may include gender, family Township, age and city of residence etc..
S203, server obtain the transaction feature space of Transaction Information, the behavioural characteristic space of behavioural information and attribute letter The attributive character space of breath.
It should be noted that transaction feature space is the feature space of Transaction Information, behavioural characteristic space is behavior The feature space of information, attributive character space are the feature space of attribute information, and name herein is just for the sake of to transaction letter The feature space of the different classes of information such as breath, behavioural information and attribute information distinguishes.
After obtaining Transaction Information, behavioural information and attribute information, the transaction feature of the available Transaction Information of server Space, specifically, server first extract record of transferring accounts from Transaction Information, which may include one or more turn Account record, then, construction, which is transferred accounts, records corresponding oriented cum rights network G are as follows: G=(V, E, W), wherein V indicates oriented cum rights net Node in network G, each node represent a user;E indicates that the side in oriented cum rights network G, each side represent two use It transfers accounts between family record;W indicates the weight on side, represents transfer amounts.
For example, one is transferred accounts and is recorded as user 1 to be given to the transfer amounts of user 2 being 100 yuan, this record of transferring accounts can have Following field: u1,trans_num,u2, wherein u1For user 1, trans_num is the transfer amounts that user 1 is given to user 2, u2 For user 2.
After the record that obtains transferring accounts, server can be based on by above-mentioned LINE algorithm First-order approximation and Second-order is approximate, calculates the node vector for record of transferring accounts.
Server extracts mobile payment record from Transaction Information, and mobile payment record may include one or more Mobile payment record.Then, user identifier, the type of payment for merchandise, the payment amount, payment shop in mobile payment record are extracted It spreads and hands over the multidimensional such as incident timestamp to pay feature, then numeralization coding is carried out to multidimensional payment feature, get paid volume Code information, finally can be according to the feature vector of payment encoded information composition payment record.
After the feature vector of the node vector for the record that obtains transferring accounts and mobile payment record, server can be by the node Vector and feature vector carry out left and right splicing, and obtaining the corresponding transaction feature space of Transaction Information, (i.e. the feature of Transaction Information is empty Between), as shown in Figure 6.
And the behavioural characteristic space of server acquisition behavioural information is specifically extracted in server elder generation subordinate act information The frequency, comment, the hair multidimensional behavioural characteristic such as the message frequency and call frequency are thumbed up out, and it is corresponding then to construct every dimension behavioural characteristic Oriented cum rights network, for example, construction thumb up the corresponding oriented cum rights network N et of the frequencytu, the corresponding oriented cum rights of construction comment Network N etcm, the corresponding oriented cum rights network N et of the construction hair message frequencymsg, and the corresponding oriented cum rights net of construction call frequency Network Netvv.Wherein it is possible to using the frequency of every dimension behavioural characteristic as its corresponding weighted value, therefore these four networks are all oriented Cum rights network.In order to using the behavioural information and social networks that user is shown in heterogeneous networks, and can The social information of user is preferably excavated, the node vector of behavioural information can be calculated by LINE algorithm.
After obtaining multidimensional behavioural characteristic, it is approximate that server can be based on First-order by above-mentioned LINE algorithm With Second-order approximation, oriented cum rights network N et is calculatedtuThumb up frequency node vector, calculate oriented cum rights network NetcmComment node vector, oriented cum rights network N etmsgHair message node vector and calculate oriented cum rights network N etvv Call frequency node vector.As shown in fig. 7, can will thumb up frequency node vector, comment node vector, hair message node to Amount and call frequency node vector carry out left and right splicing, obtain behavioural information corresponding behavioural characteristic space (i.e. behavioural information Feature space).
And the attributive character space of server acquisition attribute information is specifically extracted in server elder generation dependence information The multidimensional properties feature such as gender, local, city of residence and age, then carries out numeralization coding to multidimensional property feature, obtains Attribute coding's information can generate attribute feature vector (the i.e. feature of attribute information of attribute information according to attribute coding's information Vector).For example, as shown in figure 8, the multidimensional property feature of the attribute information of user A can indicate are as follows: user marks user, gender Gender, local home, city of residence domicile, now age age etc., by these corresponding numeralizations of multidimensional property feature Obtained encoded information is encoded, carrying out left and right splicing can be obtained the feature space of attribute information.
Transaction feature space reflection is transaction nuclear space, by behavioural characteristic space reflection is behavior core by S204, server Space and by attributive character space reflection be attribute nuclear space.
It should be noted that transaction nuclear space is the nuclear space of Transaction Information, behavior nuclear space is behavioural information Nuclear space, attribute nuclear space are the nuclear space of attribute information, herein name just for the sake of to Transaction Information, behavioural information and The nuclear space of the different classes of information such as attribute information distinguishes.
After obtaining transaction feature space, behavioural characteristic space and attributive character space, server can be by transaction feature Space, behavioural characteristic space and attributive character space are each mapped to nuclear space.Specifically, it can will be handed over by above-mentioned kernel function Easy feature space is mapped as transaction nuclear space, by behavioural characteristic space reflection is behavior nuclear space by above-mentioned kernel function, and By above-mentioned kernel function by attributive character space reflection be attribute nuclear space, as shown in Figure 5.
S205, server carry out multicore linear combination processing to transaction nuclear space, behavior nuclear space and attribute nuclear space, obtain To synthesis nuclear space.
After obtaining transaction nuclear space, behavior nuclear space and attribute nuclear space, server can respectively to transaction nuclear space, Behavior nuclear space and attribute nuclear space are normalized, and then, server can be empty by the transaction core after normalized Between, behavior nuclear space and attribute nuclear space carry out multicore linear combination processing, obtain synthesis nuclear space, as shown in Figure 5.
Wherein, the normalized mode is similar with the above-mentioned normalized mode referred to, at the multicore linear combination Reason is similar with the above-mentioned multicore linear combination processing referred to, and details are not described herein again.
S206, server calculate the credit scoring of user A according to synthesis nuclear space by regression model.
After obtaining synthesis nuclear space according to transaction nuclear space, behavior nuclear space and attribute nuclear space, server can be incited somebody to action Synthesis nuclear space inputs above-mentioned regression model, the credit scoring of user A is exported from the regression model, as shown in Figure 5.
In the embodiment of the present invention, Transaction Information, behavioural information and attribute information based on user A etc. are generated to user A's Credit carries out multi-class description, using multi-class study mechanism, by the feature space of Transaction Information, behavioural information and attribute information It is mapped to nuclear space, and multicore linear combination is carried out to transaction nuclear space, behavior nuclear space and attribute nuclear space and obtains synthetic kernel Space will finally synthesize nuclear space and pass through the credit scoring of regression model calculating user A.It overcomes in the prior art from unitary class It is other processing is carried out to user information to lead to that serious forgiveness is low and accuracy is low, it improves and credit scoring is carried out to user A Accuracy.
For convenient for better implementation credit estimation method provided in an embodiment of the present invention, the embodiment of the present invention also provides one kind Device based on above-mentioned credit estimation method.Wherein the meaning of noun is identical with above-mentioned credit estimation method, and specific implementation is thin Section can be with reference to the explanation in embodiment of the method.
Referring to Fig. 9, Fig. 9 is the structural schematic diagram of credit evaluation device provided in an embodiment of the present invention, the wherein credit Assessment device may include information acquisition unit 301, feature acquiring unit 302, the first map unit 303, synthesis unit 304 and Assessment unit 305 etc..
Wherein, information acquisition unit 301 classify isomeric data for obtaining isomeric data related to user, Obtain multiple classification informations.
Wherein, user can be individual, be also possible to enterprise etc..When user is individual, isomeric data may include using The gender at family, the account number of user, the age of user, the city of residence of user, the comment frequency of user, the hair message of user The frequency, the call frequency of user, the record of transferring accounts of user, the mobile payment record of user, the gender of user good friend, user good friend Call frequency, the credit of user good friend, the age of user good friend and user good friend record etc. of transferring accounts.It below will be with user It is to be described in detail for individual.
Optionally, in one embodiment, information acquisition unit 301 can obtain user by crawler technology from internet Isomeric data;In another embodiment, information acquisition unit 301 can be by social platform (for example, wechat, microblogging and QQ Deng) open application programming interface (Application Programming Interface, API) obtains the different of user Structure data.
It should be noted that in order to protect privacy of user, the embodiment of the present invention can be by the isomeric data of user by turning The processing such as code and desensitization, for example, the account number of user can be handled by Hash, obtains a string of longer character strings to indicate The account number of user.Therefore, the isomeric data of the user of statistics of the embodiment of the present invention was after transcoding and desensitization etc. are handled Information, thus achieve the purpose that protect privacy of user.
After obtaining the isomeric data of user, information acquisition unit 301 can classify the isomeric data of user, obtain To multiple classification informations, wherein different classification informations can according to the attributive character of isomeric data, purposes, generation source, Acquisition modes or locating platform etc. are classified to obtain, and specific mode classification is not construed as limiting here.For example, can be by user Isomeric data be divided into multiple and different classification informations such as Transaction Information, behavioural information and attribute information.Wherein, Transaction Information It may include the transfer accounts record and mobile payment record etc. of the transfer accounts record and mobile payment record, user good friend of user, behavior Information may include the voice communication for thumbing up the frequency, the frequency of making comments of user, hair the message frequency, user of user of user The frequency, the video calling frequency of user, the hair message frequency of user good friend and voice communication frequency of user good friend etc., attribute Information may include the gender of user, the local of user, the age of user, the city of residence of user, the gender of user good friend, use The city of residence of family good friend and the age of user good friend etc..It is understood that a classification information is a type of different Structure data, multiple classification informations can respectively correspond a variety of different types of isomeric datas.
It should be noted that the information that can use similar users carries out completion for the isomeric data of missing.For example, When the gender of user A missing, the similitude between each user can be measured by Euclidean distance, is determined and user A Most like similar users, then using the gender of similar users as the gender of user A.
Feature acquiring unit 302, for obtaining the corresponding vector information of each classification information in multiple classification informations, according to Vector information obtains the corresponding feature space of each classification information.
After the isomeric data of user is divided into multiple classification informations, feature acquiring unit 302 can be respectively to each Classification information is handled, and obtains the corresponding feature space of each classification information, wherein feature space can be by classification information Vector information composition vector space, which may include feature vector and node vector etc..
Since different classification informations has the characteristics of isomery, each classification information can be obtained using different algorithms Corresponding feature space.
In some embodiments, as shown in Figure 10, feature acquiring unit 302 may include:
Subelement 3021 is extracted, for extracting candidate categories information from multiple classification informations;
Subelement 3022 is obtained, for constructing the corresponding oriented cum rights network of each candidate categories information, and obtains and has To the node vector of cum rights network;
First generates subelement 3023, for generating the corresponding feature space of classification information according to node vector.
Specifically, in order to improve the efficiency handled multiple classification informations, first by extraction subelement 3021 from more Candidate categories information is extracted in a classification information, wherein the candidate categories information may include one or more classification informations, The candidate categories information can pass through oriented its feature space of cum rights network query function.For example, classification information A, classification information B, class In other information C, classification information D and classification information E, classification information A, classification information B and classification information C can pass through if it exists Oriented cum rights network is constructed to calculate its feature space, then can extract classification information A, classification information B and classification information C makees For candidate categories information.
Then, it obtains subelement 3022 and constructs the corresponding oriented cum rights network of each candidate categories information, for example, construction class The corresponding oriented cum rights network G of other information AAAre as follows: GA=(VA,EA,WA), wherein VAFor oriented cum rights network GANode, each Node can indicate a user;EAFor oriented cum rights network GASide, each side can indicate between two users existing Classification information A;WAIt can indicate side EAWeight.
Secondly, obtaining subelement 3022 can lead to after obtaining the corresponding oriented cum rights network of each candidate categories information Excessive size values internet startup disk algorithm (Large-scale Information Network Embedding, LINE), is based on First approximation (i.e. First-order is approximate) and Two-order approximation (i.e. Second-order is approximate) calculate the knot of oriented cum rights network Point vector, i.e., be characterized as low-dimensional vector for the network node of oriented cum rights network.Wherein, in oriented cum rights network, two phases Node similitude even is higher (i.e. single order similitude is higher), two similitudes that are not connected but having many public neighbor nodes Also relatively high (i.e. second order similitude is also relatively high) can be very good study by LINE algorithm and arrive both similitudes, thus LINE algorithm remains the information that original oriented cum rights network is included well.
Finally, first generate subelement 3023 can according to the node of the corresponding oriented cum rights network of candidate categories information to Amount, generates the corresponding feature space of candidate categories information, for example, candidate categories information directly can be set by node vector Feature space, the processing such as can also optimize or screen to node vector, will treated that node vector is set as candidate The feature space of classification information.
It should be noted that for a candidate categories information one or more oriented cum rights networks can be constructed, when one When a candidate categories information structuring goes out multiple oriented cum rights networks, can calculate separately the node of each oriented cum rights network to Amount, obtains multiple node vectors, using multiple node vectors as the corresponding node vector of a candidate categories information, Ke Yigen The corresponding feature space of candidate categories information is generated according to the node vector.
In some embodiments, candidate categories information includes Transaction Information, obtains subelement 3022 and may include:
Record obtains module, for obtaining transfer accounts record and the mobile payment record of Transaction Information;
Constructing module records corresponding oriented cum rights network for constructing to transfer accounts;
Node vector obtains module, for obtaining the node vector of oriented cum rights network;
Feature vector obtains module, for obtaining the feature vector of mobile payment record;
First generation subelement 3023 is specifically used for: generating the corresponding spy of Transaction Information according to node vector and feature vector Levy space.
By taking candidate categories information is Transaction Information as an example, wherein Transaction Information may include transfer accounts record and mobile payment Record etc., record obtains module can extract the information such as record and mobile payment record of transferring accounts from Transaction Information, this is transferred accounts Record may include one or more transferring accounts record, and mobile payment record may include one or more mobile payment record.
It should be noted that in order to protect privacy of user, the embodiment of the present invention can be passed through the classification informations such as Transaction Information The processing such as transcoding and desensitization is crossed, therefore, the classification information of statistics of the embodiment of the present invention was that treated for transcoding and desensitization etc. Information, to achieve the purpose that protect privacy of user.
Then, constructing module construction, which is transferred accounts, records corresponding oriented cum rights network G are as follows: G=(V, E, W), wherein V is indicated Node in oriented cum rights network G, each node represent a user;E indicates the side in oriented cum rights network G, each side Represent record of transferring accounts between two users;W indicates the weight on side, represents transfer amounts.
For example, as shown in figure 3, by taking record of transferring accounts as an example, it is assumed that there are user u1, user u2, use in oriented cum rights network G Family u3, user u4, user u5 and user u6 can not only have 6 users certainly in oriented cum rights network G, the present embodiment is For ease of description for example, should not be understood as the restriction to number of users, but regardless of how many user, oriented band Power network G construction process be all it is similar, can be understood according to the example.The Compass of the tail portion of arrow in Fig. 3 Show the side user for producing the amount of money, the direction of arrow stem indicates to receive another party user of the amount of money, the side between two users Existing a, b, c, d, e, f, g and h indicate the amount of money transferred accounts, and can know from Fig. 3, and user u1 has transferred accounts a gold to user u2 Volume, user u5 have transferred accounts the e amount of money to user u1, and user u3 has transferred accounts f amount of money, etc. to user u5.
Secondly, node vector acquisition module obtains to construct to transfer accounts and records the node vector of corresponding oriented cum rights network, this When can be based on First-order approximation and Second-order by LINE algorithm approximate, network node is characterized as low-dimensional Vector obtains node vector.
It should be noted that the similitude between two nodes can indicate that user transfers accounts the similitude of record, if to one The sample of users that a little initial nodes provide credit evaluation (i.e. after getting the isomeric data of user, selects a part of letter at random Breath is used as sample information, is manually marked to the credit level of user corresponding to these sample informations), then it can pass through it The similitude of his node and sample of users measures the credit evaluations of other users, i.e., is finishing internet startup disk using LINE algorithm Later, each node in oriented cum rights network G is indicated with the vector of a low-dimensional, therefore can be measured with Euclidean distance Similitude between user.
In addition, mobile payment record refers to that user consumes by social platform (such as wechat) or puts down by online payment Record caused by the online payments such as platform (such as Alipay and shopping website).One mobile payment record may include user Mark, the type of payment for merchandise, payment amount pay shop and the incident timestamp of friendship etc..
After obtaining mobile payment record, feature vector obtains the feature vector of the available mobile payment record of module, For example, the payment feature in mobile payment record can be extracted first, which may include user identifier, payment for merchandise Then type, payment amount, payment shop and the incident timestamp of friendship etc. carry out numeralization processing to these payment features, Feature vector is generated according to the feature after numeralization.
Finally, first generate subelement 3023 can node vector according to obtained record of transferring accounts and mobile payment The feature vector of record generates the corresponding feature space of Transaction Information, for example, can be by the node vector for record of transferring accounts and movement The feature vector of payment record is spliced, and the corresponding feature space of Transaction Information is obtained.
Optionally, node vector acquisition module may include:
First computational submodule estimates connection probability and warp between every two node for calculating in oriented cum rights network Test connection probability;
Second computational submodule is obtained for calculating the distributional difference estimated between connection probability and experience connection probability First object function;
Third computational submodule, for calculate the context in oriented cum rights network between every two node estimate probability and Context empirical probability;
4th computational submodule estimates distributional difference between probability and context empirical probability for calculating context, Obtain the second objective function;
Acquisition submodule, for obtained according to first object function and the second objective function the node of oriented cum rights network to Amount.
Specifically, it records corresponding oriented cum rights network to transferring accounts by LINE algorithm first to carry out First-order close Seemingly:
First computational submodule calculates in oriented cum rights network and estimates connection probability between every two node, the company of estimating Connecing probability can be lower dimensional space, can be as follows shown in formula (1):
Wherein, p1(vi,vj) indicate node viWith node vjBetween estimate connection probability, viAnd vjRefer to oriented cum rights network In two nodes, viAnd vjBetween have a line, i.e., the side (v in oriented cum rights networki,vj), uiIt refers to calculating by LINE Method obtains node viIt is indicated in the vector of lower dimensional space, ujIt refers to obtaining node v by LINE algorithmjLower dimensional space to Amount indicates that the transposition of T direction amount, exp is indicated using natural constant e as the exponential function at bottom.
It is interconnected that the meaning that experience connection probability indicates, which can be in oriented cum rights network between every two node, Probability, experience connection probability can be higher dimensional space, and the first computational submodule calculates every two node in oriented cum rights network Between experience connect probability, can be shown in following formula (2):
Wherein,Expression experience connects probability, wi,jFor node v in oriented cum rights networkiWith node vjBetween side Weight, W are the sum of each side right weight in oriented cum rights network, that is,
Then, the second computational submodule calculates the distributional difference estimated between connection probability and experience connection probability, obtains First object function, can be as follows shown in formula (3):
Wherein, O1Indicate first object function,Expression experience connects probability, p1() indicates to estimate connection generally Rate, d () indicate to calculate the KL- divergence (Kullback-for estimating and being distributed between connection probability and experience connection probability Leibler divergence), KL- divergence is also known as relative entropy, and KL- divergence is specifically defined can be as follows shown in formula (4):
It should be noted that one due to internet startup disk is wanted after obtaining estimating connection probability and experience connection probability Ask that be exactly node retain as far as possible in the space of the information after insertion in the former space of oriented cum rights network, therefore between node Range distribution should also be kept, if therefore two nodes be in original oriented cum rights network it is interconnected, be embedded in The distance of vector corresponding to the two nodes also should be a little bit smaller later, in order to portray the otherness between both distributions, It here can be using classical KL- divergence algorithm.
The experience connection probability tables of higher dimensional space are shown as the original connection information (i.e. adjacency matrix) of oriented cum rights network, low The connection probability of estimating of dimension space is indicated the vector space after the node vector in oriented cum rights network, first object letter Number can be minimized the distributional difference of lower dimensional space estimated between connection probability and the experience connection probability of higher dimensional space.This One objective function O1Portraying single order similitude, that is to say, that two nodes being connected in original oriented cum rights network Corresponding vector characterization also should be more close in lower dimensional space.
Further, it records corresponding oriented cum rights network to transferring accounts by LINE algorithm to carry out Second-order close Seemingly:
The context that third computational submodule calculates in oriented cum rights network between every two node estimates probability, Ke Yiru Shown in lower formula (5):
Wherein, p2(vj|vi) indicate node viAs node vjContext estimate probability, | V | expression refer to oriented cum rights net The number of node in network, the transposition of T direction amount.
Third computational submodule calculates the context empirical probability in oriented cum rights network between every two node, Ke Yiru Shown in lower formula (6):
Wherein, wi,jFor the weight on side in oriented cum rights network, diFor node viOut-degree, out-degree refers to oriented cum rights network In, for a node, the number for the node being connected to;Corresponding with out-degree is in-degree, and in-degree refers to, for a node, It is connected to the number of the node of the node.Node viOut-degree diIt can be expressed as follows:
Then, it is poor to calculate the distribution that the context is estimated between probability and context empirical probability for the 4th computational submodule It is different, the second objective function is obtained, it can following formula (7):
Wherein, λiIt can be defined as the degree (including in-degree and out-degree) of each node, it may be assumed that
It should be noted that the context that the second objective function can be minimized lower dimensional space estimates probability and higher dimensional space Context empirical probability between distributional difference.Second objective function O2Portraying second order similitude, that is to say, that former The vector characterization for having two nodes of many common neighbor nodes corresponding in lower dimensional space in the oriented cum rights network to begin It is more close.
After obtaining first object function and the second objective function, the knot of the available first object function of acquisition submodule Point vector, and the node vector of the second objective function is obtained, then, the knot vector that the two is obtained splices, and obtains The node vector of oriented cum rights network implies the information for record of transferring accounts in the node vector.
Optionally, acquisition submodule is specifically used for:
First object function is optimized by stochastic gradient descent algorithm, the node obtained under first object function is low Dimensional vector;
The second objective function is optimized by stochastic gradient descent algorithm, the node obtained under the second objective function is low Dimensional vector;
Node low-dimensional vector under node low-dimensional vector and the second objective function under first object function is spelled It connects, obtains the node vector of oriented cum rights network.
Specifically, the node vector of accurate low-dimensional in order to obtain, is obtaining first object function and the second objective function Afterwards, acquisition submodule can optimize first object function and the second objective function respectively, further according to the target after optimization Function obtains the node vector of oriented cum rights network.For example, stochastic gradient descent algorithm (Stochastic can be passed through Gradient Descent, SGD) first object function is optimized, the node low-dimensional vector under first object function is obtained, And the second objective function is optimized by SGD algorithm, the node low-dimensional vector under the second objective function is obtained, finally, The node vector of the node vector of first object function and the second objective function is subjected to left and right splicing etc., obtains oriented cum rights The node vector of network.
Optionally, feature vector acquisition module may include:
Feature acquisition submodule, the multidimensional for obtaining mobile payment record pay feature;
Encoding submodule gets paid encoded information for encoding to multidimensional payment feature;
Submodule is generated, for generating the feature vector of payment record according to payment encoded information.
Specifically, the multidimensional that feature acquisition submodule obtains mobile payment record pays feature, wherein multidimensional pays feature It may include any in user identifier, the type of payment for merchandise, payment amount, payment shop and the incident timestamp of friendship etc. Multiple payment features, for example, a mobile payment record can indicate are as follows: user, category, money, shop_name, Time_stamp etc., wherein user indicates that user identifier, the user identifier can be character string type;Category indicates branch The type of commodity is paid, the type of the payment for merchandise can be character string type;Money indicates payment amount, which can To be float, shop_name expression payment shop, which can be character string type, time_stamp table Show and hand over incident timestamp, the incident timestamp of the friendship can be timestamp type.
After obtaining the multidimensional payment feature of mobile payment record, encoding submodule can pay feature to multidimensional and compile Code, gets paid encoded information, wherein obtained payment encoded information can be the information of numeralization, and coding mode can be with Flexible setting is carried out according to actual needs, and particular content is not construed as limiting here.Ultimately producing submodule can be according to every branch The payment encoded information for paying record generates the feature vector of payment record, may include one in the feature vector of the payment record Or the corresponding feature vector of a plurality of payment record.
Optionally, encoding submodule is specifically used for:
It is that value type pays feature that multidimensional, which is paid the non-numeric type payment Feature Conversion in feature,;
By the value type being converted to pay feature and multidimensional payment feature in value type payment feature carry out from Dispersion processing, gets paid encoded information.
When encoding submodule encodes multidimensional payment feature, it can be and quantize to multidimensional payment feature, obtain To the corresponding value type of every dimension payment feature, for example, the mapping between non-numeric type and value type can be preset Relationship, different non-numeric types correspond to different value types, are then obtained in multidimensional payment feature according to the mapping relations Non-numeric type pays the corresponding value type of feature, and it is number that multidimensional, which is paid the non-numeric type payment Feature Conversion in feature, Value Types pay feature.
Either, multidimensional is paid into the non-numeric type payment feature in feature and is first converted to value type payment feature, The multidimensional of obtained numeralization pays feature.For example, a mobile payment record can indicate are as follows: user identifier user (character String type), the type category (character string type) of payment for merchandise, payment amount money (float) pay shop Shop_name (character string type) hands over incident timestamp time_stamp (timestamp type) etc., can in this record By the type category of the payment for merchandise of the user identifier user of character string type, character string type and character string type It pays shop shop_name and carries out labeling, i.e., character string type is mapped as value type, obtains corresponding numerical value class Type pays feature.
It is after value type pays feature, at this point, moving that multidimensional, which is paid the non-numeric type payment Feature Conversion in feature, It is value type payment feature that it is corresponding, which to pay feature, for the multidimensional of dynamic payment record, and encoding submodule can pay multidimensional special Value type payment feature in sign carries out sliding-model control, for example, can be by the payment of float in above-mentioned record The equidistant discretization of amount of money money is 10 grades (difference of maximum value and minimum value in all payment amounts is divided into 10 grades), is determined The corresponding value type of every level-one pays feature;By the incident timestamp time_stamp of the friendship of timestamp type with every 10 points Zhong Weiyi granularity division determines the corresponding value type payment feature of each granularity.Finally according to more after sliding-model control Dimension payment feature, available every dimension pay the corresponding payment encoded information of feature.
It should be noted that when the mobile payment record to user is analyzed, it can be first to all mobile payments Record is pre-processed, and is obtained target payment feature, is paid the feature vector that feature generates mobile payment record according to target, In, target payment feature may include the average consumption amount of money, the spending amount most frequently occurred and user most frequently consume when Between section etc..For example, can calculate in all mobile payments record of the user, average consumption amount of money avg_num, most frequently occur Spending amount (spending amount is subjected to equidistant sliding-model control as unit of 100 here) and the spending amount frequency is most The statistics such as period most_time that the average value most_num and user of the spending amount in more sections are most frequently consumed Information is learned, forms the feature vector of payment record according to demographic information.
To sum up, obtain transferring accounts record corresponding oriented cum rights network node vector and mobile payment record feature to After amount, first generate subelement 3023 can node vector to record of transferring accounts and mobile payment record feature vector carry out it is left Right splicing generates the corresponding feature space of Transaction Information.Wherein, left and right splicing be in feature space node vector it is left, Feature vector is spliced on the right side, alternatively, left and right splicing be in feature space node vector in right, feature vector in left spelling It connects.
In some embodiments, candidate categories information includes behavioural information, obtains subelement 3022 and is specifically used for:
Obtain the multidimensional behavioural characteristic of behavioural information;
Construct the corresponding oriented cum rights network of every dimension behavioural characteristic;
Obtain the node vector of each oriented cum rights network;
First generation subelement 3023 is specifically used for: according to the node vector of each oriented cum rights network, generating behavior letter Cease corresponding feature space.
By taking candidate categories information is behavioural information as an example, wherein behavior information may include thumbing up the frequency, sending out for user Table comments on the frequency, the hair message frequency, the voice communication frequency and video calling frequency etc., and obtaining subelement 3022 can be with subordinate act Multidimensional behavioural characteristic is extracted in information, which may include thumbing up the frequency, comment, the hair message frequency and leading to Any number of behavioural characteristics during voice frequency is inferior, wherein call frequency includes the voice communication frequency and the video calling frequency etc..
It should be noted that in order to protect privacy of user, the embodiment of the present invention can be passed through the classification informations such as behavioural information The processing such as transcoding and desensitization is crossed, therefore, the classification information of statistics of the embodiment of the present invention was that treated for transcoding and desensitization etc. Information, to achieve the purpose that protect privacy of user.
For every dimension behavioural characteristic of behavioural information, obtains subelement 3022 and construct the corresponding oriented band of every dimension behavioural characteristic Network is weighed, for example, behavioural characteristic includes thumbing up the frequency, the comment frequency, hair the message frequency and call frequency etc., at this point it is possible to structure It makes and thumbs up the corresponding oriented cum rights network A of the frequency, which is the net that user mutually thumbs up the frequency in circle of friends Network;The corresponding oriented cum rights network B of comment can be constructed, which is mutual comment of the user in circle of friends Network, the corresponding oriented cum rights network C of the hair message frequency can be constructed, which is user in social platform The network of (such as wechat and QQ) mutually hair message frequency;The corresponding oriented cum rights network D of call frequency can be constructed, this is oriented Cum rights network D mutual video or network of the voice communication frequency between user.
After obtaining the corresponding oriented cum rights network of every dimension behavioural characteristic, LINE algorithm can be passed through by obtaining subelement 3022 It is approximate based on First-order approximation and Second-order, obtain the node vector of each oriented cum rights network, wherein logical Cross the node vector that LINE algorithm obtains each oriented cum rights network, with above by LINE algorithm acquisition transfer accounts record it is corresponding The node vector of oriented cum rights network is similar, repeats no more here.
After the node vector for obtaining the corresponding oriented cum rights network of every dimension behavioural characteristic, first generates subelement 3023 can To generate the corresponding feature space of behavioural information according to the node vector of each oriented cum rights network.For example, can will obtain The node vector a of oriented cum rights network A, the node vector b of oriented cum rights network B, oriented cum rights network C node vector c and The node vector d of oriented cum rights network D, is spliced, and the corresponding feature space of behavioural information is obtained.
In some embodiments, as shown in figure 11, classification information includes attribute information, and feature acquiring unit 302 includes:
Feature obtains subelement 3024, for obtaining the multidimensional property feature of attribute information in multiple classification informations;
Coded sub-units 3025 obtain attribute coding's information for encoding to multidimensional property feature;
Second generates subelement 3026, for generating the corresponding feature space of attribute information according to attribute coding's information.
Specifically, by taking classification information is attribute information as an example, wherein the attribute information may include the gender of user, family Township, city of residence and age etc., feature obtains subelement 3024 can be to obtain multidimensional property feature in dependence information, this is more Dimension attribute feature may include any number of attributive character in gender, local, city of residence and age etc..
For example, the multidimensional property feature of the attribute information of a user can indicate are as follows: user, gender, home, Domicile, age etc., wherein user indicates that user identifier, the user identifier can be character string type;Gender indicates to use The gender at family, the gender can be character string type;Home indicates the local of user, which can be character string type; Domicile indicates that the city of residence of user, the city of residence can be character string type;Age indicates the age now of user, The age can be integer type.Wherein, the gender of user, local and city of residence etc. can be user and registering social account Number or while registering other website platform accounts provide, can also be obtained from other approach;The age of user can pass through registration The age and time filled in when account are calculated, and can also obtain from other approach.
It should be noted that in order to protect privacy of user, the embodiment of the present invention can be passed through the classification informations such as attribute information The processing such as transcoding and desensitization is crossed, therefore, the classification information of statistics of the embodiment of the present invention was that treated for transcoding and desensitization etc. Information, to achieve the purpose that protect privacy of user.
After obtaining the multidimensional property feature of attribute information, coded sub-units 3025 can be compiled multidimensional property feature Code, obtains attribute coding's information, wherein obtained attribute coding's information can be the information of numeralization, and coding mode can be with Flexible setting is carried out according to actual needs, and particular content is not construed as limiting here.The second last generates subelement 3026 can root The corresponding feature space of attribute information is generated according to attribute coding's information, for example, can be directly by attribute coding's information composition characteristic Space.
Optionally, coded sub-units 3025 are specifically used for:
By the non-numeric type attributive character in multidimensional property feature, value type attributive character is converted to;
It is raw according to the value type attributive character in the value type attributive character and multidimensional property feature being converted to At attribute coding's information.
When coded sub-units 3025 encode multidimensional property feature, it can be and numerical value is carried out to multidimensional property feature Change, the corresponding value type of every dimension attribute feature is obtained, for example, can preset between non-numeric type and value type Mapping relations, different non-numeric types correspond to different value types, and it is special then to obtain multidimensional property according to the mapping relations The corresponding value type of non-numeric type attributive character in sign converts the non-numeric type attributive character in multidimensional property feature For value type attributive character.
Either, the non-numeric type attributive character in multidimensional property feature is formed into value type attribute spy by coding Sign, i.e., to every one-dimensional non-numeric type attributive character, count the quantity n of different value, then belong to according to the sequence of 1~n to every dimension Property feature is encoded, and the coding of identical value is identical, and (for example, male is encoded to 1,0) women is encoded to, so as to by non-number Value Types are converted to value type.For example, can by the user identifier user of character string type, gender gender, local home, And city of residence domicile etc., it is mapped as value type, obtains corresponding value type attributive character.
After non-numeric type attributive character in multidimensional property feature is converted to value type attributive character, at this point, belonging to The corresponding multidimensional property feature of property information is value type attributive character, available attribute coding's information.It can basis Attribute coding's information generates the attribute feature vector of attribute information, and it is empty which forms the corresponding feature of attribute information Between.
It is above-mentioned to be directed to each classification information, corresponding feature space is obtained using different processing modes, is improved The flexibility that feature space obtains.
First map unit 303 is obtained for the corresponding feature space of each classification information to be each mapped to nuclear space Multiple nuclear space.
After obtaining the corresponding feature space of each classification information, the first map unit 303 can be by each classification information Corresponding feature space is each mapped to nuclear space, obtains the corresponding nuclear space of each classification information, wherein the nuclear space can be with It is infinite dimensional space, may include the inner product two-by-two between feature vector in the nuclear space, or including node vector two-by-two Between inner product etc..For example, in the feature space for the feature space, behavioural information for obtaining Transaction Information and the spy of attribute information It, can be respectively by the feature space of the feature space of Transaction Information, the feature space of behavioural information and attribute information after levying space It is mapped as nuclear space, obtains the nuclear space of the nuclear space of Transaction Information, the nuclear space of behavioural information and attribute information.
Specifically, feature space can be mapped to nuclear space by kernel function by the first map unit 303, i.e., by feature sky Between middle characterization user DUAL PROBLEMS OF VECTOR MAPPING to nuclear space.Kernel function is defined as follows:
If χ is the input space, which can be Euclidean space RnSubset or discrete set, and set H as core sky Between, which can be Hilbert space, if there is a mapping from χ to H: φ (x): χ → H, so as to all X, z ∈ χ, function k (x, z) meet condition:
K (x, z)=φ (x) φ (y)
Then k (x, z) is referred to as kernel function, and φ (x) is mapping function, wherein φ (x) φ (z) be φ (x) and φ (z) it Between inner product.
The kernel function may include linear kernel function, Polynomial kernel function, gaussian kernel function, Radial basis kernel function, Sigmoid kernel function and compound kernel function etc. will be described in detail, the Gauss so that kernel function is gaussian kernel function as an example below Kernel function can be such that
Then, the obtained corresponding feature space of each classification information is corresponded into input space χ, by gaussian kernel function Processing obtain corresponding nuclear space.For example, by the feature space of obtained Transaction Information, behavioural information feature space and The feature space of attribute information passes through the processing of gaussian kernel function, is respectively mapped to the nuclear space of Transaction Information, behavioural information The nuclear space of nuclear space and attribute information.
Synthesis unit 304 obtains synthesis nuclear space for multiple nuclear space to be carried out multicore linear combination processing.
After obtaining multiple nuclear space as composed by the corresponding nuclear space of each classification information, synthesis unit 304 can be with Multiple nuclear space are subjected to multicore linear combination processing, for example, multiple nuclear space are carried out linear combination calculating, obtain synthetic kernel Space.
In some embodiments, synthesis unit 304 is specifically used for:
Multiple nuclear space are normalized, multiple normalization nuclear space are obtained;
Obtain the corresponding weighted value of each classification information in multiple classification informations;
Multicore linear combination processing is carried out according to the corresponding weighted value of each classification information and each normalization nuclear space, is obtained To synthesis nuclear space.
Due to being taken to every one-dimensional characteristic in each classification information in the corresponding feature space of each classification information of acquisition Value range can be not intended to be limited in any, it is thus possible to which the value range that will lead to some features is larger, and the value of some features Range is smaller, it will the synthesis nuclear space influenced, in order to reduce this influence, synthesis unit 304 can to nuclear space into Row normalized.
Specifically, nuclear space normalized is as follows:
A finite subset of input space χ is set as S={ x1,...xn, Feature Mapping is φ (x): χ → H, kernel function For k (x, z)=φ (x) φ (z), φ (S)={ φ (x is enabled1),...φ(xn) it is image of the S under mapping phi,The element of nuclear matrix K is Kij=k (xi,xj), i, j=1 ... the norm of n, feature vector φ (x) are as follows:
The feature vector of standardization are as follows:
Normalize nuclear space are as follows:
After each nuclear space is normalized respectively, the corresponding normalization core of available each nuclear space is empty Between, therefore after multiple nuclear space are normalized, available multiple normalization nuclear space.
Due to being handled from isomeric data of the different classification informations to user, and different classification information isomery and more Sample, obtained nuclear space have different characteristics, therefore, the nuclear space of different characteristics can be carried out linear combination, obtain multiclass The advantages of nuclear space, so as to obtain more preferably mapping performance.
Multiple nuclear space are subjected to multicore linear combination processing, construction synthesis nuclear space, which can be Linear weighted function summation core.Specifically, synthesis unit 304 first obtains the corresponding weighted value of each classification information, according to each classification The corresponding weighted value of information and each normalization nuclear space carry out multicore linear combination processing, obtain synthesis nuclear space.
For example, setting Transaction Information, behavioural information and the corresponding nuclear space of attribute information are respectively k1(x,z),k2(x, z),k3(x, z), the normalization nuclear space after normalization are respectivelyThese three normalization cores are empty Between carry out multicore linear combination processing, obtain synthesis nuclear space K (x, z) are as follows:
Wherein, βiIndicate the corresponding nuclear space of i-th of classification information to the significance level of the credit evaluation of user, i.e., i-th The corresponding weighted value of a classification information.It, can by handling the progress multicore linear combination of multiple nuclear space to obtain synthesis nuclear space Lead to loss of learning to avoid the feature for directly piecing together isomeric data etc. and generates large error.
Assessment unit 305, for obtaining the credit evaluation result of user according to synthesis nuclear space.
Multicore linear combination processing is being carried out to multiple nuclear space, after obtaining synthesis nuclear space, assessment unit 305 can root The credit evaluation result of user is obtained according to synthesis nuclear space, wherein credit evaluation result can be credit scoring or credit rating Deng.
In some embodiments, assessment unit 305 is specifically used for: by preset regression model, according to synthetic kernel sky Between calculate user credit scoring.
Assessment unit 305 can preset regression model, wherein the regression model is mainly used for according to synthesis nuclear space The credit evaluation of user is calculated as a result, by taking credit evaluation result is credit scoring as an example, conjunction that assessment unit 305 can will obtain Regression model is inputted at nuclear space, so as to export the credit scoring of user.It is higher that credit scoring can be set, credit is got over It is good;Credit scoring is lower, and credit is poorer.
Optionally, by taking credit evaluation result is credit rating as an example, assessment unit 305 can be to be commented according to obtained credit Point as a result, further grade to the credit of user, for example, it may be determined that the corresponding credit rating of obtained credit scoring, It is higher that credit scoring can be set, credit rating is also higher;Credit scoring is lower, and credit rating is also lower;Credit can be set It grades higher, credit is better;Credit rating is lower, and credit is poorer.
In some embodiments, as shown in figure 12, credit evaluation device further include:
Training sample set is divided into multiple classification informations for obtaining training sample set by information collection acquiring unit 306 Collection;
Second map unit 307, for multiple classification information collection to be mapped to nuclear space;
Objective function generation unit 308, for generating objective function according to nuclear space and preset regression function;
Model generation unit 309 generates for being handled by Lagrange duality algorithm objective function and returns mould Type.
Specifically, information collection acquiring unit 306 can first collect the historical data of user, using the historical data of user as Training sample set, for example, it may be being also possible to using whole historical datas as training sample set from the history number being collected into In it is random or filter out a part of data as training sample set according to preset rules.Then, to the corresponding use of training sample set The credit level at family is manually marked, for training regression model.
Below will to the corresponding support vector regression model of single classification information (Support Vector Regression, SVR it) is illustrated:
If training sample set isWherein xiFor input value, yiFor output valve, d is dimension Number, n is number of training.In ε-SVR, x will be first inputtediIt is mapped to feature space by Nonlinear Mapping φ, so that in spy Sign can be fitted output valve y with linear function f (x)=ω φ (x)+b in spacei, and f (x) is for all training samples This, has | f (xi)-yi|≤ε, and f (x) is smooth as far as possible.Thus the regression function of ε-SVR algorithm is obtained:
Wherein, above formula indicates the function distance between the hands-on point of regression function and training sample in a model, right The loss function L defined in formula, elocutionary meaning are that model is allowed to have certain error, and the point in error range is regarded as model On point, and the point outside error range to make it with fitting regression function apart from as small as possible;Constant term C in formula is Punishment parameter.
Similar to SVM, slack variable ξ and ξ * is introduced in SVR, above-mentioned regression function is converted are as follows:
It is solved by Lagrange duality method, converts dual problem for above-mentioned primal problem, obtain new recurrence letter Number, as follows:
Wherein, αiWithIt is above-mentioned secondary by solving for the corresponding Lagrange multiplier of two constraint conditions in former problem Planning problem acquires optimal α and α*, by α and α*The value of the ω and b of available original problem.
The regression function of above-mentioned single classification information can simply expand to multiple classification informations very much, i.e., obtained by information collection Take unit 306 that training sample set is first divided into multiple classification information collection, it in the manner described above will be more by the second map unit 307 A classification information collection is respectively mapped to nuclear space, and objective function generation unit 308 generates target according to nuclear space and regression function Function, therefore, can define be directed to the regression model of multi-class information objective function it is as follows:
The above problem is similarly quadratic programming problem, and model generation unit 309 can be existed by Lagrange duality algorithm Acquire optimal value α, α*, after the parameters such as β and b, obtain the fitting function of training sample set, which is regression model, It can be as follows:
After training regression model, when there is new user xnewWhen needing to assess its credit, the user xnewCredit can To be calculated according to following formula:
Therefore, it is based on above-mentioned regression model, can accurately be assessed from credit of multiple classifications to user, effective gram The defect that traditional single classification is uniformly processed is taken, so that assessment result is more accurate.
From the foregoing, it will be observed that information acquisition unit of the embodiment of the present invention 301 can be divided isomeric data related to user Class, to be divided into multiple classification informations;Then, the corresponding feature space of each classification information is obtained by feature acquiring unit 302, And the corresponding feature space of each classification information is each mapped to nuclear space by the first map unit 303;And it is single by synthesizing Obtained multiple nuclear space are carried out multicore linear combination processing by member 304, obtain synthesis nuclear space;So that assessment unit 305 can obtain the credit evaluation result of user according to synthesis nuclear space.It can classify from the isomeric data of user in the program Obtained multiple classification informations accurately assess the credit of user, effectively overcome and carry out at unified to the information of user Reason, and the defect assessed according to all feature direct splicings at the credit to user such as eigenmatrix, so that assessment knot Fruit is relatively reliable.
The embodiment of the present invention also provides a kind of server, and as shown in figure 13, it illustrates involved in the embodiment of the present invention The structural schematic diagram of server, specifically:
The server may include one or processor 401, one or more meters of more than one processing core The components such as memory 402, power supply 403 and the input unit 404 of calculation machine readable storage medium storing program for executing.Those skilled in the art can manage It solves, server architecture shown in Figure 13 does not constitute the restriction to server, may include than illustrating more or fewer portions Part perhaps combines certain components or different component layouts.Wherein:
Processor 401 is the control centre of the server, utilizes each of various interfaces and the entire server of connection Part by running or execute the software program and/or module that are stored in memory 402, and calls and is stored in memory Data in 402, the various functions and processing data of execute server, to carry out integral monitoring to server.Optionally, locate Managing device 401 may include one or more processing cores;Preferably, processor 401 can integrate application processor and modulatedemodulate is mediated Manage device, wherein the main processing operation system of application processor, user interface and application program etc., modem processor is main Processing wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 401.
Memory 402 can be used for storing software program and module, and processor 401 is stored in memory 402 by operation Software program and module, thereby executing various function application and data processing.Memory 402 can mainly include storage journey Sequence area and storage data area, wherein storing program area can the (ratio of application program needed for storage program area, at least one function Such as sound-playing function, image player function) etc.;Storage data area, which can be stored, uses created data according to server Deng.In addition, memory 402 may include high-speed random access memory, it can also include nonvolatile memory, for example, at least One disk memory, flush memory device or other volatile solid-state parts.Correspondingly, memory 402 can also include Memory Controller, to provide access of the processor 401 to memory 402.
Server further includes the power supply 403 powered to all parts, it is preferred that power supply 403 can pass through power management system It unites logically contiguous with processor 401, to realize the function such as management charging, electric discharge and power managed by power-supply management system Energy.Power supply 403 can also include one or more direct current or AC power source, recharging system, power failure monitor electricity The random components such as road, power adapter or inverter, power supply status indicator.
The server may also include input unit 404, which can be used for receiving the number or character letter of input Breath, and generation keyboard related with user setting and function control, mouse, operating stick, optics or trackball signal are defeated Enter.
Although being not shown, server can also be including display unit etc., and details are not described herein.Specifically in the present embodiment, Processor 401 in server can according to following instruction, by the process of one or more application program is corresponding can It executes file to be loaded into memory 402, and runs the application program being stored in memory 402 by processor 401, thus Realize various functions, as follows:
Isomeric data related to user is obtained, isomeric data is classified, obtains multiple classification informations;It obtains multiple It is empty to obtain the corresponding feature of each classification information according to vector information for the corresponding vector information of each classification information in classification information Between;The corresponding feature space of each classification information is each mapped to nuclear space, obtains multiple nuclear space;By multiple nuclear space into Row multicore linear combination processing obtains synthesis nuclear space;The credit evaluation result of user is obtained according to synthesis nuclear space.
Optionally, the corresponding vector information of each classification information in multiple classification informations is obtained, is obtained according to vector information The step of each classification information corresponding feature space may include: that candidate categories information is extracted from multiple classification informations; The corresponding oriented cum rights network of each candidate categories information is constructed, and obtains the node vector of oriented cum rights network;According to knot Point vector generates the corresponding feature space of classification information.
Optionally, candidate categories information includes Transaction Information, constructs the corresponding oriented cum rights net of each candidate categories information Network, and the node vector of oriented cum rights network is obtained, the step of the corresponding feature space of classification information is generated according to node vector Suddenly may include:
Obtain transfer accounts record and the mobile payment record of Transaction Information;Construction, which is transferred accounts, records corresponding oriented cum rights network; Obtain the node vector of oriented cum rights network;Obtain the feature vector of mobile payment record;According to node vector and feature vector Generate the corresponding feature space of Transaction Information.
Optionally, candidate categories information includes behavioural information, constructs the corresponding oriented cum rights net of each candidate categories information Network, and the node vector of oriented cum rights network is obtained, the step of the corresponding feature space of classification information is generated according to node vector Suddenly may include:
Obtain the multidimensional behavioural characteristic of behavioural information;Construct the corresponding oriented cum rights network of every dimension behavioural characteristic;
Obtain the node vector of each oriented cum rights network;According to the node vector of each oriented cum rights network, row is generated For the corresponding feature space of information.
Optionally, classification information includes attribute information, obtains the corresponding vector of each classification information in multiple classification informations Information, the step of obtaining each classification information corresponding feature space according to vector information may include:
Obtain the multidimensional property feature of attribute information in multiple classification informations;Multidimensional property feature is encoded, is obtained Attribute coding's information;The corresponding feature space of attribute information is generated according to attribute coding's information.
Optionally, by multiple nuclear space carry out multicore linear combination processing, obtain synthesis nuclear space the step of may include: Multiple nuclear space are normalized, multiple normalization nuclear space are obtained;Obtain each classification letter in multiple classification informations Cease corresponding weighted value;Multicore linear combination is carried out according to the corresponding weighted value of each classification information and each normalization nuclear space Processing obtains synthesis nuclear space.
Optionally, the step of obtaining the credit evaluation result of user according to synthesis nuclear space may include: by preset Regression model calculates the credit scoring of user according to synthesis nuclear space.
The specific implementation of above each operation can be found in the embodiment of front, and details are not described herein.
From the foregoing, it will be observed that the embodiment of the present invention can classify isomeric data related to user, it is multiple to be divided into Classification information;Then, the corresponding feature space of each classification information is obtained, and the corresponding feature space of each classification information is divided It is not mapped as nuclear space, obtained multiple nuclear space are subjected to multicore linear combination processing, obtains synthesis nuclear space;So as to The credit evaluation result of user is obtained according to synthesis nuclear space.It can classify from the isomeric data of user in the program more A classification information accurately assesses the credit of user, effectively overcomes and the information of user is uniformly processed, and According to the defect that all feature direct splicings are assessed at the credit to user such as eigenmatrix, so that assessment result more may be used It leans on.
It will appreciated by the skilled person that all or part of the steps in the various methods of above-described embodiment can be with It is completed by instructing, or relevant hardware is controlled by instruction to complete, which can store computer-readable deposits in one In storage media, and is loaded and executed by processor.
For this purpose, the embodiment of the present invention provides a kind of storage medium, wherein being stored with a plurality of instruction, which can be processed Device is loaded, to execute the step in any credit estimation method provided by the embodiment of the present invention.For example, the instruction can To execute following steps:
Isomeric data related to user is obtained, isomeric data is classified, obtains multiple classification informations;It obtains multiple It is empty to obtain the corresponding feature of each classification information according to vector information for the corresponding vector information of each classification information in classification information Between;The corresponding feature space of each classification information is each mapped to nuclear space, obtains multiple nuclear space;By multiple nuclear space into Row multicore linear combination processing obtains synthesis nuclear space;The credit evaluation result of user is obtained according to synthesis nuclear space.
Optionally, the corresponding vector information of each classification information in multiple classification informations is obtained, is obtained according to vector information The step of each classification information corresponding feature space may include: that candidate categories information is extracted from multiple classification informations; The corresponding oriented cum rights network of each candidate categories information is constructed, and obtains the node vector of oriented cum rights network;According to knot Point vector generates the corresponding feature space of classification information.
Optionally, candidate categories information includes Transaction Information, constructs the corresponding oriented cum rights net of each candidate categories information Network, and the node vector of oriented cum rights network is obtained, the step of the corresponding feature space of classification information is generated according to node vector Suddenly may include:
Obtain transfer accounts record and the mobile payment record of Transaction Information;Construction, which is transferred accounts, records corresponding oriented cum rights network; Obtain the node vector of oriented cum rights network;Obtain the feature vector of mobile payment record;According to node vector and feature vector Generate the corresponding feature space of Transaction Information.
Optionally, candidate categories information includes behavioural information, constructs the corresponding oriented cum rights net of each candidate categories information Network, and the node vector of oriented cum rights network is obtained, the step of the corresponding feature space of classification information is generated according to node vector Suddenly may include:
Obtain the multidimensional behavioural characteristic of behavioural information;Construct the corresponding oriented cum rights network of every dimension behavioural characteristic;
Obtain the node vector of each oriented cum rights network;According to the node vector of each oriented cum rights network, row is generated For the corresponding feature space of information.
Optionally, classification information includes attribute information, obtains the corresponding vector of each classification information in multiple classification informations Information, the step of obtaining each classification information corresponding feature space according to vector information may include:
Obtain the multidimensional property feature of attribute information in multiple classification informations;Multidimensional property feature is encoded, is obtained Attribute coding's information;The corresponding feature space of attribute information is generated according to attribute coding's information.
Optionally, by multiple nuclear space carry out multicore linear combination processing, obtain synthesis nuclear space the step of may include: Multiple nuclear space are normalized, multiple normalization nuclear space are obtained;Obtain each classification letter in multiple classification informations Cease corresponding weighted value;Multicore linear combination is carried out according to the corresponding weighted value of each classification information and each normalization nuclear space Processing obtains synthesis nuclear space.
Optionally, the step of obtaining the credit evaluation result of user according to synthesis nuclear space may include: by preset Regression model calculates the credit scoring of user according to synthesis nuclear space.
The specific implementation of above each operation can be found in the embodiment of front, and details are not described herein.
Wherein, which may include: read-only memory (ROM, Read Only Memory), random access memory Body (RAM, Random Access Memory), disk or CD etc..
By the instruction stored in the storage medium, any credit provided by the embodiment of the present invention can be executed and commented The step in method is estimated, it is thereby achieved that achieved by any credit estimation method provided by the embodiment of the present invention Beneficial effect is detailed in the embodiment of front, and details are not described herein.
It is provided for the embodiments of the invention a kind of credit estimation method, device and storage medium above and has carried out detailed Jie It continues, used herein a specific example illustrates the principle and implementation of the invention, and the explanation of above embodiments is only It is to be used to help understand method and its core concept of the invention;Meanwhile for those skilled in the art, according to the present invention Thought, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not be construed as Limitation of the present invention.

Claims (15)

1. a kind of credit estimation method characterized by comprising
Isomeric data related to user is obtained, the isomeric data is classified, obtains multiple classification informations;
The corresponding vector information of each classification information in the multiple classification information is obtained, is obtained according to the vector information each The corresponding feature space of classification information;
The corresponding feature space of each classification information is each mapped to nuclear space, obtains multiple nuclear space;
The multiple nuclear space is subjected to multicore linear combination processing, obtains synthesis nuclear space;
The credit evaluation result of the user is obtained according to the synthesis nuclear space.
2. credit estimation method according to claim 1, which is characterized in that described to obtain in the multiple classification information often The corresponding vector information of a classification information, the step of corresponding feature space of each classification information is obtained according to the vector information Include:
Candidate categories information is extracted from the multiple classification information;
Construct the corresponding oriented cum rights network of each candidate categories information, and obtain the node of the oriented cum rights network to Amount;
The corresponding feature space of the classification information is generated according to the node vector.
3. credit estimation method according to claim 2, which is characterized in that the candidate categories information includes transaction letter Breath, it is described to construct the corresponding oriented cum rights network of each candidate categories information, and obtain the node of the oriented cum rights network Vector, the step of generating the classification information corresponding feature space according to the node vector include:
Obtain transfer accounts record and the mobile payment record of the Transaction Information;
It transfers accounts described in construction and records corresponding oriented cum rights network;
Obtain the node vector of the oriented cum rights network;
Obtain the feature vector of the mobile payment record;
The corresponding feature space of the Transaction Information is generated according to the node vector and described eigenvector.
4. credit estimation method according to claim 3, which is characterized in that the knot for obtaining the oriented cum rights network Point vector the step of include:
It calculates in the oriented cum rights network and estimates connection probability and experience connection probability between every two node;
The distributional difference between connection probability and experience connection probability is estimated described in calculating, obtains first object function;
It calculates the context in the oriented cum rights network between every two node and estimates probability and context empirical probability;
It calculates the context and estimates distributional difference between probability and context empirical probability, obtain the second objective function;
The node vector of the oriented cum rights network is obtained according to the first object function and second objective function.
5. credit estimation method according to claim 4, which is characterized in that described according to the first object function and institute Stating the step of the second objective function obtains the node vector of the oriented cum rights network includes:
The first object function is optimized by stochastic gradient descent algorithm, the node obtained under first object function is low Dimensional vector;
Second objective function is optimized by stochastic gradient descent algorithm, the node obtained under the second objective function is low Dimensional vector;
Node low-dimensional vector under node low-dimensional vector and second objective function under the first object function is carried out Splicing, obtains the node vector of the oriented cum rights network.
6. credit estimation method according to claim 3, which is characterized in that the spy for obtaining the mobile payment record Levy vector the step of include:
Obtain the multidimensional payment feature of the mobile payment record;
Multidimensional payment feature is encoded, encoded information is got paid;
The feature vector of the payment record is generated according to the payment encoded information.
7. credit estimation method according to claim 6, which is characterized in that described to be compiled to multidimensional payment feature Code, the step of getting paid encoded information include:
It is that value type pays feature that the multidimensional, which is paid the non-numeric type payment Feature Conversion in feature,;
By the value type being converted to pay feature and the multidimensional payment feature in value type payment feature carry out from Dispersion processing, gets paid encoded information.
8. credit estimation method according to claim 2, which is characterized in that the candidate categories information includes behavior letter Breath, it is described to construct the corresponding oriented cum rights network of each candidate categories information, and obtain the node of the oriented cum rights network Vector, the step of generating the classification information corresponding feature space according to the node vector include:
Obtain the multidimensional behavioural characteristic of the behavioural information;
Construct the corresponding oriented cum rights network of every dimension behavioural characteristic;
Obtain the node vector of each oriented cum rights network;
According to the node vector of each oriented cum rights network, the corresponding feature space of behavioural information is generated.
9. credit estimation method according to claim 1, which is characterized in that the classification information includes attribute information, institute It states and obtains the corresponding vector information of each classification information in the multiple classification information, each class is obtained according to the vector information The step of other information corresponding feature space includes:
Obtain the multidimensional property feature of attribute information in the multiple classification information;
The multidimensional property feature is encoded, attribute coding's information is obtained;
The corresponding feature space of the attribute information is generated according to attribute coding's information.
10. credit estimation method according to claim 9, which is characterized in that described to be carried out to the multidimensional property feature Coding, the step of obtaining attribute coding's information include:
By the non-numeric type attributive character in the multidimensional property feature, value type attributive character is converted to;
It is raw according to the value type attributive character in the value type attributive character and the multidimensional property feature being converted to At attribute coding's information.
11. the credit estimation method according to any of claims 1 to 10, which is characterized in that described by the multiple core Space carry out multicore linear combination processing, obtain synthesis nuclear space the step of include:
The multiple nuclear space is normalized, multiple normalization nuclear space are obtained;
Obtain the corresponding weighted value of each classification information in the multiple classification information;
Multicore linear combination processing is carried out according to each corresponding weighted value of classification information and each normalization nuclear space, is obtained To synthesis nuclear space.
12. the credit estimation method according to any of claims 1 to 10, which is characterized in that described according to the synthesis Nuclear space obtains the step of credit evaluation result of the user and includes:
By preset regression model, the credit scoring of the user is calculated according to the synthesis nuclear space.
13. credit estimation method according to claim 12, which is characterized in that it is described by preset regression model according to The synthesis nuclear space, before the step of calculating the credit scoring of the user, the method also includes:
Training sample set is obtained, the training sample set is divided into multiple classification information collection;
The multiple classification information collection is mapped into nuclear space;
Objective function is generated according to the nuclear space and preset regression function;
The objective function is handled by Lagrange duality algorithm, generates regression model.
14. a kind of credit evaluation device characterized by comprising
Information acquisition unit classifies the isomeric data for obtaining isomeric data related to user, obtains multiple Classification information;
Feature acquiring unit, for obtaining the corresponding vector information of each classification information in the multiple classification information, according to institute It states vector information and obtains the corresponding feature space of each classification information;
First map unit obtains more for the corresponding feature space of each classification information to be each mapped to nuclear space A nuclear space;
Synthesis unit obtains synthesis nuclear space for the multiple nuclear space to be carried out multicore linear combination processing;
Assessment unit, for obtaining the credit evaluation result of the user according to the synthesis nuclear space.
15. a kind of storage medium, which is characterized in that the storage medium is stored with a plurality of instruction, and described instruction is suitable for processor It is loaded, the step in 1 to 13 described in any item credit estimation methods is required with perform claim.
CN201810036839.9A 2018-01-15 2018-01-15 Credit evaluation method, device and storage medium Active CN110046981B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810036839.9A CN110046981B (en) 2018-01-15 2018-01-15 Credit evaluation method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810036839.9A CN110046981B (en) 2018-01-15 2018-01-15 Credit evaluation method, device and storage medium

Publications (2)

Publication Number Publication Date
CN110046981A true CN110046981A (en) 2019-07-23
CN110046981B CN110046981B (en) 2022-03-08

Family

ID=67272847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810036839.9A Active CN110046981B (en) 2018-01-15 2018-01-15 Credit evaluation method, device and storage medium

Country Status (1)

Country Link
CN (1) CN110046981B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717377A (en) * 2019-08-26 2020-01-21 平安科技(深圳)有限公司 Face driving risk prediction model training and prediction method thereof and related equipment
CN110796269A (en) * 2019-09-30 2020-02-14 北京明略软件系统有限公司 Method and device for generating model, and method and device for processing information
CN111553800A (en) * 2020-04-30 2020-08-18 上海商汤智能科技有限公司 Data processing method and device, electronic equipment and storage medium
CN112446777A (en) * 2019-09-03 2021-03-05 腾讯科技(深圳)有限公司 Credit evaluation method, device, equipment and storage medium
CN113362162A (en) * 2021-06-29 2021-09-07 深圳壹账通智能科技有限公司 Wind control identification method and device based on network behavior data, electronic equipment and medium
CN114039868A (en) * 2021-11-09 2022-02-11 广东电网有限责任公司江门供电局 Value added service management method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008126998A1 (en) * 2007-04-17 2008-10-23 Hyun Uk Shin Interpersonal loan brokerage system and method
CN104850939A (en) * 2015-04-28 2015-08-19 信而量数据科技(上海)有限公司 Information management system and method based on personal credit data
CN106952052A (en) * 2017-04-06 2017-07-14 东北林业大学 Based on hybrid weight core principle component analysis enterprise supplier evaluation method
CN107133865A (en) * 2016-02-29 2017-09-05 阿里巴巴集团控股有限公司 A kind of acquisition of credit score, the output intent and its device of characteristic vector value
CN107292463A (en) * 2016-03-30 2017-10-24 阿里巴巴集团控股有限公司 A kind of method and system that the project evaluation is carried out to application program
CN107481132A (en) * 2017-08-02 2017-12-15 上海前隆信息科技有限公司 A kind of credit estimation method and system, storage medium and terminal device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008126998A1 (en) * 2007-04-17 2008-10-23 Hyun Uk Shin Interpersonal loan brokerage system and method
CN104850939A (en) * 2015-04-28 2015-08-19 信而量数据科技(上海)有限公司 Information management system and method based on personal credit data
CN107133865A (en) * 2016-02-29 2017-09-05 阿里巴巴集团控股有限公司 A kind of acquisition of credit score, the output intent and its device of characteristic vector value
CN107292463A (en) * 2016-03-30 2017-10-24 阿里巴巴集团控股有限公司 A kind of method and system that the project evaluation is carried out to application program
CN106952052A (en) * 2017-04-06 2017-07-14 东北林业大学 Based on hybrid weight core principle component analysis enterprise supplier evaluation method
CN107481132A (en) * 2017-08-02 2017-12-15 上海前隆信息科技有限公司 A kind of credit estimation method and system, storage medium and terminal device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TANG J等: "Line: Large-scale information network embedding", 《PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717377A (en) * 2019-08-26 2020-01-21 平安科技(深圳)有限公司 Face driving risk prediction model training and prediction method thereof and related equipment
CN112446777A (en) * 2019-09-03 2021-03-05 腾讯科技(深圳)有限公司 Credit evaluation method, device, equipment and storage medium
CN112446777B (en) * 2019-09-03 2023-11-17 腾讯科技(深圳)有限公司 Credit evaluation method, device, equipment and storage medium
CN110796269A (en) * 2019-09-30 2020-02-14 北京明略软件系统有限公司 Method and device for generating model, and method and device for processing information
CN110796269B (en) * 2019-09-30 2023-04-18 北京明略软件系统有限公司 Method and device for generating model, and method and device for processing information
CN111553800A (en) * 2020-04-30 2020-08-18 上海商汤智能科技有限公司 Data processing method and device, electronic equipment and storage medium
CN111553800B (en) * 2020-04-30 2023-08-25 上海商汤智能科技有限公司 Data processing method and device, electronic equipment and storage medium
CN113362162A (en) * 2021-06-29 2021-09-07 深圳壹账通智能科技有限公司 Wind control identification method and device based on network behavior data, electronic equipment and medium
CN114039868A (en) * 2021-11-09 2022-02-11 广东电网有限责任公司江门供电局 Value added service management method and device
CN114039868B (en) * 2021-11-09 2023-08-18 广东电网有限责任公司江门供电局 Value added service management method and device

Also Published As

Publication number Publication date
CN110046981B (en) 2022-03-08

Similar Documents

Publication Publication Date Title
TWI788529B (en) Credit risk prediction method and device based on LSTM model
CN110046981A (en) A kind of credit estimation method, device and storage medium
EP3985578A1 (en) Method and system for automatically training machine learning model
CN110363449B (en) Risk identification method, device and system
CN109919316A (en) The method, apparatus and equipment and storage medium of acquisition network representation study vector
CN108288067A (en) Training method, bidirectional research method and the relevant apparatus of image text Matching Model
CN110516910A (en) Declaration form core based on big data protects model training method and core protects methods of risk assessment
CN110347719A (en) A kind of enterprise's foreign trade method for prewarning risk and system based on big data
US11604994B2 (en) Explainable machine learning based on heterogeneous data
Chen et al. Calibrating a Land Parcel Cellular Automaton (LP-CA) for urban growth simulation based on ensemble learning
CN110297911A (en) Internet of Things (IOT) calculates the method and system that cognition data are managed and protected in environment
CN107615275A (en) Estimate to excavate the computing resource serviced for service data
CN107358247A (en) A kind of method and device for determining to be lost in user
CN110348721A (en) Financial default risk prediction technique, device and electronic equipment based on GBST
US20220100772A1 (en) Context-sensitive linking of entities to private databases
US20220100967A1 (en) Lifecycle management for customized natural language processing
Li et al. Predicting business risks of commercial banks based on BP-GA optimized model
CN114693409A (en) Product matching method, device, computer equipment, storage medium and program product
Wang et al. Time-series forecasting of mortality rates using transformer
CN109740947A (en) Expert's method for digging, system, storage medium and electric terminal based on patent data
Niham et al. Utilization of Big Data in Libraries by Using Data Mining
Nguyen et al. Estimating county health indices using graph neural networks
CN116975622A (en) Training method and device of target detection model, and target detection method and device
TWM623354U (en) investment recommendation system
Yang Intelligent informatization early warning analysis of agricultural economy based on support vector sequential regression model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant