The patent application is a divisional application of a patent application with the application number of CN201510369824.0 and the application date of 2015, 06 and 29 and is named as 'a method and equipment for determining network relationship stability and recommending internet services'.
Detailed Description
In order to achieve the purpose of the present application, an embodiment of the present application provides a method and a device for determining network relationship stability and internet service recommendation based on network relationship data, and network relationship data associated with a first user identifier is obtained, where the network relationship data is used to represent that a network connection is established between a first user and at least one second user corresponding to the first user identifier through an internet platform, and the network relationship data includes the at least one second user identifier; and determining the network relationship stability of the first user on the internet platform according to the at least one second user identifier contained in the acquired network relationship data, wherein the network relationship stability is used for representing the probability of the first user of generating the credit risk, and the higher the network relationship stability is, the lower the probability of the first user of generating the credit risk is. Therefore, the embodiment of the application estimates the stability of the network relationship of the first user on the internet platform by collecting the network relationship data associated with the first user identification, judges whether the user has credit risk or not by means of the stability of the network relationship of the user, provides safety guarantee for the subsequently generated internet service, and particularly reduces the fraud risk in the internet financial service.
It should be noted that the internet business operation described in the embodiment of the present application may refer to a business operation related to the internet, and may also refer to a business operation related to an internet financial business. The internet financial services herein may include, but are not limited to: point-to-point (English: Peer-to-Peer; abbreviation: P2P) credit service, Third Party Payment (English: Third-Party Payment) service.
The present application is described in further detail below with reference to the attached drawing figures. It is to be understood that the described embodiments are merely exemplary of the present application and not restrictive of the total embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a schematic flowchart of a method for determining network relationship stability based on network relationship data according to an embodiment of the present application. The method may be as follows.
Step 101: network relationship data associated with a first subscriber identity is obtained.
The network relationship data is used for representing that a network contact is established between a first user and at least one second user corresponding to the first user identification through an internet platform, and the network relationship data comprises the at least one second user identification.
In step 101, as internet technology develops, users use the internet more and more, and traces left on the internet are more and more, and such traces may be referred to as network data. Among these traces is network relationship data, i.e., data that a user establishes a network connection with different other users in a social network.
For example: if a first user establishes a friend relationship with a plurality of other users on a certain social network site, a corresponding relationship between the first user identifier and the other user identifiers establishing the friend relationship with the first user identifier is stored in a server of the social network site, and the corresponding relationship can be called network relationship data.
It should be noted that the network relationship data includes, in addition to the friend relationship established between two different users, data of which the friend relationship changes, and the like. In a broad sense, the network relationship data may contain all data associated with the network relationship.
It should be further noted that the network relationship data in the embodiment of the present invention may also be referred to as context relationship data in real life, and then the network relationship architecture may also be referred to as a context relationship architecture of the user.
Step 102: and determining the network relationship stability of the first user on the Internet platform according to the at least one second user identifier contained in the acquired network relationship data.
The network relationship stability is used for representing the probability of the first user appearing credit risk, and the higher the network relationship stability is, the lower the probability of the first user appearing credit risk is.
In step 102, when the network relationship data associated with the first user identifier is acquired, the acquired network relationship data is analyzed from different dimensions, so as to determine the network relationship stability of the first user on the internet platform.
Specifically, firstly, according to the at least one second user identifier included in the collected network relationship data, the network relationship characteristics of the first user on the internet platform are determined.
The network relation feature is used for representing the state of the first user establishing network contact on the Internet platform.
The dimensions described in the embodiments of the present application include, but are not limited to, the following three dimensions: the first dimension is used for representing the stability of the network relation structure of the first user; the second dimension is used for representing the stability of the network relationship structure of a second user establishing a strong network relationship with the first user; and the third dimension is used for representing that the network relation structure of the first user changes.
The three dimensions may be determined in advance, or may be obtained by analyzing a large amount of data, and the determination method is not limited herein.
When analyzing the acquired network relationship data from different dimensions, the network relationship features contained in the different dimensions can be analyzed.
Specifically, for the stability of the network relationship structure for representing the first user, in a social network, if two different users a and B respectively establish a friend relationship with a user C, it is indicated that the user a and the user B may also establish a friend relationship at a future time, and this manner is also referred to as a ternary closure, as shown in fig. 2, which is a schematic diagram of a ternary closure structure formed in the social network. In addition, if more than 3 users establish a friend relationship with each other, the friend relationship represents the stability of a network relationship structure. In this way, through the analysis of the network relationship data of the first user, at least one first characteristic for characterizing the stability of the network relationship structure of the first user can be determined.
Namely, at least one first feature used for representing the stability of the network relationship structure of the first user is determined according to the at least one second user identifier contained in the acquired network relationship data.
The first characteristic comprises at least one of the number of users establishing friend relationship with the first user, the number of connections established between the first user and friends thereof, and the stability of the network relationship structure of the first user.
For example: determining the number of users of second users establishing a friend relationship with the first user according to the at least one second user identifier contained in the acquired network relationship data; the number of connections established between the first user and at least one second user establishing a friend relationship with the first user can be determined according to the at least one second user identifier contained in the acquired network relationship data; the stability of the network relationship structure of the first user can also be calculated according to the at least one second user identifier contained in the acquired network relationship data.
It should be noted that, the method for calculating the stability of the network relationship structure of the first user may adopt the following method, or may adopt other methods, which is not limited herein:
in the embodiment of the present application, a network clustering method is used to calculate the stability of the network relationship structure of the first user:
the RCS is the calculated network relation structure stability of the first user; n is the number of second users establishing friend relationships with the first user; m is the number of the friend relationships established with the friends of the first user.
For the second dimension used for representing the stability of the network relationship structure of the second user establishing a strong network relationship with the first user, since the greater the number of the second users establishing a strong network relationship with the first user in the social network (or the greater the proportion of the number of the second users establishing a strong network relationship with the first user to the total number of the second users establishing a friend relationship with the first user), the higher the stability of the network relationship structure of the second user establishing a strong network relationship with the first user is, and based on transitivity, the higher the stability of the network relationship structure of the first user in the social network is also described.
And the strong network relation is used for representing that the network interaction times between the first user and the second user in a set time period are greater than a set numerical value.
For example: establishing a friend relationship between a user A and a user B, and establishing a friend relationship between the user A and a user C, wherein the number of times of interaction between the user A and the user B through a network in a set time period reaches 100 times; in the same time period, the number of times of network interaction between the user a and the user C is less than 10, and at this time, it can be determined that the user B belongs to the user who establishes the strong network relationship with the user a, and the user C does not belong to the user who establishes the strong network relationship with the user a.
Specifically, at least one second feature used for characterizing the stability of the network relationship structure of the second user who establishes a strong network relationship with the first user is determined according to the at least one second user identifier included in the acquired network relationship data.
The second characteristic includes at least one of the number of second users establishing a strong network relationship with the first user and an average value of the network relationship structural stability of the second users establishing a strong network relationship with the first user.
For the third dimension, the network relationship structure used for representing the first user changes, and as the first user establishes a friend relationship with a plurality of different second users, and as time progresses, the number of users in contact with the first user in different time periods is different, as shown in fig. 3, the network relationship structure of the first user changes. As can be seen from fig. 3, the number of users that remain in contact with the first user gradually decreases as the time period increases.
It should be noted that the time length corresponding to T1 in fig. 3 is greater than the time length corresponding to T2; the time length corresponding to T2 is greater than the time length corresponding to T3; the time length corresponding to T3 is longer than the time length corresponding to T4.
The stability of the network relationship structure of the first user can be further judged by observing the change of the network relationship structure of the first user by means of time factors.
Specifically, at least one third feature for representing that the network relationship structure of the first user changes is determined according to the at least one second user identifier included in the acquired network relationship data.
The third characteristic includes at least one of the number of users establishing network contact with the first user in different time periods and the number of users establishing network contact with the first user, wherein the number of contact times is smaller than a set number of times.
After the network relationship data of the first user is collected, counting the number of users establishing network contact (namely performing one-time network contact with the first user) with the first user in different time periods according to the time period, and further determining the proportion of the number of the users establishing the network contact to the total number of the users establishing the friend relationship with the first user.
In another embodiment of the present application, after obtaining network relationship features of different dimensions, the method further includes:
and screening the obtained network relationship characteristics, and selecting the network relationship characteristics which greatly contribute to the network relationship stability of the first user on the Internet platform.
It should be noted that the network relationship feature with a relatively large contribution here can be understood as the network relationship feature that can best represent the stability of the network relationship of the first user on the internet platform.
The parameter used for measuring the contribution size in the embodiment of the present application may be an information value, that is, an influence degree value of the network relationship characteristic on the network relationship stability.
Specifically, the manner of calculating the information value of each network relationship feature in the embodiment of the present application may adopt the following manner:
in the first step, a sample is taken.
The samples collected here include positive samples (i.e., user information with high network relationship stability) and negative samples (i.e., user information with low network relationship stability).
For example: the positive sample may contain user information with lower credit risk in internet financial services; the negative examples may contain information of users with higher credit risk in internet financial services.
And secondly, performing discrete processing on each obtained network relation characteristic.
Specifically, each network relationship feature is subjected to discrete processing in a mode of an equal frequency interval or an equal width interval. In the process of discrete processing, a chi-square analysis method is adopted to judge whether the discrete intervals need to be combined.
The basic idea of chi-squared analysis here is that for accurate discretization, the class distributions within one interval should be identical, so that if adjacent intervals have similar class distributions, then the adjacent intervals are merged into one interval. The discrete processing comprises the following specific steps:
taking values of each sample according to the target network relation characteristics of discrete processing, so that each sample belongs to an interval; and calculating the chi-square value of each group of adjacent intervals, and combining the group of adjacent intervals with the minimum chi-square value. When the number of the merged sections matches the preset number of sections, the discrete processing is ended.
Here, the chi-squared value for each set of adjacent intervals may be calculated as:
wherein m is 2, which represents two adjacent intervals calculated each time; k is 2, representing the number of sample types, i.e. positive and negative samples; a. the
ijThe number of sample points of a jth sample in the ith interval is represented; e
ijIs represented by A
ijThe specific calculation may be expressed as:
it should be noted that the number of the intervals set in the embodiment of the present application may be determined as needed, for example: 5 pieces of the Chinese herbal medicines.
And thirdly, calculating the weight value of each interval obtained after the discrete processing of each network relation characteristic.
The weight value represents an influence value of each interval of each network relationship characteristic on the network relationship stability of the first user.
Assuming that 5 intervals are obtained for each network relationship feature, the weight value calculated for each interval can be expressed as:
WOE(j)=In(P0j/P1j);
wherein WOE (j) is represented as weight value of j interval, P1jRepresenting the ratio of the number of negative samples to the total number of negative samples in the jth interval, P0jThe ratio of the number of positive samples to the total number of positive samples in the jth interval is shown.
And fourthly, screening out the network relation characteristics which greatly contribute to the network relation stability of the first user on the internet platform by using the network relation characteristic matrix and the information value of each network relation characteristic.
Specifically, the information value of each network relationship feature is calculated by the following method:
wherein IV is represented as an information value of a network relation characteristic.
And when the information value of each network relationship characteristic is obtained through calculation, selecting the network relationship characteristic of which the information is greater than a set threshold value as the network relationship characteristic which greatly contributes to the network relationship stability of the first user on the internet platform.
And secondly, calculating to obtain a network relationship stability index of the first user on the Internet platform according to the network relationship characteristics.
Specifically, according to the determined at least one first feature, the determined at least one second feature and the determined at least one third feature, a network relationship stability index of the first user on the internet platform is calculated and obtained based on a logistic regression model.
For example: assuming that the number of the determined network relationship features is 9, calculating a network relationship stability index of the first user on the internet platform by using the following method:
wherein SCORE represents and obtains a network relation stability index theta of the first user on the Internet platform0To theta9The logistic regression coefficients are represented.
The logistic regression system can be obtained by calculating a training sample set and maximum likelihood estimation.
And finally, determining the network relationship stability of the first user on the Internet platform by using the network relationship stability index.
Specifically, after the network relationship stability index of the first user is obtained through calculation, it may be determined that the higher the network relationship stability index is, which indicates that the higher the network relationship stability of the first user on the internet platform is, the lower the first user credit risk is; conversely, it indicates that the first user is at a higher credit risk.
According to the technical scheme of the embodiment of the application, network relation data associated with a first user identification is obtained, the network relation data is used for representing that a network contact is established between a first user corresponding to the first user identification and at least one second user through an internet platform, and the network relation data comprises the at least one second user identification; and determining the network relationship stability of the first user on the internet platform according to the at least one second user identifier contained in the acquired network relationship data, wherein the network relationship stability is used for representing the probability of the first user of generating the credit risk, and the higher the network relationship stability is, the lower the probability of the first user of generating the credit risk is. Therefore, the embodiment of the application estimates the stability of the network relationship of the first user on the internet platform by collecting the network relationship data associated with the first user identification, judges whether the user has credit risk or not by means of the stability of the network relationship of the user, provides safety guarantee for the subsequently generated internet service, and particularly reduces the fraud risk in the internet financial service.
Fig. 4 is a flowchart illustrating an internet service recommendation method according to an embodiment of the present application. The method may be as follows.
Step 401: and determining the stability of the network relationship of the first user on the Internet platform.
The network relationship stability is used for representing the probability of the first user appearing credit risk, and the higher the network relationship stability is, the lower the probability of the first user appearing credit risk is.
It should be noted that, in the embodiment of the present application, the manner of determining the network relationship stability may be performed in the manner shown in fig. 1, or may be performed in other manners, which is not limited herein.
Step 402: and determining the internet service type corresponding to the network relation stability of the first user according to the corresponding relation between the network relation stability and the internet service type.
In step 402, in order to reduce the risk of default of the internet service, network relationship stability is determined for different internet services, so that a matching internet service can be determined for the first user according to the calculated network relationship stability of the first user.
Step 403: recommending at least one internet service contained in the determined internet service type to the first user.
It should be noted that the internet service described in the embodiment of the present application is not limited to the internet financial service. The use of determining the network relationship stability provided by the embodiment of the present application is not limited to internet service recommendation, and may also relate to other aspects, which are not described in detail herein.
Fig. 5 is a schematic structural diagram of an apparatus for determining network relationship stability according to an embodiment of the present application. The apparatus comprises: an acquisition unit 51 and a determination unit 52, wherein:
an obtaining unit 51, configured to obtain network relationship data associated with a first user identifier, where the network relationship data is used to represent that a network connection is established between a first user and at least one second user corresponding to the first user identifier through an internet platform, and the network relationship data includes the at least one second user identifier;
a determining unit 52, configured to determine, according to the at least one second user identifier included in the collected network relationship data, a network relationship stability of the first user on the internet platform, where the network relationship stability is used to represent a probability that the first user has a credit risk, and the higher the network relationship stability is, the lower the probability that the first user has the credit risk is.
Specifically, the determining unit 51 is specifically configured to determine, according to the at least one second user identifier included in the acquired network relationship data, a network relationship characteristic of the first user on the internet platform, where the network relationship characteristic is used to characterize a state where the first user establishes a network contact on the internet platform;
calculating to obtain a network relationship stability index of the first user on the internet platform according to the network relationship characteristics;
and determining the network relationship stability of the first user on the Internet platform by using the network relationship stability index.
Specifically, the determining unit 51 determines the network relationship characteristic of the first user on the internet platform according to the at least one second user identifier included in the acquired network relationship data, and specifically includes:
and determining at least one first feature for representing the stability of the network relationship structure of the first user according to the at least one second user identifier included in the acquired network relationship data, wherein the first feature includes at least one of the number of users who establish a friend relationship with the first user, the number of connections established between the first user and friends thereof, and the stability of the network relationship structure of the first user.
Specifically, the determining unit 51 determines the network relationship characteristic of the first user on the internet platform according to the at least one second user identifier included in the acquired network relationship data, and specifically includes:
and determining at least one second feature for representing the network relationship structure stability of a second user establishing a strong network relationship with the first user according to the at least one second user identifier included in the acquired network relationship data, wherein the strong network relationship is used for representing that the number of network interactions between the first user and the second user is greater than a set value in a set time period, and the second feature includes at least one of the number of the second users establishing the strong network relationship with the first user and the average value of the network relationship structure stability of the second users establishing the strong network relationship with the first user.
Specifically, the determining unit 51 determines the network relationship characteristic of the first user on the internet platform according to the at least one second user identifier included in the acquired network relationship data, and specifically includes:
and determining at least one third feature for representing that the network relationship structure of the first user changes according to the at least one second user identifier contained in the acquired network relationship data, wherein the third feature contains at least one of the number of users establishing network contact with the first user in different time periods and the number of users establishing network contact with the first user for which the number of contacts is less than a set number.
Specifically, the determining unit 51 calculates, according to the network relationship characteristic, a network relationship stability index of the first user on the internet platform, and specifically includes:
and calculating to obtain a network relationship stability index of the first user on the Internet platform based on a logistic regression model according to the determined at least one first feature, the determined at least one second feature and the determined at least one third feature.
It should be noted that the apparatus provided in the embodiment of the present application may be implemented by using computer hardware, or may be implemented by using software, which is not limited herein.
Fig. 6 is a schematic structural diagram of an internet service recommendation device according to an embodiment of the present application. The apparatus comprises: a determination unit 61 and a recommendation unit 62, wherein:
the determining unit 61 is configured to determine a network relationship stability of the first user on the internet platform, where the network relationship stability is used to represent a probability that the first user has a credit risk, and the higher the network relationship stability is, the lower the probability that the first user has a credit risk is;
a recommending unit 62, configured to determine, according to a correspondence between a network relationship stability and an internet service type, an internet service type corresponding to the network relationship stability of the first user; and recommending at least one internet service contained in the determined internet service type to the first user.
It should be noted that the apparatus provided in the embodiment of the present application may be implemented by using computer hardware, or may be implemented by using software, which is not limited herein.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus (device), or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present invention and their equivalents, the present invention is intended to include such modifications and variations.