CN103577876B

CN103577876B - Based on feedforward neural network credible with insincere user identification method

Info

Publication number: CN103577876B
Application number: CN201310547349.2A
Authority: CN
Inventors: 王英; 左万利; 田中生; 王鑫; 彭涛; 王萌萌; 赵秋月
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2013-11-07
Filing date: 2013-11-07
Publication date: 2016-10-05
Anticipated expiration: 2033-11-07
Also published as: CN103577876A

Abstract

The invention discloses a kind of based on feedforward neural network credible with insincere user identification method, it is intended to precision is inadequate, basis of characterization is not enough, motility disappearance and community network resolve the problems such as granularity is thicker to overcome prior art to exist.Steps of the method are: 1. obtain special user and determine the user that training set comprises；2. analyze and quantify user characteristics, user is expressed as user characteristics vector；3. build feedforward neural network；4. training feedforward neural network；5. realized credible with insincere user identification by the feedforward neural network after training: the steps include: 1) obtain user profile in community network；2) quantify user profile and generate user characteristics vector；3) being input in feedforward neural network by user characteristics vector, the output valve identification according to output node is credible and insincere user.

Description

Based on feedforward neural network credible with insincere user identification method

Technical field

The present invention relates to the credible of a kind of community network field and insincere user identification method, more precisely, The present invention relates to a kind of based on feedforward neural network credible and insincere user identification method.

Background technology

Nowadays, community network is that the network user provides more convenient, communication for information and a money efficiently Source shared platform, has open and virtual yet with community network, and a large amount of unreal information are the most false Information is full of cyberspace, causes non-sincere phenomenon day by day serious, has undesirable effect society, upset Social order, therefore identifies that network false information becomes to attach most importance to and hot research problem.In view of the network user is The medium that information is issued and propagated, identify the primary work of the information true and false be in community network credible with can not The identification of credit household.

Existing credible and insincere user identifies that work is mostly around user's credit worthiness and calculates expansion, main It is divided three classes: based on user behavior, topological structure based on community network and based on user behavior and network Topological structure.But calculate according to relatively simple:

1. before, two class methods all only considered single credit worthiness influence factor.

Based on user behavior: quote user in behavior evaluation community network according to video sharing behavior and keyword Credit worthiness；Utilize sets theory to combine individual packet with social cognition and carry out credit worthiness assessment；According to online Buddy list and the overall attitude of specific user that in community network, user is up-to-date find between the user not known each other mutually Contact；By number of labels, the credit worthiness of labelling user, the temporal dynamic property letter relevant with term of text Reputation degree assesses user's credit worthiness jointly；According to about the comment once concluded the business, comment publisher's credit worthiness, comment Opinion issuing time, transaction amount and situational factor design credit worthiness valuation functions assess credit worthiness.

Based on community network topological structure: utilize the assessment of relational implementation credit worthiness between user；For online Auction, sets up the S-graph of social connection between reflection buyer and buyer and calculates assessment parameter, pass through participant Display/implicit feedback infer the feedback deliberately omitted, assess its credit worthiness with this；For trusting and power of influence Transitivity assess credit worthiness.

2. although last class method has considered user behavior and topological structure, but does not has the most all sidedly Introduce user's individual factors, more without reference to the interbehavior between user, have ignored the own characteristic of user.

Based on user behavior and community network topological structure: combine three kinds of data elements, social status, society Meeting adjacency and society's similarity carry out society and trust measurement.

Although work on hand achieves the assessment of user's credit worthiness in community network to a certain extent, but its Assessment is according to having limitation, it is impossible to the enough user profile utilizing community network to be provided all sidedly, assessment side Method motility is relatively low, and assessment result lacks objectivity, accuracy and credibility.Therefore, above-mentioned in order to solve Problem proposes to depend on using interbehavior between topological structure, user's individual factors and the user of community network as assessment According to, utilize credible in feedforward neural network identification community network and insincere user, not only overcome and only consider The deficiency of single influence factor, considers user's individual factors the most all sidedly, improves the accurate of recognition methods Degree and objectivity.Wherein, topological structure and user's individual factors of community network is primarily directed to user place Segmental society's network, it has proved that local trust tolerance have higher degree of accuracy than global reputation measurer. Additionally, neutral net study during it can be found that undefined assessment foundation, further increase and comment Estimate the motility of method and the precision of identification.

Summary of the invention

The technical problem to be solved be overcome prior art to exist precision is inadequate, motility is not enough, The problems such as intelligent disappearance, it is provided that a kind of based on feedforward neural network the credible and insincere user side of identification Method.

For solving above-mentioned technical problem, the present invention adopts the following technical scheme that realization: described based on front Credible and the insincere user identification method of feedback neutral net comprises the steps:

1. obtain special user and determine training set user:

(1) building initial user set according to the grade of user in community network, initial user set defines:

Μ=μ ∈ Μ | μ ∈ OSW_s, Φ_μ→Φ_tag}

Wherein: M represents initial user set, OSW_sUser's set under the s of expression field, μ represents user Set OSW_sIn user, Φ_μRepresent the individual summary of user, Φ_tagRepresent what higher ranked user was had Label；

(2) according to user time relevant information, initial user set is deleted, filters inactive user, Using remaining user as special user；

(3) initial social network is built according to the trusting relationship between other users in special user and community network Network；

(4) seed user more is obtained according to the trusting relationship between other users in special user and community network New initial community network, the user that seed user is comprised by training set, acquisition condition:

ProUser={ μ_p∈ProUser||μ_s→μ_p|≥2}

Wherein: ProUser represents seed user set, μ_sRepresent the special user determined, μ_pRepresent set A user in ProUser, | μ_s→μ_p| represent and trust user μ_pSpecial user's number.

2. analyze and quantify user characteristics, user being expressed as user characteristics vectorial:

(1) with three aspects of interbehavior between user's community network topological structure, user's individual factors and user Information carries out feature analysis to user；

(2) quantify user characteristics, user is expressed as the user characteristics vector being made up of multiple features.

3. structure feedforward neural network:

(1) the input node number of feedforward neural network is determined according to the dimension of user characteristics vector；

(2) determine that the number of plies of feedforward neural network and each layer comprise according to the complexity of trusted users identification Nodes；

(3) structure and performance requirement according to feedforward neural network determine that hidden layer and output layer comprise node Type.

4. training feedforward neural network:

(1) according to the k value of k-fold cross validation algorithm, training set is divided into k subset, any two The common factor of subset is empty；

(2) feedforward neural network is carried out k training, chooses a different subset as training set every time, Remaining k-1 subset is as test set；

(3) different weights are given frequency of training determined by k training according to corresponding accuracy of identification, Weighted sum is tried to achieve as final frequency of training according to the corresponding weight value of k frequency of training；

(4) in complete training set, feedforward neural network is trained according to final frequency of training.

5. realized credible identifying with insincere user by the feedforward neural network after training:

(1) three sides of interbehavior between user's community network topological structure, user's individual factors and user are obtained Surface information also quantifies, and user is expressed as user characteristics vector；

(2) user characteristics vector is input to feedforward neural network and carries out credible and insincere user identification, Output valve to identified user；

(3) the output valve identification user according to feedforward neural network is credible or insincere.

Acquisition special user described in technical scheme comprises the steps:

(1) user in initial user set is arranged in descending order according to user time relevant information；

(2) for demarcation line, the user after sequence is divided into two parts with centrally located user, respectively Calculate the time related information average of two parts user, obtain no special user's initial cluster center and special use Family initial cluster center, computing formula:

Ω_{c} = \frac{Σ_{i = 1}^{| Θ | / 2} Γ_{i}}{| Θ | / 2}, Ω_{s} = \frac{Σ_{i = | Θ | / 2}^{| Θ |} Γ_{i}}{| Θ | / 2}

Wherein: set Θ represents initial user set, | Θ | represents the quantity of user, Γ in initial user set_i Represent user time relevant information, Ω_cAnd Ω_sRepresent no special user's initial cluster center and special use respectively Family initial cluster center；

(3) each user is calculated in initial user set to the distance of two cluster centres, computing formula:

d i s t (Γ_{i}, Ω_{c}) = {(| Γ_{i 1} - Ω_{c 1} |^{h} + ... + | Γ_{i r} - Ω_{c r} |^{h})}^{\frac{1}{h}}

d i s t (Γ_{i}, Ω_{s}) = {(| Γ_{i 1} - Ω_{s 1} |^{h} + ... + | Γ_{i r} - Ω_{s r} |^{h})}^{\frac{1}{h}}

Wherein: Γ_iRepresent user time relevant information, Ω_cAnd Ω_sRespectively represent no special user clustering center with And special user's cluster centre, Γ_ir、Ω_crAnd Ω_srRepresent the component of vector respectively；

(4) assign the user in the cluster that cluster centre distance is shorter；

(5) no special user clustering center and special user's cluster centre are calculated, computing formula:

Ω_{c}^{'} = \frac{Σ_{i = 1}^{| Θ_{c}^{'} |} Γ_{i}}{| Θ_{c} |}, Ω_{s}^{'} = \frac{Σ_{i = 1}^{| Θ_{s}^{'} |} Γ_{i}}{| Θ_{s} |}

Wherein: Ω_c' and Ω_s' represent domestic consumer's cluster centre and special user's cluster centre respectively, gather Θ_c And Θ_sRepresent that the domestic consumer that cluster obtains gathers and special user's set respectively, | Θ_c| and | Θ_s| represent respectively The quantity of user, Γ in domestic consumer's set and special user's set_iRepresent user time relevant information；

(6) check in two clusters, whether user changes.If change, recalculate in user's set each Individual user is to the distance of two cluster centres, otherwise, terminates.

Quantization user characteristics described in technical scheme comprises the steps:

(1) topological structure of community network is quantified: core degree, user kernel degree is divided into out-degree (Out-Link) With two parts of in-degree (In-Link), quantify out-degree and in-degree according to user's trusting relationship in community network, Employing equation below:

Out-Link=| Trustee |/(| Trustee |+| Trustor |)

In-Link=| Trustor |/(| Trustee |+| Trustor |)

Wherein: | Trustee | represents the number of the trusted user of user, | Trustor | represents the user trusting user Number；

(2) quantify user's individual factors: liveness and power of influence, deliver Review number and institute with user There is the percentage ratio of the summation choosing user to deliver Review number to quantify liveness, employing equation below:

A c t v i t y = μ_{R W} / \underset{μ &Element; \Pr o U s e r}{Σ} μ_{R W}

Wherein: μ_RWRepresenting that user delivers the number of Review, ProUser represents the use that training set is comprised Family.

According to Member Visits and two attributes of Total Visits, power of influence size is quantified, use such as Lower formula:

M V P = μ_{M V} / \underset{μ &Element; \Pr o U s e r}{Σ} μ_{M V}, T V P = μ_{T V} / \underset{μ &Element; \Pr o U s e r}{Σ} μ_{T V}

Wherein: MVP represents user's Member Visits value and all user Member Visits values chosen The percentage ratio of summation, μ_MVRepresenting the Member Visits value of user, TVP represents user Total Visits Value and the percentage ratio of all summations choosing user's Total Visits value, μ_TVRepresent the Total Visits of user Value, ProUser represents the user that training set is comprised；

(3) interbehavior between user is quantified: supporting rate, opposition rate, the supporting rate of user and opposition rate foundation User delivers the Review Rating of Rview and quantifies, employing equation below:

Wherein: r_sRepresent Review, r that Review Rating is VeryHelpful, MostHelpful_oTable Show that Review Rating is OffTopic, the Review of NotHelpful, SomewhatHelpful, Helpful, |R_w| represent the Review number of user.

Structure feedforward neural network described in technical scheme comprises the steps:

(1) determine the input node number of feedforward neural network according to the dimension of user characteristics vector, have seven Individual input node: Out-Link, In-Link, Activity, MVP, TVP, Support-rate and Oppose-rate；

(2) determine that the number of plies of feedforward neural network and each layer comprise according to the complexity of trusted users identification Nodes.Have three layers: input layer, seven input nodes；Hidden layer, two concealed nodes；Output Layer, an output node；

(3) structure and performance requirement according to feedforward neural network determine that hidden layer and output layer comprise node Type is sigmoid threshold cell.

Training feedforward neural network described in technical scheme comprises the steps:

(1) according to the k value of k-fold cross validation algorithm (in this method, k takes 5), training set is divided into K subset, the common factor of any two subset is empty；

(2) feedforward neural network is carried out k training, chooses a different subset as training set every time, Remaining k-1 subset, as test set, it is noted that " concussion " of accuracy of identification during training, works as identification Deconditioning during stable accuracy；

(3) record accuracy of identification and the frequency of training of k training, be corresponding frequency of training according to accuracy of identification Give weights, calculate the weighted sum of k frequency of training as final frequency of training；

(4) in complete training set, train feedforward neural network according to final frequency of training, saved accordingly The threshold value of point and the weights of corresponding edge.

Credible and insincere user described in technical scheme identifies and comprises the steps:

(1) obtain in community network the community network topological structure of user to be identified, user's individual factors and Between user three aspect information of interbehavior and be quantified as user characteristics vector, user characteristics vector form such as Under: (Out-Link, In-Link, Activity, MVP, TVP, Support-rate, Oppose-rate)；

(2) characteristic vector of user to be identified is input to feedforward neural network, obtains the defeated of user to be identified Go out value；

(3) if the output valve of user to be identified and the difference of 1 are less than predetermined threshold value, then user is identified as Trusted users, otherwise, it is identified as insincere user.

Compared with prior art the invention has the beneficial effects as follows:

The most of the present invention based on feedforward neural network credible with insincere user identification method from social network Between network topological structure, user's individual factors and user three aspects of interbehavior give clear about user, Clearly, computable information represents.Nonumeric information discrete for user in community network is converted into numerical value letter Breath, enriches the foundation that trusted users identifies with insincere user, and represents use with user characteristics vector Family, provides for feedforward neural network and is easy to calculating, the input data of various dimensions.Community network is carried out Stratification parsing: interbehavior between community network topological structure, user's individual factors and user, further Specify that parsing purpose, refined parsing granularity, lay a good foundation for user feature analysis and quantization.Level The each level dissolving the community network after analysis has independence, to the user characteristics amount of carrying out in this level Change will not produce impact to other levels, has high cohesion, the feature of lower coupling, it is simple to changing of quantization method Enter, compensate for the deficiencies such as existing method basis of characterization is single, community network parsing granularity is thicker.

Before the most of the present invention based on feedforward neural network credible and insincere user identification method improves The construction method of feedback neural metwork training collection.Conventional construction method is distinguished only in accordance with the existing labelling of user Credible with insincere user, but the temporal correlation of community network, dynamically change and the community network of user It is ageing that the problems such as renewal delay cause labelling not have, and therefore, for the problems referred to above, needs the introducing time User is screened by relevant information, solves labelling with this and does not have ageing problem.Building training Introducing during collection and determine user method, binding time relevant information and clustering technique are that labelling interpolation is ageing, Delete the user that labelling lost efficacy, provide more specification, accurately training set for feedforward neural network.Due to really Determining user method based on clustering technique, the user in training set after being processed is at user characteristics Distribution in space is the tightst, decreases training and is centrally located at positive example and the interstitial content of negative example boundary, Avoid " vibration " problem of discrimination during cross-training.

The most of the present invention based on feedforward neural network credible and insincere user identification method can credit Family identifies and combines with machine learning method.Trusted users identification is a kind of thinking activities of people, by position Nervous system in brain completes, and therefore uses feedforward neural network to come simulated nervous system, Feedforward Neural Networks The neuron in node correspondence nervous system in network, the electricity of the input value of node neuron corresponding with output valve Signal receives and transmission.Additionally, the stratification that the input value of input node comes from community network resolves, point Do not belong to different classifications, the most original, the input of diversification that in corresponding brain, nervous system receives. Trusted users identification problem, with " attribute-value " formal definition, is suitable for using feedforward neural network to solve. In addition, closer to the actual settling mode of problem compared with existing recognition methods, and the god that feedovers Through network concealed layer in the training process it appeared that prior undefined basis of characterization, not only enrich identification Foundation, and improve the motility of method and the accuracy of identification.

In sum, the present invention be directed to the level of community network, diversity, the feature such as ageing and The Biological characteristics of trusted users identification, is mapped to feedforward by analogy by the nervous system on biological significance Neutral net, realizes training set in conjunction with the algorithm based on clustering technique and corresponding machine learning algorithm Build and credible and insincere user identification.

Accompanying drawing explanation

The present invention is further illustrated below in conjunction with the accompanying drawings:

Fig. 1 is to implement of the present invention based on feedforward neural network credible and insincere user identification method The function of each comprising modules of computer program, annexation schematic block diagram；

Fig. 2 is the FB(flow block) of determination user method based on cluster of the present invention；

Fig. 3 is feedforward neural network schematic block diagram of the present invention；

Fig. 4 is initial user set of the present invention, one initial user of each node on behalf in figure, Label on node is the user name of user representated by node, and the distribution of these nodes does not has any rule；

Fig. 5 is the result determining that initial user set is deleted by user method of the present invention, Lead Node and UnLead node are dummy nodes, represent cluster centre；The node on behalf being connected with Lead node Special user, the disallowable no special user of the node on behalf being connected with UnLead node；

Fig. 6 is initial community network schematic block diagram of the present invention, by special user freak369, The user that bryan_carey and popsrocks trusts jointly is divided into four classes: special user freak369 and special User, special user freak369 and the special user popsrocks that user bryan_carey trusts jointly is altogether With the user that jointly trusts of user, special user bryan_carey and special user popsrocks trusted and The user that special user freak369, bryan_carey and popsrocks trust jointly, this four class user is corresponding Four user's set of Fig. 6 core；

Fig. 7 is community network of the present invention (locally) schematic block diagram；

Fig. 8 is that feedforward neural network cross-validation method of the present invention trains schematic diagram；

Fig. 9 is that feedforward neural network identification of the present invention is credible with insincere user's result schematic diagram, with There is the node on limit and represent credible and insincere user respectively in node Cre and node Ucr；With node ANNCre There is the node on limit with node ANNUCre and represent that to be fed forward neural network recognization be credible and insincere respectively User；The node on behalf that simultaneously there is limit with node Cre and node ANNUCre is fed forward neutral net knowledge Wei the trusted users of insincere user；

Detailed description of the invention

Below in conjunction with the accompanying drawings the present invention is explained in detail:

Credible and insincere user identification method based on feedforward neural network of the present invention is to be solved Technical problem is that and overcome the deficiencies in the prior art, propose training set based on feedforward neural network and build, obtain spy Different user, determine training set user, quantify user characteristics, build feedforward neural network, training feed forward neural Network, application feedforward neural network etc. are credible and the key issue in insincere user identification, propose and realize A series of new technology and methods towards trusted users identification, efficiently solve basis of characterization deficiency, society The problems such as meeting network analysis granularity is thicker, improve the accuracy of identification, for trusted users identification and engineering The combination of habit technology provides technical support.Wherein feedforward neural network is affected by biological inspiration, passes through Biological nervous system operation principle is mapped to Computer Science and Technology field by analogy, with perceptron or Other types unit simulates the neuron of biological nervous system, with perceptron or the input of other types unit The reception of the neuron signal of telecommunication and transmission in value and output valve simulated nervous system.Through carefully analyzing, we The feedforward neural network technology that method uses can the most relevantly simulate biological nervous system solving practical problems Mode, be mainly reflected in the different types of input value of input block with in practical problem information diversification corresponding, The weight of unit finds new basis of characterization automatically with corresponding, the hidden layer of information value size in practical problem With relation corresponding between information in practical problem.

Refering to Fig. 1, for realizing based on feedforward neural network credible and insincere user identification method this purpose Having worked out computer program voluntarily, it includes five functional modules, i.e. includes obtaining special user and determining Line module that training set comprises, quantify user characteristics module, build feedforward neural network module, training before Feedback neural network module and credible and insincere subscriber identification module, modules function:

1. obtain special user and determine the line module that training set comprises

Described obtain special user and determine that the line module that training set comprises is divided into and determine user method and obtain Take two major parts of seed user.Determine that user method includes building initial user set, obtaining user etc. Level and acquisition user time relevant information etc., obtain seed user and include obtaining users to trust relation, according to using Family trusting relationship builds initial community network etc..It is first depending on user gradation and builds initial user set, then It is determined by user method and obtains the user building initial community network, last according to users to trust relation structure Initial community network, obtains seed user by the structure of initial community network.

(1) user gradation is obtained

More need existing subscriber is screened for the first time before specification, accurately training set building, screening According to for user's social rank in community network.Obtain user gradation and mainly have employed Web Crawler, The class information of user in community network is gone out in conjunction with matching regular expressions.According to the user gradation information obtained Screen for the first time, using the user that remains as initial user set.

(2) user method is determined

In conventional method, the user that initial user set middle grade is higher is regarded as trusted users.But It is ageing to be that community network has, and some higher ranked users make it can because being chronically at disabled state Letter degree reduces, and does not meets the condition of trusted users in training set, therefore needs after obtaining initial user set User method to be determined by retains any active ues, separates inactive user.Determine that user method is with cluster Based on technology, guarantee, in conjunction with user time relevant information, the method that user is ageing, it is intended to remove training set Middle noise node.

(3) initial community network is built

It is determined by user method and can obtain special user.Web Crawler is used to obtain special user's Trusting relationship, builds initial community network according to trusting relationship.Build initial community network be one oriented Network, direction is by trusting and being trusted decision.Owing to community network is larger, if using adjacency matrix Storage can waste more space, and therefore this method stores community network in the form of a file, the trustor of user It is stored in the file with the entitled filename of user with the person of being trusted.

(4) seed user is obtained

Initial community network contains special user, the user of special user's trust and trusts special user's User, this three classes user connects each other with the relation of being trusted by trusting.This method in initial community network Only focus on special user and the user of special user's trust, by differentiating that the relation between user selects special user The user trusted, again passes by and determines that user method processes, and the user retained is provided as seed user To training set.

2. quantify user characteristics module

Described quantization user characteristics module is divided into quantization community network topological structure, quantifies user's individual factors And quantifying interbehavior three part between user, the work of this three part is both for user's expansion that training set comprises , it is intended to the categorical data of user is quantified as numeric data, merges various species basis of characterization, obtain User characteristics vector, is expressed as user feedforward neural network and can recognize that the pattern of process.

(1) community network topological structure is quantified

Use Web Crawler to obtain the trusting relationship that training set comprises user, build society according to trusting relationship Meeting network, by user kernel metrization community network topological structure.Core degree represent a user and other User's contact number, to contact its credibility of more user the highest with other users.This method center Heart degree comprises out-degree (Out-Link) and in-degree (In-Link) two parts, and Trustee represents the user of users to trust, Trustor represents the user trusting user, quantifies with the ratio of Trustee same Trustee, Trustor sum Degree (Out-Link), quantifies in-degree (In-Link) with the ratio of Trustor same Trustee, Trustor sum.

(2) user's individual factors is quantified

User's individual factors comprises liveness and power of influence two parts, embodies user respectively to certain realm information Sensitivity and the personal view of the user influence degree to other users.With user Review in this method The percentage ratio of the summation of value and all user's of choosing Review values is to quantify liveness (Activity), with user Member Visits value and Total Visits value carry out quantization influence power.Power of influence comprises two parts content: MVP And TVP.What MVP was user's Member Visits value with all Member Visits values choosing user is total The percentage ratio of sum, TVP is the summation of user's Total Visits value and all Total Visits values choosing user Percentage ratio.

(3) interbehavior between user is quantified

Between user, interbehavior comprises supporting rate (support-rate) and opposition rate (oppose-rate) two parts.? Its credibility of user that holdup is the highest is the highest, and its credibility of user that opposition rate is the highest is the lowest.User sends out The Review of table can receive different grading Review Rating, to be rated the Review number of support Mesh quantifies supporting rate (support-rate), to be rated the Review of opposition with the percentage ratio of Review sum Number quantifies opposition rate (oppose-rate) with the percentage ratio of Review sum.

3. build feedforward neural network module

Described build feedforward neural network module be mainly responsible for determining the hierarchical structure of feedforward neural network and The node number comprised at all levels and type.

(1) input node number is determined

Feedforward neural network in this method comprises seven input nodes, seven points of corresponding user characteristics vector Amount.

(2) determine that the number of plies and each layer comprise nodes

This method comprises three layers: input layer, hidden layer and output layer.Input layer such as step (1) institute Stating, hidden layer comprises two concealed nodes, and output layer comprises an output node.

(3) node type is determined

Input node uses perceptron unit, is only responsible for receiving the user characteristics vector of input.Concealed nodes and Output node uses sigmoid unit, by output valve by extruding Function Mapping to interval (0,1).Extruding Function: f (x)=1/ (1+e^-x), wherein x is the output valve that sigmoid unit processes without extruding function.

4. training feedforward neural network module

Described training feedforward neural network module uses cross validation method to train feedforward neural network.

(1) k-fold cross-training feedforward neural network

Training set is divided into five subsets, and the common factor of any two subset is empty.Select a subset every time As training set, remaining four subset obtains frequency of training during best identified precision as test set, record. Obtain five frequency of training after five training, give, according to five accuracy of identification, the power that five frequency of training are certain Value, calculates the weighted sum of five frequency of training as final frequency of training.

(2) feedforward neural network after training

In complete training set, train feedforward neural network according to final frequency of training, obtain Feedforward Neural Networks The threshold value of each node of network and the weights on each bar limit.

The most credible with insincere subscriber identification module

Described credible and insincere subscriber identification module mainly comprises user and is converted to user characteristics vector sum use Family is credible and insincere identification two parts.

(1) user is converted to user characteristics vector

Web Crawler is used to obtain the information of user to be identified, the information that then will obtain in community network Input quantifies user characteristics module and obtains by mutual between community network topological structure, user's individual factors and user The user characteristics vector that behavior three aspect basis of characterization is constituted.

(2) user is credible with insincere identification

User characteristics vector is input to the feedforward neural network after training, according to feedforward neural network output joint The output valve of point carries out credible and insincere identification to user.

Refering to Fig. 2, of the present invention determine that the step of user method is as follows:

(1) k (in the present invention value as 2) individual cluster centre is set；

(2) by user's temporally relevant information descending sort；

(3) user after sequence is divided into k decile；

(4) each user distance to each cluster centre is calculated；

(5) assign the user in the cluster that beeline is corresponding；

(6) k cluster centre is recalculated；

(7) check whether the user in k cluster changes；

(8) if change, user's distance to each cluster centre is recalculated；If unchanged, terminate；

Refering to Fig. 3, Architecture of Feed-forward Neural Network of the present invention is as follows:

Feedforward neural network employed in this method is made up of input layer, hidden layer and output layer three part.

(1) the input data of input layer are user characteristics vectors, every one-dimensional corresponding of user characteristics vector Input block: Out-Link, In-Link, Activity, MVP, TVP, Support-rate and Oppose-rate Corresponding seven input blocks.

(2) hidden layer comprises two hidden units altogether, and each hidden unit has threshold value H-Thresholdi (i=1,2), this threshold value determines the output valve of hidden unit.Connect input block and every of hidden unit There is a weights ω ij (i={1,2,3,4,5,6,7}, j={1,2}) on limit, and ω ij represents that i-th input block is to jth The contribution rate of individual hidden unit output.

(3) output layer comprises an output unit, and threshold value O-Threshold determines the output of output unit Value.The each edge connecting hidden unit and output unit has weights ω hi (i={1,2}), ω hi table Show the contribution rate that output unit is exported by i-th hidden unit.

The output valve of hidden unit and output unit all processes through sigmoid function.The codomain of sigmoid is 0 to 1, monotonic increase, by sigmoid function can by input value domain mapping to scope be 0 to 1 defeated Go out codomain.

Embodiment:

Refering to Fig. 1, of the present invention based on feedforward neural network credible and insincere user identification method Step is as follows:

1. the user that acquisition special user and training set comprise

(1) use Web Crawler to crawl user gradation information, initially used by user gradation information architecture Family is gathered, and result refers to Fig. 4；

(2) being determined by user method to delete initial user set, obtain special user, result is joined Read Fig. 5；

(3) use Web Crawler to crawl users to trust relation, build initial social network according to trusting relationship Network, result refers to Fig. 6；

(4) according to the topological structure of initial community network and determine that user method obtains seed user, use Web Crawler crawls the trusting relationship of seed user, builds community network, and result refers to Fig. 7.

2. quantify user characteristics

Use Web Crawler to crawl between the community network topological structure of user, user's individual factors and user to hand over Mutual behavior tripartite's surface information also quantifies, and then, user is expressed as user characteristics vector.User characteristics Vector is as follows:

3. build feedforward neural network

Refering to Fig. 3, build feedforward neural network initializing program part.

The input layer of feedforward neural network, hidden layer and output layer are by following code construction:

Input=new double [inputSize]；

Hidden=new double [hiddenSize]；

Output=new double [outputSize]；

Feedforward neural network concealed nodes threshold value and output node threshold value are stated by following code:

HidThresHold=new double [hiddenSize]；

OptThresHold=new double [outputSize]；

Limit between the weights on limit and concealed nodes and output node between feedforward neural network input node and concealed nodes Weights stated by following code:

IptHidWeights=new double [inputSize] [hiddenSize]；

HidOptWeights=new double [hiddenSize] [outputSize]；

Being above-mentioned threshold value and weights tax initial value by random function, scope is [-0.05,0.05], and code is such as Shown in lower:

CreatRandomWeight(IptHidWeights)；

CreatRandomWeight(HidOptWeights)；

4. training feedforward neural network

Using cross-validation method training feedforward neural network, training process is refering to Fig. 8.

After training, the threshold value of each node of feedforward neural network and the weights on each bar limit are as follows:

Concealed nodes threshold value :-0.9997232652135897 ,-0.9997855324086498；

Output node threshold value :-0.9985211755028933；

The weights on limit between input node and concealed nodes:

0.0897245066322621,0.030246991857638347,0.08625975560191126,

0.05646571427266416,0.12009404029133104,0.030833920608579368,

0.07035358302802967,0.13070006844338009,0.07024489700918604,

0.09999458520853907,0.4235673648091106,0.4004164285673208,

2.1125328654365805,2.113744324370964

The weights on limit between concealed nodes and output node:

1.0038892999564673,0.9847503518050365

The most credible and insincere user identifies

(1) in community network, obtain the information of user to be identified, quantify and be expressed as user characteristics vector；

(2) using vectorial for the user characteristics input data as feedforward neural network, according to the output of output node Value identifies credible and insincere user, and result refers to Fig. 9, as a example by user characteristics vector, identifies that process is as follows:

The output valve (HN1OutPut) of concealed nodes 1:

Out-Link*W11+In-Link*W21+Activity*W31+MVP*W41+TVP*W51+Support- rate*W61+Oppose-rate*W71+H-Threshold1

The output valve (HN2OutPut) of concealed nodes 2:

Out-Link*W11+In-Link*W21+Activity*W31+MVP*W41+TVP*W51+Support- rate*W61+Oppose-rate*W72+H-Threshold2

The output valve (ONOutput) of output node:

HN1OutPut*Wh1+HN2OutPut*Wh2+O-Threshold

Respective value is brought into above-mentioned computing formula and obtains following result:

The output valve (HN1OutPut) of concealed nodes 1: 1.051599979598784；

The output valve (HN2OutPut) of concealed nodes 2: 1.052814435066734；

The output valve (ONOutput) of output node: 1.09336101901679；

ONOutput is less than 0.2 with the difference of target output value 1, and this user is judged as trusted users.

Claims

1. based on feedforward neural network a credible and insincere user identification method, it is characterised in that institute Based on feedforward neural network the credible and insincere user identification method stated comprises the steps:

(1) obtain special user and determine the user that training set comprises:

1) building initial user set according to the grade of user in community network, initial user set defines:

Μ=μ ∈ Μ | μ ∈ OSW_s, Φ_μ→Φ_tag}

2) according to user time relevant information, initial user set is deleted, filter inactive user, will Remaining user is as special user；

3) initial community network is built according to the trusting relationship between other users in special user and community network；

4) obtain seed user according to the trusting relationship between other users in special user and community network and update Initial community network, the user that seed user is comprised by training set, acquisition condition:

ProUser={ μ_p∈ProUser||μ_s→μ_p|≥2}

Wherein: ProUser represents seed user set, μ_sRepresent the special user determined, μ_pRepresent set A user in ProUser, | μ_s→μ_p| represent and trust user μ_pSpecial user's number；

(2) analyze and quantify user characteristics, user being expressed as user characteristics vectorial:

1) with three aspect letters of interbehavior between user's community network topological structure, user's individual factors and user Breath carries out feature analysis to user；

2) quantify user characteristics, user is expressed as the user characteristics vector being made up of multiple features；

(3) feedforward neural network is built:

1) the input node number of feedforward neural network is determined according to the dimension of user characteristics vector；

2) determine what the number of plies of feedforward neural network and each layer comprised according to the complexity of trusted users identification Nodes；

3) structure and performance requirement according to feedforward neural network determine the class that hidden layer and output layer comprise node Type；

(4) training feedforward neural network:

1) according to the k value of k-fold cross validation algorithm, training set is divided into k subset, any two The common factor of collection is empty；

2) feedforward neural network is carried out k training, chooses a different subset as training set every time, Remaining k-1 subset is as test set；

3) give different weights by frequency of training determined by k training according to corresponding accuracy of identification, depend on Weighted sum is tried to achieve as final frequency of training according to the corresponding weight value of k frequency of training；

4) in complete training set, feedforward neural network is trained according to final frequency of training；

(5) realized credible identifying with insincere user by the feedforward neural network after training:

1) three aspects of interbehavior between user's community network topological structure, user's individual factors and user are obtained Information also quantifies, and user is expressed as user characteristics vector；

2) user characteristics vector is input to feedforward neural network and carries out credible and insincere user identification, obtain The output valve of identified user；

3) the output valve identification user according to feedforward neural network is credible or insincere.

2. according to based on feedforward neural network the credible and insincere user identification method described in claim 1, It is characterized in that, described acquisition special user comprises the steps:

(1) according to user time relevant information, the user in initial user set is arranged in descending order；

Ω_{c} = \frac{Σ_{i = 1}^{| Θ | / 2} Γ_{i}}{| Θ | / 2}, Ω_{s} = \frac{Σ_{i = | Θ | / 2}^{| Θ |} Γ_{i}}{| Θ | / 2}

d i s t (Γ_{i}, Ω_{c}) = {(| Γ_{i 1} - Ω_{c 1} |^{h} + ... + | Γ_{i r} - Ω_{c r} |^{h})}^{\frac{1}{h}}

d i s t (Γ_{i}, Ω_{s}) = {(| Γ_{i 1} - Ω_{s 1} |^{h} + ... + | Γ_{i r} - Ω_{s r} |^{h})}^{\frac{1}{h}}

(4) assign the user in the cluster that cluster centre distance is shorter；

Ω_{c}^{'} = \frac{Σ_{i = 1}^{| Θ_{c}^{'} |} Γ_{i}}{| Θ_{c} |}, Ω_{s}^{'} = \frac{Σ_{i = 1}^{| Θ_{s}^{'} |} Γ_{i}}{| Θ_{s} |}

(6) check in two clusters, whether user changes, if change, then recalculate in user's set Each user is to the distance of two cluster centres, otherwise, terminates.

3. according to based on feedforward neural network the credible and insincere user identification method described in claim 1, It is characterized in that, described quantization user characteristics comprises the steps:

(1) topological structure of community network is quantified: core degree, user kernel degree is divided into out-degree (Out-Link) With in-degree (In-Link) two parts, quantify out-degree and in-degree according to user's trusting relationship in community network, Employing equation below:

Out-Link=| Trustee |/(| Trustee |+| Trustor |)

In-Link=| Trustor |/(| Trustee |+| Trustor |)

(2) quantify user's individual factors: liveness and power of influence, deliver Review number with all with user Choose user deliver Review number summation percentage ratio quantify liveness, use equation below:

A c t i v i t y = μ_{R W} / \underset{μ &Element; \Pr o U s e r}{Σ} μ_{R W}

Wherein: μ_RWRepresenting that user delivers the number of Review, ProUser represents the use that training set is comprised Family；

According to Member Visits and two attributes of Total Visits, power of influence is quantified, use following public Formula:

M V P = μ_{M V} / \underset{μ &Element; \Pr o U s e r}{Σ} μ_{M V}, T V P = μ_{T V} / \underset{μ &Element; \Pr o U s e r}{Σ} μ_{T V}

Wherein: MVP represents user's Member Visits value and all Member Visits values choosing user The percentage ratio of summation, μ_MVRepresenting the Member Visits value of user, TVP represents user Total Visits The percentage ratio of the summation of value and all Total Visits values choosing user, μ_TVRepresent the Total Visits of user Value, ProUser represents the user that training set is comprised；

4. according to based on feedforward neural network the credible and insincere user identification method described in claim 1, It is characterized in that, described structure feedforward neural network comprises the steps:

(2) determine that the number of plies of feedforward neural network and each layer comprise according to the complexity of trusted users identification Nodes, have three layers: input layer, seven input nodes；Hidden layer, two concealed nodes；Output layer, One output node；

5. according to based on feedforward neural network the credible and insincere user identification method described in claim 1, It is characterized in that, described training feedforward neural network comprises the steps:

(1) training set is divided into k subset, in this method according to the k value of k-fold cross validation algorithm K takes 5, and the common factor of any two subset is empty；

6. according to based on feedforward neural network the credible and insincere user identification method described in claim 1, It is characterized in that, described realized credible and insincere user by the feedforward neural network after training and identify and include Following steps:

(1) obtain in community network the community network topological structure of user to be identified, user's individual factors and Between user three aspect information of interbehavior and be quantified as user characteristics vector, user characteristics vector form such as Under:<Out-Link, In-Link, Activity, MVP, TVP, Support-rate, Oppose-rate>；

(3) if the output valve of user to be identified and the difference of 1 are less than predetermined threshold value, then user is identified as Trusted users, otherwise, for insincere user.