CN109670855A - The methods of marking and device of information flow platform author - Google Patents

The methods of marking and device of information flow platform author Download PDF

Info

Publication number
CN109670855A
CN109670855A CN201811299493.8A CN201811299493A CN109670855A CN 109670855 A CN109670855 A CN 109670855A CN 201811299493 A CN201811299493 A CN 201811299493A CN 109670855 A CN109670855 A CN 109670855A
Authority
CN
China
Prior art keywords
author
log
user
information flow
flow platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811299493.8A
Other languages
Chinese (zh)
Inventor
陈翔
张济显
唐传洋
韩振岭
张颖
李伟力
赵国振
范强
任宝鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201811299493.8A priority Critical patent/CN109670855A/en
Publication of CN109670855A publication Critical patent/CN109670855A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Finance (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Educational Administration (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides the methods of marking and device of a kind of information flow platform author.This method comprises: obtaining user's original log of information flow platform from multiple and different channels;User's original log is parsed by rule parsing engine, the user journal after being parsed, wherein the rule parsing engine is constructed according to the different respective log resolution rules of channel;Author's log is obtained from the background data base of the information flow platform;According to the evaluation index of the output performance of each author in the user journal and author's log statistic special time period after the parsing, wherein the evaluation index includes quality, production capacity, temperature, profession degree and credit rating;Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.The present invention realizes multisource data fusion, ensure that the accuracy, stability and availability of data, and guarantee the fairness, objectivity and accuracy to author assessment.

Description

The methods of marking and device of information flow platform author
Technical field
The present invention relates to Internet technical field, especially a kind of methods of marking of information flow platform author, information levelling Scoring apparatus, computer storage medium and the calculating equipment of platform author.
Background technique
Information flow (feed) is that one kind can be with the content stream of rolling view.The number of users of information flow platform is huge at present, In order to be supplied to the good reading experience of user, platform author need to be evaluated according to user behavior data, thus to author Quality carries out control.
In current enterprise, in order to more comprehensively understand user, need to obtain different users from multiple products and dimension Behavioral data, and then more comprehensive evaluation is made to author.However, due to company size and technological reserve is different, multi-source number It is huge according to the scale of construction, regular it is chaotic, process is cumbersome, data delay and exception, the problems such as business demand is different, cause to be difficult quickly to take Build practical, accurate, stable multisource data fusion system.
In addition, realizing the evaluation to author's performance using machine learning algorithm, specifically in existing author assessment system Implementation are as follows: select the related data of a part of author as data set, according to the behavioral data of user to author's mark point Selected data set is divided into training set and test set by number, according to the different regression model of training set training, uses test set Data select optimal regression model, are predicted according to the optimum regression model selected testing data.However, using returning Model evaluates the performance of author, actually learns the rule that label is marked for author, but there is always certain for regression model Error rate, it cannot be guaranteed that 100% accuracy, so not can guarantee the fairness for all authors.
Therefore, a kind of Stability and veracity that can guarantee multi-source data is needed, and guarantees the justice for author The author assessment method of property.
Summary of the invention
In view of the above problems, it proposes on the present invention overcomes the above problem or at least be partially solved in order to provide one kind State the methods of marking of information flow platform author of problem, the scoring apparatus of information flow platform author, computer storage medium and Calculate equipment.
One side according to an embodiment of the present invention provides the methods of marking of information flow platform author a kind of, comprising:
User's original log of information flow platform is obtained from multiple and different channels;
User's original log is parsed by rule parsing engine, the user journal after being parsed, wherein The rule parsing engine is constructed according to the different respective log resolution rules of channel;
Author's log is obtained from the background data base of the information flow platform;
According to after the parsing user journal and author's log score the author of the information flow platform.
Optionally, the different channels include mobile device application APP client and/or PC APP client End.
Optionally, after obtaining author's log in the background data base from the information flow platform, the method is also wrapped It includes:
By after the parsing user journal and author's log be saved into Hadoop distributed file system.
Optionally, article/video system that the user journal after the parsing and author's log are delivered by author One Resource Locator URL is associated.
Optionally, after the author to the information flow platform scores, the method also includes:
The appraisal result of author to the information flow platform is saved in MySQL tables of data.
Optionally, according to after the parsing user journal and author's log to the author of the information flow platform into Row scoring, comprising:
According to the output of each author in the user journal and author's log statistic special time period after the parsing The evaluation index of performance, and calculate according to the evaluation index evaluation score of the author.
Optionally, the evaluation index includes quality, production capacity, temperature, profession degree and credit rating;
The evaluation score of the author is calculated according to the evaluation index, comprising:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
Optionally, the quality of the article/video user's evaluation parametric statistics author delivered according to each author, In, the user's evaluation parameter include reading/viewing duration, user click data, user's sharing data, user comment data, User collects data and user thumbs up one or more of data.
Optionally, the quality Q (X) of each author is counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
Optionally, the production capacity is used to characterize the output efficiency of author;
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
Optionally, the production capacity P1 of each author is counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time section The ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time period The adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
Optionally, the designated time period is monthly.
Optionally, article/video user's concern amount, user's pageview and the user's amount of sharing delivered according to each author Count the temperature of the author.
Optionally, the temperature P2 of each author is counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview) Reason.
Optionally, the profession degree is for characterizing author in the influence power of different field;
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
Optionally, each author is counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
Optionally, the credit rating C of each author is counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, the standard of the audit deduction of points includes at least one following:
It is against the form of the statute, violate social ethics, contain flame.
Optionally, according to each author in the user journal and author's log statistic special time period after the parsing Output performance evaluation index, and calculate according to the evaluation index evaluation score of the author, comprising:
According to after the parsing user journal and author's log count each work in the special time period respectively The evaluation index of the article output performance of person and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculate To the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author's Overall merit score.
Optionally, according to after the parsing user journal and author's log counted in the special time period respectively The evaluation index of the article output performance of each author and the evaluation index of video output performance, comprising:
According to after the parsing user journal and author's log count each work in the special time period respectively The evaluation index of article output performance of the person in variant field and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculate To the article overall evaluation score and video overall evaluation score of the author, comprising:
The evaluation index of article output performance according to the author in each field and the video output tables respectively Article evaluation score and video evaluation score of the author in each field is calculated in existing evaluation index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the author Score;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the author Score.
According to another aspect of an embodiment of the present invention, the scoring apparatus of information flow platform author a kind of is additionally provided, comprising:
User journal obtains module, suitable for obtaining user's original log of information flow platform from multiple and different channels;
User journal parsing module is obtained suitable for being parsed by rule parsing engine to user's original log User journal after parsing, wherein the rule parsing engine is according to the different respective log resolution rules buildings of channel 's;
Author's log acquisition module, suitable for obtaining author's log from the background data base of the information flow platform;And
Author score statistical module, suitable for according to after the parsing user journal and author's log to the information The author of levelling platform scores.
Optionally, the different channels include mobile device application APP client and/or PC APP client End.
Optionally, described device further include:
Daily record data preserving module, suitable for author's log of user journal and the acquisition after the parsing to be saved into In Hadoop distributed file system.
Optionally, article/video system that the user journal after the parsing and author's log are delivered by author One Resource Locator URL is associated.
Optionally, described device further include:
Appraisal result preserving module, suitable for being carried out in author of the author scoring statistical module to the information flow platform After scoring, the appraisal result of the author to the information flow platform is saved in MySQL tables of data.
Optionally, author's scoring statistical module is further adapted for:
According to the output of each author in the user journal and author's log statistic special time period after the parsing The evaluation index of performance, and calculate according to the evaluation index evaluation score of the author.
Optionally, the evaluation index includes quality, production capacity, temperature, profession degree and credit rating;
Author's scoring statistical module is further adapted for:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
Optionally, author's scoring statistical module is further adapted for:
The quality of the article/video user's evaluation parametric statistics author delivered according to each author, wherein the use Family evaluation parameter includes reading/viewing duration, user click data, user's sharing data, user comment data, user's collection number One or more of data are thumbed up according to user.
Optionally, author's scoring statistical module is further adapted for:
The quality Q (X) of each author is counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
Optionally, the production capacity is used to characterize the output efficiency of author;
Author's scoring statistical module is further adapted for:
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
Optionally, author's scoring statistical module is further adapted for:
The production capacity P1 of each author is counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time section The ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time period The adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
Optionally, the designated time period is monthly.
Optionally, author's scoring statistical module is further adapted for:
Article/video user's concern amount, user's pageview and user's amount of the sharing statistics delivered according to each author should The temperature of author.
Optionally, author's scoring statistical module is further adapted for:
The temperature P2 of each author is counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview) Reason.
Optionally, the profession degree is for characterizing author in the influence power of different field;
Author's scoring statistical module is further adapted for:
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
Optionally, author's scoring statistical module is further adapted for:
Each author is counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
Optionally, author's scoring statistical module is further adapted for:
The credit rating C of each author is counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, the standard of the audit deduction of points includes at least one following:
It is against the form of the statute, violate social ethics, contain flame.
Optionally, author's scoring statistical module is further adapted for:
According to after the parsing user journal and author's log count each work in the special time period respectively The evaluation index of the article output performance of person and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculate To the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author's Overall merit score.
Optionally, author's scoring statistical module is further adapted for:
According to after the parsing user journal and author's log count each work in the special time period respectively The evaluation index of article output performance of the person in variant field and the evaluation index of video output performance;
The evaluation index of article output performance according to the author in each field and the video output tables respectively Article evaluation score and video evaluation score of the author in each field is calculated in existing evaluation index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the author Score;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the author Score.
It is according to an embodiment of the present invention in another aspect, additionally provide a kind of computer storage medium, the computer storage Media storage has computer program code, when the computer program code is run on the computing device, leads to the calculating Equipment executes the methods of marking according to above described in any item information flow platform authors.
Another aspect according to an embodiment of the present invention additionally provides a kind of calculating equipment, comprising:
Processor;And
It is stored with the memory of computer program code;
When the computer program code is run by the processor, the calculating equipment is caused to execute according to above The methods of marking of described in any item information flow platform authors.
The methods of marking and device for the information flow platform author that the embodiment of the present invention proposes are obtained from multiple and different channels After user's original log of information flow platform, first with what is constructed according to the respective log form of different channels and resolution rules Rule parsing engine parses user's original log of multi-source, with the user journal after being parsed;Then further according to solution It user journal after analysis and scores from author's log that background data base obtains author.By using rule parsing engine The user journal of separate sources and form is parsed, is solved under the background in user journal from multiple support channels, The problem that data volume is big in data resolving, data are dirty, resolution rules are chaotic realizes multisource data fusion, ensure that data Accuracy, stability and availability.
Further, according to the output of each author in the user journal and author's log statistic special time period after parsing The evaluation index of performance, evaluation index include quality, production capacity, temperature, profession degree and credit rating, and according to these evaluation index meters Calculate the evaluation score of author.Production by using quality, production capacity, five temperature, profession degree and credit rating evaluation indexes to author Performance is evaluated out, is provided a kind of fair, objective, accurate appraisement system, can be successfully managed different information sources, is had There is universality.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
According to the following detailed description of specific embodiments of the present invention in conjunction with the accompanying drawings, those skilled in the art will be brighter The above and other objects, advantages and features of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 is shown to be illustrated using the application scenarios for the method that machine learning algorithm carries out author assessment in the prior art Figure;
The flow chart of the methods of marking of Fig. 2 information flow platform author according to an embodiment of the invention;
Fig. 3 shows the flow diagram of the methods of marking of information flow platform author according to another embodiment of the present invention;
Fig. 4 shows the flow diagram of the data flow in the methods of marking of information flow platform author shown in Fig. 3;
Fig. 5, which is shown, calculates commenting for author in the methods of marking of the information flow platform author of another embodiment according to the present invention The flow diagram of valence index and final evaluation score;
Fig. 6 shows the structural schematic diagram of the scoring apparatus of information flow platform author according to an embodiment of the invention;With And
Fig. 7 shows the structural schematic diagram of the scoring apparatus of information flow platform author according to another embodiment of the present invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.
The number of users of information flow platform is huge at present, need to be according to user in order to be supplied to the good reading experience of user Behavioral data evaluates platform author, to carry out control to author's mass, and can also be according to author assessment in turn User carries out commending contents and distribution, and provides excitation for author.However, in the prior art, for from multiple products and dimension Spend the different user behavior datas obtained, that there are the multi-source data scale of constructions is huge, it is regular it is chaotic, process is cumbersome, data delay and Extremely, the problems such as business demand is different, leads to not realize that accurate, stable multisource data fusion, availability of data are poor.
Further, in the prior art, the evaluation to author's performance is realized usually using machine learning algorithm.In Fig. 1 Show a kind of application scenarios schematic diagram for the method that author assessment is carried out using machine learning algorithm in the prior art, wherein Author to be evaluated is public's account, and the quality score problem of public's account is realized by the regression model in machine learning. As shown in Figure 1, in the author assessment method, score is marked to author according to the behavioral data of user first, and by author's number According to being divided into training sample data and test sample data.Then, training sample data carry out manual features work to regression model Journey analysis, specifically, each statistical index data to author carries out screening analysis according to feature importance, by the instruction after screening White silk sample data, which is input in regression model, to be learnt, and is completed the training to regression model, is obtained multiple regression models.It connects , test sample data also carry out manual features project analysis to regression model, so that the evaluation and test to regression model is completed, selection Best regression model out.Next optimization is iterated to regression model again, obtains final regression model.Finally using most Whole regression model carries out fractional value prediction to public's account to be evaluated.However, it is found by the inventors that being commented using regression model The performance of valence author, actually learns the rule that label is marked for author, but regression model is there is always certain error rate, It cannot be guaranteed that 100% accuracy, so for this part author of prediction error, being unfair property.
In order to solve the above technical problems, the embodiment of the present invention proposes the methods of marking of information flow platform author a kind of.Fig. 2 shows The flow chart of the methods of marking of information flow platform author according to an embodiment of the invention is gone out.Referring to fig. 2, this method at least may be used To include the following steps S202 to step S208.
Step S202 obtains user's original log of information flow platform from multiple and different channels.
Step S204 parses user's original log by rule parsing engine, the user journal after being parsed, Wherein the rule parsing engine is constructed according to the respective log resolution rules of different channels.
Step S206 obtains author's log from the background data base of information flow platform.
Step S208, according to after parsing user journal and author's log score the author of information flow platform.
The methods of marking for the information flow platform author that the embodiment of the present invention proposes is obtaining information flow from multiple and different channels After user's original log of platform, solved first with according to the rule of the respective log form of different channels and resolution rules building Analysis engine parses user's original log of multi-source, with the user journal after being parsed;Then further according to parsing after User journal and from background data base obtain author's log score author.By using rule parsing engine to difference The user journal of source and form is parsed, and is solved under the background in user journal from multiple support channels, in data solution The problem that data volume is big during analysis, data are dirty, resolution rules are chaotic realizes multisource data fusion, ensure that the accurate of data Property, stability and availability.
It is more fully evaluated to be made to author, the user behavior data of more fully multiple and different channels need to be obtained, And when user data derive from different channels when, due to from without channel user journal record form (or claim lattice Formula) it is different, then its resolution rules is not also identical, and it is chaotic that dirty data, resolution rules inevitably occurs in conventional log analysis mode Problem.The present invention is above-mentioned to solve by constructing rule parsing engine in advance according to the respective log resolution rules of different channels Problem.In turn, after executing the user journal that above step S202 is collected from multiple and different channels, in step S204 In, collected user journal is carried out according to the resolution rules of separate sources log using constructed rule parsing engine Parsing, to realize multisource data fusion, ensure that the accuracy of user data.
Difference channel mentioned above may include mobile device application APP client and/or PC APP visitor Family end etc., for example, the cell phone application client based on Android (Android) system, based on the cell phone application client of IOS system, base In the PC PC APP client of Windows operating system, PC APP client based on (SuSE) Linux OS etc..With Record has user to data such as the click of article/video, browsing, comments in the log of family.
In above step S206, author's log is obtained from the background data base of information flow platform, is recorded in author's log There is author to publish an article the/information such as quantity, time, the field of video.
In alternative embodiment of the invention, user journal after being parsed to user's original log, And from back-end data obtain author's log after, the methods of marking can with the following steps are included:
By after parsing user journal and acquired author's log be saved into Hadoop distributed file system.
Hadoop distributed file system (Hadoop Distributed File System, HDFS) is a kind of suitable fortune Distributed file system of the row on common hardware (Commodity Hardware), is disposed by means of Hadoop tool, The main advantage of file system is mainly the reading efficiency for improving client.One HDFS cluster is run on master by one Namenode and it is multiple run on slave Datanode composition.The name that Namenode is responsible for managing file system is empty Between and client to the access operation of file system, Datanode is responsible for the data of management storage.File is carried out in the form of block It is stored in datanode, the number of copies of block is set, the storage of identical copy block is reached into redundancy into different datanode Effect prevents loss of data after single datanode disk failure.Therefore, HDFS has high fault tolerance (Fault-Tolerant) The characteristics of, and be designed to be deployed on cheap (low-cost) hardware.Moreover, it provides high-throughput (High Throughput) carry out the data of access application, the storage for not being afraid of failure is provided for mass data, be super large data set (Large Data Set's) brings many conveniences using processing.
In the embodiment of the present invention, by the way that the user journal after parsing and acquired author's log are saved into HDFS system In, the application processing of the high serious forgiveness for mass data, high reading efficiency is provided, to ensure that the stability of data.
Further, it when storing the user journal after parsing and author's log in HDFS system, is sent out by author The uniform resource locator (Uniform Resource Locator, URL) of article/video (being referred to as news) of table will User journal after parsing is associated with author's log, to improve the efficiency of reading data and processing when subsequent log statistic. It should be noted that news mentioned herein is interpreted as sensu lato information, such as hot news, entertainment information, society's money News etc., and the event news more than propagated on TV or network.
In above step S208, according to after parsing user journal and author's log the author of information flow platform is commented Point, control is carried out with the output performance to author.
In alternative embodiment of the invention, step S208 can be specifically embodied as following steps:
It is commented according to what the output of each author in the user journal and author's log statistic special time period after parsing showed Valence index, and calculate according to evaluation index the evaluation score of the author.
Special time period mentioned herein can be set as the arbitrary target period according to statistical demand, such as nearest one Week, one month, 1 year etc..
Further, evaluation index mentioned above may include quality Q (X), production capacity P1, temperature P2, profession degree P3 and Credit rating C.At this point it is possible to add according to the following formula to quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C Power sums to calculate the evaluation score of author:
Score=a1 × Q (X)+a2 × P1+a3 × P2+a4 × P3+a5 × C;
Wherein, a1, a2, a3, a4 and a5 are quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C respectively Weight.
In a specific embodiment, a1, a2, a3, a4 and a5 can be set as 1, at this point, author assessment score Calculation formula indicate are as follows:
Score=Q (X)+P1+P2+P3+C.
The output of author is showed and is carried out by using five quality, production capacity, temperature, profession degree and credit rating evaluation indexes Evaluation, provides a kind of fair, objective, accurate appraisement system, can successfully manage different information sources, has universality.
It, should after scoring in step S208 the author of information flow platform in alternative embodiment of the invention Method can with the following steps are included:
The appraisal result of author to information flow platform is saved in MySQL tables of data.
MySQL is current most popular Relational DBMS, is saved the data in by linked database In different tables, to increase processing speed and managerial flexibility.By the way that calculated result is saved in MySQL tables of data, The reading and use for being conducive to subsequent evaluation data, guarantee the availability of data.
Below to the statistics work of this five evaluation indexes of quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C It further illustrates.
(1) quality Q (X)
Quality Q (X) characterizes article/video superiority and inferiority degree that each author delivers.It, can in a kind of optional embodiment Quality with the article/video user's evaluation parametric statistics author delivered according to each author.User mentioned herein comments Valence parameter may include article/video reading/viewing duration, user click data, user's sharing data, user comment number Data are collected according to, user and user thumbs up one or more of data.
In a preferred embodiment, the quality Q (X) of each author can be counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance).
In formula above, conversion ratio is the conversion data for all article/videos that the author delivers, is defined as: turn Rate=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate.
User's reading/the viewing time for all article/videos that a length of author delivers when reading/viewing.
Mean apparent is the average value of the user's evaluation data for all article/videos that the author delivers, is defined as: it is average The amount of thumbing up of the performance=average click volume+average amount of collection of average sharing amount+average review amount++ averagely.
It is optimal to show as highest user's evaluation data in all article/videos that the author delivers, is defined as: optimal table Existing=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up.
Above-mentioned conversion ratio, reading/viewing duration, mean apparent and optimal performance require to be normalized.Normalizing Change is a kind of dimensionless processing means, i.e., the expression formula that will have dimension turns to nondimensional expression formula, become mark by transformation Amount calculates to simplify, and reduces magnitude.
Further, since the numerical value such as click volume, sharing amount, comment amount are usually larger, mean apparent and optimal table Before being normalized now, logarithm is first taken, to reduce its order of magnitude, is further simplified calculating.It specifically, can be bottom with e or 10 Take logarithm.
(2) production capacity P1
Production capacity P1 is used to characterize the output efficiency of author.It, can be according to each author's in a kind of optional embodiment Article/video delivers quantity and delivers the production capacity of the Efficiency Statistics author.
In a preferred embodiment, the production capacity P1 of the author can be counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency.
In formula above, the total quantity that quantity is article or video that the author delivers is delivered.
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time section The ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time period The adduction of the ratio between number.Designated time period mentioned herein can be identical or different with special time period mentioned above.
In a specific embodiment, which can be for monthly.At this point, delivering efficiency can indicate are as follows:
/ moon number of days+all numbers of sending the documents the moon of delivering efficiency=moon dispatch number of days/week moon number.
It is above-mentioned to deliver quantity and deliver efficiency and require to be normalized.Further, quantity is delivered to be returned Before one changes, logarithm is taken, first to reduce its order of magnitude.
(3) temperature P2
Temperature P2 (being referred to as popularity) is for characterizing the welcome or concerned degree of author.In a kind of optional implementation It, can be (or clear for medium according to article/video user's concern amount that each author delivers, user's pageview in mode The amount of looking at) temperature of the author is counted with user's amount of sharing (or be medium sharing amount).
In a preferred embodiment, the temperature P2 of the author can be counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount).
Above-mentioned user's concern amount, user's pageview require to be normalized with user's amount of sharing.Further, exist Before being normalized, logarithm first is taken to them, to reduce its order of magnitude.
(4) profession degree P3
Professional degree P3 is for characterizing author in the influence power of different field.It, can basis in a kind of optional embodiment Each author counts the professional degree of the author in the quality and production capacity of different field.
In a preferred embodiment, the author can be counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
The different field being mentioned above can be divided according to actual needs, such as can be divided into current events, sport, joy Happy, science and technology etc..
(5) credit rating C
Credit rating C is intended to encourage original, strike unlawful practice.
In a preferred embodiment, the credit rating C of each author can be counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points.
The standard of above-mentioned audit deduction of points may include it is against the form of the statute, violate social ethics, containing flame (such as At least one of pornography).
The deduction of points value of above-mentioned audit deduction of points and customer complaint deduction of points item can be by platform sets itself.For example, contrary to law 10 points of regulation button, divides containing flame button 5, primary 3 points of button of the every complaint of user etc..
A variety of implementations of the links of embodiment illustrated in fig. 2 are described above, specific embodiment will be passed through below Come be discussed in detail information flow platform author of the invention methods of marking realization process.
Fig. 3 shows the flow diagram of the methods of marking of information flow platform author according to another embodiment of the present invention. It is illustrated referring to methods of marking of the Fig. 3 to the information flow platform author of the embodiment of the present invention.As shown in figure 3, the scoring Method may comprise steps of:
The first step, log collection.
In this step, user's original log is collected from multiple and different channels, and simultaneously from the rear number of units of information flow platform According to collection author's log in library.
In the present embodiment, channel 1 shown in Fig. 3 to channel 4 can be respectively the cell phone application based on android system It client, the cell phone application client based on IOS system, the PC APP client based on Windows operating system and is based on The PC APP client of (SuSE) Linux OS.It should be noted that collecting the quantity of the channel of user journal shown in Fig. 3 It is only illustrative with title, the present invention is not limited thereto.
Second step, log parsing.
Due to user data source present diversification, need to according to the log form and log resolution rules of different channels, Construct rule parsing engine.In this step, it using constructed rule parsing engine, is advised according to the parsing of separate sources log Then, collected user's original log is parsed, to realize multisource data fusion, data is accurate after guarantee parsing Property.
Third step, log storage.
After being parsed to user's original log, by after parsing user journal and collected author's log storage arrive In HDFS system, the application processing of the high serious forgiveness for mass data, high reading efficiency is provided, to guarantee the steady of data It is qualitative.When carrying out log storage, user journal and author's log are associated by news URL.It should be noted that mention herein And news be interpreted as information that sensu lato author delivers, such as hot news, entertainment information, social information etc., without The event news only propagated on TV or network.
4th step, log statistic.
In this step, the author assessment system model constructed using the present invention, according to the user journal and work after parsing Person's log counts the evaluation index of the output performance of each author in special time period.
In author assessment system model of the invention, the evaluation index of the output performance of author includes quality Q (X), produces Energy P1, five temperature P2, profession degree P3 and credit rating C dimensions, respective calculation method are as described above.
5th step calculates score.
In this step, summation is weighted to quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C to count Calculate the evaluation score of author.
Further, by the statistical result of evaluation index and commenting for author of the output performance for the author that log statistic obtains The calculated result of valence score is saved in MySQL tables of data, in favor of the reading and use of subsequent evaluation data, guarantees data Availability.
Further, it after storing the evaluation result data of author, can be developed by platform technology corresponding Application interface reads the score data in MySQL tables of data by the application interface, is that user carries out content based on score data Recommend and distribute, or provides excitation by certain method for running for author.
The embodiment of the present invention realizes multisource data fusion, efficiently solves data by building rule parsing engine Quantity is big in resolving, data are dirty, the problem of regular confusion, ensure that the accuracy, stability and availability of data.Together When, the general appraisement system to information flow platform author is realized, can guarantee fairness, objectivity and standard to author assessment True property effectively supports the content distribution of algorithm, and can be improved the enthusiasm of author's creation by certain method for running.
The process that the data flow in the methods of marking of information flow platform author shown in Fig. 3 is further illustrated in Fig. 4 is shown It is intended to.Below with reference to Fig. 4, the flow of data stream in the methods of marking of the information flow platform author of the embodiment of the present invention is said It is bright.
It is shown in Figure 4, firstly, after collecting user journal and author's log, by platform technology, using constructed Rule parsing engine collected user journal is parsed, and by after parsing user journal and collected author day Will is as periodical initial data storage into HDFS system.
Then, data analysis portion is from periodical initial data is obtained (that is, user journal and work after parsing in HDFS system Person's log), according to the periodicity initial data, model calculating is carried out by author assessment system model, obtains score data (packet Include the evaluation index of the output performance of author and the final evaluation score of author), and score data is saved in MySQL data Evaluation model data in source are used as in table.
Finally, developing corresponding application interface by platform technology again, MySQL tables of data is read by the application interface In score data, based on score data be that user carries out commending contents and distribution, or by certain method for running be author Excitation is provided.
Fig. 5, which is shown, calculates commenting for author in the methods of marking of the information flow platform author of another embodiment according to the present invention The flow diagram of valence index and final evaluation score.The evaluation of the calculating author of the embodiment of the present invention is referred to referring to Fig. 5 The process of mark and final evaluation score is illustrated.
As shown in figure 5, obtaining user's original log and author's log, and user's original log is parsed and is solved After user journal after analysis, firstly, according to after parsing user journal and author's log count in special time period every respectively The evaluation index of the article output performance of one author and the evaluation index of video output performance.Specifically, it is sent out according to each author User journal and author's log after the corresponding parsing of the article of table count the article output tables of author special time period Nei Existing evaluation index, including quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C.It is delivered according to each author What user journal and author's log after the corresponding parsing of video showed to count the video output of author special time period Nei Evaluation index, including quality Q (X), production capacity P1, temperature P2, profession degree P3 and credit rating C.
Then, referred to respectively according to the evaluation of the evaluation index of the article output of each author performance and the performance of video output Mark, is calculated the article overall evaluation score and video overall evaluation score of the author.
Finally, the article overall evaluation score and video overall evaluation score to the author are weighted summation, it is somebody's turn to do The overall merit score of author.The weight of article overall evaluation score and video overall evaluation score can be set according to actual needs It is fixed, the invention is not limited in this regard.
It is further preferred that the article output performance of each author comments in statistics special time period referring still to Fig. 5 When valence index, it is possible to implement are as follows: user journal and author's log after the corresponding parsing of article delivered according to each author come Count the evaluation index of article output performance of the author in variant field in special time period.In turn, each work is calculated The article overall evaluation score of person can be implemented are as follows: according to the author in each field article output performance evaluation index, Article evaluation score of the author in each field is calculated, then, to article evaluation score of the author in each field It is weighted summation, obtains the article overall evaluation score of the author.Field mentioned herein may include current events, sport, joy Happy, science and technology etc..The weight of article evaluation score in each field can be set according to actual needs, and the present invention does not limit this System.
Similarly, in the evaluation index of video output performance for counting each author in special time period, it is possible to implement Are as follows: user journal and author's log after the corresponding parsing of the video delivered according to each author should in special time period to count The evaluation index of video output performance of the author in variant field.In turn, the video overall evaluation point of each author is calculated Number can be implemented are as follows: according to the evaluation index of video output performance of the author in each field, the author is calculated each Then video evaluation score in field is weighted summation to video evaluation score of the author in each field, is somebody's turn to do The video overall evaluation score of author.Field mentioned herein may include current events, sport, amusement, science and technology etc..In each field The weight of video evaluation score can set according to actual needs, the invention is not limited in this regard.
Overall merit author again after being evaluated respectively by article to author and the performance of video output, realizes to author More objective, accurate evaluation.Further, by calculating separately the article of author and the evaluation of video output performance in each field Index and its overall evaluation score, further simplify calculating.
Based on the same inventive concept, the embodiment of the invention also provides the scoring apparatus of information flow platform author a kind of, use In the methods of marking for supporting information flow platform author provided by any one above-mentioned embodiment or combinations thereof.Fig. 6 shows root According to the structural schematic diagram of the scoring apparatus of the information flow platform author of one embodiment of the invention.Referring to Fig. 6, which at least can be with It include: that user journal obtains module 610, user journal parsing module 620, author's log acquisition module 630 and author's scoring Statistical module 640.
Now introduce each composition of the scoring apparatus of the information flow platform author of the embodiment of the present invention or the function of device and Connection relationship between each section:
User journal obtains module 610, suitable for obtaining user's original log of information flow platform from multiple and different channels.
User journal parsing module 620 obtains module 610 with user journal and connect, and is suitable for passing through rule parsing engine pair User's original log parses, the user journal after being parsed, and wherein the rule parsing engine is each according to different channels From log resolution rules building.
Author's log acquisition module 630, suitable for obtaining author's log from the background data base of information flow platform.
Author's scoring statistical module 640, connects with user journal parsing module 620 and author's log acquisition module 630 respectively Connect, suitable for according to after parsing user journal and author's log score the author of information flow platform.
In one alternate embodiment, different channel mentioned above include mobile device application APP client and/ Or PC APP client.
In one alternate embodiment, as shown in fig. 7, the scoring apparatus for the information flow platform author that Fig. 6 is shown can be with Including daily record data preserving module 750.Daily record data preserving module 750 respectively with user journal parsing module 620, author's log It obtains module 630 to be connected with author's scoring statistical module 640, author's log suitable for user journal and acquisition after parsing It is saved into Hadoop distributed file system.In turn, author scores statistical module 640 from Hadoop distributed file system User journal and author's log after obtaining parsing carry out author's scoring statistics.
In one alternate embodiment, when storing the user journal after parsing and author's log in HDFS system, Article/video uniform resource position mark URL that user journal and author's log after parsing are delivered by author is associated.
In one alternate embodiment, still referring to shown in Fig. 7, the scoring apparatus of information flow platform author can also be wrapped Include appraisal result preserving module 760.Appraisal result preserving module 760 is connect with author's scoring statistical module 640, is suitable in author After scoring statistical module 640 scores to the author of information flow platform, by the appraisal result of the author to information flow platform It is saved in MySQL tables of data.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
It is commented according to what the output of each author in the user journal and author's log statistic special time period after parsing showed Valence index, and calculate according to evaluation index the evaluation score of the author.
In one alternate embodiment, the evaluation index of the output performance of author includes quality, production capacity, temperature, profession degree And credit rating.
Correspondingly, author's scoring statistical module 640 is further adapted for:
Summation is weighted to the quality of each author, production capacity, temperature, profession degree and credit rating, obtains commenting for the author Valence score.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The quality of the article/video user's evaluation parametric statistics author delivered according to each author, wherein Yong Huping Valence parameter include reading/viewing duration, user click data, user's sharing data, user comment data, user collect data and User thumbs up one or more of data.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The quality Q (X) of each author is counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
In one alternate embodiment, production capacity is used to characterize the output efficiency of author.Correspondingly, author's scoring statistical module 640 are further adapted for:
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The production capacity P1 of each author is counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time section The ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time period The adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
In one alternate embodiment, designated time period mentioned above is monthly.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
Article/video user's concern amount, user's pageview and user's amount of the sharing statistics delivered according to each author should The temperature of author.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The temperature P2 of each author is counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview) Reason.
In one alternate embodiment, professional degree is for characterizing author in the influence power of different field.Correspondingly, Zuo Zheping Statistical module 640 is divided to be further adapted for:
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
Each author is counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
The credit rating C of each author is counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, it includes at least one following for auditing the standard of deduction of points:
It is against the form of the statute, violate social ethics, contain flame.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
According to after parsing user journal and author's log count the article output of each author in special time period respectively The evaluation index of evaluation index and video the output performance of performance;
Respectively according to the evaluation index of the evaluation index of the article output of each author performance and the performance of video output, calculate Obtain the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author's Overall merit score.
In one alternate embodiment, author's scoring statistical module 640 is further adapted for:
According to after parsing user journal and author's log count in special time period each author respectively in variant neck The evaluation index of article output performance in domain and the evaluation index of video output performance;
The evaluation of the evaluation index of article output performance according to the author in each field and the performance of video output respectively Article evaluation score and video evaluation score of the author in each field is calculated in index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the author Score;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the author Score.
Based on the same inventive concept, the embodiment of the invention also provides a kind of computer storage mediums.Computer storage Media storage has computer program code, when the computer program code is run on the computing device, calculating equipment is caused to be held The methods of marking of row information flow platform author according to any one above-mentioned embodiment or combinations thereof.
Based on the same inventive concept, the embodiment of the invention also provides a kind of calculating equipment.The calculating equipment may include:
Processor;And
It is stored with the memory of computer program code;
When the computer program code is run by processor, the calculating equipment is caused to execute according to any one above-mentioned reality Apply the methods of marking of information flow platform author described in example or combinations thereof.
According to the combination of any one above-mentioned alternative embodiment or multiple alternative embodiments, the embodiment of the present invention can reach It is following the utility model has the advantages that
The methods of marking and device for the information flow platform author that the embodiment of the present invention proposes are obtained from multiple and different channels After user's original log of information flow platform, first with what is constructed according to the respective log form of different channels and resolution rules Rule parsing engine parses user's original log of multi-source, with the user journal after being parsed;Then further according to solution It user journal after analysis and scores from author's log that background data base obtains author.By using rule parsing engine The user journal of separate sources and form is parsed, is solved under the background in user journal from multiple support channels, The problem that data volume is big in data resolving, data are dirty, resolution rules are chaotic realizes multisource data fusion, ensure that data Accuracy, stability and availability.
Further, according to the output of each author in the user journal and author's log statistic special time period after parsing The evaluation index of performance, evaluation index include quality, production capacity, temperature, profession degree and credit rating, and according to these evaluation index meters Calculate the evaluation score of author.Production by using quality, production capacity, five temperature, profession degree and credit rating evaluation indexes to author Performance is evaluated out, is provided a kind of fair, objective, accurate appraisement system, can be successfully managed different information sources, is had There is universality.
It is apparent to those skilled in the art that the specific work of the system of foregoing description, device and unit Make process, can refer to corresponding processes in the foregoing method embodiment, for brevity, does not repeat separately herein.
In addition, each functional unit in each embodiment of the present invention can be physically independent, can also two or More than two functional units integrate, and can be all integrated in a processing unit with all functional units.It is above-mentioned integrated Functional unit both can take the form of hardware realization, can also be realized in the form of software or firmware.
Those of ordinary skill in the art will appreciate that: if the integrated functional unit is realized and is made in the form of software It is independent product when selling or using, can store in a computer readable storage medium.Based on this understanding, Technical solution of the present invention is substantially or all or part of the technical solution can be embodied in the form of software products, The computer software product is stored in a storage medium comprising some instructions, with so that calculating equipment (such as Personal computer, server or network equipment etc.) various embodiments of the present invention the method is executed when running described instruction All or part of the steps.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM), random access memory Device (RAM), the various media that can store program code such as magnetic or disk.
Alternatively, realizing that all or part of the steps of preceding method embodiment can be (all by the relevant hardware of program instruction Such as personal computer, the calculating equipment of server or network equipment etc.) it completes, described program instruction can store in one In computer-readable storage medium, when described program instruction is executed by the processor of calculating equipment, the calculating equipment is held The all or part of the steps of row various embodiments of the present invention the method.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Present invention has been described in detail with reference to the aforementioned embodiments for pipe, those skilled in the art should understand that: at this Within the spirit and principle of invention, it is still possible to modify the technical solutions described in the foregoing embodiments or right Some or all of the technical features are equivalently replaced;And these are modified or replaceed, and do not make corresponding technical solution de- From protection scope of the present invention.
One side according to an embodiment of the present invention provides a kind of methods of marking of information flow platform author of A1., comprising:
User's original log of information flow platform is obtained from multiple and different channels;
User's original log is parsed by rule parsing engine, the user journal after being parsed, wherein The rule parsing engine is constructed according to the different respective log resolution rules of channel;
Author's log is obtained from the background data base of the information flow platform;
According to after the parsing user journal and author's log score the author of the information flow platform.
A2. method according to a1, wherein it is described difference channels include mobile device application APP client and/ Or PC APP client.
A3. method according to a1 or a2, wherein obtain author in the background data base from the information flow platform After log, further includes:
By after the parsing user journal and author's log be saved into Hadoop distributed file system.
A4. method according to a3, wherein user journal and author's log after the parsing are sent out by author Article/video uniform resource position mark URL of table is associated.
A5. the method according to any one of A1-A4, wherein score in the author to the information flow platform Later, further includes:
The appraisal result of author to the information flow platform is saved in MySQL tables of data.
A6. the method according to any one of A1-A5, wherein according to the user journal and the work after the parsing Person's log scores to the author of the information flow platform, comprising:
According to the output of each author in the user journal and author's log statistic special time period after the parsing The evaluation index of performance, and calculate according to the evaluation index evaluation score of the author.
A7. the method according to A6, wherein the evaluation index includes quality, production capacity, temperature, profession degree and credit Degree;
The evaluation score of the author is calculated according to the evaluation index, comprising:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
A8. the method according to A7, wherein united according to article/video user's evaluation parameter that each author delivers Count the quality of the author, wherein the user's evaluation parameter includes reading/viewing duration, user click data, user's sharing number Data are collected according to, user comment data, user and user thumbs up one or more of data.
A9. the method according to A8, wherein count the quality Q (X) of the author according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
A10. the method according to A7, wherein the production capacity is used to characterize the output efficiency of author;
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
A11. the method according to A10, wherein count the production capacity P1 of the author according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time section The ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time period The adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
A12. the method according to A11, wherein the designated time period is monthly.
A13. the method according to A7, wherein article/video user's concern amount, the use delivered according to each author Family pageview counts the temperature of the author with user's amount of sharing.
A14. the method according to A13, wherein count the temperature P2 of the author according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview) Reason.
A15. the method according to any one of A7-A12, wherein the profession degree is for characterizing author in different necks The influence power in domain;
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
A16. the method according to A15, wherein count the author according to the following formula in the professional degree in each field P3:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
A17. the method according to A7, wherein count the credit rating C of each author according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, the standard of the audit deduction of points includes at least one following:
It is against the form of the statute, violate social ethics, contain flame.
A18. the method according to any one of A6-A17, wherein according to user journal after the parsing and described The evaluation index of the output performance of each author in author's log statistic special time period, and being calculated according to the evaluation index should The evaluation score of author, comprising:
According to after the parsing user journal and author's log count each work in the special time period respectively The evaluation index of the article output performance of person and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculate To the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author's Overall merit score.
A19. the method according to A18, wherein according to the user journal and author's log difference after the parsing The evaluation index of the article output performance of each author in the special time period and the evaluation index of video output performance are counted, Include:
According to after the parsing user journal and author's log count each work in the special time period respectively The evaluation index of article output performance of the person in variant field and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculate To the article overall evaluation score and video overall evaluation score of the author, comprising:
The evaluation index of article output performance according to the author in each field and the video output tables respectively Article evaluation score and video evaluation score of the author in each field is calculated in existing evaluation index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the author Score;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the author Score.
According to another aspect of an embodiment of the present invention, a kind of scoring apparatus of information flow platform author of B20. is additionally provided, Include:
User journal obtains module, suitable for obtaining user's original log of information flow platform from multiple and different channels;
User journal parsing module is obtained suitable for being parsed by rule parsing engine to user's original log User journal after parsing, wherein the rule parsing engine is according to the different respective log resolution rules buildings of channel 's;
Author's log acquisition module, suitable for obtaining author's log from the background data base of the information flow platform;And
Author score statistical module, suitable for according to after the parsing user journal and author's log to the information The author of levelling platform scores.
B21. the device according to B20, wherein the difference channel includes mobile device application APP client And/or PC APP client.
B22. the device according to B20 or B21, wherein further include:
Daily record data preserving module, suitable for author's log of user journal and the acquisition after the parsing to be saved into In Hadoop distributed file system.
B23. the device according to B22, wherein user journal and author's log after the parsing pass through author Article/video the uniform resource position mark URL delivered is associated.
B24. the device according to any one of B20-B23, wherein further include:
Appraisal result preserving module, suitable for being carried out in author of the author scoring statistical module to the information flow platform After scoring, the appraisal result of the author to the information flow platform is saved in MySQL tables of data.
B25. the device according to any one of B20-B24, wherein author's scoring statistical module is further adapted for:
According to the output of each author in the user journal and author's log statistic special time period after the parsing The evaluation index of performance, and calculate according to the evaluation index evaluation score of the author.
B26. the device according to B25, wherein the evaluation index includes quality, production capacity, temperature, profession degree and letter Expenditure;
Author's scoring statistical module is further adapted for:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
B27. the device according to B26, wherein author's scoring statistical module is further adapted for:
The quality of the article/video user's evaluation parametric statistics author delivered according to each author, wherein the use Family evaluation parameter includes reading/viewing duration, user click data, user's sharing data, user comment data, user's collection number One or more of data are thumbed up according to user.
B28. the device according to B27, wherein author's scoring statistical module is further adapted for:
The quality Q (X) of the author is counted according to the following formula:
Q (X)=conversion ratio+reading/viewing duration+log (mean apparent)+log (optimal performance);
Wherein, conversion ratio=clicking rate+sharing rate+comment rate+collection rate+thumbs up rate-and does not like rate,
The amount of thumbing up of the mean apparent=average click volume+average amount of collection of average sharing amount+average review amount++ averagely,
Optimal performance=highest click volume+highest sharing amount+highest comment amount+highest amount of collection+highest amount of thumbing up,
Conversion ratio, reading/viewing duration, log (mean apparent) and log (optimal performance) are normalized.
B29. the device according to B26, wherein the production capacity is used to characterize the output efficiency of author;
Author's scoring statistical module is further adapted for:
Quantity is delivered according to article/video of each author and delivers the production capacity of the Efficiency Statistics author.
B30. the device according to B29, wherein author's scoring statistical module is further adapted for:
The production capacity P1 of the author is counted according to the following formula:
P1=log (delivering quantity)+deliver efficiency;
Wherein, the total quantity that quantity is article or video that the author delivers is delivered,
Efficiency is delivered to have delivered the number of days of article or video and the designated time period in the author at the appointed time section The ratio between total number of days and the author have delivered all numbers of article or video and total week of the designated time period in the designated time period The adduction of the ratio between number,
It log (delivering quantity) and delivers efficiency and is all normalized.
B31. the device according to B30, wherein the designated time period is monthly.
B32. the device according to B26, wherein author's scoring statistical module is further adapted for:
Article/video user's concern amount, user's pageview and user's amount of the sharing statistics delivered according to each author should The temperature of author.
B33. the device according to B32, wherein author's scoring statistical module is further adapted for:
The temperature P2 of the author is counted according to the following formula:
P2=log (user's concern amount)+log (user's pageview)+log (user's sharing amount);
Wherein, place is normalized with log (user's amount of sharing) in log (user's concern amount), log (user's pageview) Reason.
B34. the device according to any one of B26-B31, wherein the profession degree is for characterizing author in different necks The influence power in domain;
Author's scoring statistical module is further adapted for:
The professional degree of the author is counted in the quality and production capacity of different field according to each author.
B35. the device according to B34, wherein author's scoring statistical module is further adapted for:
The author is counted according to the following formula in the professional degree P3 in each field:
The adduction of the quality in a certain field P3=and adduction/all spectra quality of production capacity and production capacity.
B36. the device according to B26, wherein author's scoring statistical module is further adapted for:
The credit rating C of each author is counted according to the following formula:
C=100- audits deduction of points-customer complaint deduction of points;
Wherein, the standard of the audit deduction of points includes at least one following:
It is against the form of the statute, violate social ethics, contain flame.
B37. the device according to any one of B25-B36, wherein author's scoring statistical module is further adapted for:
According to after the parsing user journal and author's log count each work in the special time period respectively The evaluation index of the article output performance of person and the evaluation index of video output performance;
Respectively according to the evaluation index of the evaluation index of article output performance and video output performance, calculate To the article overall evaluation score and video overall evaluation score of the author;
Article overall evaluation score and video overall evaluation score to the author are weighted summation, obtain the author's Overall merit score.
B38. the device according to B37, wherein author's scoring statistical module is further adapted for:
According to after the parsing user journal and author's log count each work in the special time period respectively The evaluation index of article output performance of the person in variant field and the evaluation index of video output performance;
The evaluation index of article output performance according to the author in each field and the video output tables respectively Article evaluation score and video evaluation score of the author in each field is calculated in existing evaluation index;
Summation is weighted to article evaluation score of the author in each field, obtains the article overall evaluation of the author Score;
Summation is weighted to video evaluation score of the author in each field, obtains the video overall evaluation of the author Score.
It is according to an embodiment of the present invention in another aspect, additionally providing a kind of computer storage medium of C39., the computer Storage medium is stored with computer program code, when the computer program code is run on the computing device, causes described Calculate the methods of marking that equipment executes the information flow platform author according to any one of A1-A19.
Another aspect according to an embodiment of the present invention additionally provides a kind of calculating equipment of D40., comprising:
Processor;And
It is stored with the memory of computer program code;
When the computer program code is run by the processor, the calculating equipment is caused to execute according to A1-A19 Any one of described in information flow platform author methods of marking.

Claims (10)

1. a kind of methods of marking of information flow platform author, comprising:
User's original log of information flow platform is obtained from multiple and different channels;
User's original log is parsed by rule parsing engine, the user journal after being parsed, wherein described Rule parsing engine is constructed according to the different respective log resolution rules of channel;
Author's log is obtained from the background data base of the information flow platform;
According to after the parsing user journal and author's log score the author of the information flow platform.
2. according to the method described in claim 1, wherein, the difference channel includes mobile device application APP client And/or PC APP client.
3. method according to claim 1 or 2, wherein obtain and make in the background data base from the information flow platform After person's log, further includes:
By after the parsing user journal and author's log be saved into Hadoop distributed file system.
4. according to the method described in claim 3, wherein, user journal and author's log after the parsing pass through author Article/video the uniform resource position mark URL delivered is associated.
5. method according to any of claims 1-4, wherein score in the author to the information flow platform Later, further includes:
The appraisal result of author to the information flow platform is saved in MySQL tables of data.
6. method according to any one of claims 1-5, wherein according to the user journal and the work after the parsing Person's log scores to the author of the information flow platform, comprising:
According to the output performance of each author in the user journal and author's log statistic special time period after the parsing Evaluation index, and calculate according to the evaluation index evaluation score of the author.
7. according to the method described in claim 6, wherein, the evaluation index includes quality, production capacity, temperature, profession degree and letter Expenditure;
The evaluation score of the author is calculated according to the evaluation index, comprising:
Summation is weighted to the quality, production capacity, temperature, profession degree and credit rating, obtains the evaluation score of the author.
8. a kind of scoring apparatus of information flow platform author, comprising:
User journal obtains module, suitable for obtaining user's original log of information flow platform from multiple and different channels;
User journal parsing module is parsed suitable for being parsed by rule parsing engine to user's original log User journal afterwards, wherein the rule parsing engine is constructed according to the different respective log resolution rules of channel;
Author's log acquisition module, suitable for obtaining author's log from the background data base of the information flow platform;And
Author score statistical module, suitable for according to after the parsing user journal and author's log to the information levelling The author of platform scores.
9. a kind of computer storage medium, the computer storage medium is stored with computer program code, when the computer When program code is run on the computing device, the calculating equipment is caused to execute according to claim 1 described in any one of -7 The methods of marking of information flow platform author.
10. a kind of calculating equipment, comprising:
Processor;And
It is stored with the memory of computer program code;
When the computer program code is run by the processor, cause the calculating equipment execute according to claim 1- The methods of marking of information flow platform author described in any one of 7.
CN201811299493.8A 2018-11-02 2018-11-02 The methods of marking and device of information flow platform author Pending CN109670855A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811299493.8A CN109670855A (en) 2018-11-02 2018-11-02 The methods of marking and device of information flow platform author

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811299493.8A CN109670855A (en) 2018-11-02 2018-11-02 The methods of marking and device of information flow platform author

Publications (1)

Publication Number Publication Date
CN109670855A true CN109670855A (en) 2019-04-23

Family

ID=66141771

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811299493.8A Pending CN109670855A (en) 2018-11-02 2018-11-02 The methods of marking and device of information flow platform author

Country Status (1)

Country Link
CN (1) CN109670855A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110471898A (en) * 2019-08-22 2019-11-19 长江师范学院 Dissemination method can be traced in a kind of information credit management method and Figures
CN110795658A (en) * 2019-09-25 2020-02-14 北京三快在线科技有限公司 User scoring method and device, electronic equipment and computer storage medium
CN111104486A (en) * 2019-12-25 2020-05-05 郑州师范学院 Modern literature comparison and explanation system
CN111738608A (en) * 2020-06-28 2020-10-02 中国联合网络通信集团有限公司 Channel scoring method and system
CN116910628A (en) * 2023-09-12 2023-10-20 联通在线信息科技有限公司 Creator expertise portrait assessment method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8762225B1 (en) * 2004-09-30 2014-06-24 Google Inc. Systems and methods for scoring documents
CN104657488A (en) * 2015-03-05 2015-05-27 中南大学 Method for calculating author influence based on citation propagation network
CN106682097A (en) * 2016-12-01 2017-05-17 北京奇虎科技有限公司 Method and device for processing log data
CN107911721A (en) * 2017-12-01 2018-04-13 北京蓝水科技文化有限公司 The quantitatively evaluating Index and system of a kind of internet films and television programs
CN108280073A (en) * 2017-01-05 2018-07-13 北大方正集团有限公司 The influence power analysis method and system of news client

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8762225B1 (en) * 2004-09-30 2014-06-24 Google Inc. Systems and methods for scoring documents
CN104657488A (en) * 2015-03-05 2015-05-27 中南大学 Method for calculating author influence based on citation propagation network
CN106682097A (en) * 2016-12-01 2017-05-17 北京奇虎科技有限公司 Method and device for processing log data
CN108280073A (en) * 2017-01-05 2018-07-13 北大方正集团有限公司 The influence power analysis method and system of news client
CN107911721A (en) * 2017-12-01 2018-04-13 北京蓝水科技文化有限公司 The quantitatively evaluating Index and system of a kind of internet films and television programs

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110471898A (en) * 2019-08-22 2019-11-19 长江师范学院 Dissemination method can be traced in a kind of information credit management method and Figures
CN110795658A (en) * 2019-09-25 2020-02-14 北京三快在线科技有限公司 User scoring method and device, electronic equipment and computer storage medium
CN111104486A (en) * 2019-12-25 2020-05-05 郑州师范学院 Modern literature comparison and explanation system
CN111738608A (en) * 2020-06-28 2020-10-02 中国联合网络通信集团有限公司 Channel scoring method and system
CN116910628A (en) * 2023-09-12 2023-10-20 联通在线信息科技有限公司 Creator expertise portrait assessment method and system
CN116910628B (en) * 2023-09-12 2024-02-06 联通在线信息科技有限公司 Creator expertise portrait assessment method and system

Similar Documents

Publication Publication Date Title
CN109670855A (en) The methods of marking and device of information flow platform author
CN109657138B (en) Video recommendation method and device, electronic equipment and storage medium
Seufert Freemium economics: Leveraging analytics and user segmentation to drive revenue
Schomm et al. Marketplaces for data: an initial survey
Adie et al. Altmetric: enriching scholarly content with article‐level discussion and metrics
Bjeladinovic A fresh approach for hybrid SQL/NoSQL database design based on data structuredness
Liu et al. Citations with different levels of relevancy: Tracing the main paths of legal opinions
CN104115178A (en) Methods and systems for predicting market behavior based on news and sentiment analysis
Nogués et al. Business Intelligence Tools for Small Companies
JP2008299542A (en) Content providing method, content providing device, content providing program, and content providing system
US20140337304A1 (en) Application retention metrics
Kornevs et al. Cloud computing evaluation based on financial metrics
CN111026801A (en) Method and system for assisting operation quick decision-making work of insurance type e-commerce
CN110163683A (en) Value user's key index determines method, advertisement placement method and device
CN110866698A (en) Device for assessing service score of service provider
Binfield PLoS ONE and the rise of the Open Access MegaJournal
CN109033173A (en) It is a kind of for generating the data processing method and device of multidimensional index data
Wang et al. Toward the health measure for open source software ecosystem via projection pursuit and real-coded accelerated genetic
Király A metadata quality assurance framework
KR20200126424A (en) Media source measurement for integration into the censored media corpus
Wang et al. A deep learning model for predicting movie box office based on deep belief network
CN109086339B (en) Data processing method and device for generating index recombination rate
US20220318236A1 (en) Library information management system
CN113485989A (en) Comprehensive analysis method, system, medium and equipment for supervision data
Goyal Demonetization-Twitter Data Analysis using Big Data & Hadoop

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination