Embodiment
In order to be better understood from above-mentioned technical proposal, below by accompanying drawing and specific embodiment to this specification embodiment
Technical scheme be described in detail, it should be understood that the specific features in this specification embodiment and embodiment are to this explanation
The detailed description of book embodiment technical scheme, rather than the restriction to this specification technical scheme, in the case where not conflicting,
Technical characteristic in this specification embodiment and embodiment can be mutually combined.
Fig. 1 is the internet content risk assessment application scenarios schematic diagram of this specification embodiment.User is in client 10
Operation, for example issue, forward the internet contents such as all kinds of models, comment;Client 10 by internet content be sent to website or
APP service server 20;Internet content is supplied to security management and control server 30 to carry out security pipe by service server 20
Reason;Security management and control server 30 carries out the operation such as content recognition, risk control, examination & verification and quality inspection to internet content.
In a first aspect, this specification embodiment provides a kind of internet content risk evaluating method, Fig. 2 is refer to, including:
S201:Classification judgement is carried out to the internet content of acquisition.
By analyzing the data of internet content, determine the classification of internet content for video (live), audio, picture or
Text.The classification for determining internet content is in order to which the internet content to be subsequently input to the risk identification model of corresponding classification
In.
S202:Internet content is input in multiple risk identification models of corresponding classification, obtains multiple risk identifications
Multiple risk score values that model exports respectively.
As it was previously stated, the classification of internet content includes video, audio, picture and text etc..Instructed in advance for all kinds of contents
Practise multiple risk identification models.For example for the other content of picture category, porny identification model can be pre-set, related to
Political affairs picture recognition model, illegal advertisement (such as Quick Response Code) picture recognition model etc..Linear model can be used to train each
Model, such as linear regression model (LRM), analysis of variance model etc.;Other algorithms (such as deep learning etc.) can certainly be used to instruct
Practice identification model.
S203:According to multiple risk score values, the integrated risk score value of the internet content is calculated.
It is integrated risk score value Computing Principle schematic diagram in an optional mode referring to Fig. 3.Internet content is input to it
In multiple identification models corresponding to classification (model 1, model 2 ..., model 10);Each model export to obtain value-at-risk X1,
X1、…、X 10;According to this ten value-at-risks integrated risk score value is obtained using certain algorithm.
In a kind of optional mode, Risk rated ratio is set respectively for multiple risk identification models;To multiple risk score values point
The Risk rated ratio corresponding to is not weighted, and obtains integrated risk score value.Still as Fig. 3 example, it is assumed that for model 1,
Model 2 ..., model 10 set weight be respectively b1, b2 ..., b10, then when calculating integrated risk score value, by each value-at-risk
Corresponding Weight calculates.
Assuming that for the other internet content of picture category, there is N number of risk model, its corresponding model score is expressed as
Xi, then:
Wherein, Xresult represents integrated risk score value;Parameter a is adjustment parameter, can be trained to obtain according to great amount of samples;
Bi is Risk rated ratio corresponding to Xi, is to be set previously according to great amount of samples for each model.
Referring to Fig. 4, for the flow chart of one example of internet content risk evaluating method.In the example,
S401:Internet content is inputed to content identifier module by service server;
S402:Content identifier module be identified after by the risk score value of each model be supplied to integrated risk score value calculate mould
Block;
S403:Integrated risk score value computing module calculates integrated risk score value;And choose and send according to integrated risk score value
Careful content is to auditing platform;
S404:Audit platform and carry out content auditing;
S405:Auditing result is back to service server by examination & verification platform;
S406:Audit platform can sampling inspection to quality inspection platform;
S407:Quality inspection result is returned to service server by quality inspection platform.
In an optional mode, after the integrated risk score value is calculated, in addition to:According to preset risk point
Lowest threshold, determine that integrated risk score value divides the internet content of lowest threshold to be high risk content more than the risk.
When sending pending internet content to internet content examination & verification platform, preferentially extracted from the high risk content in pending
Hold.It is appreciated that it is the higher content of risk to calculate the higher internet content of score by integrated risk point, in order to keep away
Exempt from its exposed on the internet or exposed overlong time, it is necessary to carry out the processing such as priority check or filtering to it.This specification is real
Apply in example, different from checking internet content sequentially in time, but risk judgment is carried out based on content in itself, by high-risk wind
Dangerous content priority sends to examination & verification platform and audited, and can reduce the excessive risk content exposed time.
Assuming that 1,000,000 history is taken to audit data (being wherein marked as black amount as 166673) at random, according to this explanation
After risk obtained by book embodiment pours in separately sequence arrangement, the relation of examination & verification amount and the black amount of mark is as shown in table 1 below.
Table 1
Examination & verification amount |
Examination & verification ratio |
Mark black amount |
Mark and black account for total black ratio |
300000 |
30% |
105003 |
63% |
400000 |
40% |
123338 |
74% |
500000 |
50% |
135005 |
81% |
600000 |
60% |
141672 |
85% |
700000 |
70% |
148338 |
89% |
800000 |
80% |
158339 |
95% |
900000 |
90% |
165006 |
99% |
1000000 |
100% |
166673 |
100% |
Wherein content mark is black refers to, is audited by content safety, is judged as problematic content, it is necessary to which business is deleted
Operated except processing is waited.
From the above data, it can be seen that preceding 30% examination & verification task covers 63% content risks, can effectively reduce
The exposed time of excessive risk content.
In an optional mode, index can be recalled according to dangerous content and determines pending number;From high risk content
During the pending content of middle selection, it is random or according to integrated risk score value from high to low, choose the Risk Content of the pending number, send
Platform is audited to internet content.Such as the example of above-mentioned table 1.Assuming that it is 90% to the requirement recalled, then examination & verification amount 70% is
Task can be basically completed, and (during examination & verification amount 70%, mark is black, and to account for total black ratio be 89%, close to 90%), namely saves nearly 30%
Examination & verification amount.
Existing audit mode is will to need simply (inverted order) arrangement examination & verification in chronological order of the contents of manual examination and verification.Should
Consequence is caused by audit mode:Examination & verification is not ranked up by degree of risk height, higher interior of degree of risk may be caused
Appearance delays processing so that the time that the excessive risk content exposed is detained is longer.Under extreme case, because new Risk Content continues
Produce so that history excessive risk content cannot be handled always, cause risk persistently to spread.This specification embodiment provides mutual
Networking content risks evaluation method, the value-at-risk provided according to each risk model are weighted processing and obtain integrated risk point
Value, can preferably embody the risk of internet content in itself by the integrated risk score value;It is then based on integrated risk point
Value selects high risk content and carries out Priority Review status, so as to reduce the dangerous content exposed holdup time.It is real using this specification
After applying example, the forward probability increase of excessive risk content ordering, so as to by the probability of Priority Review status be increased, meeting to recall index
In the case of can reduce the black workload of mark, improve efficiency.
Second aspect, based on same inventive concept, this specification embodiment provides a kind of internet content risk assessment dress
Put, refer to Fig. 5, including:
Classification judging unit 501, for carrying out classification judgement to the internet content of acquisition;
Risk identification unit 502, for internet content to be input in multiple risk identification models of corresponding classification, obtain
The multiple risk score values exported respectively to multiple risk identification models;
Risk integrative unit 503, according to multiple risk score values, calculate the integrated risk score value of the internet content.
In a kind of optional mode, the risk integrative unit 503 is specifically used for:To multiple risk score values respectively according to right
The Risk rated ratio answered is weighted, and obtains the integrated risk score value;It is each wind in advance that wherein described Risk rated ratio, which is,
What dangerous identification model was set.
In a kind of optional mode, the classification judging unit 501 is specifically used for:The data of internet content are analyzed, really
The classification of the fixed internet content is video, audio, picture, and/or, text.
In a kind of optional mode, in addition to:High-risk content determining unit 504, for minimum according to preset risk point
Threshold value, determine that integrated risk score value divides the internet content of lowest threshold to be high risk content more than the risk.
In a kind of optional mode, in addition to:Pending content determining unit 505, for flat to internet content examination & verification
When platform sends pending internet content, preferentially pending content is extracted from the high risk content.
In a kind of optional mode, in addition to:Pending number decision unit 506, for recalling index according to dangerous content,
Determine pending number;The pending content determining unit 505, from the high risk content, at random or according to integrated risk
Score value is sent to the internet content examination & verification platform from high to low, the Risk Content of the selection pending number.
The third aspect, based on the inventive concept with internet content risk assessment in previous embodiment, the present invention also provides
A kind of computer-readable recording medium, is stored thereon with computer program, and the program is realized described previously when being executed by processor
The step of either method of the method for internet content risk assessment.
Fourth aspect, based on the inventive concept same with internet content risk evaluating method in previous embodiment, this hair
It is bright that a kind of server is also provided, as shown in fig. 6, including memory 604, processor 602 and being stored on memory 604 and can be
The computer program run on processor 602, the processor 602 realize internet content described previously when performing described program
The step of either method of risk evaluating method.
Wherein, in figure 6, bus architecture (being represented with bus 600), bus 600 can include any number of interconnection
Bus and bridge, bus 600 deposited what the one or more processors including being represented by processor 602 and memory 604 represented
The various circuits of reservoir link together.Bus 600 can also will ancillary equipment, voltage-stablizer and management circuit etc. it
Various other circuits of class link together, and these are all it is known in the art, therefore, no longer being carried out further to it herein
Description.EBI 606 provides interface between bus 600 and receiver 601 and transmitter 603.Receiver 601 and transmitter
603 can be same element, i.e. transceiver, there is provided for the unit to be communicated over a transmission medium with various other devices.Place
Reason device 602 is responsible for bus 600 and common processing, and memory 604 can be used for storage processor 602 and perform behaviour
Used data when making.
This specification is with reference to the method, equipment (system) and computer program product according to this specification embodiment
Flow chart and/or block diagram describe.It should be understood that can be by every in computer program instructions implementation process figure and/or block diagram
One flow and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computers can be provided
Processor of the programmed instruction to all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices
To produce a machine so that produce use by the instruction of computer or the computing device of other programmable data processing devices
In setting for the function that realization is specified in one flow of flow chart or multiple flows and/or one square frame of block diagram or multiple square frames
It is standby.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to
Make the manufacture of equipment, the commander equipment realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or
The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, so as in computer or
The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in individual square frame or multiple square frames.
Although having been described for the preferred embodiment of this specification, those skilled in the art once know basic wound
The property made concept, then other change and modification can be made to these embodiments.So appended claims are intended to be construed to include
Preferred embodiment and fall into having altered and changing for this specification scope.
Obviously, those skilled in the art can carry out various changes and modification without departing from this specification to this specification
Spirit and scope.So, if these modifications and variations of this specification belong to this specification claim and its equivalent skill
Within the scope of art, then this specification is also intended to comprising including these changes and modification.