CN109558555A - Microblog water army detection method and detection system based on artificial immunity danger theory - Google Patents
Microblog water army detection method and detection system based on artificial immunity danger theory Download PDFInfo
- Publication number
- CN109558555A CN109558555A CN201810950560.1A CN201810950560A CN109558555A CN 109558555 A CN109558555 A CN 109558555A CN 201810950560 A CN201810950560 A CN 201810950560A CN 109558555 A CN109558555 A CN 109558555A
- Authority
- CN
- China
- Prior art keywords
- microblog
- user
- microblogging
- danger
- antigen
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention belongs to micro blog network technical fields, disclose a kind of microblog water army detection method and detection system based on artificial immunity danger theory, the thought of artificial immunity is applied in the detection of microblog users behavioural characteristic, obtains microblog users data using focused web crawler;It is portrayed by the analysis method based on user behavior characteristics and defines network navy behavior, distinguish the characteristic attribute of network novel waterborne troops and normal users;Finally the signal processing mechanism of artificial immunity danger theory is applied in network navy detection, using the waterborne troops user in the Dendritic Cells algorithm DCA detection microblogging of danger theory.The present invention obtains microblog users data using the mode of the focused web crawler based on Python, and with the data of structuring storage to database, which is easier to obtain data set, can reasonably acquire all kinds of behavioral datas of user, has many advantages, such as to crawl that the period is short, the quality of data is high.
Description
Technical field
The invention belongs to micro blog network technical field more particularly to a kind of microblog water armies based on artificial immunity danger theory
Detection method and detection system.
Background technique
Currently, the prior art commonly used in the trade is such that
Micro blog network waterborne troops refers to some by interests driving, to reach the authenticity such as scramble data, mispriming
The purpose of spin, damage citizen's interests, manufactures in microblogging by manipulation software robot or navy account number, propagates void
Affectedly see the general name with junk information producers such as junk information.Some data mining technologies are used in microblog water army detection, fixed
The high discrimination feature of justice or behavior pattern find hiding network navy.
Current main navy detection method is as follows:
Navy detection method based on content characteristic: including text classification, text emotion analysis and Text Orientation point
The methods of analysis, by calculating content of microblog and junk information similarity, or the similarity of comment content and comment spam, to identify
Network navy.
Navy detection method based on environmental characteristic: by obtaining TCP footprint information in network environment, IP blacklist is believed
Breath, robot website order tracking and routing iinformation etc., which connect, analyzes the network level feature of waterborne troops, realizes water
Army's tracking.
Navy detection method based on user characteristics: special by the relationship characteristic and behavior of the network user of analysis variation
Sign chooses correlated characteristic attribute training classifier, the detection of micro blog network waterborne troops is then carried out with trained classifier.
In conclusion problem of the existing technology is:
Navy detection method based on content characteristic, due to the complication and disparate networks platform systems of real name of network environment
Constraint, waterborne troops are generated by previous system batch operation, are gradually converted into a kind of novel waterborne troops operated by real user, the latter
The junk information of manufacture is intended to normal users, no longer has significantly recognizable feature, therefore this method cannot be effective
It was found that the novel waterborne troops of network.
Navy detection method based on environmental characteristic, due to nets such as TCP footprint information, IP black list information and routing iinformations
Network environmental characteristic information can not be covered up by modification, therefore the detection method recognition accuracy is higher, but network environment class data set
More difficult acquisition, therefore program replicability is lower
Navy detection method based on user characteristics, this method can find hiding network navy well, and more
Suitable under social network-i i-platform environment waterborne troops detection, but existing characteristics description not comprehensively, to the mass data of multi objective at
The problems such as managing lower efficiency and needs a large amount of training datasets.
Solve the difficulty and meaning of above-mentioned technical problem:
(1) due to the enhancing of self hidden consciousness of network navy, simple waterborne troops's detection based on content characteristic can be missed
The new network waterborne troops propagated mostly using normal text feature as illusion, practicability are lower.Note of the present invention from microblog users
Volume, issuing microblog forwarding, are commented on, are thumbed up etc. in use processes and excavating the specific behavior pattern of microblog water army, to waterborne troops's behavior
Feature is analysed in depth, and the important attribute that can distinguish waterborne troops and non-waterborne troops is excavated, these attributes carve microblog water army feature
It is decorated with important function.
(2) traditional to there is very big difficulty in terms of data acquisition based on the navy detection method of environmental characteristic, it can push away
Wide property is lower.The present invention chooses focused web crawler strategy, logs in the pass landing approach for obtaining Sina weibo by simulating,
And URL search strategy is worked out, the Html obtained under specified link is saved, and is finally parsed, is translated into Html
Structural data is stored into database.Data acquisition strategy in the present invention crawls high-efficient, and can design and climb according to demand
The particular content of specified page is taken, replicability is high, to realize that waterborne troops's detection provides good data supporting.
(3) behavior of waterborne troops gradually complicates at present, and the feature for choosing local, single aspect carries out the detection of waterborne troops,
Can existing characteristics description it is not comprehensive, the problems such as being easy to cause identification there are errors.The present invention is with the basic act (note of microblog users
Volume information, registion time etc.), user issues behavior (issuing microblog etc.), and user pays close attention to behavior (concern, bean vermicelli etc.), Yong Huzhuan
(forwarding, comment, thumb up) is distributed as starting point, more comprehensive, deep grind is carried out to the behavioural characteristic of microblog users
Study carefully, and final result of study is applied in microblog water army detection.The present invention describes more comprehensively, to reduce to the feature of microblog water army
The identification error of waterborne troops's detection more fully chooses feature in microblog water army detection and plays an important role.
(4) traditional waterborne troops's classification and Detection method based on user characteristics needs a large amount of training datasets, and detection efficiency is low
And applicability is not high.The present invention detects the signal processing mechanism of artificial immunity danger theory applied to network navy, using danger
The theoretical Dendritic Cells algorithm (DCA) in danger detects the waterborne troops user in microblogging, and DCA algorithm has does not depend on knowledge base, meter
It is high-efficient, the features such as rate of false alarm and rate of failing to report can be reduced.The present invention is based on the characteristics of DCA algorithm to realize waterborne troops's detection, tool
The advantages that having computational efficiency high, being not necessarily to training dataset and higher Detection accuracy.
Summary of the invention
In view of the problems of the existing technology, the present invention provides a kind of microblog water armies based on artificial immunity danger theory
Detection method and detection system.It is an object of the invention to the thought of artificial immunity danger theory is introduced into user behavior characteristics
Analysis in, to efficiently identify microblog water army user.By analyzing the behavioural characteristic of Sina weibo waterborne troops, it is total to choose microblogging
Whether number microblogging grade, authenticates, the characteristic attributes such as sunlight credit, number of fans, using the analysis result of the above attribute as difference water
The characteristic signal of army and normal users, and it is real based on Dendritic Cells algorithm (Dendritic Cells Algorithm, DCA)
The identification of existing Sina weibo waterborne troops.
In social network environment, the problems such as user caused by all types of user behavior is abnormal and network security, and manually exempt from
Epidemic disease system in intrusion detection problem using similitude with higher, as utilized the dendron shape in artificial immunity danger theory
Cell algorithm (Dendritic Cell Algorithm, DCA) constructs Integrated Intrusion Detection (RSAI-IID) model, or carries out
Spam mass-sending detection and Web server abnormality detection etc., wherein Dendritic Cells algorithm has computational efficiency height, can reduce
Rate of false alarm and rate of failing to report are not necessarily to the features such as training dataset.
The invention is realized in this way a kind of microblog water army detection method based on artificial immunity danger theory, the base
Include: in the microblog water army detection method of artificial immunity danger theory
Using focused web crawler obtain microblog users behavioral data, using artificial immunity to microblog users behavioural characteristic into
Row detection;
User behavior characteristics are analyzed and defined with network navy behavior, distinguishes the novel waterborne troops of network and normal users
Characteristic attribute;
Using the network navy user behavior in the Dendritic Cells algorithm DCA detection microblogging of artificial immunity danger theory.
Further, the microblog water army detection method based on artificial immunity danger theory specifically includes:
The acquisition of microblog data: step 1 uses focused web crawler, crawls to the user information of microblogging;The reality of invention
Data are tested by calling Sina weibo api interface and Python to write focused web crawler two ways and obtained, and
Duplicate removal is carried out to these data, the pretreatment such as remove sky;
Step 2, the selection of feature: number of fans, attention number, microblogging sum, original microblogging in extracting user's microblogging
Count, whether authenticate, microblogging grade, whether there is or not brief introduction, registion time, sunlight credit, mutual attention number, participate in topic number, comment number,
Forwarding number and after thumbing up 14 kinds of user behavior characteristics of number, by multiple comparative experiments and is summarized user behavior characteristics original in 14
Sunlight credit, liveness, identity evaluation, influence power, bean vermicelli concern are fused to than, than 6 indexs of original microblogging;
Antigen signals definition: step 3 sunlight credit SC, liveness AT, identity evaluation IE, influence power CI, bean vermicelli is closed
Note carries out normalization processing than FF, original microblogging ratio 6 indexs of OM, and mapping function is as follows:Its
Middle x is original signal value, as x ∈ [m, n], carry out Linear Mapping, when x ∈ [n, ∞) when, signal is maximized 10;
Step 4, the microblog water army detection based on DCA algorithm: using microblog users as antigen, initialization antigen first is adopted
Collect number and Dendritic Cells population;Unrecognized microblog users are selected in microblog users detection sample at random, according to micro-
The rich corresponding pathogen associated molecular pattern signal of user, danger signal, safety signal and the scorching signal of cause are as input signal;
It is same to offering according to calculation formula is following and its concentration of CSMI, SEM, MAT is calculated in corresponding weight matrix
CSM, SEMI, MAT concentration that the DC cell of one antigen is obtained add up;
The calculation formula of DCA algorithm is as follows:
(1+IS) is amplified signal in formula, and the corresponding value of input signal PAMP, DS, SS and weight are CP, CD, CS respectively
And WP, WD, WS, the corresponding value of output signal CSM, SEM and MAT is respectively C[CSM], C[SEM]And C[MAT]。
CSM, SEM and MAT value are calculated according to input signal values and weight matrix, and is added up.If CSM is greater than migration
Threshold value then compares the size of SEMI and MAT, and the state of the DC and the antigen state of DC acquisition are marked according to comparison result;
If antigen determines that total degree reaches antigen discrimination threshold, cell maturation antigen value MCAV, formula MCAV=MAT/ are calculated
(SEM+MAT), wherein SEM and MAT be output signal SEM, MAT value.Compare the size of MCAV and outlier threshold, if MCAV
Larger, then antigenic mark is abnormal, which is waterborne troops, otherwise labeled as normal.
Further, in step 1, crawling method includes that simulation logs in, obtains station address link and HTML code parsing;
(1) simulation logs in: after network address authenticates successfully, being logged in;
(2) station address link: the division according to Sina weibo to user authentication type is obtained, is had without Sina's certification
Ordinary user, the personal authentication user for being identified as yellow V or gold V, the enterprise institution certification user for being identified as blue V;Different type is recognized
The user home page or the second level page of card have different URL link templates;
(3) HTML code parses: by being logged in advance with after target URL definition, utilizing what is carried in Python
The library urllib, urllib2 is carried out a variety of parsings to the Html of URL and operated, or opened using an advanced crawler of Python
It sends out frame Scrapy and carries out the positioning of Html page info;Carry out the information scratching of web page.
Further, in step 2, fusion method includes:
1) sunlight credit SC point be extremely low 300-419, it is lower 420-450, general 451-570, preferable 571-690, fabulous
691-900 grade is indicated using numerical value 1-5 respectively in fusion;
2) liveness AT, including microblogging sum M, participation topic number T, registion time Z, current time N, wherein " N-Z " is tied
For fruit with " day " for unit, calculation is as follows,
AT=(0.7M+0.3T)/(N-Z);
3) identity evaluate IE, respectively whether there is or not brief introduction I, whether authenticate C and number of degrees G, to each attribute weight distinguish
It is 0.2,0.4,0.4, calculation is as follows,
IE=0.2I+0.4C+0.4G;
4) influence power CI, respectively comment number J, forwarding number R, thumb up several F and the sent out microblogging of user by comment number, quilt
Forwarding number and number being thumbed up, the weight of each attribute is respectively 0.3,0.5,0.2, and calculation is as follows,
CI=0.3J+0.5R+0.2F;
5) ratio of the bean vermicelli concern than number of fans Fans and attention number Followers that FF is each user, calculation method
It is as follows,
FF=Fans/Followers;
6) original microblogging ratio OM is microblogging sum M ratio shared by original microblogging Weibo_Original in microblogging transmitted by user
Example;Calculation is as follows, OM=(Weibo_Original)/M.
Further, in step 3,4 kinds of input signals of the index of correlation and DCA algorithm that detect Sina weibo waterborne troops map
Include:
Pathogen associated molecular pattern PAMP: showing user behavior exception, and there are the features of waterborne troops's behavior, defines PAMP=
{<SC,IE,FF>};
Danger signal DS: showing that a possibility that user behavior is abnormal, abnormal is higher, and only normally performed activity changes, but
There are the possibility of waterborne troops's behavior, define DS={<AT, CI, OM>};
Safety signal SS: it indicates that a possibility that user is normal is higher, and is in normal condition, define SS={<SC, IE>};
Pro-inflammatory cytokine IS: showing active user generally there are exception, plays the role of amplifying PAMP, DS, SS signal, fixed
Adopted IS={<CI>}.
Another object of the present invention is to provide a kind of computer program, based on people described in the computer program operation
The microblog water army detection method of work Danger Immune theory.
Another object of the present invention is to provide a kind of terminal, it is described based on artificial immunity that the terminal at least carries realization
The controller of the microblog water army detection method of danger theory.
Another object of the present invention is to provide a kind of computer readable storage mediums, including instruction, when it is in computer
When upper operation, so that computer executes the microblog water army detection method based on artificial immunity danger theory.
Another object of the present invention is to provide the microblog water armies based on artificial immunity danger theory described in a kind of realize
The microblog water army detection system based on artificial immunity danger theory of detection method, it is described based on the micro- of artificial immunity danger theory
Winning waterborne troops's detection system includes:
Microblog data is obtained module and is crawled using focused web crawler to the user information of microblogging;
Characteristic selecting module: in extracting user's microblogging number of fans, attention number, microblogging sum, original microblog number, whether
Certification, microblogging grade, whether there is or not brief introduction, registion time, sunlight credit, mutual attention number, participation topic number, comment numbers, forwarding number
After thumbing up 14 kinds of user behavior characteristics of number, user behavior characteristics original in 14 are fused to summary by multiple comparative experiments
Sunlight credit, liveness, identity evaluation, influence power, bean vermicelli concern are than, than 6 indexs of original microblogging;
Sunlight credit SC, liveness AT, identity evaluation IE, influence power CI, bean vermicelli are paid close attention to ratio by antigen signals definition module
FF, original microblogging ratio 6 indexs of OM carry out normalization processing;
Microblog water army detection module based on DCA algorithm, using microblog users as antigen, initialization antigen first acquires number
Mesh and Dendritic Cells population;Unrecognized microblog users are selected in microblog users detection sample at random, are used according to microblogging
The corresponding pathogen associated molecular pattern signal in family, danger signal, safety signal and the scorching signal of cause are as input signal;According to
The concentration of CSMI, SEM, MAT is calculated in DCA algorithm calculation formula and its corresponding weight matrix, to offering same antigen
CSM, SEMI, MAT concentration that DC cell is obtained add up;If CSM is greater than mobility threshold, compare the big of SEMI and MAT
It is small, the state of the DC and the antigen state of DC acquisition are marked according to comparison result;If antigen determines that total degree reaches anti-
Former discrimination threshold then calculates cell maturation antigen value MCAV, compares the size of MCAV and outlier threshold, if MCAV is larger,
Antigenic mark is exception, which is waterborne troops, otherwise labeled as normal.
Another object of the present invention is to provide a kind of micro blog network platform, described in the micro blog network platform at least carries
The microblog water army detection system based on artificial immunity danger theory.
In step 4, the corresponding signal weight matrix of DCA algorithm is as follows:
Weight shows that more greatly the influence degree of its output to corresponding signal is bigger in weight matrix, and weight is negative value, that is, generation
Table its be negatively influencing to the output of corresponding signal.Input signal values are converted to the calculating of output signal value;Wherein (1+IS) is to put
Big signal, Wp, WD, Ws are to calculate output signal (C[CSM],C[SEM]And C[MAT]) when each input signal shared by weight, according to
It is obtained in weight matrix, such as calculates C[CSM]When, Wp=8, WD=4, Ws=-6.
Further, it also needs to carry out experiment detection scheme after step 4: can detect by above-mentioned steps each micro- in experiment sample
Whether rich user is waterborne troops, testing result and truthful data is based on, using accuracy rate (PR), recall rate (RR) and harmonic-mean
This 3 kinds of indexs of F1 detect the accuracy of this method.Accuracy rate, recall rate and harmonic-mean are higher, then the effect of waterborne troops's detection
Fruit is better.The calculation of each index is as follows.
Accuracy rate calculation formula are as follows:Class in formula+=TP/ (TP+FP), class-=
TN/ (TN+FN) respectively indicates classifier to the classification accuracy of microblog water army and normal users, and TP, TN are in sample respectively
Detection waterborne troops number and detect non-waterborne troops's number, FN, FP are the practical waterborne troops's number and non-waterborne troops's number of identification mistake respectively.PR indicates to divide
Class device Average Accuracy.The height of Average Accuracy PR is by class+、class-The height of the two value codetermines.
Recall rate calculation formula are as follows: RR=TP/ (TP+FN).
Harmonic-mean calculation formula are as follows: F1=(2*PR*RR)/(PR+RR).
In conclusion advantages of the present invention and good effect are as follows:
The thought that the present invention plans artificial immunity is applied in the detection of microblog users behavioural characteristic, is climbed using focused web
Worm is easy, quickly obtains microblog users data, portrays definition network navy by the analysis method based on user behavior characteristics
Behavior obtains the characteristic attribute of energy effective district subnetwork novel waterborne troops and normal users, finally by artificial immunity danger theory
Signal processing mechanism is applied in network navy detection, using the core algorithm of danger theory --- Dendritic Cells algorithm
(DCA) the waterborne troops user in microblogging is detected.
The present invention uses for reference the thought of Immune System, proposes and carries out microblogging net with the DCA algorithm in danger theory
The detection of network waterborne troops user.By the behavioural characteristic of waterborne troops user in analysis Sina weibo, according to microblogging normal users and waterborne troops
Difference of the user in the performance of the features such as forwarding, comment, sunlight credit, judges whether there is waterborne troops's behavior.The present invention and tradition
Waterborne troops's recognition detection method compare, have several advantages that
(1) present invention analyses in depth waterborne troops's behavioural characteristic, and these user characteristics are defined on this basis, has
There is the characteristics of dynamic and adaptivity, compared with traditional detection technique based on content characteristic, can more effectively find do not have
There is the novel waterborne troops of network of significant recognizable feature.
(2) present invention obtains microblog users data using the mode of the focused web crawler based on Python, and with structure
To database, which is easier to obtain data set for the data storage of change, can reasonably acquire all kinds of behavioral datas of user,
Have many advantages, such as to crawl that the period is short, the quality of data is high.
(3) 14 kinds of user behavior characteristics such as number of fans, attention number, microblogging sum in comprehensive analysis user microblogging of the present invention,
By its fusion treatment at six Xiang Zhibiao, and with the detection of the DCA algorithm progress microblog water army in artificial immunity danger theory.This
More comprehensively to the analysis of feature, data-handling efficiency is high, and does not need mass data collection and be trained for invention.
(4) present invention realizes microblogging water by the algorithm of DCA using the data set obtained based on focused web crawler strategy
The detection of army compares 3 indexs such as accuracy rate, recall rate and harmonic-mean in experimental result.The present invention selects front
Mentioned waterborne troops's detection algorithm based on content characteristic and waterborne troops's detection algorithm based on user characteristics is compared.Experiment knot
The accuracy rate that fruit shows that waterborne troops's detection algorithm based on content characteristic identifies new network waterborne troops is lower, special based on user
Waterborne troops's detection algorithm of sign has higher accuracy rate, but recall rate is lower when for the data processing of high-magnitude, in the present invention
Navy detection method (DCA algorithm) has preferable applicability and higher accuracy rate for the recognition detection of new network waterborne troops.
Detailed description of the invention
Fig. 1 is the microblog water army detection method flow chart provided in an embodiment of the present invention based on artificial immunity danger theory.
Fig. 2 is the microblog water army detecting system schematic diagram provided in an embodiment of the present invention based on artificial immunity danger theory.
In figure: 1, microblog data obtains module;2, characteristic selecting module;3, antigen signals definition module;4, it is calculated based on DCA
The microblog water army detection module of method.
Fig. 3 is the microblog water army detection method provided in an embodiment of the present invention based on artificial immunity danger theory and other water
The experimental result comparison diagram of army's detection method.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to embodiments, to the present invention
It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to
Limit the present invention.
DT (Danger Theory) danger theory, is one of artificial immune system research theory;DCA
(Dendritic Cell Algorithm) Dendritic Cells algorithm;PAMP(Pathogen-associated Molecular
Patterns) pathogen associated molecular pattern;DS (Danger Signal) danger signal;SS (Safe Signal) believes safely
Number;IS (Inflammatory Signal) causes scorching signal;CSM (costimulatory molecules) costimulatory molecules
DCA algorithm principle: DCA algorithm is mainly to simulate to be proposed as the function of the dendritic cells of antigen presenting cell
, input signal includes four kinds: (1) PAMP signal (pathogen associated molecular pattern);(2) DS signal (danger signal);(3)
SS signal (safety signal): the signal that cell natural death generates represents the normal behaviour in system;(4) IS signal (causes scorching letter
Number).After carrying out fusion treatment by correlation function and weight matrix to input signal, following three kinds of signals: (1) CSM are exported
(costimulatory molecules) costimulatory molecules: the value be used to judge when immature DC cell starts to break up, when
When CSM > mobility threshold, immature DC begins to differentiate into half ripe DC or maturation DC;(2) half ripe DC cell (semi-
Mature): indicating the safe coefficient of current cellular environment, while all antigens that the DC is absorbed are offered as safe antigen;(3)
Mature DC cell (mature) indicates the degree of danger of current cellular environment, while all antigens that the DC is absorbed are offered as danger
Dangerous antigen.When antigen, which reaches, differentiates number, the mature environmental antigens value MCAV (MCAV for representing the antigen intensity of anomaly is calculated
=antigenic mark is the labeled total degree of number/antigen of dangerous antigen).
Below with reference to instance analysis, the invention will be further described.
Microblog water army detection method provided in an embodiment of the present invention based on artificial immunity danger theory, comprising:
Step 1, the acquisition of microblog data
The present invention uses focused web crawler, realizes that the user information for microblogging crawls.Crawling process mainly includes mould
It is quasi- to log in, obtain station address link and HTML code parsing.
(1) simulation logs in: selecting " https: //login.sina.com.cn/signup/signin.php " as simulation
The address logged in is gone weibo.com or weibo.cn to authenticate, is then obtained using the cookie after the network address authenticates successfully
The cookie of weibo.com or weibo.cn certification, realizes Session session, and Session mechanism passes through Cookie and URL weight
It is realistic now to log in.
(2) obtain station address link: the division according to Sina weibo to user authentication type, have ordinary user (without
Sina's certification), personal authentication user's (being identified as yellow V or gold V etc.), enterprise institution authenticate user (being identified as blue V) etc., inhomogeneity
The user home page or the second level page of type certification have different URL link templates, and to protect privacy of user, hereinafter " * * * " is marked
Know the UID or sensitive data that position is user, is exemplified below.
Https: //weibo.com/u/***: unverified and personal authentication individual subscriber homepage link;
Https: //weibo.com/***: having authenticated the individual subscriber homepage link of enterprise, group, mechanism;
Https: //weibo.com/p/xxxxx***/info? mod=pedit_more: the message details of all types of user
Page link, wherein " xxxxx " is the character string of 5 0-9, different user has different character strings.It is captured by Fiddler
Ajax request is to the path for storing this page, by analysis it is found that the path string in link is that 5 bit digitals add user
UID composition, therefore can be linked according to this rule creation user's details page.
(3) HTML code parses: by being logged in advance with after target URL definition, crawler saves the Html under the link, but
Content is complicated in Html file, needs to screen information position required for positioning according to keyword.It can use Python at this time
The library urllib, urllib2 carried in language carries out a variety of parsings to the Html of URL and operates, Python can also be used
One advanced crawler Development Framework Scrapy carries out Html page info positioning.Scrapy can be according to the use demand of user
It carries out applicability modification, while can fast and accurately carry out the information scratching of web page.
Step 2, the selection of feature
The present invention in extracting user's microblogging number of fans, attention number, microblogging sum, original microblog number, whether authenticate, be micro-
Rich grade, whether there is or not brief introduction, registion time, sunlight credit, mutual attention number, participate in topic number, comment number, forwarding number and thumb up number
After 14 kinds of user behavior characteristics, user behavior characteristics original in 14 are fused to by sunlight letter by multiple comparative experiments and summary
With, liveness, identity evaluation, influence power, bean vermicelli concern than 6 indexs such as, original microblogging ratios.Fusion process is as follows:
(1) sunlight credit (SC) is divided into extremely low (300-419), lower (420-450), general (451-570), preferably
(571-690), fabulous (691-900) 5 grades, are indicated using numerical value 1-5 respectively in fusion;
(2) liveness (AT) is related to multiple attribute variables, including microblogging total (M), participation topic number (T), registion time
(Z), current time (N), wherein " N-Z " result with " day " be unit, calculation is as follows,
AT=(0.7M+0.3T)/(N-Z);
(3) identity evaluation (IE), be related to multiple attribute variables, respectively whether there is or not brief introduction (I), whether authenticate (C) and grade
Number (G), the weight to each attribute is respectively 0.2,0.4,0.4, and calculation is as follows,
IE=0.2I+0.4C+0.4G;
(4) influence power (CI) is related to multiple attribute variables, respectively comment number (J), forwarding number (R), thumbs up number (F), and
The sent out microblogging of user by comment number, be forwarded number and thumbed up number, the weight of each attribute is respectively 0.3,0.5,0.2, meter
Calculation mode is as follows,
CI=0.3J+0.5R+0.2F;
(5) ratio of the bean vermicelli concern than number of fans (Fans) and attention number (Followers) that (FF) is each user, meter
Calculation method is as follows,
FF=Fans/Followers;
(6) original microblogging ratio (OM) is that microblogging shared by original microblogging (Weibo_Original) is total in microblogging transmitted by user
Number (M) ratio.Calculation is as follows,
OM=(Weibo_Original)/M;
Step 3, antigen signals define
By sunlight credit (SC), liveness (AT), identity evaluation (IE), influence power (CI), bean vermicelli concern than (FF), original
6 indexs such as microblogging ratio (OM) carry out normalization processing, and mapping function is as follows:
Wherein x is original signal value, as x ∈ [m, n], carry out Linear Mapping, when x ∈ [n, ∞) when, signal takes maximum
Value 10.
4 kinds of input signals of the index of correlation and DCA algorithm that detect Sina weibo waterborne troops have following mapping:
Pathogen associated molecular pattern PAMP: showing user behavior exception, and there are the features of waterborne troops's behavior, defines PAMP=
{<SC,IE,FF>};
Danger signal DS: showing that a possibility that user behavior is abnormal, abnormal is higher, may be that normally performed activity changes
Become, but there are the possibility of waterborne troops's behavior, define DS={<AT, CI, OM>};
Safety signal SS: it indicates that a possibility that user is normal is higher, and is in normal condition, define SS={<SC, IE>};
Pro-inflammatory cytokine IS: showing active user generally there are exception, plays the role of amplifying PAMP, DS, SS signal, fixed
Adopted IS={<CI>}.
Step 4, the microblog water army detection based on DCA algorithm
It is as shown in Figure 1 applied to DCA algorithm flow of the invention.
Specific implementation are as follows: using microblog users as antigen, initialization antigen first acquires number and Dendritic Cells
Population;At random in microblog users
Unrecognized microblog users are selected in detection sample, according to the corresponding pathogen associated molecular pattern of microblog users
Signal, danger signal, safety signal and cause scorching signal as input signal, according to calculation formula is following and its corresponding weight square
The concentration of CSMI, SEM, MAT is calculated in battle array, CSM, SEMI, MAT concentration that the DC cell for offering same antigen is obtained into
Row is cumulative.
The calculation formula of DCA algorithm is as follows:
The corresponding signal weight matrix of DCA algorithm is as follows:
If CSM be greater than mobility threshold, compare the size of SEMI and MAT, according to comparison result mark the state of the DC with
And the antigen state of DC acquisition.If antigen determines that total degree reaches antigen discrimination threshold, cell maturation antigen value is calculated
(MCAV), formula is MCAV=MAT/ (SEM+MAT), compares the size of MCAV and outlier threshold, if MCAV is larger, antigen
Labeled as exception, i.e., the microblog users are waterborne troops, otherwise labeled as normal.
Such as Fig. 2, the embodiment of the present invention provides a kind of microblog water army detection system based on artificial immunity danger theory, packet
It includes:
Microblog data is obtained module 1 and is crawled using focused web crawler to the user information of microblogging;
Characteristic selecting module 2: in extracting user's microblogging number of fans, attention number, microblogging sum, original microblog number, be
Deny that card, microblogging grade, whether there is or not brief introduction, registion time, sunlight credit, mutual attention number, participation topic number, comment numbers, forwarding
After counting and thumbing up 14 kinds of user behavior characteristics of number, user behavior characteristics original in 14 are merged with summary by multiple comparative experiments
It is sunlight credit, liveness, identity evaluation, influence power, bean vermicelli concern than, than 6 indexs of original microblogging;
Antigen signals definition module 3 pays close attention to sunlight credit SC, liveness AT, identity evaluation IE, influence power CI, bean vermicelli
Normalization processing is carried out than FF, original microblogging ratio 6 indexs of OM;
Microblog water army detection module 4 based on DCA algorithm, using microblog users as antigen, initialization antigen acquisition first
Number and Dendritic Cells population;Unrecognized microblog users are selected in microblog users detection sample at random, according to microblogging
The corresponding pathogen associated molecular pattern signal of user, danger signal, safety signal and the scorching signal of cause are as input signal;According to
The concentration of CSMI, SEM, MAT is calculated in DCA algorithm calculation formula and its corresponding weight matrix, to offering same antigen
CSM, SEMI, MAT concentration that DC cell is obtained add up;If CSM is greater than mobility threshold, compare the big of SEMI and MAT
It is small, the state of the DC and the antigen state of DC acquisition are marked according to comparison result;If antigen determines that total degree reaches anti-
Former discrimination threshold then calculates cell maturation antigen value MCAV, compares the size of MCAV and outlier threshold, if MCAV is larger,
Antigenic mark is exception, which is waterborne troops, otherwise labeled as normal.
Below with reference to experiment effect, the invention will be further described.
Such as Fig. 3, the present invention is crawled module by microblog data and obtains user data, examined using the waterborne troops based on DCA algorithm
Module is surveyed to detect user data.The evaluation index of experiment includes accuracy rate (PR), recall rate (RR) and harmonic-mean
The value of F1, three evaluation indexes are higher, then the effect of waterborne troops's detection is better.
The present invention traditional navy detection method of the selection based on content characteristic and based on user characteristics with based on DCA algorithm
Waterborne troops's detection module compares experiment, 3 indexs such as accuracy rate, recall rate and harmonic-mean in comparative experiments result.
Verifying and comparative experiments are carried out to algorithm validity using Sina weibo user truthful data, the experimental results showed that in the present invention
Method can effectively detect the waterborne troops user in Sina weibo, have higher Detection accuracy.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real
It is existing.When using entirely or partly realizing in the form of a computer program product, the computer program product include one or
Multiple computer instructions.When loading on computers or executing the computer program instructions, entirely or partly generate according to
Process described in the embodiment of the present invention or function.The computer can be general purpose computer, special purpose computer, computer network
Network or other programmable devices.The computer instruction may be stored in a computer readable storage medium, or from one
Computer readable storage medium is transmitted to another computer readable storage medium, for example, the computer instruction can be from one
A web-site, computer, server or data center pass through wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL)
Or wireless (such as infrared, wireless, microwave etc.) mode is carried out to another web-site, computer, server or data center
Transmission).The computer-readable storage medium can be any usable medium or include one that computer can access
The data storage devices such as a or multiple usable mediums integrated server, data center.The usable medium can be magnetic Jie
Matter, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk Solid
State Disk (SSD)) etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.
Claims (10)
1. a kind of microblog water army detection method based on artificial immunity danger theory, which is characterized in that described to be based on artificial immunity
The microblog water army detection method of danger theory includes:
Microblog users behavioral data is obtained using focused web crawler, microblog users behavioural characteristic is examined using artificial immunity
It surveys;
User behavior characteristics are analyzed and defined with network navy behavior, distinguishes the feature of network novel waterborne troops and normal users
Attribute;
Using the network navy user behavior in the Dendritic Cells algorithm DCA detection microblogging of artificial immunity danger theory.
2. as described in claim 1 based on the microblog water army detection method of artificial immunity danger theory, which is characterized in that be based on
The microblog water army detection method of artificial immunity danger theory specifically includes:
The acquisition of microblog data: step 1 uses focused web crawler, crawls to the user information of microblogging;
Step 2, the selection of feature: in extracting user's microblogging number of fans, attention number, microblogging sum, original microblog number, be
Deny that card, microblogging grade, whether there is or not brief introduction, registion time, sunlight credit, mutual attention number, participation topic number, comment numbers, forwarding
After counting and thumbing up 14 kinds of user behavior characteristics of number, user behavior characteristics original in 14 are merged with summary by multiple comparative experiments
It is sunlight credit, liveness, identity evaluation, influence power, bean vermicelli concern than, than 6 indexs of original microblogging;
Antigen signals definition: sunlight credit SC, liveness AT, identity evaluation IE, influence power CI, bean vermicelli are paid close attention to ratio by step 3
FF, original microblogging ratio 6 indexs of OM carry out normalization processing, and mapping function is as follows:Wherein x
Original signal value, as x ∈ [m, n], carry out Linear Mapping, when x ∈ [n, ∞) when, signal is maximized 10;
Step 4, the microblog water army detection based on DCA algorithm: using microblog users as antigen, initialization antigen first acquires number
Mesh and Dendritic Cells population;Unrecognized microblog users are selected in microblog users detection sample at random, are used according to microblogging
The corresponding pathogen associated molecular pattern signal in family, danger signal, safety signal and the scorching signal of cause are as input signal;
According to calculation formula is following and its concentration of CSMI, SEM, MAT is calculated in corresponding weight matrix, to offering same primary antibody
CSM, SEMI, MAT concentration that former DC cell is obtained add up;
The calculation formula of DCA algorithm is as follows:
(1+IS) is amplified signal in formula, the corresponding value of input signal PAMP, DS, SS and weight be respectively CP, CD, CS and
WP, WD, WS, the corresponding value of output signal CSM, SEM and MAT is respectively C[CSM], C[SEM]And C[MAT];
CSM, SEM and MAT value are calculated according to input signal values and weight matrix, and is added up.If CSM is greater than migration threshold
Value, then compare the size of SEM and MAT, and the state of the DC and the antigen state of DC acquisition are marked according to comparison result.If
Antigen determines that total degree reaches antigen discrimination threshold, then calculates cell maturation antigen value MCAV, and formula is MCAV=MAT/ (SEM+
MAT), wherein SEM and MAT be output signal SEM, MAT value;Compare the size of MCAV and outlier threshold, if MCAV is larger,
Then antigenic mark is abnormal, which is waterborne troops, otherwise labeled as normal.
3. as claimed in claim 2 based on the microblog water army detection method of artificial immunity danger theory, which is characterized in that step
In one, crawling method includes that simulation logs in, obtains station address link and HTML code parsing;
(1) simulation logs in: after network address authenticates successfully, being logged in;
(2) station address link: the division according to Sina weibo to user authentication type is obtained, is had without the common of Sina's certification
User, the personal authentication user for being identified as yellow V or gold V, the enterprise institution certification user for being identified as blue V;Different type certification
User home page or the second level page have different URL link templates;
(3) HTML code parse: by log in advance with target URL definition after, using in Python carry urllib,
The library urllib2 carries out a variety of parsings to the Html of URL and operates, or utilizes an advanced crawler Development Framework of Python
Scrapy carries out the positioning of Html page info;Carry out the information scratching of web page.
4. as claimed in claim 2 based on the microblog water army detection method of artificial immunity danger theory, which is characterized in that step
In two, fusion method includes:
1) sunlight credit SC points are extremely low 300-419, lower 420-450, general 451-570, preferable 571-690, fabulous 691-
900 grades are indicated using numerical value 1-5 respectively in fusion;
2) liveness AT, including microblogging sum M, participate in topic number T, registion time Z, current time N, wherein " N-Z " result with
" day " is unit, and calculation is as follows,
AT=(0.7M+0.3T)/(N-Z);
3) identity evaluate IE, respectively whether there is or not brief introduction I, whether authenticate C and number of degrees G, the weight to each attribute is respectively
0.2,0.4,0.4, calculation is as follows,
IE=0.2I+0.4C+0.4G;
4) influence power CI, respectively comment number J, forwarding number R, thumb up several F and the sent out microblogging of user by comment number, be forwarded
It counting and is thumbed up number, the weight of each attribute is respectively 0.3,0.5,0.2, and calculation is as follows,
CI=0.3J+0.5R+0.2F;
5) than the ratio of number of fans Fans and attention number Followers that FF is each user, calculation method is as follows for bean vermicelli concern,
FF=Fans/Followers;
6) original microblogging ratio OM is microblogging sum M ratio shared by original microblogging Weibo_Original in microblogging transmitted by user;
Calculation is as follows, OM=(Weibo_Original)/M.
5. as claimed in claim 2 based on the microblog water army detection method of artificial immunity danger theory, which is characterized in that step
In three, 4 kinds of input signals mapping of the index of correlation and DCA algorithm that detect Sina weibo waterborne troops includes:
Pathogen associated molecular pattern PAMP: showing user behavior exception, there are the feature of waterborne troops's behavior, define PAMP=<
SC,IE,FF>};
Danger signal DS: showing that a possibility that user behavior is abnormal, abnormal is higher, and only normally performed activity changes, but exists
The possibility of waterborne troops's behavior defines DS={<AT, CI, OM>};
Safety signal SS: it indicates that a possibility that user is normal is higher, and is in normal condition, define SS={<SC, IE>};
Pro-inflammatory cytokine IS: showing active user generally there are exception, plays the role of amplifying PAMP, DS, SS signal, defines IS
={<CI>}.
6. a kind of computer program, which is characterized in that described in the computer program operation Claims 1 to 5 any one
Microblog water army detection method based on artificial immunity danger theory.
7. a kind of terminal, which is characterized in that the terminal, which is at least carried, to be realized described in Claims 1 to 5 any one based on people
The controller of the microblog water army detection method of work Danger Immune theory.
8. a kind of computer readable storage medium, including instruction, when run on a computer, so that computer is executed as weighed
Benefit requires the microblog water army detection method described in 1-5 any one based on artificial immunity danger theory.
9. it is a kind of realize the microblog water army detection method described in claim 1 based on artificial immunity danger theory based on artificial
The microblog water army detection system of Danger Immune theory, which is characterized in that the microblog water army based on artificial immunity danger theory
Detection system includes:
Microblog data is obtained module and is crawled using focused web crawler to the user information of microblogging;
Characteristic selecting module: in extracting user's microblogging number of fans, attention number, microblogging sum, original microblog number, be to deny
Card, microblogging grade, whether there is or not brief introduction, registion time, sunlight credit, mutual attention number, participate in topic number, comment number, forwarding number and
After thumbing up 14 kinds of user behavior characteristics of number, user behavior characteristics original in 14 are fused to by sun by multiple comparative experiments and summary
Light credit, liveness, identity evaluation, influence power, bean vermicelli concern are than, than 6 indexs of original microblogging;
Antigen signals definition module, by sunlight credit SC, liveness AT, identity evaluation IE, influence power CI, bean vermicelli concern than FF,
6 indexs of original microblogging ratio OM carry out normalization processing;
Microblog water army detection module based on DCA algorithm, using microblog users as antigen, first initialization antigen acquisition number with
Dendritic Cells population;Unrecognized microblog users are selected in microblog users detection sample at random, according to microblog users pair
The scorching signal of pathogen associated molecular pattern signal, danger signal, safety signal and cause answered is as input signal;It is calculated according to DCA
The concentration of CSMI, SEM, MAT is calculated in method calculation formula and its corresponding weight matrix, to the DC cell for offering same antigen
CSM, SEMI, MAT concentration obtained adds up;If CSM is greater than mobility threshold, compare the size of SEMI and MAT, according to
Comparison result marks the state of the DC and the antigen state of DC acquisition;If antigen determines that total degree reaches antigen and differentiates threshold
Value, then calculate cell maturation antigen value MCAV, compare the size of MCAV and outlier threshold, if MCAV is larger, antigenic mark
For exception, which is waterborne troops, otherwise labeled as normal.
10. a kind of micro blog network platform, which is characterized in that the micro blog network platform at least carries base as claimed in claim 9
In the microblog water army detection system of artificial immunity danger theory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810950560.1A CN109558555B (en) | 2018-08-20 | 2018-08-20 | Microblog water army detection method and detection system based on artificial immune hazard theory |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810950560.1A CN109558555B (en) | 2018-08-20 | 2018-08-20 | Microblog water army detection method and detection system based on artificial immune hazard theory |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109558555A true CN109558555A (en) | 2019-04-02 |
CN109558555B CN109558555B (en) | 2020-05-05 |
Family
ID=65864492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810950560.1A Active CN109558555B (en) | 2018-08-20 | 2018-08-20 | Microblog water army detection method and detection system based on artificial immune hazard theory |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109558555B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110287322A (en) * | 2019-06-27 | 2019-09-27 | 有米科技股份有限公司 | Moisture flow processing method, system and the equipment of social media flow |
CN110297990A (en) * | 2019-05-23 | 2019-10-01 | 东南大学 | The associated detecting method and system of crowdsourcing marketing microblogging and waterborne troops |
CN111159399A (en) * | 2019-12-13 | 2020-05-15 | 天津大学 | Automobile vertical website water army discrimination method |
CN113806616A (en) * | 2021-08-16 | 2021-12-17 | 北京智慧星光信息技术有限公司 | Microblog user identification method, system, electronic equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103077240A (en) * | 2013-01-10 | 2013-05-01 | 北京工商大学 | Microblog water army identifying method based on probabilistic graphical model |
CN103198161A (en) * | 2013-04-28 | 2013-07-10 | 中国科学院计算技术研究所 | Microblog ghostwriter identifying method and device |
CN106940732A (en) * | 2016-05-30 | 2017-07-11 | 国家计算机网络与信息安全管理中心 | A kind of doubtful waterborne troops towards microblogging finds method |
US20180083903A1 (en) * | 2016-09-21 | 2018-03-22 | King Fahd University Of Petroleum And Minerals | Spam filtering in multimodal mobile communication |
CN107895010A (en) * | 2017-11-13 | 2018-04-10 | 华东师范大学 | A kind of method that detection network navy is thumbed up based on network |
CN108197696A (en) * | 2018-01-31 | 2018-06-22 | 湖北工业大学 | A kind of network navy account recognition methods and system |
-
2018
- 2018-08-20 CN CN201810950560.1A patent/CN109558555B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103077240A (en) * | 2013-01-10 | 2013-05-01 | 北京工商大学 | Microblog water army identifying method based on probabilistic graphical model |
CN103198161A (en) * | 2013-04-28 | 2013-07-10 | 中国科学院计算技术研究所 | Microblog ghostwriter identifying method and device |
CN106940732A (en) * | 2016-05-30 | 2017-07-11 | 国家计算机网络与信息安全管理中心 | A kind of doubtful waterborne troops towards microblogging finds method |
US20180083903A1 (en) * | 2016-09-21 | 2018-03-22 | King Fahd University Of Petroleum And Minerals | Spam filtering in multimodal mobile communication |
CN107895010A (en) * | 2017-11-13 | 2018-04-10 | 华东师范大学 | A kind of method that detection network navy is thumbed up based on network |
CN108197696A (en) * | 2018-01-31 | 2018-06-22 | 湖北工业大学 | A kind of network navy account recognition methods and system |
Non-Patent Citations (3)
Title |
---|
张超 等: "基于树突状细胞算法的垃圾邮件群发检测", 《传感器与微系统》 * |
杨超 等: "基于人工免疫危险理论的微博水军用户检测研究", 《计算机科学》 * |
王志召: "微博客数据分析系统的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110297990A (en) * | 2019-05-23 | 2019-10-01 | 东南大学 | The associated detecting method and system of crowdsourcing marketing microblogging and waterborne troops |
CN110287322A (en) * | 2019-06-27 | 2019-09-27 | 有米科技股份有限公司 | Moisture flow processing method, system and the equipment of social media flow |
CN110287322B (en) * | 2019-06-27 | 2021-04-16 | 有米科技股份有限公司 | Water flow processing method, system and equipment for social media flow |
CN111159399A (en) * | 2019-12-13 | 2020-05-15 | 天津大学 | Automobile vertical website water army discrimination method |
CN113806616A (en) * | 2021-08-16 | 2021-12-17 | 北京智慧星光信息技术有限公司 | Microblog user identification method, system, electronic equipment and storage medium |
CN113806616B (en) * | 2021-08-16 | 2023-08-22 | 北京智慧星光信息技术有限公司 | Microblog user identification method, system, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109558555B (en) | 2020-05-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104077396B (en) | Method and device for detecting phishing website | |
CN109558555A (en) | Microblog water army detection method and detection system based on artificial immunity danger theory | |
CN103544255B (en) | Text semantic relativity based network public opinion information analysis method | |
Yuan et al. | Reading thieves' cant: automatically identifying and understanding dark jargons from cybercrime marketplaces | |
CN109922052A (en) | A kind of malice URL detection method of combination multiple characteristics | |
CN104899508B (en) | A kind of multistage detection method for phishing site and system | |
CN103927297A (en) | Evidence theory based Chinese microblog credibility evaluation method | |
JP2014502753A (en) | Web page information detection method and system | |
Hutchinson et al. | Detecting phishing websites with random forest | |
Cresci et al. | A Fake Follower Story: improving fake accounts detection on Twitter | |
Chiew et al. | Building standard offline anti-phishing dataset for benchmarking | |
CN110134876A (en) | A kind of cyberspace Mass disturbance perception and detection method based on gunz sensor | |
Koutsouvelis et al. | Detection of insider threats using artificial intelligence and visualisation | |
CN113901465A (en) | Heterogeneous network-based Android malicious software detection method | |
Yu et al. | Detecting malicious web requests using an enhanced textcnn | |
Thakur et al. | Detection of malicious URLs in big data using RIPPER algorithm | |
Elmas et al. | Misleading repurposing on twitter | |
Wu et al. | Malicious website detection based on urls static features | |
Wu et al. | Website defacements detection based on support vector machine classification method | |
Wei et al. | Age: authentication graph embedding for detecting anomalous login activities | |
Yin et al. | Research of integrated algorithm establishment of a spam detection system | |
CN107239704A (en) | Malicious web pages find method and device | |
Pan | Network security and user abnormal behavior detection by using deep neural network | |
Chen | Security precautionary technology for enterprise information resource database based on genetic algorithm in age of big data | |
Xiao et al. | The Challenges of Machine Learning for Trust and Safety: A Case Study on Misinformation Detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |