CN108985391A - Hidden writer's detection method of Behavior-based control - Google Patents
Hidden writer's detection method of Behavior-based control Download PDFInfo
- Publication number
- CN108985391A CN108985391A CN201810996553.5A CN201810996553A CN108985391A CN 108985391 A CN108985391 A CN 108985391A CN 201810996553 A CN201810996553 A CN 201810996553A CN 108985391 A CN108985391 A CN 108985391A
- Authority
- CN
- China
- Prior art keywords
- image
- hidden
- writer
- behavior
- hidden writer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of hidden writer's detection methods of Behavior-based control, comprising: chooses a certain number of users from social platform, each user crawls N continuous images, and using the image of a part of user as training data, other are as test data;Image is randomly selected from training data to simulate hidden writer's behavior, generates hidden writer's data;Behavioural characteristic is extracted from training data and hidden writer's data respectively, and trains two classifiers using extracted feature;Two classifiers are tested using test data, and using by two classifiers after test, new input picture is detected, therefore, it is determined that the user for sending new input picture is normal users or hidden writer.Based on this method, hidden writer can be accurately detected.
Description
Technical field
The present invention relates to a kind of hidden writer of social networks safety and steganalysis field more particularly to Behavior-based control detections
Method.
Background technique
The purpose of steganalysis is whether detection image is modified by Steganography.For the steganalysis of single image, lead to
Often regard that one is distinguished carrier and carries two close classification problems as, wherein design is able to reflect message insertion and counts special to carrier
Property influence validity feature be one of its critical issue.The rich model steganalysis feature and selection that Fridrich et al. is proposed
Channel attack model makes the steganalysis performance of single image be greatly improved;In recent years, with the hair of deep learning
Exhibition, CNN, RNN, Res-Net, GAN are also increasingly used for steganalysis.
Although Steganalysis is constantly progressive, current research is all based on laboratory condition, i.e. image is generally
Natural image, and the matching of insertion rate and embedded mobile GIS is required when training classifier.But this is usually unable to satisfy in reality
Kind requires.Insertion rate and embedded mobile GIS firstly for image are unknown.In addition, in true social platform, Yong Hufa
The noise source of the picture material and image sent is also multiplicity, and it is various that this will make the mode of this supervised learning face
The problem of mismatch.Even if steganalysis feature is up to tens of thousands of dimensions, also it is difficult to play its effectiveness in true scene.For this
Situation, Ker are proposed the concept of hidden writer's detection, are carried out as unit of the user for sending image rather than as unit of single image
Detection.In hidden writer detection, unsupervised learning method is generally used.Ker proposes the method detection steganography using cluster first
Person again detected the local outlier factor (Local Outlier Factor) in abnormality detection for hidden writer in 2014.
The method that Li et al. people in 2016 proposes hierarchical clustering and clustering ensemble.Zheng et al. attempts to extract using deep neural network hidden
Analysis feature is write to detect for hidden writer.Although these methods avoid the problem of mismatch in supervised learning, it is adopted
Feature is all traditional steganalysis feature of low-dimensional, and essence is still by whether making steganography modification to be sentenced to do
Fixed, for different data, performance be would also vary from.Fig. 1 is the local outlier factor proposed using Ker
(lof) experimental result in BossBase and twitter data, abscissa represent insertion rate, ordinate generation to method respectively
Average ranking of the lof value of the hidden writer of table in 100 hidden writers, it is more forward to illustrate that effect is better.It can be seen from the figure that
It shows widely different in the data of BossBase and twitter, is influenced by image source very big.And when insertion rate is low
Average ranking reaches 50, is equivalent to and is substantially not detectable.
The full communication process communicated using hidden image should include the selection of image-carrier, point of insertion rate
Match, the selection of embedded mobile GIS, last embedded images are simultaneously sent.In social scene, the behavioural information of various dimensions can be related to, than
Such as the frequency of communication, the object of communication sends the content relevance of image.And current Steganography is concerned only with the peace of single dimension
Entirely, i.e., so that carrier and the close undistinguishable of load.We have investigated hundreds of steganography software, and almost all of software is all only paid attention to
The improvement of steganographic algorithm, the information revealed without considering other behaviors of user in entire communication process, for example, being sent out
Send the correlation of image.Existing steganography software does not have the function that carrier is selected for user, and friendly software relatively can be
The random selection carrier of user allows user to use the image of oneself captured in real-time.But for the user of not professional knowledge
For, in order to save time and efforts, it is likely that image can be selected at random as carrier.In this case, using user's
Behavioural information will make hidden writer detect the basic change of generation to detect hidden writer.
Summary of the invention
The object of the present invention is to provide a kind of hidden writer's detection methods of Behavior-based control, can accurately detect hidden writer.
The purpose of the present invention is what is be achieved through the following technical solutions:
A kind of hidden writer's detection method of Behavior-based control, comprising:
A certain number of users are chosen from social platform, and each user crawls N continuous images, and by a part of user
Image as training data, other are as test data;
Database of the image of selected part user as hidden writer, and therefrom randomly select a certain number of images and carry out mould
Intend hidden writer's behavior, generates hidden writer's data;
Behavioural characteristic is extracted from training data and hidden writer's data respectively, and trains two points using extracted feature
Class device;
Two classifiers are tested using test data, and using by two classifiers after test, input figure to new
As being detected, therefore, it is determined that the user for sending new input picture is normal users or hidden writer.
As seen from the above technical solution provided by the invention, the feature of correlation between image will be reflected as row
It is characterized, and cooperates two classifiers that can accurately detect hidden writer.Meanwhile the diversity of behavioural information can be examined for steganography person
It surveys and the detection visual angle of multi-angle is provided, on the one hand on the other hand, steganography software can be promoted to consider with the hidden writer of more reliable detection
Behavioural information designs more humane safer steganography software.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment
Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this
For the those of ordinary skill in field, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
The method that Fig. 1 is the lof proposed using Ker that background of invention provides respectively in BossBase and
Experimental result in twitter data;
Fig. 2 is a kind of flow chart of hidden writer's detection method of Behavior-based control provided in an embodiment of the present invention;
Fig. 3 is the flow chart provided in an embodiment of the present invention for extracting behavioural characteristic;
Fig. 4 is the experimental result provided in an embodiment of the present invention based on the present invention program.
Specific embodiment
With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete
Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this
The embodiment of invention, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, belongs to protection scope of the present invention.
The embodiment of the present invention provides a kind of hidden writer's detection method of Behavior-based control, mainly includes the following steps:
1, a certain number of users are chosen from social platform, each user crawls N continuous images, and a part is used
The image at family is as training data, other are as test data.
Illustratively, social platform can be selected as twitter, and the tool of crawling can choose tweepy;It, can in practical operation
To use tweepy to crawl upper 2000 users of twitter, the user by picture number less than 100 is screened out, and retains 700 use
Family, each user retain 100 continuous images.
It,, will using the resize function of matlab after crawling each user N continuous images in the embodiment of the present invention
Each image cropping is specified size m × n.Illustratively, size can be set to 512*512.
The division proportion of training data and test data may be set according to actual conditions.
2, database of the image of selected part user as hidden writer from the image crawled, and therefrom randomly select one
The image of fixed number amount simulates hidden writer's behavior, generates hidden writer's data.
Since the data of hidden writer in practice can not be obtained, for the validity of verification method, in the embodiment of the present invention,
A part is randomly selected from the image crawled to test to simulate hidden writer.
Likewise, the amount of images randomly selected from hidden writer's database may be set to be 100.
3, behavioural characteristic is extracted from training data and hidden writer's data respectively, and trains two using extracted feature
Classifier.
In the embodiment of the present invention, training data is made of the image of a part of user, and hidden writer's data are also by steganography
The image of person forms, and the mode of feature extraction is identical;For each user or hidden writer, from corresponding image sequence
The feature of correlation between being able to reflect image is extracted as behavioural characteristic;Extracting mode is as shown in figure 3, main process is as follows:
1) for the image sequence of each user or hidden writer, the difference of the grey level histogram of adjacent two images is calculated,
Constitute matrix of differences:
di,i-1=abs (hi-hi-1);
In formula, hi、hi-1Respectively indicate the grey level histogram of the i-th width image, the (i-1)-th width image;
2) quantification treatment is carried out to matrix of differences: first takes logarithmic quantization, is then truncated, truncation section is [0, T] table
It is shown as:
D'=trucT(round(logdi,i+1));
3) the frequecy characteristic P and co-occurrence matrix C of d' are counted using the single order of all matrix of differences and second-order statistics
Distribution:
P=[p1,...,pT+1];
Wherein, d'k、d'k+1Respectively indicate kth in d', k+1 element;M, n is respectively the length and width of image;C in co-occurrence matrix Cl,jAnd cj,lCorrelation is similar between represented pixel, is merged, and is closed
Co-occurrence matrix after and
By frequecy characteristic P with merge after the co-occurrence matrix C' feature namely behavioural characteristic that merge to the end:
F=[P C'];
Characteristic dimension is as follows:
Assuming that take T=12, then the characteristic dimension of behavioural characteristic | F |=104.
4, two classifiers are tested using test data, and using by two classifiers after test, is inputted to new
Image is detected, therefore, it is determined that the user for sending new input picture is normal users or hidden writer.
Obtain trained two classifier in the above manner, then using test data to trained two classifier into
Row test, two classifiers tested after passing through then can be used for the classification and Detection of hidden writer.
It will be understood by those skilled in the art that being also needed in test phase and when being detected to new input picture
Behavioural characteristic is extracted, then using the behavioural characteristic extracted as the input of two classifiers, is positive to obtain behavioural characteristic correspondence
The classification output of common family or hidden writer.
In addition, by changing training data composition, reducing training number for the hidden writer realized with certain behavior safety
According to the mismatch problems with test data.Hidden writer is divided into different behavior safeties according to ratio shared by random image in image
Grade has divided multiple behavior safety grades altogether, indicates not have the hidden writer of behavior safety consciousness with P%, i.e., transmitted
Image is all random;(P-Q) % indicates the hidden writer with certain behavior safety consciousness, in the image transmitted by him, has
Q% is sent according to the sequence of normal users, and P% is the image randomly selected.And so on, obtain multiple behavior safeties etc.
Grade;Illustratively, the hidden writer for not having behavior safety consciousness can be indicated with 100%, i.e., transmitted image is all random
's;90% indicates the hidden writer with certain behavior safety consciousness, and in the image transmitted by him, 10% is according to just common
What the sequence at family was sent, 90% is the image randomly selected.And so on, we obtain the hidden writer of 10 kinds of grades.
In training classifier, the hidden writer of different safety class averagely mixes composition training set, reaches for unknown
The accurate detection of the hidden writer of behavior safety grade.Such as: 1000 hidden writers in training set include 100 " 10% " hidden
Writer, 100 " 20% " hidden writer ... 100 " 100% " hidden writers.
In order to which the detection effect of above scheme of the present invention has also carried out related experiment.Experimental result is as shown in table 1 and Fig. 4.
1 combined training of table and test experiments result
The experiment of table 1 is the hidden writer for realizing with certain behavior safety, by changing the composition of training set, instruction
Practice mixed classifier.Then using mixed classifier respectively to the hidden writer of different safety class (10%, 20% ...,
100%) it is tested, obtained false dismissal probability result.
Fig. 4 be in order to illustrate this method for picture number have robustness, that is, select different number of image into
Row experiment, but guarantee that test set and training set, normal users and hidden writer, the picture number of selected each user are consistent.
10 image/users, 20 image/users ..., 100 image/users, obtained Average Error Probabilities (false-alarm are chosen respectively
Rate and false dismissed rate are averaged).
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can
The mode of necessary general hardware platform can also be added to realize by software by software realization.Based on this understanding,
The technical solution of above-described embodiment can be embodied in the form of software products, which can store non-easy at one
In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Within the technical scope of the present disclosure, any changes or substitutions that can be easily thought of by anyone skilled in the art,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims
Subject to enclosing.
Claims (4)
1. a kind of hidden writer's detection method of Behavior-based control characterized by comprising
A certain number of users are chosen from social platform, and each user crawls N continuous images, and by the figure of a part of user
As being used as training data, other are as test data;
Database of the image of selected part user as hidden writer, and it is hidden to simulate therefrom to randomly select a certain number of images
Writer's behavior generates hidden writer's data;
Behavioural characteristic is extracted from training data and hidden writer's data respectively, and trains two classification using extracted feature
Device;
Two classifiers are tested using test data, and using by test after two classifiers, to new input picture into
Row detection, therefore, it is determined that the user for sending new input picture is normal users or hidden writer.
2. a kind of hidden writer's detection method of Behavior-based control according to claim 1, which is characterized in that crawl each user
It is specified size by each image cropping using the resize function of matlab after N continuous images.
3. a kind of hidden writer's detection method of Behavior-based control according to claim 1, which is characterized in that the extraction behavior
Feature includes:
Training data is made of the image of a part of user, and hidden writer's data are also to be made of the image of hidden writer, and feature mentions
The mode taken is identical;For each user or hidden writer, extracts and be able to reflect between image from corresponding image sequence
The feature of correlation is as behavioural characteristic;Extracting mode is as follows:
For the image sequence of each user or hidden writer, the difference of the grey level histogram of adjacent two images is calculated, it is poor to constitute
Value matrix:
di,i-1=abs (hi-hi-1);
In formula, hi、hi-1Respectively indicate the grey level histogram of the i-th width image, the (i-1)-th width image;
Quantification treatment is carried out to matrix of differences: first taking logarithmic quantization, is then truncated, section is truncated as [0, T] expression are as follows:
D'=trucT(round(logdi,i+1));
Frequecy characteristic P and co-occurrence matrix the C distribution of d' is counted using the single order of all matrix of differences and second-order statistics:
P=[p1,...,pT+1];
Wherein, d'k、d'k+1Respectively indicate kth in d', k+1 element;M, n is respectively the length and width of image;
C in co-occurrence matrix Ci,jAnd cj,iCorrelation is similar between represented pixel, is merged, the co-occurrence matrix after being merged
By frequecy characteristic P with merge after the co-occurrence matrix C' feature namely behavioural characteristic that merge to the end:
F=[P C'];
Characteristic dimension is as follows:
4. a kind of hidden writer's detection method of Behavior-based control according to claim 1, which is characterized in that this method is also wrapped
It includes: hidden writer being divided into different behavior safety grades according to ratio shared by random image in image, has divided multiple rows altogether
For security level, the hidden writer for not having behavior safety consciousness is indicated with P%, i.e., transmitted image is all random;(P-
Q) % indicates the hidden writer with certain behavior safety consciousness, and in the image transmitted by it, Q% is according to normal users
What sequence was sent, P% is the image randomly selected, and so on, obtain multiple behavior safety grades;In training classifier,
The hidden writer of different safety class averagely mixes composition training set, reaches the hidden writer of standard to(for) unknown behavior safety grade
Really detection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810996553.5A CN108985391A (en) | 2018-08-29 | 2018-08-29 | Hidden writer's detection method of Behavior-based control |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810996553.5A CN108985391A (en) | 2018-08-29 | 2018-08-29 | Hidden writer's detection method of Behavior-based control |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108985391A true CN108985391A (en) | 2018-12-11 |
Family
ID=64547297
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810996553.5A Pending CN108985391A (en) | 2018-08-29 | 2018-08-29 | Hidden writer's detection method of Behavior-based control |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108985391A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059754A (en) * | 2019-04-22 | 2019-07-26 | 厦门大学 | A kind of batch data steganography method, terminal device and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203492A (en) * | 2016-06-30 | 2016-12-07 | 中国科学院计算技术研究所 | The system and method that a kind of image latent writing is analyzed |
CN107808100A (en) * | 2017-10-25 | 2018-03-16 | 中国科学技术大学 | For the steganalysis method of fc-specific test FC sample |
-
2018
- 2018-08-29 CN CN201810996553.5A patent/CN108985391A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203492A (en) * | 2016-06-30 | 2016-12-07 | 中国科学院计算技术研究所 | The system and method that a kind of image latent writing is analyzed |
CN107808100A (en) * | 2017-10-25 | 2018-03-16 | 中国科学技术大学 | For the steganalysis method of fc-specific test FC sample |
Non-Patent Citations (2)
Title |
---|
LI LI等: "Side Channel Steganalysis: When Behavior is Considered in Steganographer Detection", 《MULTIMEDIA TOOLS AND APPLICATIONS》 * |
毛金莲: "《智能图像检索关键技术研究》", 30 June 2015, 北京理工大学出版社 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059754A (en) * | 2019-04-22 | 2019-07-26 | 厦门大学 | A kind of batch data steganography method, terminal device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Qi et al. | Exploiting multi-domain visual information for fake news detection | |
Rana et al. | Deepfake detection: A systematic literature review | |
Ning et al. | Invisible poison: A blackbox clean label backdoor attack to deep neural networks | |
CN112749608B (en) | Video auditing method, device, computer equipment and storage medium | |
CN107835113A (en) | Abnormal user detection method in a kind of social networks based on network mapping | |
Al-Qershi et al. | Evaluation of copy-move forgery detection: datasets and evaluation metrics | |
CN110197389A (en) | A kind of user identification method and device | |
Lago et al. | Visual and textual analysis for image trustworthiness assessment within online news | |
Gu et al. | AnchorMF: towards effective event context identification | |
Khan et al. | Digital forensics and cyber forensics investigation: security challenges, limitations, open issues, and future direction | |
CN109522692B (en) | Webpage machine behavioral value method and system | |
Guo et al. | Crowdstory: Fine-grained event storyline generation by fusion of multi-modal crowdsourced data | |
Checco et al. | All that glitters is gold—an attack scheme on gold questions in crowdsourcing | |
CN116957049A (en) | Unsupervised internal threat detection method based on countermeasure self-encoder | |
Li et al. | Steganographic security analysis from side channel steganalysis and its complementary attacks | |
Wenger et al. | Data isotopes for data provenance in dnns | |
CN110163013A (en) | A kind of method and apparatus detecting sensitive information | |
Wang et al. | Analyzing image-based political propaganda in referendum campaigns: from elements to strategies | |
CN108985391A (en) | Hidden writer's detection method of Behavior-based control | |
Li et al. | TCM-KNN scheme for network anomaly detection using feature-based optimizations | |
Hendrix et al. | Media forensics in the age of disinformation | |
Duan et al. | Fed‐DNN‐Debugger: Automatically Debugging Deep Neural Network Models in Federated Learning | |
CN114861177A (en) | Method and device for detecting suspicious account on social network | |
CN114092141A (en) | Wool party identification method, device, equipment and storage medium | |
Boenisch | Differential privacy: general survey and analysis of practicability in the context of machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181211 |