CN108985391A - Hidden writer's detection method of Behavior-based control - Google Patents

Hidden writer's detection method of Behavior-based control Download PDF

Info

Publication number
CN108985391A
CN108985391A CN201810996553.5A CN201810996553A CN108985391A CN 108985391 A CN108985391 A CN 108985391A CN 201810996553 A CN201810996553 A CN 201810996553A CN 108985391 A CN108985391 A CN 108985391A
Authority
CN
China
Prior art keywords
image
hidden
writer
behavior
hidden writer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810996553.5A
Other languages
Chinese (zh)
Inventor
张卫明
俞能海
李莉
姚远志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN201810996553.5A priority Critical patent/CN108985391A/en
Publication of CN108985391A publication Critical patent/CN108985391A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of hidden writer's detection methods of Behavior-based control, comprising: chooses a certain number of users from social platform, each user crawls N continuous images, and using the image of a part of user as training data, other are as test data;Image is randomly selected from training data to simulate hidden writer's behavior, generates hidden writer's data;Behavioural characteristic is extracted from training data and hidden writer's data respectively, and trains two classifiers using extracted feature;Two classifiers are tested using test data, and using by two classifiers after test, new input picture is detected, therefore, it is determined that the user for sending new input picture is normal users or hidden writer.Based on this method, hidden writer can be accurately detected.

Description

Hidden writer's detection method of Behavior-based control
Technical field
The present invention relates to a kind of hidden writer of social networks safety and steganalysis field more particularly to Behavior-based control detections Method.
Background technique
The purpose of steganalysis is whether detection image is modified by Steganography.For the steganalysis of single image, lead to Often regard that one is distinguished carrier and carries two close classification problems as, wherein design is able to reflect message insertion and counts special to carrier Property influence validity feature be one of its critical issue.The rich model steganalysis feature and selection that Fridrich et al. is proposed Channel attack model makes the steganalysis performance of single image be greatly improved;In recent years, with the hair of deep learning Exhibition, CNN, RNN, Res-Net, GAN are also increasingly used for steganalysis.
Although Steganalysis is constantly progressive, current research is all based on laboratory condition, i.e. image is generally Natural image, and the matching of insertion rate and embedded mobile GIS is required when training classifier.But this is usually unable to satisfy in reality Kind requires.Insertion rate and embedded mobile GIS firstly for image are unknown.In addition, in true social platform, Yong Hufa The noise source of the picture material and image sent is also multiplicity, and it is various that this will make the mode of this supervised learning face The problem of mismatch.Even if steganalysis feature is up to tens of thousands of dimensions, also it is difficult to play its effectiveness in true scene.For this Situation, Ker are proposed the concept of hidden writer's detection, are carried out as unit of the user for sending image rather than as unit of single image Detection.In hidden writer detection, unsupervised learning method is generally used.Ker proposes the method detection steganography using cluster first Person again detected the local outlier factor (Local Outlier Factor) in abnormality detection for hidden writer in 2014. The method that Li et al. people in 2016 proposes hierarchical clustering and clustering ensemble.Zheng et al. attempts to extract using deep neural network hidden Analysis feature is write to detect for hidden writer.Although these methods avoid the problem of mismatch in supervised learning, it is adopted Feature is all traditional steganalysis feature of low-dimensional, and essence is still by whether making steganography modification to be sentenced to do Fixed, for different data, performance be would also vary from.Fig. 1 is the local outlier factor proposed using Ker (lof) experimental result in BossBase and twitter data, abscissa represent insertion rate, ordinate generation to method respectively Average ranking of the lof value of the hidden writer of table in 100 hidden writers, it is more forward to illustrate that effect is better.It can be seen from the figure that It shows widely different in the data of BossBase and twitter, is influenced by image source very big.And when insertion rate is low Average ranking reaches 50, is equivalent to and is substantially not detectable.
The full communication process communicated using hidden image should include the selection of image-carrier, point of insertion rate Match, the selection of embedded mobile GIS, last embedded images are simultaneously sent.In social scene, the behavioural information of various dimensions can be related to, than Such as the frequency of communication, the object of communication sends the content relevance of image.And current Steganography is concerned only with the peace of single dimension Entirely, i.e., so that carrier and the close undistinguishable of load.We have investigated hundreds of steganography software, and almost all of software is all only paid attention to The improvement of steganographic algorithm, the information revealed without considering other behaviors of user in entire communication process, for example, being sent out Send the correlation of image.Existing steganography software does not have the function that carrier is selected for user, and friendly software relatively can be The random selection carrier of user allows user to use the image of oneself captured in real-time.But for the user of not professional knowledge For, in order to save time and efforts, it is likely that image can be selected at random as carrier.In this case, using user's Behavioural information will make hidden writer detect the basic change of generation to detect hidden writer.
Summary of the invention
The object of the present invention is to provide a kind of hidden writer's detection methods of Behavior-based control, can accurately detect hidden writer.
The purpose of the present invention is what is be achieved through the following technical solutions:
A kind of hidden writer's detection method of Behavior-based control, comprising:
A certain number of users are chosen from social platform, and each user crawls N continuous images, and by a part of user Image as training data, other are as test data;
Database of the image of selected part user as hidden writer, and therefrom randomly select a certain number of images and carry out mould Intend hidden writer's behavior, generates hidden writer's data;
Behavioural characteristic is extracted from training data and hidden writer's data respectively, and trains two points using extracted feature Class device;
Two classifiers are tested using test data, and using by two classifiers after test, input figure to new As being detected, therefore, it is determined that the user for sending new input picture is normal users or hidden writer.
As seen from the above technical solution provided by the invention, the feature of correlation between image will be reflected as row It is characterized, and cooperates two classifiers that can accurately detect hidden writer.Meanwhile the diversity of behavioural information can be examined for steganography person It surveys and the detection visual angle of multi-angle is provided, on the one hand on the other hand, steganography software can be promoted to consider with the hidden writer of more reliable detection Behavioural information designs more humane safer steganography software.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this For the those of ordinary skill in field, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
The method that Fig. 1 is the lof proposed using Ker that background of invention provides respectively in BossBase and Experimental result in twitter data;
Fig. 2 is a kind of flow chart of hidden writer's detection method of Behavior-based control provided in an embodiment of the present invention;
Fig. 3 is the flow chart provided in an embodiment of the present invention for extracting behavioural characteristic;
Fig. 4 is the experimental result provided in an embodiment of the present invention based on the present invention program.
Specific embodiment
With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this The embodiment of invention, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, belongs to protection scope of the present invention.
The embodiment of the present invention provides a kind of hidden writer's detection method of Behavior-based control, mainly includes the following steps:
1, a certain number of users are chosen from social platform, each user crawls N continuous images, and a part is used The image at family is as training data, other are as test data.
Illustratively, social platform can be selected as twitter, and the tool of crawling can choose tweepy;It, can in practical operation To use tweepy to crawl upper 2000 users of twitter, the user by picture number less than 100 is screened out, and retains 700 use Family, each user retain 100 continuous images.
It,, will using the resize function of matlab after crawling each user N continuous images in the embodiment of the present invention Each image cropping is specified size m × n.Illustratively, size can be set to 512*512.
The division proportion of training data and test data may be set according to actual conditions.
2, database of the image of selected part user as hidden writer from the image crawled, and therefrom randomly select one The image of fixed number amount simulates hidden writer's behavior, generates hidden writer's data.
Since the data of hidden writer in practice can not be obtained, for the validity of verification method, in the embodiment of the present invention, A part is randomly selected from the image crawled to test to simulate hidden writer.
Likewise, the amount of images randomly selected from hidden writer's database may be set to be 100.
3, behavioural characteristic is extracted from training data and hidden writer's data respectively, and trains two using extracted feature Classifier.
In the embodiment of the present invention, training data is made of the image of a part of user, and hidden writer's data are also by steganography The image of person forms, and the mode of feature extraction is identical;For each user or hidden writer, from corresponding image sequence The feature of correlation between being able to reflect image is extracted as behavioural characteristic;Extracting mode is as shown in figure 3, main process is as follows:
1) for the image sequence of each user or hidden writer, the difference of the grey level histogram of adjacent two images is calculated, Constitute matrix of differences:
di,i-1=abs (hi-hi-1);
In formula, hi、hi-1Respectively indicate the grey level histogram of the i-th width image, the (i-1)-th width image;
2) quantification treatment is carried out to matrix of differences: first takes logarithmic quantization, is then truncated, truncation section is [0, T] table It is shown as:
D'=trucT(round(logdi,i+1));
3) the frequecy characteristic P and co-occurrence matrix C of d' are counted using the single order of all matrix of differences and second-order statistics Distribution:
P=[p1,...,pT+1];
Wherein, d'k、d'k+1Respectively indicate kth in d', k+1 element;M, n is respectively the length and width of image;C in co-occurrence matrix Cl,jAnd cj,lCorrelation is similar between represented pixel, is merged, and is closed Co-occurrence matrix after and
By frequecy characteristic P with merge after the co-occurrence matrix C' feature namely behavioural characteristic that merge to the end:
F=[P C'];
Characteristic dimension is as follows:
Assuming that take T=12, then the characteristic dimension of behavioural characteristic | F |=104.
4, two classifiers are tested using test data, and using by two classifiers after test, is inputted to new Image is detected, therefore, it is determined that the user for sending new input picture is normal users or hidden writer.
Obtain trained two classifier in the above manner, then using test data to trained two classifier into Row test, two classifiers tested after passing through then can be used for the classification and Detection of hidden writer.
It will be understood by those skilled in the art that being also needed in test phase and when being detected to new input picture Behavioural characteristic is extracted, then using the behavioural characteristic extracted as the input of two classifiers, is positive to obtain behavioural characteristic correspondence The classification output of common family or hidden writer.
In addition, by changing training data composition, reducing training number for the hidden writer realized with certain behavior safety According to the mismatch problems with test data.Hidden writer is divided into different behavior safeties according to ratio shared by random image in image Grade has divided multiple behavior safety grades altogether, indicates not have the hidden writer of behavior safety consciousness with P%, i.e., transmitted Image is all random;(P-Q) % indicates the hidden writer with certain behavior safety consciousness, in the image transmitted by him, has Q% is sent according to the sequence of normal users, and P% is the image randomly selected.And so on, obtain multiple behavior safeties etc. Grade;Illustratively, the hidden writer for not having behavior safety consciousness can be indicated with 100%, i.e., transmitted image is all random 's;90% indicates the hidden writer with certain behavior safety consciousness, and in the image transmitted by him, 10% is according to just common What the sequence at family was sent, 90% is the image randomly selected.And so on, we obtain the hidden writer of 10 kinds of grades.
In training classifier, the hidden writer of different safety class averagely mixes composition training set, reaches for unknown The accurate detection of the hidden writer of behavior safety grade.Such as: 1000 hidden writers in training set include 100 " 10% " hidden Writer, 100 " 20% " hidden writer ... 100 " 100% " hidden writers.
In order to which the detection effect of above scheme of the present invention has also carried out related experiment.Experimental result is as shown in table 1 and Fig. 4.
1 combined training of table and test experiments result
The experiment of table 1 is the hidden writer for realizing with certain behavior safety, by changing the composition of training set, instruction Practice mixed classifier.Then using mixed classifier respectively to the hidden writer of different safety class (10%, 20% ..., 100%) it is tested, obtained false dismissal probability result.
Fig. 4 be in order to illustrate this method for picture number have robustness, that is, select different number of image into Row experiment, but guarantee that test set and training set, normal users and hidden writer, the picture number of selected each user are consistent. 10 image/users, 20 image/users ..., 100 image/users, obtained Average Error Probabilities (false-alarm are chosen respectively Rate and false dismissed rate are averaged).
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can The mode of necessary general hardware platform can also be added to realize by software by software realization.Based on this understanding, The technical solution of above-described embodiment can be embodied in the form of software products, which can store non-easy at one In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are with so that a computer is set Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Within the technical scope of the present disclosure, any changes or substitutions that can be easily thought of by anyone skilled in the art, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims Subject to enclosing.

Claims (4)

1. a kind of hidden writer's detection method of Behavior-based control characterized by comprising
A certain number of users are chosen from social platform, and each user crawls N continuous images, and by the figure of a part of user As being used as training data, other are as test data;
Database of the image of selected part user as hidden writer, and it is hidden to simulate therefrom to randomly select a certain number of images Writer's behavior generates hidden writer's data;
Behavioural characteristic is extracted from training data and hidden writer's data respectively, and trains two classification using extracted feature Device;
Two classifiers are tested using test data, and using by test after two classifiers, to new input picture into Row detection, therefore, it is determined that the user for sending new input picture is normal users or hidden writer.
2. a kind of hidden writer's detection method of Behavior-based control according to claim 1, which is characterized in that crawl each user It is specified size by each image cropping using the resize function of matlab after N continuous images.
3. a kind of hidden writer's detection method of Behavior-based control according to claim 1, which is characterized in that the extraction behavior Feature includes:
Training data is made of the image of a part of user, and hidden writer's data are also to be made of the image of hidden writer, and feature mentions The mode taken is identical;For each user or hidden writer, extracts and be able to reflect between image from corresponding image sequence The feature of correlation is as behavioural characteristic;Extracting mode is as follows:
For the image sequence of each user or hidden writer, the difference of the grey level histogram of adjacent two images is calculated, it is poor to constitute Value matrix:
di,i-1=abs (hi-hi-1);
In formula, hi、hi-1Respectively indicate the grey level histogram of the i-th width image, the (i-1)-th width image;
Quantification treatment is carried out to matrix of differences: first taking logarithmic quantization, is then truncated, section is truncated as [0, T] expression are as follows:
D'=trucT(round(logdi,i+1));
Frequecy characteristic P and co-occurrence matrix the C distribution of d' is counted using the single order of all matrix of differences and second-order statistics:
P=[p1,...,pT+1];
Wherein, d'k、d'k+1Respectively indicate kth in d', k+1 element;M, n is respectively the length and width of image; C in co-occurrence matrix Ci,jAnd cj,iCorrelation is similar between represented pixel, is merged, the co-occurrence matrix after being merged
By frequecy characteristic P with merge after the co-occurrence matrix C' feature namely behavioural characteristic that merge to the end:
F=[P C'];
Characteristic dimension is as follows:
4. a kind of hidden writer's detection method of Behavior-based control according to claim 1, which is characterized in that this method is also wrapped It includes: hidden writer being divided into different behavior safety grades according to ratio shared by random image in image, has divided multiple rows altogether For security level, the hidden writer for not having behavior safety consciousness is indicated with P%, i.e., transmitted image is all random;(P- Q) % indicates the hidden writer with certain behavior safety consciousness, and in the image transmitted by it, Q% is according to normal users What sequence was sent, P% is the image randomly selected, and so on, obtain multiple behavior safety grades;In training classifier, The hidden writer of different safety class averagely mixes composition training set, reaches the hidden writer of standard to(for) unknown behavior safety grade Really detection.
CN201810996553.5A 2018-08-29 2018-08-29 Hidden writer's detection method of Behavior-based control Pending CN108985391A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810996553.5A CN108985391A (en) 2018-08-29 2018-08-29 Hidden writer's detection method of Behavior-based control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810996553.5A CN108985391A (en) 2018-08-29 2018-08-29 Hidden writer's detection method of Behavior-based control

Publications (1)

Publication Number Publication Date
CN108985391A true CN108985391A (en) 2018-12-11

Family

ID=64547297

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810996553.5A Pending CN108985391A (en) 2018-08-29 2018-08-29 Hidden writer's detection method of Behavior-based control

Country Status (1)

Country Link
CN (1) CN108985391A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059754A (en) * 2019-04-22 2019-07-26 厦门大学 A kind of batch data steganography method, terminal device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203492A (en) * 2016-06-30 2016-12-07 中国科学院计算技术研究所 The system and method that a kind of image latent writing is analyzed
CN107808100A (en) * 2017-10-25 2018-03-16 中国科学技术大学 For the steganalysis method of fc-specific test FC sample

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203492A (en) * 2016-06-30 2016-12-07 中国科学院计算技术研究所 The system and method that a kind of image latent writing is analyzed
CN107808100A (en) * 2017-10-25 2018-03-16 中国科学技术大学 For the steganalysis method of fc-specific test FC sample

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LI LI等: "Side Channel Steganalysis: When Behavior is Considered in Steganographer Detection", 《MULTIMEDIA TOOLS AND APPLICATIONS》 *
毛金莲: "《智能图像检索关键技术研究》", 30 June 2015, 北京理工大学出版社 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059754A (en) * 2019-04-22 2019-07-26 厦门大学 A kind of batch data steganography method, terminal device and storage medium

Similar Documents

Publication Publication Date Title
Qi et al. Exploiting multi-domain visual information for fake news detection
Rana et al. Deepfake detection: A systematic literature review
Ning et al. Invisible poison: A blackbox clean label backdoor attack to deep neural networks
CN112749608B (en) Video auditing method, device, computer equipment and storage medium
CN107835113A (en) Abnormal user detection method in a kind of social networks based on network mapping
Al-Qershi et al. Evaluation of copy-move forgery detection: datasets and evaluation metrics
CN110197389A (en) A kind of user identification method and device
Lago et al. Visual and textual analysis for image trustworthiness assessment within online news
Gu et al. AnchorMF: towards effective event context identification
Khan et al. Digital forensics and cyber forensics investigation: security challenges, limitations, open issues, and future direction
CN109522692B (en) Webpage machine behavioral value method and system
Guo et al. Crowdstory: Fine-grained event storyline generation by fusion of multi-modal crowdsourced data
Checco et al. All that glitters is gold—an attack scheme on gold questions in crowdsourcing
CN116957049A (en) Unsupervised internal threat detection method based on countermeasure self-encoder
Li et al. Steganographic security analysis from side channel steganalysis and its complementary attacks
Wenger et al. Data isotopes for data provenance in dnns
CN110163013A (en) A kind of method and apparatus detecting sensitive information
Wang et al. Analyzing image-based political propaganda in referendum campaigns: from elements to strategies
CN108985391A (en) Hidden writer's detection method of Behavior-based control
Li et al. TCM-KNN scheme for network anomaly detection using feature-based optimizations
Hendrix et al. Media forensics in the age of disinformation
Duan et al. Fed‐DNN‐Debugger: Automatically Debugging Deep Neural Network Models in Federated Learning
CN114861177A (en) Method and device for detecting suspicious account on social network
CN114092141A (en) Wool party identification method, device, equipment and storage medium
Boenisch Differential privacy: general survey and analysis of practicability in the context of machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181211