CN110928848A - File fragment classification method and system - Google Patents

File fragment classification method and system Download PDF

Info

Publication number
CN110928848A
CN110928848A CN201911146348.0A CN201911146348A CN110928848A CN 110928848 A CN110928848 A CN 110928848A CN 201911146348 A CN201911146348 A CN 201911146348A CN 110928848 A CN110928848 A CN 110928848A
Authority
CN
China
Prior art keywords
file
file fragment
training
data set
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911146348.0A
Other languages
Chinese (zh)
Inventor
尹凌
奚桂锴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201911146348.0A priority Critical patent/CN110928848A/en
Publication of CN110928848A publication Critical patent/CN110928848A/en
Priority to PCT/CN2020/128860 priority patent/WO2021098620A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a file fragment classification method, which comprises the following steps: constructing a file fragment data set by using the file data set, wherein the file fragment data set comprises: training and testing sets; preprocessing the constructed file fragment data set; constructing a deep convolutional neural network model; training and evaluating the constructed deep convolutional neural network model by utilizing the preprocessed training set and test set; and predicting the file type of the file fragment by using the deep convolutional neural network model. The invention also relates to a file fragment classification system. The method does not need to manually design characteristics and other priori knowledge, can automatically learn the characteristics of the input file fragments, and the designed deep convolutional neural network can be suitable for classification tasks of file fragments with different sizes, thereby having better classification effect.

Description

File fragment classification method and system
Technical Field
The invention relates to a file fragment classification method and a file fragment classification system.
Background
When a criminal suspect deletes a file stored in a disk, the disk often has residual file contents. If a judicial forensics staff wants to find evidence through file fragments in a disk, the file fragments need to be recombined and spliced into a file.
If a large number of file fragments are directly spliced two by two, huge calculation amount is consumed. If the file type of the file to which each file fragment belongs (i.e. the type of the file fragment) can be known in advance, the number of combinations that need to be tried can be greatly reduced.
One type of existing file fragment classification method is to use magic numbers or the like to identify files of different file types. These magic numbers typically appear at the file header and file footer, and files of different file types will appear with different numbers of magic numbers at different locations. Since files in a disk are often stored in fragmented form, and multiple file fragments belonging to a file are not always connected in sequence, it is often difficult to identify file fragments of different file types by using file header information and file trailer information of the file.
Another type of file fragment classification method is a content-based file fragment classification method. The file fragment classification method based on the content is to directly predict the file type of the file fragment through the analysis of the content of the file fragment. The method does not need to rely on file signatures or magic numbers or the like. The existing file fragment classification method based on contents mainly starts from the statistical perspective, and establishes a traditional machine learning model such as LDA, SVM, KNN and the like by extracting the statistical characteristics of each file fragment such as frequency distribution of unigram and bigram, entropy and the like, so as to identify the type corresponding to each file fragment. In the content-based file fragment classification method, the method of establishing a traditional machine learning model by extracting statistical features of file fragments heavily depends on the design of features, is time-consuming and requires a large amount of professional knowledge. Moreover, such methods cannot achieve a good classification effect at present.
In the file fragment classification method based on the content, the existing file fragment classification method based on the deep learning is not mature, the corresponding classification effect is not good, and the method is lower than the file fragment classification method based on the traditional machine learning model. The existing deep learning-based research needs to design different neural network architectures for different sizes of file fragments, so the applicability of the existing method is limited to a certain extent.
Disclosure of Invention
Accordingly, there is a need for a method and system for classifying file fragments.
The invention provides a file fragment classification method, which comprises the following steps: a. constructing a file fragment data set by using the file data set, wherein the file fragment data set comprises: training and testing sets; b. preprocessing the constructed file fragment data set; c. constructing a deep convolutional neural network model; d. training and evaluating the constructed deep convolutional neural network model by utilizing the preprocessed training set and test set; e. and predicting the file type of the file fragment by using the deep convolutional neural network model.
Wherein, the step a specifically comprises:
decompressing all zip compressed package files contained in the public file data set govdocs1, and dividing the files in the decompressed folders into different categories according to the file types to which the files belong;
dividing the selected files corresponding to the file types to be researched into two types to generate file fragments respectively used for a training set and a testing set;
and slicing each file according to the selected file fragment size to generate a large number of file fragments, and deleting the first file fragment of each file and the last file fragment of each file, which is smaller than the size of the specified file fragment.
The step b specifically comprises the following steps:
converting each file fragment in the generated training set and the test set, and converting one-dimensional file fragments into two-dimensional gray images through simple shape change;
and carrying out normalization processing on each two-dimensional gray image, calculating the maximum value and the minimum value of each position pixel point in a training set, scaling the corresponding pixel points in the training set and a testing set according to the maximum value and the minimum value obtained in the training set, and enabling the gray value of the pixel points to fall between-1 and 1.
The deep convolutional neural network model comprises L convolutional blocks, a global average pooling layer and two full-connection layers.
The convolution block includes: a convolutional layer, a residual unit and a maximum pooling layer;
the number of the volume blocks L is limited by the size of the converted grayscale image:
Lmax=min(log2max(w,h)-1,log2min(w,h))
in the formula, LmaxRefers to the maximum number of volume blocks that can be stacked in the model, and w and h refer to the width and height, respectively, of the converted two-dimensional grayscale image.
The convolutional layer uses d convolutional kernels of 1 × 1, and assuming that C IxJ feature maps are input to the convolutional block, the convolutional layer up-samples the number of channels of the input feature map.
The residual error unit comprises two convolution layers and is connected in a jumping mode by adopting a residual error learning method.
The maximum pooling layer performs spatial down-sampling on each input feature map to reduce the input feature map to the original one
Figure BDA0002282311120000041
Namely, it is
Figure BDA0002282311120000042
The step d specifically comprises the following steps:
and evaluating the deep convolutional neural network by utilizing the preprocessed test set, wherein evaluation indexes comprise average classification accuracy of a plurality of file fragment categories, macro-average F1 scores and micro-average F1 scores.
The invention provides a file fragment classification system, which comprises a fragment data set construction module, a preprocessing module, a model construction module, a training evaluation module and a file type prediction module, wherein: the fragment data set construction module is used for constructing a file fragment data set by using a file data set, and the file fragment data set comprises: training and testing sets; the preprocessing module is used for preprocessing the constructed file fragment data set; the model construction module is used for constructing a deep convolutional neural network model; the training evaluation module is used for training and evaluating the constructed deep convolutional neural network model by utilizing the preprocessed training set and the preprocessed test set; the file type prediction module is used for predicting the file type of the file fragment by utilizing the deep convolutional neural network model.
The application provides a file fragment classification method and a file fragment classification system, and prediction can be carried out only by converting input file fragments into two-dimensional gray images and inputting the two-dimensional gray images into a model. According to the invention, when the file fragments are converted into the two-dimensional gray scale image, no extra calculation amount is needed. When the type of the file fragment is predicted, the method completely judges based on the content of the file fragment without other prior knowledge. The method can automatically learn the features from the input file fragments directly, and does not need to manually extract the features from the file fragments and then carry out modeling. In addition, the deep convolutional neural network designed by the invention can be suitable for the classification task of file fragments with different sizes. The deep convolutional neural network designed by the invention adopts a residual error structure design, can build a deeper network model, is suitable for processing file fragment classification tasks of different sizes, effectively improves the classification accuracy of the file fragments, and has a better classification effect.
Drawings
FIG. 1 is a flow chart of a file fragment classification method of the present invention;
FIG. 2 is a schematic diagram of a process for converting a document fragment into a grayscale image according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a deep convolutional neural network model according to an embodiment of the present invention;
FIG. 4 is a diagram of a convolution block in a deep convolutional neural network model according to an embodiment of the present invention;
FIG. 5 is a diagram of residual error units in a deep convolutional neural network model according to an embodiment of the present invention.
Fig. 6 is a diagram of the hardware architecture of the file fragment sorting system of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is a flowchart illustrating the operation of the file fragment classifying method according to the preferred embodiment of the present invention.
In step S1, a file fragment data set is constructed using the file data set. The file fragment data set comprises: training set and test set. Specifically, the method comprises the following steps:
in this embodiment, the file shard data set is generated using the exposed file data set govdocs 1. The file data set contains 1000 zip compressed package files. Decompressing all zip compressed packet files contained in the file data set, and dividing files in the decompressed folder into different categories according to the file types.
And aiming at the file fragment types needing to be researched, a certain number of files are selected for experiments. And respectively enabling the selected files corresponding to the file types to be researched to be 6: 4 into two classes to generate file fragments for the training set and the test set, respectively.
Each file is sliced according to the selected file fragment size to generate a large number of file fragments. In order to avoid the file signature which can be used for identifying the file type in the file header, the first file fragment of each file is deleted, and simultaneously, the last file fragment which is smaller than the size of the specified file fragment of each file is deleted. And aiming at the training set and the test set, limiting the number of the file fragments corresponding to each file type in a random sampling mode so as to balance the data set as much as possible and obtain a large number of file fragments which are respectively used for training and testing and correspond to different file types.
Step S2, pre-process the constructed file fragment data set, that is, pre-process the training set and the test set. Specifically, the method comprises the following steps:
each file fragment in the generated training set and test set is converted, and a one-dimensional file fragment can be converted into a two-dimensional gray image through simple shape change, please refer to fig. 2. Wherein the file fragments consist of a sequence of bytes; each byte corresponds to each pixel point in the two-dimensional gray image. When converting a file fragment (one-dimensional byte sequence) into a two-dimensional grayscale image, the shape of the grayscale image should be as close to a square as possible, so as to facilitate the construction of a model deep enough for classifying the file fragment.
In the present embodiment, a file fragment of 512 bytes is converted into a two-dimensional grayscale image of 16x32(16x32 ═ 512); a 4096 byte file fragment is converted to a 64x64(64x64 ═ 4096) two-dimensional grayscale image.
And finally, carrying out normalization processing on each two-dimensional gray image, calculating the maximum value and the minimum value of each position pixel point in a training set, carrying out scaling on the corresponding pixel points in the training set and a testing set according to the maximum value and the minimum value obtained in the training set, and enabling the gray value of the pixel points to fall between-1 and 1.
And step S3, constructing a deep convolutional neural network model. Specifically, the method comprises the following steps:
as shown in fig. 3, the deep convolutional neural network model includes L convolutional blocks, a global average pooling layer and two fully connected layers. The relu (rectified Linear unit) shown in fig. 3 refers to a modified Linear unit, which is an activation function.
The structure of each volume block is shown in fig. 4, and includes three parts: convolutional layers, residual units, and max-pooling layers. Wherein: the convolutional layer uses d convolutional kernels of 1 × 1, and assuming that C IxJ feature maps are input to the convolutional layer block, the convolutional layer up-samples the number of channels of the input feature maps (increases from C to d); the residual unit performs feature learning, and the max-pooling layer performs spatial down-sampling on each input feature map to reduce the input feature map to the original one
Figure BDA0002282311120000071
Namely, it is
Figure BDA0002282311120000072
The number of feature maps remains unchanged.
The number of the convolution blocks L is limited by the size of the converted grayscale image, as follows:
Lmax=min(log2max(w,h)-1,log2min(w,h))
in the formula, LmaxRefers to the maximum number of volume blocks that can be stacked in the model, and w and h refer to the width and height, respectively, of the converted two-dimensional grayscale image.
The structure of the residual error unit is shown in fig. 5, the residual error unit includes two convolution layers, and jump connection is performed by using a residual error learning method. Both convolutional layers use d convolution kernels of 3x3 for learning the features of the input feature map. The input signature is calculated by the ReLU activation function before being input to both convolutional layers.
Both fully connected layers of the model had 2048 neurons.
Although the present application constructs the model structures as shown in fig. 3, fig. 4, and fig. 5 based on certain practical considerations, and provides parameters of relevant portions of the model, the model structure of the present invention should not be limited to these parameters, nor should it be limited to the parameters of the model structure.
And step S4, training and evaluating the built deep convolutional neural network model by utilizing the preprocessed training set and test set. The evaluation indicators include average classification accuracy of a plurality of file fragment categories, a macro-averaged F1 score, and a micro-averaged F1 score. Specifically, the method comprises the following steps:
in this embodiment:
and training the deep convolutional neural network by adopting a gradient descent method based on Adam. Here, the initial learning rate was set to 0.001, the learning rate was lowered to 0.2 every 5 rounds, and the total round of training was set to 40. In addition, the deep convolutional neural network is trained by adopting an earlystop technology. And when the evaluation indexes of the deep convolutional neural network on the test set are not improved in 5 continuous rounds, stopping training in advance, and taking the current model parameters as the optimal parameters of the deep convolutional neural network.
Step S5: and predicting the file type of the file fragment by using the deep convolutional neural network model. The method specifically comprises the following steps:
after the file fragments to be predicted are given, the file fragments are converted into two-dimensional gray scale images according to step S2, and then the converted gray scale images are normalized.
Specifically, the gray value of the pixel point at the corresponding position of the gray image in the training set is scaled to be between-1 and 1 according to the maximum value and the minimum value of the pixel point at the corresponding position of the gray image in the training set, and then the normalized two-dimensional gray image is input into the deep convolution neural network model so as to predict the file type of the file fragment.
Referring to fig. 6, there is shown a hardware architecture diagram of the file fragment sorting system 10 of the present invention. The system comprises: a fragmentation data set building module 101, a preprocessing module 102, a model building module 103, a training evaluation module 104, and a file type prediction module 105.
The fragment data set constructing module 101 is configured to construct a file fragment data set by using a file data set. The file fragment data set comprises: training set and test set. Specifically, the method comprises the following steps:
in this embodiment, the shard data set building module 101 generates the file shard data set by using the public file data set govdocs 1. The file data set contains 1000 zip compressed package files. Decompressing all zip compressed packet files contained in the file data set, and dividing files in the decompressed folder into different categories according to the file types.
And aiming at the file fragment types needing to be researched, a certain number of files are selected for experiments. And respectively enabling the selected files corresponding to the file types to be researched to be 6: 4 into two classes to generate file fragments for the training set and the test set, respectively.
The shard data set building module 101 slices each file according to the selected file shard size to generate a large number of file shards. In order to avoid the file signature which can be used for identifying the file type in the file header, the first file fragment of each file is deleted, and simultaneously, the last file fragment which is smaller than the size of the specified file fragment of each file is deleted. And aiming at the training set and the test set, limiting the number of the file fragments corresponding to each file type in a random sampling mode so as to balance the data set as much as possible and obtain a large number of file fragments which are respectively used for training and testing and correspond to different file types.
The preprocessing module 102 is configured to preprocess the constructed file fragment data set, that is, the training set and the test set. The method specifically comprises the following steps:
the preprocessing module 102 converts each file fragment in the generated training set and the test set, and converts the one-dimensional file fragment into a two-dimensional gray image through simple shape change, please refer to fig. 2. Wherein the file fragments consist of a sequence of bytes; each byte corresponds to each pixel point in the two-dimensional gray image. When converting a file fragment (one-dimensional byte sequence) into a two-dimensional grayscale image, the shape of the grayscale image should be as close to a square as possible, so as to facilitate the construction of a model deep enough for classifying the file fragment.
In this embodiment, the pre-processing module 102 converts the file fragments of 512 bytes into a two-dimensional grayscale image of 16x32(16x32 ═ 512); a 4096 byte file fragment is converted to a 64x64(64x64 ═ 4096) two-dimensional grayscale image.
Finally, the preprocessing module 102 normalizes each two-dimensional grayscale image, calculates the maximum value and the minimum value of each pixel point at each position in the training set, scales the corresponding pixel points in the training set and the testing set according to the maximum value and the minimum value obtained in the training set, and enables the grayscale values of the pixel points to fall between-1 and 1.
The model building module 103 is used for building a deep convolutional neural network model. Specifically, the method comprises the following steps:
as shown in fig. 3, the deep convolutional neural network model includes L convolutional blocks, a global average pooling layer and two fully connected layers. The relu (rectified linear unit) described in fig. 3 refers to a modified linear unit, which is an activation function.
The structure of each volume block is shown in fig. 4, and includes three parts: convolutional layers, residual units, and max-pooling layers. Wherein: the convolutional layer uses d convolutional kernels of 1 × 1, and assuming that C IxJ feature maps are input to the convolutional layer block, the convolutional layer up-samples the number of channels of the input feature maps (increases from C to d); the residual unit performs feature learning, and the max-pooling layer performs spatial down-sampling on each input feature map to reduce the input feature map to the original one
Figure BDA0002282311120000101
Namely, it is
Figure BDA0002282311120000102
The number of feature maps remains unchanged.
The number of the convolution blocks L is limited by the size of the converted grayscale image, as follows:
Lmax=min(log2max(w,h)-1,log2min(w,h))
in the formula, LmaxRefers to the maximum number of volume blocks that can be stacked in the model, and w and h refer to the width and height, respectively, of the converted two-dimensional grayscale image.
The structure of the residual error unit is shown in fig. 5, the residual error unit includes two convolution layers, and jump connection is performed by using a residual error learning method. Both convolutional layers use d convolution kernels of 3x3 for learning the features of the input feature map. The input signature is calculated by the ReLU activation function before being input to both convolutional layers.
Both fully connected layers of the model had 2048 neurons.
Although the present application constructs the model structures as shown in fig. 3, fig. 4, and fig. 5 based on certain practical considerations, and provides parameters of relevant portions of the model, the model structure of the present invention should not be limited to these parameters, nor should it be limited to the parameters of the model structure.
The training evaluation module 104 is configured to train and evaluate the deep convolutional neural network model constructed as described above by using the preprocessed training set and test set. The evaluation indicators include average classification accuracy of a plurality of file fragment categories, a macro-averaged F1 score, and a micro-averaged F1 score. Specifically, the method comprises the following steps:
in this embodiment:
the training evaluation module 104 trains the deep convolutional neural network by using an Adam-based gradient descent method. Here, the initial learning rate was set to 0.001, the learning rate was lowered to 0.2 every 5 rounds, and the total round of training was set to 40. In addition, the deep convolutional neural network is trained by adopting an earlystop technology. And when the evaluation indexes of the deep convolutional neural network on the test set are not improved in 5 continuous rounds, stopping training in advance, and taking the current model parameters as the optimal parameters of the deep convolutional neural network.
The file type prediction module 105 is configured to predict a file type to which the file fragment belongs by using the deep convolutional neural network model. The method specifically comprises the following steps:
after the file fragment to be predicted is given, the file type prediction module 105 converts the file fragment into a two-dimensional gray image, and then performs normalization processing on the converted gray image.
Specifically, the file type prediction module 105 scales the gray value of the pixel point at the corresponding position of the gray image to be between-1 and 1 according to the maximum value and the minimum value of the pixel point at the corresponding position of the gray image in the training set, and then inputs the normalized two-dimensional gray image into the deep convolutional neural network model to predict the file type to which the file fragment belongs.
Although the present invention has been described with reference to the presently preferred embodiments, it will be understood by those skilled in the art that the foregoing description is illustrative only and is not intended to limit the scope of the invention, as claimed.

Claims (10)

1. A file fragment classification method is characterized by comprising the following steps:
a. constructing a file fragment data set by using the file data set, wherein the file fragment data set comprises: training and testing sets;
b. preprocessing the constructed file fragment data set;
c. constructing a deep convolutional neural network model;
d. training and evaluating the constructed deep convolutional neural network model by utilizing the preprocessed training set and test set;
e. and predicting the file type of the file fragment by using the deep convolutional neural network model.
2. The method according to claim 1, wherein said step a specifically comprises:
decompressing all zip compressed package files contained in the public file data set govdocs1, and dividing the files in the decompressed folders into different categories according to the file types to which the files belong;
dividing the selected files corresponding to the file types to be researched into two types to generate file fragments respectively used for a training set and a testing set;
slicing each file according to the selected file fragment size to generate a large number of file fragments, and deleting the first file fragment and the last file fragment of each file, which are smaller than the size of the specified file fragment.
3. The method according to claim 2, wherein said step b specifically comprises:
converting each file fragment in the generated training set and the test set, and converting one-dimensional file fragments into two-dimensional gray images through simple shape change;
and carrying out normalization processing on each two-dimensional gray image, calculating the maximum value and the minimum value of each position pixel point in a training set, scaling the corresponding pixel points in the training set and a testing set according to the maximum value and the minimum value obtained in the training set, and enabling the gray value of the pixel points to fall between-1 and 1.
4. The method of claim 3, wherein the deep convolutional neural network model comprises L convolutional blocks, a global average pooling layer and two fully connected layers.
5. The method of claim 4, wherein the volume block comprises: a convolutional layer, a residual unit and a maximum pooling layer;
the number of the volume blocks L is limited by the size of the converted grayscale image:
Lmax=min(log2max(w,h)-1,log2min(w,h))
in the formula, LmaxRefers to the maximum number of volume blocks that can be stacked in the model, and w and h refer to the width and height, respectively, of the converted two-dimensional grayscale image.
6. The method of claim 5, wherein the convolutional layer uses d 1x1 convolutional kernels, and assuming that C IxJ feature maps are input to the convolutional block, the convolutional layer upsamples the number of channels of the input feature map.
7. The method of claim 6, wherein the residual unit comprises two convolutional layers, and the jump connection is performed by using a residual learning method.
8. The method of claim 7, wherein the max-pooling layer spatially down-samples each input feature map to reduce it to the original
Figure FDA0002282311110000021
Namely, it is
Figure FDA0002282311110000022
9. The method according to claim 8, wherein said step d specifically comprises:
and evaluating the deep convolutional neural network by utilizing the preprocessed test set, wherein evaluation indexes comprise average classification accuracy of a plurality of file fragment categories, macro-average F1 scores and micro-average F1 scores.
10. A file fragment classification system is characterized by comprising a fragment data set building module, a preprocessing module, a model building module, a training evaluation module and a file type prediction module, wherein:
the fragment data set construction module is used for constructing a file fragment data set by using a file data set, and the file fragment data set comprises: training and testing sets;
the preprocessing module is used for preprocessing the constructed file fragment data set;
the model construction module is used for constructing a deep convolutional neural network model;
the training evaluation module is used for training and evaluating the constructed deep convolutional neural network model by utilizing the preprocessed training set and the preprocessed test set;
the file type prediction module is used for predicting the file type of the file fragment by utilizing the deep convolutional neural network model.
CN201911146348.0A 2019-11-21 2019-11-21 File fragment classification method and system Pending CN110928848A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201911146348.0A CN110928848A (en) 2019-11-21 2019-11-21 File fragment classification method and system
PCT/CN2020/128860 WO2021098620A1 (en) 2019-11-21 2020-11-13 File fragment classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911146348.0A CN110928848A (en) 2019-11-21 2019-11-21 File fragment classification method and system

Publications (1)

Publication Number Publication Date
CN110928848A true CN110928848A (en) 2020-03-27

Family

ID=69851521

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911146348.0A Pending CN110928848A (en) 2019-11-21 2019-11-21 File fragment classification method and system

Country Status (2)

Country Link
CN (1) CN110928848A (en)
WO (1) WO2021098620A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021098620A1 (en) * 2019-11-21 2021-05-27 中国科学院深圳先进技术研究院 File fragment classification method and system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116055174A (en) * 2023-01-10 2023-05-02 吉林大学 Internet of vehicles intrusion detection method based on improved MobileNet V2
CN116975863A (en) * 2023-07-10 2023-10-31 福州大学 Malicious code detection method based on convolutional neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108319518A (en) * 2017-12-08 2018-07-24 中国电子科技集团公司电子科学研究院 File fragmentation sorting technique based on Recognition with Recurrent Neural Network and device
CN108694414A (en) * 2018-05-11 2018-10-23 哈尔滨工业大学深圳研究生院 Digital evidence obtaining file fragmentation sorting technique based on digital picture conversion and deep learning
CN109359090A (en) * 2018-08-27 2019-02-19 中国科学院信息工程研究所 File fragmentation classification method and system based on convolutional neural networks
US20190114511A1 (en) * 2017-10-16 2019-04-18 Illumina, Inc. Deep Learning-Based Techniques for Training Deep Convolutional Neural Networks

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102682024B (en) * 2011-03-11 2014-02-05 中国科学院高能物理研究所 Method for recombining incomplete JPEG file fragmentation
CN105224984B (en) * 2014-05-31 2018-03-13 华为技术有限公司 A kind of data category recognition methods and device based on deep neural network
CN110928848A (en) * 2019-11-21 2020-03-27 中国科学院深圳先进技术研究院 File fragment classification method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190114511A1 (en) * 2017-10-16 2019-04-18 Illumina, Inc. Deep Learning-Based Techniques for Training Deep Convolutional Neural Networks
CN108319518A (en) * 2017-12-08 2018-07-24 中国电子科技集团公司电子科学研究院 File fragmentation sorting technique based on Recognition with Recurrent Neural Network and device
CN108694414A (en) * 2018-05-11 2018-10-23 哈尔滨工业大学深圳研究生院 Digital evidence obtaining file fragmentation sorting technique based on digital picture conversion and deep learning
CN109359090A (en) * 2018-08-27 2019-02-19 中国科学院信息工程研究所 File fragmentation classification method and system based on convolutional neural networks

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021098620A1 (en) * 2019-11-21 2021-05-27 中国科学院深圳先进技术研究院 File fragment classification method and system

Also Published As

Publication number Publication date
WO2021098620A1 (en) 2021-05-27

Similar Documents

Publication Publication Date Title
EP3483767B1 (en) Device for detecting variant malicious code on basis of neural network learning, method therefor, and computer-readable recording medium in which program for executing same method is recorded
WO2021098620A1 (en) File fragment classification method and system
US9424493B2 (en) Generic object detection in images
CN112686331B (en) Forged image recognition model training method and forged image recognition method
KR102469261B1 (en) Adaptive artificial neural network selection techniques
CN110046550B (en) Pedestrian attribute identification system and method based on multilayer feature learning
CN107679572B (en) Image distinguishing method, storage device and mobile terminal
CN104661037B (en) The detection method and system that compression image quantization table is distorted
CN108717512A (en) A kind of malicious code sorting technique based on convolutional neural networks
CN111935487B (en) Image compression method and system based on video stream detection
CN115455171B (en) Text video mutual inspection rope and model training method, device, equipment and medium
CN110781333A (en) Method for processing unstructured monitoring data of cable-stayed bridge based on machine learning
CN113869361A (en) Model training method, target detection method and related device
CN112052687A (en) Semantic feature processing method, device and medium based on deep separable convolution
CN109508639B (en) Road scene semantic segmentation method based on multi-scale porous convolutional neural network
CN115292538A (en) Map line element extraction method based on deep learning
CN115343676B (en) Feature optimization method for positioning technology of redundant substances in sealed electronic equipment
Li et al. A spectral clustering based filter-level pruning method for convolutional neural networks
CN113408571B (en) Image classification method and device based on model distillation, storage medium and terminal
CN115439706A (en) Multi-sense-of-the-spot attention mechanism and system based on target detection
KR102242904B1 (en) Method and apparatus for estimating parameters of compression algorithm
CN114648760A (en) Image segmentation method, image segmentation device, electronic device, and storage medium
CN114139696A (en) Model processing method and device based on algorithm integration platform and computer equipment
CN113673322A (en) Character expression posture lie detection method and system based on deep learning
CN111813975A (en) Image retrieval method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination