CN105095756A - Method and device for detecting portable document format document - Google Patents

Method and device for detecting portable document format document Download PDF

Info

Publication number
CN105095756A
CN105095756A CN201510391902.7A CN201510391902A CN105095756A CN 105095756 A CN105095756 A CN 105095756A CN 201510391902 A CN201510391902 A CN 201510391902A CN 105095756 A CN105095756 A CN 105095756A
Authority
CN
China
Prior art keywords
pdf document
document
training
malice
eigenwert
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510391902.7A
Other languages
Chinese (zh)
Inventor
苟孟洛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Internet Security Software Co Ltd
Original Assignee
Beijing Kingsoft Internet Security Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Internet Security Software Co Ltd filed Critical Beijing Kingsoft Internet Security Software Co Ltd
Priority to CN201510391902.7A priority Critical patent/CN105095756A/en
Publication of CN105095756A publication Critical patent/CN105095756A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection

Abstract

The invention provides a method and a device for detecting a Portable Document Format (PDF) document, wherein the method for detecting the PDF document comprises the following steps: extracting characteristic values from a file structure of a training PDF document, wherein the training PDF document comprises a malicious PDF document containing an attack code; learning the characteristic value through a machine learning algorithm to generate a detection model; and predicting whether the PDF document to be detected is a malicious PDF document or not through the detection model. The invention realizes the prediction of the document aggressivity on the premise of static analysis, thereby improving the security of the PDF document.

Description

The detection method of Portable Document format document and device
Technical field
The present invention relates to field of information security technology, particularly relate to a kind of Portable Document format (PortableDocumentFormat; Hereinafter referred to as: the PDF) detection method of document and device.
Background technology
Along with the high speed development of internet and the day by day universal of office automation, PDF document has become the open-standards of global electronic document distribution, due to high practicability and the general adaptability of PDF document, becomes the effective carrier of targeted phishing attack.Malicious code has serious destructiveness to computing machine, and therefore the PDF document detected containing malicious code has become the important goal of computer safety field.
But existing detection method all effectively cannot detect the harmfulness of PDF document, thus cause the security of PDF document poor.
Summary of the invention
Object of the present invention is intended to solve one of technical matters in correlation technique at least to a certain extent.
For this reason, first object of the present invention is the detection method proposing a kind of Portable Document format PDF document.The method is predicted the aggressiveness of PDF document under can be implemented in static prerequisite of resolving, and improves the security of PDF document.
Second object of the present invention is the pick-up unit proposing a kind of Portable Document format PDF document.
In order to realize above-described embodiment, the detection method of the Portable Document format PDF document of first aspect present invention embodiment, comprise: from the file structure of training PDF document, extract eigenwert, described training PDF document comprises the malice PDF document comprising attack code; Described eigenwert is carried out study by machine learning algorithm and generates detection model; Predict whether PDF document to be detected is malice PDF document by described detection model.
The detection method of the PDF document of the embodiment of the present invention, by extracting eigenwert in the file structure from training PDF document, above-mentioned eigenwert is carried out study by machine learning algorithm and generates detection model, then predict whether PDF document to be detected is malice PDF document by above-mentioned detection model, because the leaching process of eigenwert is all in static resolving, do not relate to dynamic analysis, whether be malice PDF document predict, and then can improve the security of PDF document if therefore to achieve under the prerequisite of resolving in static state PDF document to be detected.
In order to realize above-described embodiment, the pick-up unit of the Portable Document format PDF document of second aspect present invention embodiment, comprise: extraction module, for extracting eigenwert in the file structure from training PDF document, described training PDF document comprises the malice PDF document comprising attack code; Generation module, the eigenwert for being extracted by described extraction module is carried out study by machine learning algorithm and is generated detection model; Detection module, the detection model for being generated by described generation module predicts whether PDF document to be detected is malice PDF document.
The pick-up unit of the PDF document of the embodiment of the present invention, extraction module is by extracting eigenwert the file structure from training PDF document, above-mentioned eigenwert is carried out study by machine learning algorithm and is generated detection model by generation module, then by above-mentioned detection model, detection module predicts whether PDF document to be detected is malice PDF document, because extraction module extracts the process of eigenwert all in static resolving, do not relate to dynamic analysis, whether therefore to achieve under the prerequisite of resolving in static state PDF document to be detected is that malice PDF document is predicted, and then the security of PDF document can be improved.
The aspect that the present invention adds and advantage will part provide in the following description, and part will become obvious from the following description, or be recognized by practice of the present invention.
Accompanying drawing explanation
The present invention above-mentioned and/or additional aspect and advantage will become obvious and easy understand from the following description of the accompanying drawings of embodiments, wherein:
Fig. 1 is the process flow diagram of a detection method embodiment of PDF document of the present invention;
Fig. 2 is the schematic diagram of another embodiment of detection method of PDF document of the present invention;
Fig. 3 is the schematic diagram that the present invention trains a PDF document embodiment;
Fig. 4 is the schematic diagram of the embodiment that predicts the outcome of detection model of the present invention;
Fig. 5 is the structural representation of a pick-up unit embodiment of PDF document of the present invention;
Fig. 6 is the structural representation of another embodiment of pick-up unit of PDF document of the present invention.
Embodiment
Be described below in detail embodiments of the invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.On the contrary, embodiments of the invention comprise fall into attached claims spirit and intension within the scope of all changes, amendment and equivalent.
Fig. 1 is the process flow diagram of a detection method embodiment of PDF document of the present invention, and as shown in Figure 1, the detection method of this PDF document can comprise:
Step 101, extracts eigenwert from the file structure of training PDF document.
Wherein, above-mentioned training PDF document comprises the malice PDF document comprising attack code.
Particularly, the process extracting eigenwert from the file structure of training PDF document can realize by PdfStreamDumper instrument, also can realize automatic business processing by coding.
In the present embodiment, above-mentioned eigenwert can comprise: metadata (/Metadata), file operation behavior (such as: "/OpenAction ") and number of pages (/Pages/Count); Certainly, the embodiment of the present invention is not limited in this, and the present invention is not construed as limiting above-mentioned eigenwert, as long as the eigenwert comprised in the file structure of PDF document, such as: above-mentioned eigenwert can also comprise "/JavaScript " etc., does not repeat them here.
Step 102, carries out study by above-mentioned eigenwert by machine learning algorithm and generates detection model.
Particularly, above-mentioned training PDF document comprises at least two training PDF document; Above-mentioned eigenwert is undertaken learning to generate detection model by machine learning algorithm and can be: according to the eigenwert generating training data extracted in the file structure from above-mentioned at least two training PDF document; Above-mentioned training data is normalized, and carries out study generation detection model by machine learning algorithm.
In the present embodiment, the core of machine learning algorithm utilizes Libsvm software to realize, Libsvm software be one simple, be easy to use and support vector machine (SupportVectorMachine fast and effectively; Hereinafter referred to as: the SVM) software package of pattern-recognition and recurrence is a kind of sorter carrying out two class classification.
By above-mentioned detection model, step 103, predicts whether PDF document to be detected is malice PDF document.
Particularly, step 103 can be: the number percent being belonged to malice PDF document by the described PDF document to be detected of above-mentioned detection model acquisition; When the number percent that above-mentioned PDF document to be detected belongs to malice PDF document is positioned at predetermined interval, determine that above-mentioned PDF document to be detected is for malice PDF document; When above-mentioned PDF document to be detected belong to malice PDF document number percent not in above-mentioned predetermined interval time, determine above-mentioned PDF document to be detected be not malice PDF document.
Wherein, above-mentioned predetermined interval can when specific implementation according to realizing the sets itself such as demand and/or system performance, and the size of the present embodiment to predetermined interval is not construed as limiting.
Above-described embodiment, by extracting eigenwert in the file structure from training PDF document, above-mentioned eigenwert is carried out study by machine learning algorithm and generates detection model, then predict whether PDF document to be detected is malice PDF document by above-mentioned detection model, because the leaching process of eigenwert is all in static resolving, do not relate to dynamic analysis, whether be malice PDF document predict, and then can improve the security of PDF document if therefore to achieve under the prerequisite of resolving in static state PDF document to be detected.
Fig. 2 is the schematic diagram of another embodiment of detection method of PDF document of the present invention, in Fig. 2, training PDF document (TrainingPDFFile) and PDF document to be detected (TestPDFFile) are all the PDF document of collecting from the external world at random, wherein, TrainingPDFFile is the malice PDF document comprising attack code.Afterwards through characteristics extraction, the eigenwert extracted from TrainingPDFFile is learnt by machine learning algorithm, comprising a series of processes such as parameter training, data normalization process, with this detection model, final generation detection model (Model), finally predicts whether TestPDFFile is malice PDF document.
Wherein, the core of machine learning algorithm utilizes Libsvm software to realize, and Libsvm software is simple, to be easy to a use and SVM pattern-recognition fast and effectively and recurrence software package, is a kind of sorter carrying out two class classification.
In order to verify the feasibility of detection model, can search at random from network and getting some PDF document and comprise the malice PDF document comprising attack code as TrainingPDFFile, TrainingPDFFile, also can have normal PDF document.Wherein malice document be announced with public leak and exposure (CommonVulnerabilities & Exposures; Hereinafter referred to as: CVE) numbering and include the vulnerability exploit file (exploit) of malicious code, Fig. 3 is the schematic diagram that the present invention trains a PDF document embodiment.
The eigenwert that three ratios are easier to distinguish malice document and normal document is chosen in these samples, as follows respectively:
(1) metadata (/Metadata): if PDF document has metadata result to be " 1 ", if PDF document does not have metadata result to be " 0 ".Wherein, generally do not comprise metadata to reduce document size in malice PDF document, and generally include metadata for reducing document size in normal PDF document.
(2) opening operation (/OpenAction/JS): malice PDF document generally can wrap left-handed javascript code, and result given here is the quantity of javascript code.
(3) number of pages (/Pages/Count): the number of pages of malice document is generally 1 page, when malice document is opened, can not jump to certain one page of document, therefore can not find "/TYPE " and "/Pages/count " these two eigenwerts.
Extract the eigenwert of TrainingPDFFile afterwards, and generating training data is as follows successively: (normal document 1, malice document 2)
11:12:03:110
11:12:03:363
11:02:03:23
11:02:03:26
11:02:03:7
21:02:13:1
21:02:13:1
21:02:13:1
21:02:13:1
21:02:13:1
21:12:13:1
Identical way, obtains test data from TestPDFFile, and that wherein TestPDFFile chooses is known malice PDF document CVE2009-0027, and supposes that it is normal document 1, thus it is as follows to obtain test data:
11:02:13:1
Afterwards these data are normalized searching outcome parameter, training pattern, generation model is predicted, these action needs operate in Libsvm.In order to test conveniently, the whole process of operation all uses default parameters, and easy.py order can be used to carry out simple forecast, acquisition predict the outcome as shown in Figure 4, Fig. 4 is the schematic diagram of the embodiment that predicts the outcome of detection model of the present invention.
What wherein deposit inside train.txt is the eigenwert that TrainPDFFile extracts, what deposit inside test.txt is the eigenwert that TestPDFFile extracts, and have recorded detailed predicting the outcome and corresponding accuracy rate inside output file test.txt.predict.As can be seen from Figure 4, the number percent that TestPDFFile belongs to normal PDF document is 0%, that is, the number percent that TestPDFFile belongs to malice PDF document is 100%, suppose predetermined interval for [70%, ∞), therefore TestPDFFile belong to malice PDF document number percent be positioned at predetermined interval, therefore can determine that TestPDFFile (CVE2009-0027) is for malice document, demonstrates the validity of the detection model that the present invention proposes.
The invention provides a kind of detection method of PDF document, and demonstrate its feasibility, but because detection model is generated by the feature of known sample, and for predicting unknown sample, so further developing along with following assault technology, new malice PDF attack pattern continues to bring out, and the detection model that the present invention proposes also can by perfect gradually.But just at present, this detection model still has very strong validity and vitality, only needs collect abundant pdf document from the external world and extract its eigenwert, and the sample of collection is more, the eigenwert extracted is more, and training predicting the outcome of detection model out also can be more accurate.
Fig. 5 is the structural representation of a pick-up unit embodiment of PDF document of the present invention, the pick-up unit of the PDF document in the present embodiment can realize the present invention's flow process embodiment illustrated in fig. 1, as shown in Figure 5, the pick-up unit of this PDF document can comprise: extraction module 51, generation module 52 and detection module 53;
Wherein, extraction module 51, for extracting eigenwert in the file structure from training PDF document, above-mentioned training PDF document comprises the malice PDF document comprising attack code; Particularly, the process that extraction module 51 extracts eigenwert from the file structure of training PDF document can realize by PdfStreamDumper instrument, also can realize automatic business processing by coding.
In the present embodiment, above-mentioned eigenwert can comprise: metadata (/Metadata), file operation behavior (such as: "/OpenAction ") and number of pages (/Pages/Count); Certainly, the embodiment of the present invention is not limited in this, and the present invention is not construed as limiting above-mentioned eigenwert, as long as the eigenwert comprised in the file structure of PDF document, such as: above-mentioned eigenwert can also comprise "/JavaScript " etc., does not repeat them here.
Generation module 52, the eigenwert for being extracted by extraction module 51 is carried out study by machine learning algorithm and is generated detection model; In the present embodiment, the core of machine learning algorithm utilizes Libsvm software to realize, and Libsvm software is simple, to be easy to a use and SVM pattern-recognition fast and effectively and recurrence software package, is a kind of sorter carrying out two class classification.
Detection module 53, the detection model for being generated by generation module 52 predicts whether PDF document to be detected is malice PDF document.
In the pick-up unit of above-mentioned PDF document, extraction module 51 is by extracting eigenwert the file structure from training PDF document, above-mentioned eigenwert is carried out study by machine learning algorithm and is generated detection model by generation module 52, then by above-mentioned detection model, detection module 53 predicts whether PDF document to be detected is malice PDF document, because extraction module 51 extracts the process of eigenwert all in static resolving, do not relate to dynamic analysis, whether therefore to achieve under the prerequisite of resolving in static state PDF document to be detected is that malice PDF document is predicted, and then the security of PDF document can be improved.
Fig. 6 is the structural representation of another embodiment of pick-up unit of PDF document of the present invention, and compared with the device shown in Fig. 5, difference is, in embodiment illustrated in fig. 6, above-mentioned training PDF document comprises at least two training PDF document; Generation module 52 can comprise: data genaration submodule 521 and model generation submodule 522;
Wherein, data genaration submodule 521, for the eigenwert generating training data extracted from the file structure of above-mentioned at least two training PDF document according to extraction module 51;
Model generation submodule 522, is normalized for the training data generated by data genaration submodule 521, and carries out study generation detection model by machine learning algorithm.
In the present embodiment, detection module 53 can comprise: obtain submodule 531 and determine submodule 532;
Wherein, obtain submodule 531, belong to the number percent of malice PDF document for being obtained PDF document to be detected by above-mentioned detection model;
Determining submodule 532, during for being positioned at predetermined interval when the number percent obtaining submodule 531 acquisition, determining that above-mentioned PDF document to be detected is for malice PDF document; When the number percent that acquisition submodule 531 obtains is not in above-mentioned predetermined interval, determine that above-mentioned PDF document to be detected is not malice PDF document.
Wherein, above-mentioned predetermined interval can when specific implementation according to realizing the sets itself such as demand and/or system performance, and the size of the present embodiment to predetermined interval is not construed as limiting.
The pick-up unit of above-mentioned PDF document achieves to be predicted the aggressiveness of document under the prerequisite of static state parsing, and then can improve the security of PDF document.
It should be noted that, in describing the invention, except as otherwise noted, the implication of " multiple " is two or more.
Describe and can be understood in process flow diagram or in this any process otherwise described or method, represent and comprise one or more for realizing the module of the code of the executable instruction of the step of specific logical function or process, fragment or part, and the scope of the preferred embodiment of the present invention comprises other realization, wherein can not according to order that is shown or that discuss, comprise according to involved function by the mode while of basic or by contrary order, carry out n-back test, this should understand by embodiments of the invention person of ordinary skill in the field.
Should be appreciated that each several part of the present invention can realize with hardware, software, firmware or their combination.In the above-described embodiment, multiple step or method can with to store in memory and the software performed by suitable instruction execution system or firmware realize.Such as, if realized with hardware, the same in another embodiment, can realize by any one in following technology well known in the art or their combination: the discrete logic with the logic gates for realizing logic function to data-signal, there is the special IC of suitable combinational logic gate circuit, programmable gate array (ProgrammableGateArray; Hereinafter referred to as: PGA), field programmable gate array (FieldProgrammableGateArray; Hereinafter referred to as: FPGA) etc.
Those skilled in the art are appreciated that realizing all or part of step that above-described embodiment method carries is that the hardware that can carry out instruction relevant by program completes, described program can be stored in a kind of computer-readable recording medium, this program perform time, step comprising embodiment of the method one or a combination set of.
In addition, each functional module in each embodiment of the present invention can be integrated in a processing module, also can be that the independent physics of modules exists, also can two or more module integrations in a module.Above-mentioned integrated module both can adopt the form of hardware to realize, and the form of software function module also can be adopted to realize.If described integrated module using the form of software function module realize and as independently production marketing or use time, also can be stored in a computer read/write memory medium.
The above-mentioned storage medium mentioned can be ROM (read-only memory), disk or CD etc.
In the description of this instructions, specific features, structure, material or feature that the description of reference term " embodiment ", " some embodiments ", " example ", " concrete example " or " some examples " etc. means to describe in conjunction with this embodiment or example are contained at least one embodiment of the present invention or example.In this manual, identical embodiment or example are not necessarily referred to the schematic representation of above-mentioned term.And the specific features of description, structure, material or feature can combine in an appropriate manner in any one or more embodiment or example.
Although illustrate and describe embodiments of the invention above, be understandable that, above-described embodiment is exemplary, can not be interpreted as limitation of the present invention, and those of ordinary skill in the art can change above-described embodiment within the scope of the invention, revises, replace and modification.

Claims (8)

1. a detection method for Portable Document format PDF document, is characterized in that, comprising:
From the file structure of training PDF document, extract eigenwert, described training PDF document comprises the malice PDF document comprising attack code;
Described eigenwert is carried out study by machine learning algorithm and generates detection model;
Predict whether PDF document to be detected is malice PDF document by described detection model.
2. method according to claim 1, is characterized in that, described training PDF document comprises at least two training PDF document; Described by described eigenwert by machine learning algorithm carry out study generate detection model comprise:
According to the eigenwert generating training data extracted in the file structure from described at least two training PDF document;
Described training data is normalized, and carries out study generation detection model by machine learning algorithm.
3. method according to claim 1, is characterized in that, describedly predicts that whether PDF document to be detected is that malice PDF document comprises by described detection model:
The number percent of malice PDF document is belonged to by the described PDF document to be detected of described detection model acquisition;
When the number percent that described PDF document to be detected belongs to malice PDF document is positioned at predetermined interval, determine that described PDF document to be detected is for malice PDF document;
When described PDF document to be detected belong to malice PDF document number percent not in described predetermined interval time, determine described PDF document to be detected be not malice PDF document.
4. the method according to claim 1-3 any one, is characterized in that, described eigenwert comprises: metadata, file operation behavior and number of pages.
5. a pick-up unit for Portable Document format PDF document, is characterized in that, comprising:
Extraction module, for extracting eigenwert in the file structure from training PDF document, described training PDF document comprises the malice PDF document comprising attack code;
Generation module, the eigenwert for being extracted by described extraction module is carried out study by machine learning algorithm and is generated detection model;
Detection module, the detection model for being generated by described generation module predicts whether PDF document to be detected is malice PDF document.
6. device according to claim 5, is characterized in that, described training PDF document comprises at least two training PDF document; Described generation module comprises: data genaration submodule and model generation submodule;
Described data genaration submodule, for the eigenwert generating training data extracted from the file structure of described at least two training PDF document according to described extraction module;
Described model generation submodule, is normalized for the training data generated by described data genaration submodule, and carries out study generation detection model by machine learning algorithm.
7. device according to claim 5, is characterized in that, described detection module comprises:
Obtain submodule, for being obtained the number percent that described PDF document to be detected belongs to malice PDF document by described detection model;
Determine submodule, for when the number percent that described acquisition submodule obtains is positioned at predetermined interval, determine that described PDF document to be detected is for malice PDF document; When the number percent that described acquisition submodule obtains is not in described predetermined interval, determine that described PDF document to be detected is not malice PDF document.
8. the device according to claim 5-7 any one, is characterized in that, the eigenwert that described extraction module extracts comprises: metadata, file operation behavior and number of pages.
CN201510391902.7A 2015-07-06 2015-07-06 Method and device for detecting portable document format document Pending CN105095756A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510391902.7A CN105095756A (en) 2015-07-06 2015-07-06 Method and device for detecting portable document format document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510391902.7A CN105095756A (en) 2015-07-06 2015-07-06 Method and device for detecting portable document format document

Publications (1)

Publication Number Publication Date
CN105095756A true CN105095756A (en) 2015-11-25

Family

ID=54576164

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510391902.7A Pending CN105095756A (en) 2015-07-06 2015-07-06 Method and device for detecting portable document format document

Country Status (1)

Country Link
CN (1) CN105095756A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778278A (en) * 2017-02-15 2017-05-31 中国科学院信息工程研究所 A kind of malice document detection method and device
CN107944273A (en) * 2017-12-14 2018-04-20 贵州航天计量测试技术研究所 A kind of malice PDF document detection method based on TF IDF algorithms and SVDD algorithms
CN109408810A (en) * 2018-09-28 2019-03-01 东巽科技(北京)有限公司 A kind of malice PDF document detection method and device
CN110119620A (en) * 2018-02-06 2019-08-13 卡巴斯基实验室股份制公司 System and method of the training for detecting the machine learning model of malice container
CN110990859A (en) * 2018-09-28 2020-04-10 第四范式(北京)技术有限公司 Method and system for executing machine learning under data privacy protection
CN111460446A (en) * 2020-03-06 2020-07-28 奇安信科技集团股份有限公司 Malicious file detection method and device based on model
CN112231701A (en) * 2020-09-29 2021-01-15 广州威尔森信息科技有限公司 PDF file processing method and device
CN112329012A (en) * 2019-07-19 2021-02-05 中国人民解放军战略支援部队信息工程大学 Detection method for malicious PDF document containing JavaScript and electronic equipment
CN112487422A (en) * 2020-10-28 2021-03-12 中国科学院信息工程研究所 Malicious document detection method and device, electronic equipment and storage medium
EP3918500B1 (en) * 2019-03-05 2024-04-24 Siemens Industry Software Inc. Machine learning-based anomaly detections for embedded software applications

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130160127A1 (en) * 2011-12-14 2013-06-20 Korea Internet & Security Agency System and method for detecting malicious code of pdf document type
CN103310150A (en) * 2012-03-13 2013-09-18 百度在线网络技术(北京)有限公司 Method and device for detecting portable document format (PDF) vulnerability
JP2014504765A (en) * 2011-01-21 2014-02-24 ファイヤアイ インク System and method for detecting malicious PDF network content

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014504765A (en) * 2011-01-21 2014-02-24 ファイヤアイ インク System and method for detecting malicious PDF network content
US20130160127A1 (en) * 2011-12-14 2013-06-20 Korea Internet & Security Agency System and method for detecting malicious code of pdf document type
CN103310150A (en) * 2012-03-13 2013-09-18 百度在线网络技术(北京)有限公司 Method and device for detecting portable document format (PDF) vulnerability

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
苟孟洛: "基于机器学习算法的恶意PDF检测模型", 《计算机安全》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778278B (en) * 2017-02-15 2019-09-10 中国科学院信息工程研究所 A kind of malice document detection method and device
CN106778278A (en) * 2017-02-15 2017-05-31 中国科学院信息工程研究所 A kind of malice document detection method and device
CN107944273A (en) * 2017-12-14 2018-04-20 贵州航天计量测试技术研究所 A kind of malice PDF document detection method based on TF IDF algorithms and SVDD algorithms
CN110119620B (en) * 2018-02-06 2023-05-23 卡巴斯基实验室股份制公司 System and method for training machine learning model for detecting malicious containers
CN110119620A (en) * 2018-02-06 2019-08-13 卡巴斯基实验室股份制公司 System and method of the training for detecting the machine learning model of malice container
CN109408810A (en) * 2018-09-28 2019-03-01 东巽科技(北京)有限公司 A kind of malice PDF document detection method and device
CN110990859A (en) * 2018-09-28 2020-04-10 第四范式(北京)技术有限公司 Method and system for executing machine learning under data privacy protection
EP3918500B1 (en) * 2019-03-05 2024-04-24 Siemens Industry Software Inc. Machine learning-based anomaly detections for embedded software applications
CN112329012A (en) * 2019-07-19 2021-02-05 中国人民解放军战略支援部队信息工程大学 Detection method for malicious PDF document containing JavaScript and electronic equipment
CN111460446A (en) * 2020-03-06 2020-07-28 奇安信科技集团股份有限公司 Malicious file detection method and device based on model
CN111460446B (en) * 2020-03-06 2023-04-11 奇安信科技集团股份有限公司 Malicious file detection method and device based on model
CN112231701A (en) * 2020-09-29 2021-01-15 广州威尔森信息科技有限公司 PDF file processing method and device
CN112487422A (en) * 2020-10-28 2021-03-12 中国科学院信息工程研究所 Malicious document detection method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN105095756A (en) Method and device for detecting portable document format document
CN109936582B (en) Method and device for constructing malicious traffic detection model based on PU learning
WO2021096649A1 (en) Detecting unknown malicious content in computer systems
CN102291392B (en) Hybrid intrusion detection method based on Bagging algorithm
CN111652290B (en) Method and device for detecting countermeasure sample
CN104503874A (en) Hard disk failure prediction method for cloud computing platform
US11580222B2 (en) Automated malware analysis that automatically clusters sandbox reports of similar malware samples
CN109902024A (en) A kind of grey box testing method and device of Program path sensitivity
Song et al. Permission Sensitivity-Based Malicious Application Detection for Android
CN110263539A (en) A kind of Android malicious application detection method and system based on concurrent integration study
CN109002810A (en) Model evaluation method, Radar Signal Recognition method and corresponding intrument
CN106874760A (en) A kind of Android malicious code sorting techniques based on hierarchy type SimHash
CN106850338A (en) A kind of R+1 classes application protocol recognition method and device based on semantic analysis
CN105630656A (en) Log model based system robustness analysis method and apparatus
Kundu et al. Application of machine learning in hardware trojan detection
CN109492401B (en) Content carrier risk detection method, device, equipment and medium
CN110162472A (en) A kind of method for generating test case based on fuzzing test
Paramkusem et al. Classifying categories of SCADA attacks in a big data framework
Xu et al. Reentrancy vulnerability detection of smart contract based on bidirectional sequential neural network with hierarchical attention mechanism
Remmide et al. Detection of phishing URLs using temporal convolutional network
CN113098989A (en) Dictionary generation method, domain name detection method, device, equipment and medium
CN113572770B (en) Method and device for detecting domain name generated by domain name generation algorithm
Gaykar et al. A Hybrid Supervised Learning Approach for Detection and Mitigation of Job Failure with Virtual Machines in Distributed Environments.
KR102405799B1 (en) Method and system for providing continuous adaptive learning over time for real time attack detection in cyberspace
CN109886119B (en) Industrial control signal-based control function classification method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20151125