CN103795695A - Self-learning file identification method and system - Google Patents

Self-learning file identification method and system Download PDF

Info

Publication number
CN103795695A
CN103795695A CN201210429410.9A CN201210429410A CN103795695A CN 103795695 A CN103795695 A CN 103795695A CN 201210429410 A CN201210429410 A CN 201210429410A CN 103795695 A CN103795695 A CN 103795695A
Authority
CN
China
Prior art keywords
black
file
feature
assessor
material database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210429410.9A
Other languages
Chinese (zh)
Inventor
陈章群
赵闽
王鑫
傅盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Juntian Electronic Technology Co Ltd
Original Assignee
Zhuhai Juntian Electronic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Juntian Electronic Technology Co Ltd filed Critical Zhuhai Juntian Electronic Technology Co Ltd
Priority to CN201210429410.9A priority Critical patent/CN103795695A/en
Publication of CN103795695A publication Critical patent/CN103795695A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a self-learning file identification method and system. The identification method comprises the following steps: (1) establishing a dynamically-updatable white material library and black material library at a server side; (2) performing material extraction on a new sample of a client side and uploading the material to the server side; (3) an identifier of the server side indentifying the new sample; (4) returning identification results of the new sample to the client side; and (5) updating the white material library and the black material library based on the identification results of the new sample. The identification method further comprises steps of updating the identifier. Through a trainer device and an identifier automatic updating module, characteristic information and change rules of various white files and black files can be obtained in a self-learning manner with white material and black material information in the white material library and the black material library being as a knowledge source, and then the identifier can be automatically updated, so that the identifier is allowed to be improved progressively, and the function of a current file identification system is improved.

Description

File authentication method and the system of self study
Technical field
The present invention relates to computer security technical field, be specifically related to unknown file to identify to determine whether viral method and system.
Background technology
At present, computer and software engineering thereof have development fast, and the thing followed is virulent appearance also.We know, computer virus is artificial special program code, and it has the of self-replication capacity, very strong infectivity, certain latency, specifically triggering property and very large destructiveness.
In view of viral harm, traditional method for detecting virus is the method for condition code coupling: be mainly to set up virus base in user end computer, from virus base, first take out a viral condition code and side-play amount thereof, extract again the condition code of detected file according to side-play amount, compare with this viral condition code, if coupling judges that this file is as such virus document, otherwise from virus base, get the condition code of next virus, until all virus comparisons are complete, judge this file security.Traditional condition code is identified several shortcomings: 1. must there be the feature database of antivirus software this locality, and whether the accuracy of judgement depends on that whether feature database is comprehensive, upgrade; 2. feature database needs frequent upgrading, and expired virus base identification capacity cannot meet demand for security; 3. viral species increases very soon, and local feature database is also in rapid expansion, and the scan efficiency of antivirus software is declined, antivirus software to the demand of system resource also in continuous increase; 4. pair new virus does not have identification capacity.
In order to solve the above-mentioned defect of conventional art, up-to-date employing " cloud killing " technology, simply say, be exactly that user side is no longer set up virus base, but be mainly responsible for scanning and find local new file, and extract a part of characteristic information of new file, be uploaded to high in the clouds (server end), sentence poison by server end, then return results.High in the clouds is provided with many moneys assessor, enlightening formula, also there is behavior to judge, also have the special in some viral assessor of other, go to identify the fail safe of a sample by various mode, then provide a general comment by weighting or other algorithm, finally judge the fail safe of file.
Known, efficiency and the accuracy of assessor become key, and these assessors are to upgrade in every day.Because there is new viral form every day, the virus of taking the assessor of not revising yesterday just can not identify today, is also completely possible.But, in prior art, the modification to assessor and renewal be people for completing, be mainly that virus is identified flyback and artificial identify of teacher by every day, the result that some assessors are provided adjusts, thereby killing rate can be improved every day.Like this, virus identifies that teacher's workload will increase a lot, so be necessary further to improve or improve existing file identification systems.
Summary of the invention
The object of this invention is to provide a kind of file authentication method and system of self study, can automatically upgrade and revise assessor, reduce virus evaluation teacher's the workload of thinking, improve the function of existing file identification systems.The technical scheme that realizes above-mentioned purpose is as follows:
A file authentication method for self study, comprises the following steps:
(1) set up at server end white material database, the black material database that capable of dynamic upgrades;
(2) new samples of client is carried out to story extraction the end that uploads onto the server;
(3) assessor of server end is identified new samples;
(4) qualification result of new samples is returned to client;
(5) upgrade described white material database, black material database according to the qualification result of new samples;
It is characterized in that: also comprise the step of upgrading described assessor, this step comprises: a. extracts the material in described white material database, black material database; B. sum up the feature of certain class text of an annotated book part and the feature of the black file of certain class; C. extract the evaluation Rule Information that can be utilized in summary result, and offer assessor; D. automatically upgrade described assessor according to described evaluation Rule Information.
As concrete technical scheme, while summing up the feature of text of an annotated book part or the feature of black file in described step b, first dialogue file material and black file material carry out classification, carry out respectively feature summary according to the classification of classification.
As concrete technical scheme, in described step b, be by the material under same classification is compared each other, sum up the feature of certain class text of an annotated book part or the feature of the black file of certain class.
As concrete technical scheme, the evaluation Rule Information being utilized described in described step c comprises: the frequency that the black file time period occurs, the trend of black file change.
As concrete technical scheme, the evaluation Rule Information being utilized described in described step c comprises: the source of text of an annotated book part, the frequency that text of an annotated book part is misjudged.
File identification systems for self study, is characterized in that, comprising:
Client new samples scan module, for finding and the emerging file of positioning client terminal;
New samples story extraction module, for extracting the material of new samples and the end that uploads onto the server;
New samples material receiver module, for receiving the material of described new samples at server end;
Assessor, identifies new samples material, output qualification result;
Material database administration module, upgrades white material database, black material database according to the qualification result of assessor output;
White material database, black material database, for white material, melanocyte material information;
It is characterized in that, also comprise: training aids, according to the material in described white material database, black material database, obtains the evaluation Rule Information that can be utilized, and offers assessor; And the automatic update module of assessor, for automatically upgrading described assessor according to described evaluation Rule Information.
As further technical scheme, described training aids comprises: white, melanocyte material extraction module, for extracting the material of described white material database, black material database; Sort module, carries out classification for the material to extracted; Compare of analysis module, for the material under same classification is compared each other, sums up the feature of certain class text of an annotated book part or the feature of the black file of certain class; Identify Rule Information generation module, for according to the feature of the feature of described certain class text of an annotated book part or the black file of certain class, generate and identify Rule Information.
Beneficial effect of the present invention is: by the automatic update module of a training aids and assessor is set, can be take the white material in white material database, black material database, melanocyte material information as knowledge source, learn characteristic information and the Changing Pattern of all kinds of text of an annotated book parts, black file in the mode of self study, automatically upgrade again assessor with this, thereby make assessor constantly progressive, the perfect function of existing file identification systems.
Accompanying drawing explanation
The main body of the file identification systems of the self study that Fig. 1 provides for embodiment forms block diagram.
The concrete pie graph of training aids in the file identification systems of the self study that Fig. 2 provides for embodiment.
The main flow chart of the file authentication method of the self study that Fig. 3 provides for embodiment.
In the file authentication method of the self study that Fig. 4 provides for embodiment, upgrade the sub-process figure of assessor.
Embodiment
Shown in Fig. 1, the file identification systems of the self study that the present embodiment provides comprise: the new samples scan module of client, new samples story extraction module, network, and the assessor of server end, material database administration module, white material database, black material database, training aids, the automatic update module of assessor.
Wherein, new samples scan module is for finding and the emerging file of positioning client terminal; New samples story extraction module is for extracting the material of new samples and the end that uploads onto the server; New samples material receiver module is for receiving the material of described new samples at server end; Assessor, for the identification of new samples material, is exported qualification result; Material database administration module is for upgrading white material database, black material database according to the qualification result of assessor output; White material database, black material database, for white material, melanocyte material information; Training aids, for according to the material of described white material database, black material database, obtains the evaluation Rule Information that can be utilized, and offers assessor; The automatic update module of assessor is for automatically upgrading described assessor according to described evaluation Rule Information.
As shown in Figure 2, training aids specifically comprises white, melanocyte material extraction module, sort module, compare of analysis module and identifies Rule Information generation module.Wherein, white, melanocyte material extraction module is used for extracting the material of described white material database, black material database; Sort module is for carrying out classification to extracted material; Compare of analysis module, for the material under same classification is compared each other, is summed up the feature of certain class text of an annotated book part or the feature of the black file of certain class; Identify that Rule Information generation module, for according to the feature of the feature of described certain class text of an annotated book part or the black file of certain class, generates and identifies Rule Information.
As shown in Figure 3, the file authentication method of the self study that the present embodiment provides, mainly comprises the following steps: (1) sets up at server end white material database, the black material database that capable of dynamic upgrades; (2) new samples of client is carried out to story extraction the end that uploads onto the server; (3) assessor of server end is identified new samples; (4) qualification result of new samples is returned to client; (5) upgrade described white material database, black material database according to the qualification result of new samples.
In addition, as shown in Figure 4, the method that the present embodiment provides also comprises the step of upgrading described assessor, and this step specifically comprises: a. extracts the material in described white material database, black material database; B. sum up the feature of certain class text of an annotated book part and the feature of the black file of certain class; C. extract the evaluation Rule Information that can be utilized in summary result, and offer assessor; D. automatically upgrade described assessor according to described evaluation Rule Information.
Wherein, while summing up the feature of text of an annotated book part or the feature of black file in described step b, first dialogue file material and black file material carry out classification, carry out respectively feature summary according to the classification of classification.In step b, be by the material under same classification is compared each other, sum up the feature of certain class text of an annotated book part or the feature of the black file of certain class.The evaluation Rule Information being utilized described in step c comprises: the frequency that the black file time period occurs, trend of black file change etc.The evaluation Rule Information being utilized described in step c comprises: the source of text of an annotated book part, frequency that text of an annotated book part is misjudged etc.

Claims (7)

1. a file authentication method for self study, comprises the following steps:
(1) set up at server end white material database, the black material database that capable of dynamic upgrades;
(2) new samples of client is carried out to story extraction the end that uploads onto the server;
(3) assessor of server end is identified new samples;
(4) qualification result of new samples is returned to client;
(5) upgrade described white material database, black material database according to the qualification result of new samples;
It is characterized in that: also comprise the step of upgrading described assessor, this step comprises: a. extracts the material in described white material database, black material database; B. sum up the feature of certain class text of an annotated book part and the feature of the black file of certain class; C. extract the evaluation Rule Information that can be utilized in summary result, and offer assessor; D. automatically upgrade described assessor according to described evaluation Rule Information.
2. the file authentication method of self study according to claim 1, it is characterized in that: while summing up the feature of text of an annotated book part or the feature of black file in described step b, first dialogue file material and black file material carry out classification, carry out respectively feature summary according to the classification of classification.
3. the file authentication method of self study according to claim 2, is characterized in that: in described step b, be by the material under same classification is compared each other, sum up the feature of certain class text of an annotated book part or the feature of the black file of certain class.
4. the file authentication method of self study according to claim 1, is characterized in that: the evaluation Rule Information being utilized described in described step c comprises: the frequency that the black file time period occurs, the trend of black file change.
5. the file authentication method of self study according to claim 4, is characterized in that: the evaluation Rule Information being utilized described in described step c comprises: the source of text of an annotated book part, the frequency that text of an annotated book part is misjudged.
6. file identification systems for self study, is characterized in that, comprising:
Client new samples scan module, for finding and the emerging file of positioning client terminal;
New samples story extraction module, for extracting the material of new samples and the end that uploads onto the server;
New samples material receiver module, for receiving the material of described new samples at server end;
Assessor, identifies new samples material, output qualification result;
Material database administration module, upgrades white material database, black material database according to the qualification result of assessor output;
White material database, black material database, for white material, melanocyte material information;
It is characterized in that, also comprise: training aids, according to the material in described white material database, black material database, obtains the evaluation Rule Information that can be utilized, and offers assessor; And the automatic update module of assessor, for automatically upgrading described assessor according to described evaluation Rule Information.
7. the file identification systems of self study according to claim 6, is characterized in that: described training aids comprises: white, melanocyte material extraction module, for extracting the material of described white material database, black material database; Sort module, carries out classification for the material to extracted; Compare of analysis module, for the material under same classification is compared each other, sums up the feature of certain class text of an annotated book part or the feature of the black file of certain class; Identify Rule Information generation module, for according to the feature of the feature of described certain class text of an annotated book part or the black file of certain class, generate and identify Rule Information.
CN201210429410.9A 2012-10-31 2012-10-31 Self-learning file identification method and system Pending CN103795695A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210429410.9A CN103795695A (en) 2012-10-31 2012-10-31 Self-learning file identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210429410.9A CN103795695A (en) 2012-10-31 2012-10-31 Self-learning file identification method and system

Publications (1)

Publication Number Publication Date
CN103795695A true CN103795695A (en) 2014-05-14

Family

ID=50670987

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210429410.9A Pending CN103795695A (en) 2012-10-31 2012-10-31 Self-learning file identification method and system

Country Status (1)

Country Link
CN (1) CN103795695A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101895521A (en) * 2009-05-22 2010-11-24 中国科学院研究生院 Network worm detection and characteristic automatic extraction method and system
CN101924762A (en) * 2010-08-18 2010-12-22 奇智软件(北京)有限公司 Cloud security-based active defense method
CN101923617A (en) * 2010-08-18 2010-12-22 奇智软件(北京)有限公司 Cloud-based sample database dynamic maintaining method
CN102081714A (en) * 2011-01-25 2011-06-01 潘燕辉 Cloud antivirus method based on server feedback
CN102682237A (en) * 2012-03-08 2012-09-19 珠海市君天电子科技有限公司 Virus judging method and system aiming at network downloading file

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101895521A (en) * 2009-05-22 2010-11-24 中国科学院研究生院 Network worm detection and characteristic automatic extraction method and system
CN101924762A (en) * 2010-08-18 2010-12-22 奇智软件(北京)有限公司 Cloud security-based active defense method
CN101923617A (en) * 2010-08-18 2010-12-22 奇智软件(北京)有限公司 Cloud-based sample database dynamic maintaining method
CN102081714A (en) * 2011-01-25 2011-06-01 潘燕辉 Cloud antivirus method based on server feedback
CN102682237A (en) * 2012-03-08 2012-09-19 珠海市君天电子科技有限公司 Virus judging method and system aiming at network downloading file

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
许蓉,等: ""云安全"检测技术安全性分析", 《计算机工程与设计》 *

Similar Documents

Publication Publication Date Title
CN110837550B (en) Knowledge graph-based question answering method and device, electronic equipment and storage medium
CN106557695B (en) A kind of malicious application detection method and system
CN109657473B (en) Fine-grained vulnerability detection method based on depth features
CN104572958B (en) A kind of sensitive information monitoring method based on event extraction
CN105868108A (en) Instruction-set-irrelevant binary code similarity detection method based on neural network
CN107908631A (en) Data processing method, device, storage medium and computer equipment
CN105187395B (en) The method and system of Malware network behavior detection are carried out based on couple in router
CN109525595A (en) A kind of black production account recognition methods and equipment based on time flow feature
CN112365171B (en) Knowledge graph-based risk prediction method, device, equipment and storage medium
CN103853738A (en) Identification method for webpage information related region
CN105893551A (en) Method and device for processing data and knowledge graph
CN111552969A (en) Embedded terminal software code vulnerability detection method and device based on neural network
CN105446741B (en) A kind of mobile applications discrimination method compared based on API
CN110648172B (en) Identity recognition method and system integrating multiple mobile devices
Saha et al. gcad: A near-miss clone genealogy extractor to support clone evolution analysis
CN111598700A (en) Financial wind control system and method
CN103679034A (en) Computer virus analyzing system based on body and virus feature extraction method
CN102799804A (en) Comprehensive identification method and system for security of unknown file
CN114491081A (en) Electric power data tracing method and system based on data blood relationship graph
CN111080491A (en) Construction site inspection system and method based on video identification
CN114817925B (en) Android malicious software detection method and system based on multi-modal graph features
KR101545998B1 (en) Method for Management Integration of Runoff-Hydraulic Model Data and System thereof
CN113672522B (en) Test resource compression method and related equipment
CN103795695A (en) Self-learning file identification method and system
CN115841334A (en) Abnormal account identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140514

RJ01 Rejection of invention patent application after publication