CN103795695A - Self-learning file identification method and system - Google Patents
Self-learning file identification method and system Download PDFInfo
- Publication number
- CN103795695A CN103795695A CN201210429410.9A CN201210429410A CN103795695A CN 103795695 A CN103795695 A CN 103795695A CN 201210429410 A CN201210429410 A CN 201210429410A CN 103795695 A CN103795695 A CN 103795695A
- Authority
- CN
- China
- Prior art keywords
- black
- file
- feature
- assessor
- material database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention discloses a self-learning file identification method and system. The identification method comprises the following steps: (1) establishing a dynamically-updatable white material library and black material library at a server side; (2) performing material extraction on a new sample of a client side and uploading the material to the server side; (3) an identifier of the server side indentifying the new sample; (4) returning identification results of the new sample to the client side; and (5) updating the white material library and the black material library based on the identification results of the new sample. The identification method further comprises steps of updating the identifier. Through a trainer device and an identifier automatic updating module, characteristic information and change rules of various white files and black files can be obtained in a self-learning manner with white material and black material information in the white material library and the black material library being as a knowledge source, and then the identifier can be automatically updated, so that the identifier is allowed to be improved progressively, and the function of a current file identification system is improved.
Description
Technical field
The present invention relates to computer security technical field, be specifically related to unknown file to identify to determine whether viral method and system.
Background technology
At present, computer and software engineering thereof have development fast, and the thing followed is virulent appearance also.We know, computer virus is artificial special program code, and it has the of self-replication capacity, very strong infectivity, certain latency, specifically triggering property and very large destructiveness.
In view of viral harm, traditional method for detecting virus is the method for condition code coupling: be mainly to set up virus base in user end computer, from virus base, first take out a viral condition code and side-play amount thereof, extract again the condition code of detected file according to side-play amount, compare with this viral condition code, if coupling judges that this file is as such virus document, otherwise from virus base, get the condition code of next virus, until all virus comparisons are complete, judge this file security.Traditional condition code is identified several shortcomings: 1. must there be the feature database of antivirus software this locality, and whether the accuracy of judgement depends on that whether feature database is comprehensive, upgrade; 2. feature database needs frequent upgrading, and expired virus base identification capacity cannot meet demand for security; 3. viral species increases very soon, and local feature database is also in rapid expansion, and the scan efficiency of antivirus software is declined, antivirus software to the demand of system resource also in continuous increase; 4. pair new virus does not have identification capacity.
In order to solve the above-mentioned defect of conventional art, up-to-date employing " cloud killing " technology, simply say, be exactly that user side is no longer set up virus base, but be mainly responsible for scanning and find local new file, and extract a part of characteristic information of new file, be uploaded to high in the clouds (server end), sentence poison by server end, then return results.High in the clouds is provided with many moneys assessor, enlightening formula, also there is behavior to judge, also have the special in some viral assessor of other, go to identify the fail safe of a sample by various mode, then provide a general comment by weighting or other algorithm, finally judge the fail safe of file.
Known, efficiency and the accuracy of assessor become key, and these assessors are to upgrade in every day.Because there is new viral form every day, the virus of taking the assessor of not revising yesterday just can not identify today, is also completely possible.But, in prior art, the modification to assessor and renewal be people for completing, be mainly that virus is identified flyback and artificial identify of teacher by every day, the result that some assessors are provided adjusts, thereby killing rate can be improved every day.Like this, virus identifies that teacher's workload will increase a lot, so be necessary further to improve or improve existing file identification systems.
Summary of the invention
The object of this invention is to provide a kind of file authentication method and system of self study, can automatically upgrade and revise assessor, reduce virus evaluation teacher's the workload of thinking, improve the function of existing file identification systems.The technical scheme that realizes above-mentioned purpose is as follows:
A file authentication method for self study, comprises the following steps:
(1) set up at server end white material database, the black material database that capable of dynamic upgrades;
(2) new samples of client is carried out to story extraction the end that uploads onto the server;
(3) assessor of server end is identified new samples;
(4) qualification result of new samples is returned to client;
(5) upgrade described white material database, black material database according to the qualification result of new samples;
It is characterized in that: also comprise the step of upgrading described assessor, this step comprises: a. extracts the material in described white material database, black material database; B. sum up the feature of certain class text of an annotated book part and the feature of the black file of certain class; C. extract the evaluation Rule Information that can be utilized in summary result, and offer assessor; D. automatically upgrade described assessor according to described evaluation Rule Information.
As concrete technical scheme, while summing up the feature of text of an annotated book part or the feature of black file in described step b, first dialogue file material and black file material carry out classification, carry out respectively feature summary according to the classification of classification.
As concrete technical scheme, in described step b, be by the material under same classification is compared each other, sum up the feature of certain class text of an annotated book part or the feature of the black file of certain class.
As concrete technical scheme, the evaluation Rule Information being utilized described in described step c comprises: the frequency that the black file time period occurs, the trend of black file change.
As concrete technical scheme, the evaluation Rule Information being utilized described in described step c comprises: the source of text of an annotated book part, the frequency that text of an annotated book part is misjudged.
File identification systems for self study, is characterized in that, comprising:
Client new samples scan module, for finding and the emerging file of positioning client terminal;
New samples story extraction module, for extracting the material of new samples and the end that uploads onto the server;
New samples material receiver module, for receiving the material of described new samples at server end;
Assessor, identifies new samples material, output qualification result;
Material database administration module, upgrades white material database, black material database according to the qualification result of assessor output;
White material database, black material database, for white material, melanocyte material information;
It is characterized in that, also comprise: training aids, according to the material in described white material database, black material database, obtains the evaluation Rule Information that can be utilized, and offers assessor; And the automatic update module of assessor, for automatically upgrading described assessor according to described evaluation Rule Information.
As further technical scheme, described training aids comprises: white, melanocyte material extraction module, for extracting the material of described white material database, black material database; Sort module, carries out classification for the material to extracted; Compare of analysis module, for the material under same classification is compared each other, sums up the feature of certain class text of an annotated book part or the feature of the black file of certain class; Identify Rule Information generation module, for according to the feature of the feature of described certain class text of an annotated book part or the black file of certain class, generate and identify Rule Information.
Beneficial effect of the present invention is: by the automatic update module of a training aids and assessor is set, can be take the white material in white material database, black material database, melanocyte material information as knowledge source, learn characteristic information and the Changing Pattern of all kinds of text of an annotated book parts, black file in the mode of self study, automatically upgrade again assessor with this, thereby make assessor constantly progressive, the perfect function of existing file identification systems.
Accompanying drawing explanation
The main body of the file identification systems of the self study that Fig. 1 provides for embodiment forms block diagram.
The concrete pie graph of training aids in the file identification systems of the self study that Fig. 2 provides for embodiment.
The main flow chart of the file authentication method of the self study that Fig. 3 provides for embodiment.
In the file authentication method of the self study that Fig. 4 provides for embodiment, upgrade the sub-process figure of assessor.
Embodiment
Shown in Fig. 1, the file identification systems of the self study that the present embodiment provides comprise: the new samples scan module of client, new samples story extraction module, network, and the assessor of server end, material database administration module, white material database, black material database, training aids, the automatic update module of assessor.
Wherein, new samples scan module is for finding and the emerging file of positioning client terminal; New samples story extraction module is for extracting the material of new samples and the end that uploads onto the server; New samples material receiver module is for receiving the material of described new samples at server end; Assessor, for the identification of new samples material, is exported qualification result; Material database administration module is for upgrading white material database, black material database according to the qualification result of assessor output; White material database, black material database, for white material, melanocyte material information; Training aids, for according to the material of described white material database, black material database, obtains the evaluation Rule Information that can be utilized, and offers assessor; The automatic update module of assessor is for automatically upgrading described assessor according to described evaluation Rule Information.
As shown in Figure 2, training aids specifically comprises white, melanocyte material extraction module, sort module, compare of analysis module and identifies Rule Information generation module.Wherein, white, melanocyte material extraction module is used for extracting the material of described white material database, black material database; Sort module is for carrying out classification to extracted material; Compare of analysis module, for the material under same classification is compared each other, is summed up the feature of certain class text of an annotated book part or the feature of the black file of certain class; Identify that Rule Information generation module, for according to the feature of the feature of described certain class text of an annotated book part or the black file of certain class, generates and identifies Rule Information.
As shown in Figure 3, the file authentication method of the self study that the present embodiment provides, mainly comprises the following steps: (1) sets up at server end white material database, the black material database that capable of dynamic upgrades; (2) new samples of client is carried out to story extraction the end that uploads onto the server; (3) assessor of server end is identified new samples; (4) qualification result of new samples is returned to client; (5) upgrade described white material database, black material database according to the qualification result of new samples.
In addition, as shown in Figure 4, the method that the present embodiment provides also comprises the step of upgrading described assessor, and this step specifically comprises: a. extracts the material in described white material database, black material database; B. sum up the feature of certain class text of an annotated book part and the feature of the black file of certain class; C. extract the evaluation Rule Information that can be utilized in summary result, and offer assessor; D. automatically upgrade described assessor according to described evaluation Rule Information.
Wherein, while summing up the feature of text of an annotated book part or the feature of black file in described step b, first dialogue file material and black file material carry out classification, carry out respectively feature summary according to the classification of classification.In step b, be by the material under same classification is compared each other, sum up the feature of certain class text of an annotated book part or the feature of the black file of certain class.The evaluation Rule Information being utilized described in step c comprises: the frequency that the black file time period occurs, trend of black file change etc.The evaluation Rule Information being utilized described in step c comprises: the source of text of an annotated book part, frequency that text of an annotated book part is misjudged etc.
Claims (7)
1. a file authentication method for self study, comprises the following steps:
(1) set up at server end white material database, the black material database that capable of dynamic upgrades;
(2) new samples of client is carried out to story extraction the end that uploads onto the server;
(3) assessor of server end is identified new samples;
(4) qualification result of new samples is returned to client;
(5) upgrade described white material database, black material database according to the qualification result of new samples;
It is characterized in that: also comprise the step of upgrading described assessor, this step comprises: a. extracts the material in described white material database, black material database; B. sum up the feature of certain class text of an annotated book part and the feature of the black file of certain class; C. extract the evaluation Rule Information that can be utilized in summary result, and offer assessor; D. automatically upgrade described assessor according to described evaluation Rule Information.
2. the file authentication method of self study according to claim 1, it is characterized in that: while summing up the feature of text of an annotated book part or the feature of black file in described step b, first dialogue file material and black file material carry out classification, carry out respectively feature summary according to the classification of classification.
3. the file authentication method of self study according to claim 2, is characterized in that: in described step b, be by the material under same classification is compared each other, sum up the feature of certain class text of an annotated book part or the feature of the black file of certain class.
4. the file authentication method of self study according to claim 1, is characterized in that: the evaluation Rule Information being utilized described in described step c comprises: the frequency that the black file time period occurs, the trend of black file change.
5. the file authentication method of self study according to claim 4, is characterized in that: the evaluation Rule Information being utilized described in described step c comprises: the source of text of an annotated book part, the frequency that text of an annotated book part is misjudged.
6. file identification systems for self study, is characterized in that, comprising:
Client new samples scan module, for finding and the emerging file of positioning client terminal;
New samples story extraction module, for extracting the material of new samples and the end that uploads onto the server;
New samples material receiver module, for receiving the material of described new samples at server end;
Assessor, identifies new samples material, output qualification result;
Material database administration module, upgrades white material database, black material database according to the qualification result of assessor output;
White material database, black material database, for white material, melanocyte material information;
It is characterized in that, also comprise: training aids, according to the material in described white material database, black material database, obtains the evaluation Rule Information that can be utilized, and offers assessor; And the automatic update module of assessor, for automatically upgrading described assessor according to described evaluation Rule Information.
7. the file identification systems of self study according to claim 6, is characterized in that: described training aids comprises: white, melanocyte material extraction module, for extracting the material of described white material database, black material database; Sort module, carries out classification for the material to extracted; Compare of analysis module, for the material under same classification is compared each other, sums up the feature of certain class text of an annotated book part or the feature of the black file of certain class; Identify Rule Information generation module, for according to the feature of the feature of described certain class text of an annotated book part or the black file of certain class, generate and identify Rule Information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210429410.9A CN103795695A (en) | 2012-10-31 | 2012-10-31 | Self-learning file identification method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210429410.9A CN103795695A (en) | 2012-10-31 | 2012-10-31 | Self-learning file identification method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103795695A true CN103795695A (en) | 2014-05-14 |
Family
ID=50670987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210429410.9A Pending CN103795695A (en) | 2012-10-31 | 2012-10-31 | Self-learning file identification method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103795695A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101895521A (en) * | 2009-05-22 | 2010-11-24 | 中国科学院研究生院 | Network worm detection and characteristic automatic extraction method and system |
CN101924762A (en) * | 2010-08-18 | 2010-12-22 | 奇智软件(北京)有限公司 | Cloud security-based active defense method |
CN101923617A (en) * | 2010-08-18 | 2010-12-22 | 奇智软件(北京)有限公司 | Cloud-based sample database dynamic maintaining method |
CN102081714A (en) * | 2011-01-25 | 2011-06-01 | 潘燕辉 | Cloud antivirus method based on server feedback |
CN102682237A (en) * | 2012-03-08 | 2012-09-19 | 珠海市君天电子科技有限公司 | Virus judging method and system aiming at network downloading file |
-
2012
- 2012-10-31 CN CN201210429410.9A patent/CN103795695A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101895521A (en) * | 2009-05-22 | 2010-11-24 | 中国科学院研究生院 | Network worm detection and characteristic automatic extraction method and system |
CN101924762A (en) * | 2010-08-18 | 2010-12-22 | 奇智软件(北京)有限公司 | Cloud security-based active defense method |
CN101923617A (en) * | 2010-08-18 | 2010-12-22 | 奇智软件(北京)有限公司 | Cloud-based sample database dynamic maintaining method |
CN102081714A (en) * | 2011-01-25 | 2011-06-01 | 潘燕辉 | Cloud antivirus method based on server feedback |
CN102682237A (en) * | 2012-03-08 | 2012-09-19 | 珠海市君天电子科技有限公司 | Virus judging method and system aiming at network downloading file |
Non-Patent Citations (1)
Title |
---|
许蓉,等: ""云安全"检测技术安全性分析", 《计算机工程与设计》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110837550B (en) | Knowledge graph-based question answering method and device, electronic equipment and storage medium | |
CN106557695B (en) | A kind of malicious application detection method and system | |
CN109657473B (en) | Fine-grained vulnerability detection method based on depth features | |
CN104572958B (en) | A kind of sensitive information monitoring method based on event extraction | |
CN105868108A (en) | Instruction-set-irrelevant binary code similarity detection method based on neural network | |
CN107908631A (en) | Data processing method, device, storage medium and computer equipment | |
CN105187395B (en) | The method and system of Malware network behavior detection are carried out based on couple in router | |
CN109525595A (en) | A kind of black production account recognition methods and equipment based on time flow feature | |
CN112365171B (en) | Knowledge graph-based risk prediction method, device, equipment and storage medium | |
CN103853738A (en) | Identification method for webpage information related region | |
CN105893551A (en) | Method and device for processing data and knowledge graph | |
CN111552969A (en) | Embedded terminal software code vulnerability detection method and device based on neural network | |
CN105446741B (en) | A kind of mobile applications discrimination method compared based on API | |
CN110648172B (en) | Identity recognition method and system integrating multiple mobile devices | |
Saha et al. | gcad: A near-miss clone genealogy extractor to support clone evolution analysis | |
CN111598700A (en) | Financial wind control system and method | |
CN103679034A (en) | Computer virus analyzing system based on body and virus feature extraction method | |
CN102799804A (en) | Comprehensive identification method and system for security of unknown file | |
CN114491081A (en) | Electric power data tracing method and system based on data blood relationship graph | |
CN111080491A (en) | Construction site inspection system and method based on video identification | |
CN114817925B (en) | Android malicious software detection method and system based on multi-modal graph features | |
KR101545998B1 (en) | Method for Management Integration of Runoff-Hydraulic Model Data and System thereof | |
CN113672522B (en) | Test resource compression method and related equipment | |
CN103795695A (en) | Self-learning file identification method and system | |
CN115841334A (en) | Abnormal account identification method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140514 |
|
RJ01 | Rejection of invention patent application after publication |