CN103577766A - Safety management method and safety management system for electronic file - Google Patents

Safety management method and safety management system for electronic file Download PDF

Info

Publication number
CN103577766A
CN103577766A CN201210281084.1A CN201210281084A CN103577766A CN 103577766 A CN103577766 A CN 103577766A CN 201210281084 A CN201210281084 A CN 201210281084A CN 103577766 A CN103577766 A CN 103577766A
Authority
CN
China
Prior art keywords
confidentiality
level
electronic document
vector
storage area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210281084.1A
Other languages
Chinese (zh)
Inventor
董靖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201210281084.1A priority Critical patent/CN103577766A/en
Publication of CN103577766A publication Critical patent/CN103577766A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6209Protecting access to data via a platform, e.g. using keys or access control rules to a single file or object, e.g. in a secure envelope, encrypted and accessed using a key, or with access control rules appended to the object itself

Abstract

The invention relates to a safety management method and a safety management system for an electronic file. The safety management method comprises the following steps: predefining a safety management rule for the electronic file at a managing end and sending the safety management rule to a client; executing the following steps by the client under a predetermined condition: scanning the content of the electronic file and extracting a feature vector of the electronic file in a security space according to an extracting rule for the feature vector of the electronic file and a weight matrix of the security space; calculating a distance between the feature vector and the center vector of each security level and confirming a security vector according to a confirming rule for the security vector of the electronic file; confirming an approved storage area corresponding to the security vector according to the mapping relation between the security vector and a storage area; judging if a planning storage position or a stored position for the electronic file is contained in the approved storage area; if the planning storage position or the stored position is not contained in the approved storage area, executing a preset first operation according to an abnormal condition judging rule.

Description

The method for managing security of electronic document and system
Technical field
The present invention relates to field of information security technology, particularly a kind of method for managing security of electronic document and system.
Background technology
Along with the development of computer technology, increasing enterprise, tissue and government organs etc. depend on computing machine and process all kinds of affairs, in this course, produce continuously a large amount of electronic documents.These electronic documents belong to general document a bit, can freely circulate without restriction, but some document stores the sensitive data (or being called private data) of relative subject, such as finance data, the secret of the trade such as customer information, contract, key code, formula, technique, and military affairs and state secret etc., its owner wishes only circulation or the filing within the specific limits of these sensitive datas.
According to the content of the included sensitive data of electronic document, its owner may be divided into electronic document a plurality of levels of confidentiality, and the electronic document of each level of confidentiality can only freely circulate in the storage area district corresponding with this level of confidentiality.
Definite current existing way about level of confidentiality is that the level of confidentiality of electronic document is carried out to predefine, when creating certain electronic document, just by artificial definite its level of confidentiality, be for example " confidential " afterwards the operation of the document and filing to be limited in storage area corresponding to " confidential " level of confidentiality and to be carried out.Yet the level of confidentiality of document is to be determined by the content of document, along with continuous modification and the interpolation of document content, original definite level of confidentiality may be no longer applicable.Therefore each document fixedly the method for level of confidentiality there is defect.In this case, once need to revise the level of confidentiality of electronic document, all need manually to carry out.
The today that is explosive increase in data volume, some enterprises just produces the data of several TB in may one day, and corresponding thousands of document manually screens and manage that efficiency is extremely low undoubtedly to it.Further, by the level of confidentiality of manually carrying out dynamically to examine in time and define document, do not there is operability yet.
There are a large amount of existing electronic documents of not determining level of confidentiality in ,Ye You enterprise simultaneously, and these documents are also wished to carry out safety management (comprising filing).These affairs are as entirely very heavy by manually carrying out obvious workload.
In addition, in some situation, electronic document need to circulate to outside corresponding storage area time (this is recurrent situation), is generally that the link by manually examining is carried out, for example department head and/get rule; (t2) weight matrix in level of confidentiality space; (t3) center vector and the predetermined radii of each concerning security matters level of confidentiality in level of confidentiality space; (t4) the level of confidentiality vector of electronic document is established rules really; (t5) level of confidentiality vector is to the mapping relations between storage area; (t6) abnormal conditions decision rule;
Client is carried out following steps under predetermined case:
(a) content of scanning electron document, extracts the proper vector of this electronic document in level of confidentiality space according to the weight matrix in the extracting rule of the proper vector of described electronic document and level of confidentiality space;
(b) in described level of confidentiality space, calculate the distance of the center vector of described proper vector and described each concerning security matters level of confidentiality, according to the level of confidentiality vector of described electronic document, really establish rules and determine the level of confidentiality vector of described electronic document;
(c) according to described level of confidentiality vector, to the mapping relations between storage area, determine the vectorial corresponding storage area of checking and approving of level of confidentiality with described electronic document;
(d) judge the plan memory location of described electronic document or memory location check and approve storage area described in whether being contained in;
(e), if storage area is checked and approved described in not being contained in described plan memory location or memory location, according to described abnormal conditions decision rule, carry out the first predetermined operation.
According to second aspect of the present invention, the present invention relates to a kind of safety management system of electronic document, it comprises storage unit, management end and at least one client, wherein,
Described storage unit is for store electronic documents, and described storage unit comprises a plurality of storage areas;
Described management end at least comprises:
Management end communication unit, for realizing communicating by letter between management end and client;
Regulation management unit, for the safety management rule of predefine electronic document, the safety management rule of described electronic document at least comprises following content: (t1) extracting rule of the proper vector of electronic document; (t2) weight matrix in level of confidentiality space; (t3) center vector and the predetermined radii of each concerning security matters level of confidentiality in level of confidentiality space; (t4) the level of confidentiality vector of electronic document is established rules really; (t5) level of confidentiality vector is to the mapping relations between storage area; (t6) abnormal conditions decision rule;
Management end performance element, for the management end operation predetermined according to the information and executing from client;
Form unit, for according to from the Information generation of client or revise about the distributed data of electronic document and/or the form of scan-data;
Described at least one client at least comprises:
Client communication unit, for realizing communicating by letter between client and described management end;
Or could realize in the situation of keeper's approval.But in fact, in the office automation system, the item that need to examine is usually more and comparatively numerous and diverse, and many links of manually examining become a mere formality, and have affected the validity of safety management.
In recent years, the appearance of all kinds of new social media, collaborative work and mobile office popular, makes discovery, management and the protection of sensitive data become more difficult, also the storage of sensitive data, mandate, transfer, control is had higher requirement.
According to an investigation demonstration of Bo Naimeng research institute (Ponemon Institute), the first threat that data security faces is negligent internal staff, rather than the employee who has an ulterior motive.The IT professional person who accepts investigation represents, 88% the leakage of a state or party secret is relevant with negligent internal staff.
Common data leakage mode comprises unsafe mobile device, by mistake Email is sent to wrong addressee etc.
CN101957894A discloses a kind of conditional e-file authority controlling and managing system and method, this system comprises management end and client, the e-file forming in client is sent to management end, keyword via its setting carries out content comparison, and according to content comparison result, e-file is automatically encrypted with authority and is classified.But, this system is that the mode of comparing by keyword is encrypted e-file and classifies, because " keyword comparison " is only, for the keyword of setting, carry out the judgement of " having " or " nothing ", thereby this mode classification is comparatively coarse, is not easy to e-file to carry out flexible management.In addition, this system need to be uploaded to management end by e-file, by management end, is processed, and can not process in real time e-file.
Natural language recognition technology is an important component part of language information processing, it adopts the theory and technology of artificial intelligence that the natural language mechanism of setting is expressed and processed by computing machine program, thereby constructs the artificial intelligence technology that can understand and identify natural language.In recent years, natural language recognition technology is application to some extent in search engine technique, but does not see the report that is applied to electronic document safety management.
Summary of the invention
The object of the invention is to overcome at least in part the defect that prior art exists, a kind of method for managing security and system of electronic document is provided.
According to a first aspect of the invention, the present invention relates to a kind of method for managing security of electronic document, said method comprising the steps of:
At the safety management rule of management end predefine electronic document, and send it to client, the safety management rule of described electronic document at least comprises following content: (t1) proper vector of electronic document carries
Document analysis unit, for the content of scanning electron document, extracts the proper vector of this electronic document in level of confidentiality space according to the weight matrix in the extracting rule of the proper vector of described electronic document and level of confidentiality space;
Level of confidentiality determining unit, for calculate the distance of the center vector of described proper vector and described each concerning security matters level of confidentiality in described level of confidentiality space, really establishes rules and determines the level of confidentiality vector of described electronic document according to the level of confidentiality vector of described electronic document;
Decision unit, for determining the vectorial corresponding storage area of checking and approving of level of confidentiality with described electronic document according to described level of confidentiality vector to the mapping relations between storage area, judge the plan memory location of described electronic document or memory location check and approve storage area described in whether being contained in, when described plan memory location or when storage area has been checked and approved described in not being contained in memory location, described decision unit is determined the first predetermined operation and predetermined management end operation generating run instruction according to described abnormal conditions decision rule;
Client executing unit, for carrying out described the first predetermined operation according to the described operational order of described decision unit.
Core of the present invention is, the level of confidentiality that natural language recognition technology is used for dynamically determining to electronic document, the significant advantage bringing is thus, the present invention is not simple " having ", the level of confidentiality that " nothing " determines electronic document based on keyword, but become mathematical model to process the procedural abstraction of determining electronic document level of confidentiality, make that electronic document is carried out to multiple classifition (determining respectively level of confidentiality from a plurality of angles in other words) and become possibility, its level of confidentiality determines that mode is more efficient, reasonable, is more suitable for, in complication system, electronic document unification is carried out to safety management.
Another significant advantage of the present invention is; because the level of confidentiality of electronic document is automatically to determine by the method for natural language recognition; thereby in the situation that document content changes; can dynamically determine the level of confidentiality of electronic document and process accordingly; this is conducive to sensitive data to carry out real-time management, better protection information safety.
By method of the present invention, can effectively avoid revealing because of the careless sensitive data causing of staff, and reduce significantly for the required hand labor expending of data safety management.
According to design of the present invention, said method and system all can be further improved or be out of shape.It is for example not limited to following situation:
In a preferred embodiment, described method and described system can be extracted according to keyword and/or regular expression the proper vector of described electronic document.Thus, the present invention not only uses the key word that depends on dictionary when determining the level of confidentiality of electronic document, and introduced according to regular expression (numbering standard etc. as company contract) and carried out feature extraction, it is comparatively applicable to some special requirement in practice, makes the safety management of electronic document more flexible, practical.
In a preferred embodiment, described level of confidentiality space is a level of confidentiality space.
In a further advantageous embodiment, described level of confidentiality space can be also a plurality of levels of confidentiality space.
Preferably, when described level of confidentiality space is a plurality of levels of confidentiality space, described a plurality of levels of confidentiality space is to be defined by different weight matrix, the corresponding level of confidentiality space of certain or some levels of confidentiality.By choosing a plurality of weight matrix, determine a plurality of levels of confidentiality space, can embody the different information gain for dissimilar level of confidentiality that same entry has, thereby form better discrimination for dissimilar level of confidentiality.
Further preferably, when extracting the proper vector of described electronic document, be normalized.Can form unified standard to proper vector like this, the document of being convenient to different length contrasts.
Further preferably, the center vector of each concerning security matters level of confidentiality in described level of confidentiality space and predetermined radii are by obtaining at sample set learning.By choosing suitable sample, form sample set, can be so that the result of classification be more reasonable, accurate.
Further preferably, described method or described system can have multiple-working mode, for example, can or can between real-time mode and background mode, switch for real-time mode, background mode.According to application scenario, choose neatly suitable mode of operation and can effectively utilize system resource.The mode of operation that real-time mode switches mutually with background mode as adopted contributes to further to ensure information safety.
According to design of the present invention further preferably, under real-time mode, the operation of described predetermined case for likely causing sensitive data to be revealed, for example, preserve or copy function described electronic document for user.Thus, when there is predetermined case, start the scanning to electronic document, can whether meet safety rule by Real-Time Monitoring current operation, effectively ensured information security.
Further preferably, under real-time mode, described the first predetermined operation comprises: refuse described preservation or copy function, isolate described electronic document, require input document protection password, delete described electronic document or send warning message to management end.Obviously, different in the situation that, can carry out different operations, which kind of situation will be system will determine in to carry out which kind of operation according to abnormal conditions decision rule.An example the simplest is exactly, and not under all situations, client all can send warning message to management end, and just in the situation that being necessary individually, client just can send warning message to management end.
According to design of the present invention further preferably, under background mode, described predetermined case refers to and on backstage, carries out regularly or the scanning of not timing.Adopt background mode the operations such as scanning comparatively consuming time, calculating can be separated with regular traffic work and carry out, thereby more efficiently utilize system resource.
Further preferably, under background mode, described the first predetermined operation comprises: described electronic document is transferred to the first appointed area.By forced branch secure context, there is the file of flaw, it can be deposited separately, manage separately, be convenient to later processing.
Further preferably, described the first appointed area be isolated area or described in check and approve storage area.
In some cases, if any one or more content in the safety management rule of electronic document changes, the safety management rule that management end predefine is new also sends it to client, and client is for each electronic document execution step (a)-step (c); Judge subsequently described in each the memory location of electronic document is checked and approved storage area described in whether being contained in, when checking and approving storage area described in the memory location of some electronic documents is not contained in, according to described abnormal conditions decision rule, carry out the second predetermined operation.Like this in the situation that safety management rule changes, automatically redefine electronic document level of confidentiality and process accordingly, make also can dynamically adjust the level of confidentiality of electronic document in the situation that of not a large amount of labor intensive work, thereby realize the safety management of electronic document.
Further preferably, described predetermined the second operation comprises: described electronic document is transferred to the second appointed area.
Further preferably, described the second appointed area be isolated area or described in check and approve storage area.
According to design of the present invention further preferably, described management end performance element is alarm unit, and described predetermined management end is operating as alarm operation.
Accompanying drawing explanation
With reference to the accompanying drawing of enclosing, the more object of the present invention, function and advantage are illustrated the following description by embodiment of the present invention, wherein:
Fig. 1 schematically shows the key step that electronic document level of confidentiality is determined in design according to the present invention.
Fig. 2 schematically shows and according to a preferred embodiment of the present invention, electronic document is carried out the part flow process of safety management.
Fig. 3 schematically shows according to the block diagram of the safety management system of the electronic document of a preferred embodiment of the present invention.
Fig. 4 schematically shows according to the block diagram of the safety management system of the electronic document of another preferred embodiment of the present invention.
Embodiment
By reference to one exemplary embodiment, object of the present invention and function and will be illustrated for realizing the method for these objects and function.Yet the present invention is not limited to following disclosed one exemplary embodiment; Can to it, be realized by multi-form.The essence of instructions is only to help various equivalent modifications Integrated Understanding detail of the present invention.
The present invention relates generally to a kind of method for managing security and system of electronic document, it is in fact that the technology of application natural language recognition is extracted the information of electronic document automatically, for example document content is (as detailed in wage, customer information, key code, Contract NO, bank account etc.), the related department's character of document is (as Finance Department, research and development department, Legal Affairs Dept etc.), the time that document relates to (as 2011 wealth years etc.), according to relevant safety management rule by electronic document storage to different storage areas, for example contain the electronic document storage of sensitive data to storage area safety or High Security Level, other electronic document storage is to storage area common or Low Security Level.Described method and system can be applied to the multiple occasions such as unit, LAN (Local Area Network), wide area network, the storage area relating to can be the storage space in this locality, LAN (Local Area Network), wide area network, such as the hard disk array of local hard drive, central server, distributed memory system, cloud storage system etc.Especially, this storage area also comprises the movable memory equipment (such as USB flash disk, read-write CD, portable hard drive etc.) of the computing machine in this locality or network and the storage space of other various external units.
In the present invention, " level of confidentiality " of electronic document means in fact different classification electronic document being divided into according to certain rule, different classes of electronic document comprises different sensitive datas, for example, its owner's sensitive data is had to different importance (for example traditional mode classification that electronic document is divided into " common-secret-secret-top secret "), or sensitive data relates to different departments, and (for example electronic document belongs to respectively research and development, finance, administration waits classification), or sensitive data relates to the different time, and (for example electronic document belonged to respectively for 2010 financial years, 2011 financial years etc.).Should understand, " level of confidentiality " and " classification " has identical meaning in the present invention, and " determining level of confidentiality " and " classification " has identical meaning.
Below, first in conjunction with preferred implementation explanation the present invention, utilize natural language recognition technology to determine the principle of electronic document level of confidentiality.
For electronic document, related all features when we adopt " entry " to represent analyze and understand document, represent with character " term ".Preferably, the granularity of entry term is key word or a corresponding object of regular expression.Key word is such as being " I.D. ", " wage ", " purchase " etc.; Regular expression is such as for representing ID (identity number) card No., currency, date, coding etc.Obviously, for different subjects, determine that the entry of the level of confidentiality time institute foundation of electronic document can be the same or different.Preferably, rule of thumb determine above-mentioned entry.
Make m unordered entry form m dimension TERM vector: TERM=(term 1, term 2..., term m), correspondingly, every piece of electronic document can be expressed as the proper vector a=(a corresponding to this TERM vector 1, a 2..., a m).
As shown in Figure 1, according to the preferred embodiment of the present invention, determine that the level of confidentiality of a certain piece of writing electronic document preferably includes following steps:
Step 101: entry term frequency statistics.
In this step, the document is scanned, add up the number of times that each term occurs in the document.Preferably, this scanning process is supported keyword coupling and matching regular expressions.After frequency statistics completes, the document is expressed as word frequency vector T F=(TF 1, TF 2... TF m), TF wherein irepresent i entry term ithe number of times occurring in the document.
Step 102: calculated characteristics vector a.
In this step, a kind of easy way is to make proper vector a=TF, even a i=TF i, i=1 ... m.
But, due to document have long have short, thereby directly according to entry term ithe number of times occurring likely causes obtaining the classifying quality of expectation to document classification.In addition, the quantity of information that each entry provides when determining level of confidentiality may be not identical, in one piece of electronic document, some entry frequency of occurrences is high and do not mean that this electronic document necessarily belongs to High Security Level, and on the contrary, some entry may only occur once just causing the document to belong to High Security Level.Therefore, the different information gain comprising in order to embody different entries, is necessary entry term ithe number of times TF occurring ibe converted to eigenwert a i.Preferably, can adopt following methods that word frequency vector T F is converted to proper vector a:
1.TF *iDF method: make a i=TF iw i;
2.TFC method: result is above normalized to order
Figure BSA00000761143000081
3.LTC method: reduce the effect of TF, order
Figure BSA00000761143000082
Above-mentioned various in, w irepresent i entry term iweight, i=1 ... m.
In the present invention, with w i, i=1 ... m, the m forming as diagonal entry * m dimension diagonal matrix is called the weight matrix W in level of confidentiality space.Now, above-mentioned easy way and TF *iDF method all can be expressed as a=TF * W.Wherein, for this easy way, weight matrix W is unit matrix, that is: w i=1, i=1 ... m.
In practice, w ivalue preferably by determining at sample set learning.Document in sample set can rule of thumb be chosen, and removes noise in the situation that of needs, to obtain Optimal Learning effect.
Under a kind of preferable case, w i=log (N/DF i),
Wherein:
N is the sum of document in sample set;
DF iin sample set, to occur i entry term ithe quantity (DF of all documents iless this entry term that represents idiscrimination larger).
Thus, the extracting rule of the proper vector of electronic document refers to the rule of calculating the proper vector institute foundation of electronic document according to the content of electronic document in the present invention, and it comprises the rule (formula) of the proper vector institute foundation of the calculating electronic document adopting in determined TERM vector and above-mentioned steps 102.
Step 103: calculate similarity (distance), determine the level of confidentiality of electronic document
According to the preferred embodiment of the present invention, according to the proper vector a of electronic document, determine that the level of confidentiality under this electronic document carries out in vector space (being called level of confidentiality space in the present invention).Particularly, in level of confidentiality space, the proper vector of each electronic document can be regarded a point as, and each level of confidentiality (classification) can be regarded a region in level of confidentiality space as.When the corresponding point of proper vector of certain electronic document falls into certain corresponding region of level of confidentiality, think that this electronic document belongs to this level of confidentiality.Thus, determine that one piece of electronic document belongs to the corresponding point of proper vector which level of confidentiality namely judges this electronic document and belongs to which region in level of confidentiality space.
According to the preferred embodiment of the present invention, level of confidentiality space comprises concerning security matters level of confidentiality (being also concerning security matters level of confidentiality region) and non-concerning security matters level of confidentiality (being also non-concerning security matters level of confidentiality region).Wherein, concerning security matters level of confidentiality is the level of confidentiality under the document that contains sensitive data, and level of confidentiality space can comprise one or more concerning security matters levels of confidentiality (such as two, three, four concerning security matters levels of confidentiality etc.); Non-concerning security matters level of confidentiality is the level of confidentiality under the document that does not contain sensitive data.
In a kind of preferred implementation, the scope of each concerning security matters level of confidentiality can be determined (the point that does not fall into any one concerning security matters level of confidentiality scope belongs to non-concerning security matters level of confidentiality) by a center vector Q and a predetermined radii (or preset distance) r, the level of confidentiality of determining that by calculating the similarity (distance) of the proper vector of electronic document and the center vector of concerning security matters level of confidentiality this electronic document is affiliated.For example, when the proper vector a of one piece of electronic document is less than or equal to predetermined radii r to the distance of the center vector Q of certain concerning security matters level of confidentiality, think that this electronic document belongs to this concerning security matters level of confidentiality; When proper vector a is greater than predetermined radii r to the distance of center vector Q, think that this electronic document does not belong to this concerning security matters level of confidentiality.
Preferably, calculate the proper vector D (a of current document 1, a 2..., a m) and the center vector Q (b of target level of confidentiality 1, b 2..., b m) between the method for distance be for example:
1.Dot distance, Sim ( D , Q ) = D · Q = Σ i = 1 m ( a i × b i ) ;
2.Cosine distance (cosine similarity), Sim ( D , Q ) = D · Q | | D | | × | | Q | | = Σ i = 1 m ( a i × b i ) Σ i = 1 m a i 2 × Σ i = 1 m b i 2 ;
3.Dice distance, Sim ( D , Q ) = 2 × D · Q | | D | | 2 × | | Q | | 2 = 2 Σ i = 1 m ( a i × b i ) Σ i = 1 m a i 2 + Σ i = 1 m b i 2 ;
4.Jaccard distance, Sim ( D , Q ) = D · Q | | D | | 2 × | | Q | | 2 - D · Q = Σ i = 1 m ( a i × b i ) Σ i = 1 m a i 2 + Σ i = 1 m b i 2 - Σ i = 1 m ( a i × b i ) .
Preferably, the center vector Q of concerning security matters level of confidentiality and predetermined radii r can be by learning and being determined in conjunction with experience in sample set.As previously mentioned, the document in sample set can rule of thumb be chosen, and removes noise in the situation that of needs, to obtain Optimal Learning effect.Should understand, the document in sample set should be contained each concerning security matters level of confidentiality and non-concerning security matters level of confidentiality.Can adopt conventional sorter such as definite center vector and predetermined radiis such as Rocchio sorters.Those skilled in the art understand, by regulating weight matrix, can be so that the position of the proper vector of each sample files in level of confidentiality space change, and the center vector of each concerning security matters level of confidentiality and the position of predetermined radii that calculate thus also will change.Can rule of thumb and/or need to select weight matrix, make each concerning security matters level of confidentiality in level of confidentiality space, form enough large discrimination, thereby determine smoothly the level of confidentiality of electronic document.Obviously for different concerning security matters levels of confidentiality, predetermined radii r can be the same or different, to obtain more preferably classifying quality.Certainly, also can rule of thumb and/or need to set specific decision rule and process exceptional situation, for example, when result of calculation shows that a certain piece of writing electronic document belongs to a plurality of concerning security matters level of confidentiality, how finally determine the level of confidentiality that this electronic document is affiliated.
The simple case of definite electronic document level of confidentiality has more than been described, complex situations more discussed further below.
A preferred embodiment of the invention, level of confidentiality involved in the present invention is vector (being called level of confidentiality vector) form, it can be 1 dimension or multi-C vector.Preferably, can be rule of thumb and/or need to set up level of confidentiality region in level of confidentiality space to the mapping relations of level of confidentiality vector.For example, make one or more levels of confidentiality region in level of confidentiality space corresponding to a value of level of confidentiality vector.Adopt level of confidentiality vector (especially multidimensional level of confidentiality vector) to determine the level of confidentiality of electronic document, can make ladder of management more versatile and flexible, adapt to comparatively complicated scene.
Similar to common situation when level of confidentiality vector is 1 dimensional vector, for example, that the level of confidentiality of electronic document is divided into is common, secret, totally 3 grades of secrets.Corresponding vectorial expression way can be α=(a 1), a wherein 1can get 3 values, its span is for example { 1,2,3} (for example 1 represents " common ", and 2 represent " secret ", and 3 represent " secret ").Can think, the level of confidentiality in concept is actually a kind of special shape of level of confidentiality vector conventionally, i.e. 1 dimension level of confidentiality vector.
Under slightly complicated situation, for example the level of confidentiality of electronic document is divided according to importance and department's two aspect factors, co-exist in 3 grades of 1 grade of finance-finance and research and develop 1 grade-research and develop 3 grades to amount to 6 kinds of situations, now can make level of confidentiality vector is 2 dimensional vector α=(a 1, a 2), a wherein 1corresponding importance, can get 3 values, and span is for example { 1,2,3}, a 2corresponding department, can get 2 values, and span is for example { finance, research and development }, thereby this level of confidentiality vector has 6 values.
Under another kind of situation, also can make level of confidentiality vector is 2 dimensional vector α=(a 1, a 2), a wherein 1corresponding financial level of confidentiality, can get 4 values, and span is for example that { 0,1,2,3}, wherein the financial level of confidentiality of the larger representative of numeral is higher, 0 representative and financial irrelevant.A 2corresponding research and development level of confidentiality, can get 3 values, span be for example 0,1,2}, wherein the larger representative research and development of numeral level of confidentiality is higher, 0 representative has nothing to do with research and development.For example, one piece of content is the document of " certain development project budget ", and the value of 2 dimensional vector α of its correspondence can be (1,2), shows that its financial level of confidentiality is 1 grade, and research and development level of confidentiality is 2 grades.
Should understand, the value of level of confidentiality vector is in fact to form a unique sign to having every sort of electronic document of different sensitive datas, in order to distinguish each sort of electronic document, so its any form that can select those skilled in the art to consider appropriate, value such as each component can be continuous or discrete numeral, character string etc., as long as can reach the object of distinguishing each sort of electronic document.For different main bodys, obviously can for example according to its internal control system, choose different level of confidentiality vector spans." High Security Level " mentioned in some example about method and system of the present invention, " Low Security Level " are only a kind of exemplary descriptions, (corresponding level of confidentiality vector is for having two values { height for convenience of description electronic document to be only divided into two different levels of confidentiality, low } 1 dimension level of confidentiality vector), its object is not to limit the invention.
When carrying out entry term frequency statistics for electronic document, be once all entries to be identified and added up.But for dissimilar level of confidentiality, the information gain that each entry comprises is not necessarily identical, for example, generally, the information gain providing in research and development class level of confidentiality is provided the information gain that employee's name provides in financial class level of confidentiality.Therefore, in order to embody same entry, for dissimilar level of confidentiality, there is different information gains, different weight matrix can be set.Because weight matrix is different, different (for example in some situation, the weight of the entry relevant with finance is higher to same piece of writing electronic document role in each computing to make TERM vector, in some situation, the weight of the entry relevant with technology is higher), for dissimilar level of confidentiality, form better discrimination thus.In this case, formed in fact a plurality of levels of confidentiality space, each level of confidentiality space is to be defined by different weight matrix, the corresponding level of confidentiality space of certain or some levels of confidentiality.
Like this, in step 102, in each level of confidentiality space, calculate one about the proper vector of this electronic document.In step 103, in each level of confidentiality space, judge whether this electronic document belongs to concerning security matters level of confidentiality (while only there is a concerning security matters level of confidentiality in this level of confidentiality space) or belong to which concerning security matters level of confidentiality (while there is a plurality of concerning security matters level of confidentiality in this level of confidentiality space) in this level of confidentiality space; Gather subsequently the judged result in each level of confidentiality space, according to predetermined many spaces level of confidentiality decision rule (rule of thumb and/or need to set, it is mainly used in determining the final value of the corresponding level of confidentiality vector of this electronic document when certain electronic document all belongs to concerning security matters level of confidentiality in a plurality of levels of confidentiality space), determine the value of the corresponding level of confidentiality vector of this electronic document.
Should understand, the number in the dimension of level of confidentiality vector and level of confidentiality space does not have positive connection.In the situation that the number in level of confidentiality space is one, level of confidentiality vector can be also multi-C vector; Otherwise in the situation that having a plurality of levels of confidentiality space, level of confidentiality vector may be also 1 dimensional vector.The dimension that a kind of preferred situation is level of confidentiality vector equates with the number in level of confidentiality space, thereby forms mapping relations comparatively clearly, is convenient to electronic document to manage.
Thus, the level of confidentiality vector of electronic document is really established rules and is referred to the rule of determining the level of confidentiality vector institute foundation of electronic document according to the proper vector of electronic document in the present invention, it comprises the rule of the level of confidentiality vector of the definite electronic document adopting in above-mentioned steps 103, and it includes but not limited to determine that by proper vector rule, Exception handling rule (whenever necessary), many spaces level of confidentiality decision rule (when level of confidentiality space is while being a plurality of) of level of confidentiality and level of confidentiality region are to the mapping relations (whenever necessary) of level of confidentiality vector etc.
Below simply describe the principle that the present invention determines electronic document level of confidentiality, can understand, adopted as above method to determine that the level of confidentiality of electronic document makes classification results comparatively flexibly rationally, more adapts to complicated scene.
According to preferred embodiment, the present invention is further elaborated below.
According to a preferred embodiment of the present invention, the method for managing security of electronic document of the present invention comprises the following steps:
At the safety management rule of management end predefine electronic document, and send it to client, the safety management rule of described electronic document at least comprises following content:
(t1) extracting rule of the proper vector of electronic document;
(t2) weight matrix in level of confidentiality space;
(t3) center vector and the predetermined radii of each concerning security matters level of confidentiality in level of confidentiality space;
(t4) the level of confidentiality vector of electronic document is established rules really;
(t5) level of confidentiality vector is to the mapping relations between storage area;
(t6) abnormal conditions decision rule;
Client is carried out following steps under predetermined case:
(a) content of scanning electron document, extracts the proper vector of this electronic document in level of confidentiality space according to the weight matrix in the extracting rule of the proper vector of described electronic document and level of confidentiality space;
(b) in described level of confidentiality space, calculate the distance of the center vector of described proper vector and described each concerning security matters level of confidentiality, according to the level of confidentiality vector of described electronic document, really establish rules and determine the level of confidentiality vector of described electronic document;
(c) according to described level of confidentiality vector, to the mapping relations between storage area, determine the vectorial corresponding storage area of checking and approving of level of confidentiality with described electronic document;
(d) judge the plan memory location of described electronic document or memory location check and approve storage area described in whether being contained in;
(e), if storage area is checked and approved described in not being contained in described plan memory location or memory location, according to described abnormal conditions decision rule, carry out the first predetermined operation.
According to a preferred embodiment of the present invention, one of them main points of the present invention are the safety management rules at the pre-defined electronic document of management end, then be sent to client, by client, complete relevant authentication and the safe handling to electronic document, described client can be for one or more.The beneficial effect that adopts client to carry out authentication and safe handling to electronic document is not need electronic document to be sent to management end, thereby reduced the pressure to network, operation link is completed by client simultaneously, significantly reduce the computing pressure of management end, and improved the response speed of management end to client.
The safety management rule of electronic document comprises many-sided content, preferably, its at least comprise the dependency rule that utilizes natural language recognition technology to determine the level of confidentiality of electronic document (comprising the extracting rule of the proper vector of electronic document, the level of confidentiality vector of the center vector of each concerning security matters level of confidentiality in the weight matrix in level of confidentiality space, level of confidentiality space and predetermined radii, electronic document is established rules etc. really), about the level of confidentiality rule how corresponding with memory location (comprising that level of confidentiality vector is to mapping relations between storage area etc.) and abnormal conditions decision rule etc.About abnormal conditions decision rule, will below describe in detail.
According to optimal way of the present invention, above-mentioned safety management rule carries out predefine at management end, is sent to subsequently client.Client will be carried out safety management to electronic document according to above rule.
Fig. 2 shows and according to a preferred embodiment of the present invention, electronic document is carried out the part flow process of safety management.
As shown in Figure 2, for example, according to a preferred embodiment of the present invention, in some predetermined situation (when preserving or copying electronic document, relevant content will be described in detail below), client will be carried out following steps SA-step SE, to realize the safety management to electronic document:
Step SA: scan the content of this electronic document, extract the proper vector of this electronic document in level of confidentiality space according to the weight matrix in the extracting rule of the proper vector of electronic document and level of confidentiality space.
Step SB: calculate the distance of the center vector of described proper vector and each concerning security matters level of confidentiality in described level of confidentiality space, really establish rules and determine the level of confidentiality vector of described electronic document according to the level of confidentiality vector of electronic document.
Above-mentioned steps SA corresponds essentially to the step 101 and 102 of addressing above; Step SB corresponds essentially to the step 103 of addressing above.Obviously, now the level of confidentiality vector of this electronic document has been endowed certain particular value.Owing to formerly related content being described in detail, do not repeat them here.
Step SC: determine the vectorial corresponding storage area of checking and approving of level of confidentiality with described electronic document to the mapping relations between storage area according to level of confidentiality vector.
In the present invention, check and approve storage area and refer to the specific storage area corresponding with the value of level of confidentiality vector.Obviously, make level of confidentiality vector form the mapping relations position to expectation by electronic document storage accurately and efficiently to storage area (it is actually the set in a plurality of concrete subpool territories, and wherein part or all of subpool territory defines/is set as and respectively checks and approves storage area).Preferably, between level of confidentiality vector sum storage area, can be man-to-man relation (each value that is level of confidentiality vector is checked and approved storage area corresponding to corresponding with it one), also can be many-to-one relation (being that a plurality of values of level of confidentiality vector are corresponding to same the check and approve storage area corresponding with it), in addition, in some cases, may there is the subpool territory that part is concrete not have corresponding with it level of confidentiality vector value (for example this concrete subpool territory retains as spare area or isolated area).Can according to the value of level of confidentiality vector, determine the check and approve storage area corresponding with it by searching the method for mapping table.
Should understand, " checking and approving storage area " of mentioning in the present invention is in fact a kind of sign, and its correspondence the set of a part of memory location in (" sensing " in other words) this locality and/or network.In some cases, " checking and approving storage area " may be directly corresponding with this part memory location; Under complicated situation slightly, " checking and approving storage area " may can obtain the physical address of this part memory location after address resolution; In other cases, " checking and approving storage area " may be the mapping to this part memory location, preferably by searching the steps such as mapping table, computing and/or parsing, could obtain the physical address of this part memory location.
Check and approve the form that storage area preferably can be expressed as some file set (or set of some file and some memory device), but it is not limited to this, each checks and approves storage area need to be only the set of a part of memory location.Above-mentioned memory location can be positioned under identical or different file, under identical or different fdisk, even can be positioned at different network sites, for example, in distributed memory system or cloud storage system, somely check and approve the part memory location that storage area is corresponding and may be arranged in different network nodes.
Obviously; can realize safety management by the storage space that adopts the control devices such as different cryptographic algorithm, key, access privilege that whole storage space is divided into different protected modes; for example; to storing the storage area of sensitive data; can adopt comparatively complicated cryptographic algorithm, long key and/or the strictest access privilege etc. is set; to storing the storage area of general sensitive data, can adopt common cryptographic algorithm, short key and/or access privilege will not distinguish etc.Obviously, these type of safety management means are known technologies of this area, and those skilled in the art can take its any safety management means that consider appropriate to realize the safety management to storage area.
Step SD: judge the plan memory location of described electronic document or memory location check and approve storage area described in whether being contained in.
In step SD, extract the plan memory location of this electronic document or memory location, judge whether it is contained in the storage area of checking and approving of determining in step SC.This can comprise multiple situation, is below described in detail.
Should understand, the object of the invention is to realize the safety management of electronic document, therefore, the present invention can take plurality of operating modes, for example real-time mode, background mode, or switch between real-time mode and background mode.
When operating in real-time mode lower time, preferably in the situation that likely cause the operation that sensitive data is revealed to start scanning, now, predetermined case for example comprises that user preserves or copy function electronic document." preservation " operational example is here as being to be saved in local memory location, network storage location (also comprising the peripheral hardwares such as movable storage device that are connected to this locality or network), and its integral body that mainly refers to electronic document is preserved; " copy " operation is also " copying " operation, mainly refers to that in the present invention the integral body of electronic document copies.According to a preferred embodiment of the present invention, when the operation generation that likely causes above sensitive data to be revealed, start scanning, the plan memory location of electronic document (being the target location of above-mentioned preservation or copy function) examined, in satisfactory situation, (be only that its plan memory location is contained in the check and approve storage area corresponding with its level of confidentiality, at this moment can think that intending memory location is safe enough) just allow to continue operation, thus ensure in real time the safety of data.
When operating in background mode lower time, can periodically to the electronic document of relative subject, start scanning, whether the memory location of checking electronic document meets the requirements.Certainly, aforesaid operations also can carry out in not timing, for example, when system is idle, carry out.Now, mentioned above referring to carried out on backstage regularly or the scanning of not timing.As needed and/or desired, background scanning can automatically start also and can manually boot; Can scan whole electronic documents, also can be only portions of electronics document wherein be scanned, for example, assigned catalogue be scanned.When this method adopts background mode operation, the operations such as scanning comparatively consuming time, calculating can be separated with regular traffic work and carry out, the unlikely like this regular traffic that affects is worked.For the consideration of performance, if certain document is after last scan finishes, content does not change, and scanning rule does not change yet, and can no longer rescan this document, directly adopts the conclusion of last scan.
In addition, in the situation that having needs and/or expectation, this method also can be switched between real-time mode and background mode, realizes the flexible management to electronic document.
Obviously, above-mentioned " plan memory location " and/or " memory location " can be that physical address also can be expressed as other form.In the situation that " plan memory location " and/or " memory location " is expressed as other form, can first be resolved to physical address, judge again afterwards whether it is contained in to check and approve storage area.
According to the preferred embodiment of the present invention, judgment result is that memory location or the current residing memory location that "Yes" means that scanned electronic document will be positioned at of step SD belong to the position of mating with its level of confidentiality, and now method of the present invention finishes.Under real-time mode, this just means that client do not intervene user's behavior, and the user before sweep start operates normally and carries out.For example, in the situation that being operating as before sweep start " preservation " now proceeds to preserve operation, the safety management means of checking and approving storage area according to this are carried out routine to this electronic document and are processed (for example encrypting).Under background mode, this means this electronic document is disposed, client continues to process next electronic document, and certainly, during this, client can be carried out the operations such as routine display reminding information (such as when pre-treatment catalogue/document, processing progress etc.).
Step SE: if storage area is checked and approved described in not being contained in described plan memory location or memory location, carry out the first predetermined operation according to abnormal conditions decision rule.
When step SD judgment result is that "No" time, enter step SE: according to abnormal conditions decision rule, carry out the first predetermined operation, to realize the safety management to electronic document.The method finishes subsequently.
Should understand, judgment result is that memory location or the current present memory location that "No" illustrates that this electronic document will be positioned at of step SD belong to and the unmatched position of its level of confidentiality, and the High Security Level electronic document that for example contains sensitive data is intended being stored to (or being stored in) corresponding to comparatively unsafe storage area of Low Security Level; Or the sensitive data originally including in the electronic document of sensitive data is deleted, now according to this electronic document of the safety management rule of electronic document, should not deposit in the safer storage area corresponding to the electronic document of High Security Level again.In the case, by carrying out predetermined the first operation, make the memory location of this electronic document and its level of confidentiality match (or waiting until further processing manually or automatically).
Preferably, when method of the present invention runs on real-time mode lower time (now scanning is for example preserved electronic document user or started during copy function), the plan memory location that judgment result is that "No" explanation electronic document of step SD is not positioned at the check and approve storage area corresponding with this electronic document, thereby the operation of intending carrying out is considered to not meet safety rule, should not implement.Now, the first predetermined operational example is as comprised: preservation or copy function as described in refusing.This belongs to comparatively enforceable processing mode.Preferably eject information and inform that its operation of user is rejected.Thus, the preservation before sweep start or copy function can not be proceeded because being rejected, and have effectively ensured data security.
Certainly; except aforesaid way; also can otherwise to electronic document, process; for example, the first predetermined operation also can be for isolating described electronic document, requirement input document protection password or deleting described electronic document, send warning message and/or information from storage area to user that point out determined level of confidentiality and/or check and approve is carried out further operation etc. for user to management end.
Preferably, when method of the present invention operates in background mode lower time, the memory location that judgment result is that "No" explanation electronic document of step SD is not positioned at the check and approve storage area corresponding with this electronic document, thereby is considered to not meet safety rule.Now, the first predetermined operation comprises: described electronic document is transferred to the first appointed area.Preferably, this first appointed area for this electronic document corresponding check and approve storage area, thereby make this electronic document be transferred to the check and approve storage area corresponding with its level of confidentiality, realized like this automatic safe management to electronic document.In addition, this first appointed area can be also isolated area, after this electronic document that is transferred to isolated area can be processed via artificial treatment or through other method, like this, at least can effectively screen out those electronic documents (being level of confidentiality those electronic documents not corresponding with checking and approving storage area) that memory location exists flaw and be isolated, at least realizing the semi-automatic management to electronic document.
Obviously, except aforesaid way, predetermined the first operation also can be taked alternate manner, such as sending warning message to management end, eject information by the real-time manual handle of client etc.
Should understand, no matter under real-time mode or background mode, carry out the first predetermined operation and be not limited to only carry out an operation (action), can be to carry out a plurality of operations that for example comprise above-mentioned each operation simultaneously, for example, at refusal current operation or when shifting described electronic document, to management end, send relevant information.In other words, described the first predetermined operation can comprise a plurality of operations.Certainly, under background mode, also can in scanning process and/or after the end of scan, to management end, send about the scanning every terms of information of situation and/or the storage information of electronic document as needed and/or desired, also be convenient to management end and form form or journal file.
As needed and/or desired, also can in the first predetermined operation of client executing, at management end, carry out predetermined management end operation in some cases.Preferred management end operational example is as for reporting to the police, generate journal file etc., and wherein alarm operation is such as for ejecting information, sending Email, send note etc. to assigned number to assigned address.In addition, management end also can generate regularly, irregularly or as required or revise about the distributed data of electronic document and/or the form of scan-data, so that realize the safe and effective management to electronic document.
Should understand, for example, under different pattern (real-time mode or background mode), position for different user/user groups, equipment in network topology, level of confidentiality vector and/or different Run-time scenario (such as preserving or copy) etc., client can be carried out different operation (such as refusal, isolation, warning etc.), and management end is also carried out certain operation in the case of necessary.For example, when predetermined case is when under real-time mode, user carries out copy function, when step SD is when the determination result is NO, for different users, client may be carried out different operations: for example, for compared with the user of super-ordinate right, the operation of client executing refusal, for the user of more rudimentary authority, the operation of client executing refusal and the operation that sends warning message to management end, management end is carried out alarm operation simultaneously.Here judge that client and/or management end should carry out the rule of which or which operation institute foundation and be abnormal conditions decision rule.
Therefore, in the present invention, abnormal conditions decision rule refers to when the Huo Yi memory location, plan memory location of electronic document does not belong to while checking and approving storage area, the rule of client and/or management end executable operations institute foundation.Particularly, it refers to that according to operational mode, Run-time scenario, equipment the position in network topology, level of confidentiality vector and/or user/information such as user's group determine that now client, management end should carry out the decision rule of which or which operation institute foundation separately.Can set as needed and/or desired abnormal conditions decision rule, thereby control client, management end, in which or which situation, carry out which or which operation.
In some cases, the safety management rule of electronic document likely changes, such as the development along with relative subject such as enterprises, some sensitive data in the past has no longer belonged to sensitive data over time, but there is again the sensitive data that some are new, now needed, according to new rule, electronic document is re-started to adjustment (for example filing again).
In this case, the safety management rule that management end predefine is new also sends it to client, client scans its content for each electronic document, according to the weight matrix in the extracting rule of the proper vector of electronic document and level of confidentiality space, extracts the proper vector of this electronic document in level of confidentiality space; In described level of confidentiality space, calculate the distance of the center vector of described proper vector and each concerning security matters level of confidentiality, according to the level of confidentiality vector of electronic document, really establish rules and determine the level of confidentiality vector of described electronic document; According to level of confidentiality vector, to the mapping relations between storage area, determine the vectorial corresponding storage area of checking and approving of level of confidentiality with described electronic document; Judge subsequently described in each the memory location of electronic document is checked and approved storage area described in whether being contained in, when checking and approving storage area described in the memory location of some electronic documents is not contained in, according to abnormal conditions decision rule, carry out the second predetermined operation.
Preferably, this second predetermined operation can be for being transferred to the second appointed area by this electronic document.Described the second appointed area for example for isolated area or described in check and approve storage area.
Obviously, except aforesaid way, the second predetermined operation also can be taked alternate manner, for example, to management end, send warning message.In addition, in scanning process and/or also can be to the relevant every terms of information of user's real-time prompting, such as processing progress, when the electronic document of pre-treatment, to result of electronic document etc. after the end of scan.
Obviously, carry out the second predetermined operation and be not limited to only carry out an operation (action), can be to carry out a plurality of operations that for example comprise above-mentioned each operation simultaneously, for example, when shifting described electronic document, to management end, send warning message.
Should understand, above-mentioned each step is same or similar with the step SA-SE describing above, thereby it also has respectively main points and the advantage of step SA-SE, also can be similar to the distortion of step SA-SE, does not repeat them here.In addition, the isolated area of mentioning here can be all an isolated area with the isolated area of formerly mentioning, or both are different isolated areas, and for example the first appointed area is the first isolated area, and the second appointed area is the second isolated area.
Should understand, electronic document involved in the present invention mainly refers to the e-file of Doctype, includes but not limited to that suffix is the file of doc, txt, xls, ppt, htm etc., pdf file, compressed file, Email that can editing character, file of type of webpage etc.
According to another aspect of the present invention, the invention still further relates to a kind of safety management system of electronic document.Below in conjunction with Fig. 3 and Fig. 4, this system and various modification thereof are elaborated.Wherein, in accompanying drawing, with identical Reference numeral, represent identical part.
Fig. 3 relates to according to the safety management system of electronic document of the present invention preferred implementation, and it mainly comprises storage unit, management end 210 and at least one client 220.It is for example the structure of a central server, one or more client computer and a plurality of network storage equipments.
Storage unit is for store electronic documents, and it is divided into a plurality of storage areas, thereby can store the electronic document of different security level.Preferably, storage unit comprises network storage unit 240.In some cases, it also comprises the client stores unit 230 that is positioned at client terminal local, such as local hard drive, portable hard drive etc.In embodiment as shown in Figure 3, client 220 comprises client stores unit 230.But, in some cases, storage unit does not comprise client stores unit, for example Fig. 4 illustrates another preferred implementation according to the safety management system of electronic document of the present invention, wherein, management end 220 ' only comprises network storage unit 240, and does not comprise client stores unit.As previously mentioned, can realize safety management by the storage space that adopts the control devices such as different cryptographic algorithm, key, access privilege that the whole storage space of storage unit is divided into different protected modes.
As shown in Figure 3, management end 210 further comprises regulation management unit 211, form unit 213, management end performance element 214 and management end communication unit 215.
Wherein, regulation management unit 211 is for the safety management rule of predefine electronic document.As previously mentioned, the safety management rule of electronic document at least comprises following content: (t1) extracting rule of the proper vector of electronic document; (t2) weight matrix in level of confidentiality space; (t3) center vector and the predetermined radii of each concerning security matters level of confidentiality in level of confidentiality space; (t4) the level of confidentiality vector of electronic document is established rules really; (t5) level of confidentiality vector is to the mapping relations between storage area; (t6) abnormal conditions decision rule.Above-mentioned rule is elaborated front, does not repeat them here.
Form unit 213 is for according to from the Information generation of client and/or revise about the distributed data of electronic document and/or the form of scan-data.As needed and/or desired, this form unit regularly or aperiodically generates and/or revises form, so that user understands the distribution situation of electronic document, thereby realizes the safe and effective management to electronic document.
Management end performance element 214 is for the management end operation predetermined according to the information and executing from client.As previously mentioned, predetermined management end operational example is as being alarm operation, and now this management end performance element is alarm unit.This alarm operation is such as being ejection information, sending Email, send note etc. to assigned number to assigned address.
Management end communication unit 215 is for realizing communicating by letter between management end and client, and preferably, its safety management rule by electronic document is sent to client, and receives the information from client.
In the present invention, the number of client 220 can be one and a plurality of.As shown in Figure 3, each client 220 at least comprises client communication unit 225, document analysis unit 221, level of confidentiality determining unit 222, decision unit 223, client executing unit 224.As previously mentioned, some client (but not all clients) also comprises client stores unit 230.
Wherein, client communication unit 225 is for realizing communicating by letter between client 220 and management end 210.Preferably, it is connected to management end communication unit 215, from management end communication unit 215, receive the safety management rule of electronic document, and (according to abnormal conditions decision rule, determine) to management end communication unit 215 and send warning message and/or scanning information (data) in the situation that of needs.Should understand, the communication between management end communication unit 215 and client communication unit 225 can be wired or wireless mode, can connect by LAN (Local Area Network) or wide area network.Different client communication unit can realize by identical or different mode with management end communication unit between communicate by letter.
Document analysis unit 221, for the content of scanning electron document, extracts the proper vector of this electronic document in level of confidentiality space according to the weight matrix in the extracting rule of the proper vector of electronic document and level of confidentiality space.When level of confidentiality space is while being a plurality of, it calculates respectively the proper vector of electronic document in a plurality of levels of confidentiality space.Document analysis unit 221 can be from client communication unit 225 extracting rule of proper vector and the weight matrix of level of confidentiality space (one or more) that receive electronic documents.
Level of confidentiality determining unit 222, for calculate the distance of the center vector of described proper vector and described each concerning security matters level of confidentiality level of confidentiality space (one or more), is really established rules and is determined the level of confidentiality vector of described electronic document according to the level of confidentiality vector of electronic document.Level of confidentiality determining unit 222 can be from client communication unit 225 receives the center vectors of each concerning security matters level of confidentiality and the level of confidentiality vector of predetermined radii and electronic document is established rules really.
Decision unit 223 is for determining the vectorial corresponding storage area of checking and approving of level of confidentiality with described electronic document according to level of confidentiality vector to the mapping relations between storage area, judge the plan memory location of described electronic document or memory location check and approve storage area described in whether being contained in, when described plan memory location or when storage area has been checked and approved described in not being contained in memory location, described decision unit is determined the first predetermined operation and predetermined management end operation generating run instruction according to abnormal conditions decision rule.As previously mentioned, not/intend memory location not to be contained in certain operation that all need to be scheduled at management end under all situations of checking and approving storage area, according to the abnormal conditions decision rule setting, only management end just carries out scheduled operation (for example reporting to the police) under specific circumstances.Decision unit 223 can be from client communication unit 225 receives level of confidentiality vectors to the mapping relations storage area and abnormal conditions decision rule.
Client executing unit 224 is for carrying out described the first predetermined operation according to the described operational order of described decision unit.As previously mentioned, this first predetermined operational example is as being refusal current operation, sending warning message and/or current electronic document is transferred to appointed area (such as checking and approving storage area or isolated area) etc. to management end.The in the situation that of needs, client executing unit 224 is sent to client communication unit 225 by warning message, scanning information (data) etc., and be further sent to management end communication unit 215 by client communication unit 225, by the corresponding units of management end 220, complete corresponding operation.
Should understand, the feature that the method for managing security of aforesaid electronic document has all can be individually or is applicable in combination the safety management system of this electronic document, at this, is only briefly described or gives for simplicity's sake omission.

Claims (11)

1. a method for managing security for electronic document, is characterized in that, said method comprising the steps of:
At the safety management rule of management end predefine electronic document, and send it to client, the safety management rule of described electronic document at least comprises following content:
(t1) extracting rule of the proper vector of electronic document;
(t2) weight matrix in level of confidentiality space;
(t3) center vector and the predetermined radii of each concerning security matters level of confidentiality in level of confidentiality space;
(t4) the level of confidentiality vector of electronic document is established rules really;
(t5) level of confidentiality vector is to the mapping relations between storage area;
(t6) abnormal conditions decision rule;
Client is carried out following steps under predetermined case:
(a) content of scanning electron document, extracts the proper vector of this electronic document in level of confidentiality space according to the weight matrix in the extracting rule of the proper vector of described electronic document and level of confidentiality space;
(b) in described level of confidentiality space, calculate the distance of the center vector of described proper vector and described each concerning security matters level of confidentiality, according to the level of confidentiality vector of described electronic document, really establish rules and determine the level of confidentiality vector of described electronic document;
(c) according to described level of confidentiality vector, to the mapping relations between storage area, determine the vectorial corresponding storage area of checking and approving of level of confidentiality with described electronic document;
(d) judge the plan memory location of described electronic document or memory location check and approve storage area described in whether being contained in;
(e), if storage area is checked and approved described in not being contained in described plan memory location or memory location, according to described abnormal conditions decision rule, carry out the first predetermined operation.
2. a safety management system for electronic document, comprises storage unit, management end and at least one client, it is characterized in that:
Described storage unit is for store electronic documents, and described storage unit comprises a plurality of storage areas;
Described management end at least comprises:
Management end communication unit, for realizing communicating by letter between management end and client;
Regulation management unit, for the safety management rule of predefine electronic document, the safety management rule of described electronic document at least comprises following content:
(t1) extracting rule of the proper vector of electronic document;
(t2) weight matrix in level of confidentiality space;
(t3) center vector and the predetermined radii of each concerning security matters level of confidentiality in level of confidentiality space;
(t4) the level of confidentiality vector of electronic document is established rules really;
(t5) level of confidentiality vector is to the mapping relations between storage area;
(t6) abnormal conditions decision rule;
Management end performance element, for the management end operation predetermined according to the information and executing from client;
Form unit, for according to from the Information generation of client or revise about the distributed data of electronic document and/or the form of scan-data;
Described at least one client at least comprises:
Client communication unit, for realizing communicating by letter between client and described management end;
Document analysis unit, for the content of scanning electron document, extracts the proper vector of this electronic document in level of confidentiality space according to the weight matrix in the extracting rule of the proper vector of described electronic document and level of confidentiality space;
Level of confidentiality determining unit, for calculate the distance of the center vector of described proper vector and described each concerning security matters level of confidentiality in described level of confidentiality space, really establishes rules and determines the level of confidentiality vector of described electronic document according to the level of confidentiality vector of described electronic document;
Decision unit, for determining the vectorial corresponding storage area of checking and approving of level of confidentiality with described electronic document according to described level of confidentiality vector to the mapping relations between storage area, judge the plan memory location of described electronic document or memory location check and approve storage area described in whether being contained in, when described plan memory location or when storage area has been checked and approved described in not being contained in memory location, described decision unit is determined the first predetermined operation and predetermined management end operation generating run instruction according to described abnormal conditions decision rule;
Client executing unit, for carrying out described the first predetermined operation according to the described operational order of described decision unit.
3. the method for claim 1 or system as claimed in claim 2, is characterized in that, extracts the proper vector of described electronic document according to keyword and/or regular expression.
4. the method for claim 1 or system as claimed in claim 2, is characterized in that, described level of confidentiality space is one or more levels of confidentiality spaces; Preferably, when described level of confidentiality space is a plurality of levels of confidentiality space, described a plurality of levels of confidentiality space is to be defined by different weight matrix, the corresponding level of confidentiality space of certain or some levels of confidentiality.
5. the method for claim 1 or system as claimed in claim 2, is characterized in that, when extracting the proper vector of described electronic document, is normalized.
6. the method for claim 1 or system as claimed in claim 2, is characterized in that, the center vector of each concerning security matters level of confidentiality in described level of confidentiality space and predetermined radii are by obtaining at sample set learning.
7. the method for claim 1 or system as claimed in claim 2, it is characterized in that, described method or described system are real-time mode, background mode or can between real-time mode and background mode, switch, wherein, under real-time mode, described predetermined case is for example preserved or copy function described electronic document for user, and under background mode, described predetermined case refers to and on backstage, carries out regularly or the scanning of not timing.
8. method as claimed in claim 7 or system; it is characterized in that; under real-time mode; described the first predetermined operation comprises: refuse described preservation or copy function; isolate described electronic document; require input document protection password, delete described electronic document or send warning message to management end.
9. method as claimed in claim 7 or system, is characterized in that, under background mode, described the first predetermined operation comprises: described electronic document is transferred to the first appointed area; Preferably, described the first appointed area be isolated area or described in check and approve storage area.
10. the method for claim 1, it is characterized in that, described method further comprises: if any one or more content in the safety management rule of electronic document changes, the safety management rule that management end predefine is new also sends it to client, and client is for each electronic document execution step (a)-step (c); Judge subsequently described in each the memory location of electronic document is checked and approved storage area described in whether being contained in, when checking and approving storage area described in the memory location of some electronic documents is not contained in, according to described abnormal conditions decision rule, carry out the second predetermined operation; Preferably, described the second predetermined operation comprises: described electronic document is transferred to the second appointed area; Preferably, described the second appointed area be isolated area or described in check and approve storage area.
11. systems as claimed in claim 2, is characterized in that, described management end performance element is alarm unit, and described predetermined management end is operating as alarm operation.
CN201210281084.1A 2012-08-09 2012-08-09 Safety management method and safety management system for electronic file Pending CN103577766A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210281084.1A CN103577766A (en) 2012-08-09 2012-08-09 Safety management method and safety management system for electronic file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210281084.1A CN103577766A (en) 2012-08-09 2012-08-09 Safety management method and safety management system for electronic file

Publications (1)

Publication Number Publication Date
CN103577766A true CN103577766A (en) 2014-02-12

Family

ID=50049527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210281084.1A Pending CN103577766A (en) 2012-08-09 2012-08-09 Safety management method and safety management system for electronic file

Country Status (1)

Country Link
CN (1) CN103577766A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905469A (en) * 2014-04-30 2014-07-02 电子科技大学 Safety control system and method applied to smart power grid wireless sensor network and cloud computing
CN104850797A (en) * 2015-04-30 2015-08-19 北京奇虎科技有限公司 Device security management method and apparatus
CN109714308A (en) * 2018-08-20 2019-05-03 平安普惠企业管理有限公司 The monitoring method of data, device, equipment and readable storage medium storing program for executing in the network architecture
CN109817291A (en) * 2018-12-25 2019-05-28 天津阿贝斯努科技有限公司 Clinical test document file management system and management method
CN110537185A (en) * 2017-04-20 2019-12-03 惠普发展公司,有限责任合伙企业 Document security
CN111767733A (en) * 2020-06-11 2020-10-13 安徽旅贲科技有限公司 Document security classification discrimination method based on statistical word segmentation
CN114676142A (en) * 2022-05-30 2022-06-28 佳瑛科技有限公司 Method, system, and medium for storing electronic forms in encrypted manner
CN115643018A (en) * 2022-10-14 2023-01-24 浙江星汉信息技术股份有限公司 Electronic file sharing method and system based on block chain

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030236845A1 (en) * 2002-06-19 2003-12-25 Errikos Pitsos Method and system for classifying electronic documents
CN1629837A (en) * 2003-12-17 2005-06-22 国际商业机器公司 Method and apparatus for processing, browsing and classified searching of electronic document and system thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030236845A1 (en) * 2002-06-19 2003-12-25 Errikos Pitsos Method and system for classifying electronic documents
CN1629837A (en) * 2003-12-17 2005-06-22 国际商业机器公司 Method and apparatus for processing, browsing and classified searching of electronic document and system thereof

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905469A (en) * 2014-04-30 2014-07-02 电子科技大学 Safety control system and method applied to smart power grid wireless sensor network and cloud computing
CN103905469B (en) * 2014-04-30 2017-01-04 电子科技大学 It is applied to intelligent grid radio sensing network and the safety control system of cloud computing and method
CN104850797A (en) * 2015-04-30 2015-08-19 北京奇虎科技有限公司 Device security management method and apparatus
CN110537185A (en) * 2017-04-20 2019-12-03 惠普发展公司,有限责任合伙企业 Document security
CN109714308A (en) * 2018-08-20 2019-05-03 平安普惠企业管理有限公司 The monitoring method of data, device, equipment and readable storage medium storing program for executing in the network architecture
CN109817291A (en) * 2018-12-25 2019-05-28 天津阿贝斯努科技有限公司 Clinical test document file management system and management method
CN109817291B (en) * 2018-12-25 2023-01-10 天津阿贝斯努科技有限公司 Clinical trial document management system and management method
CN111767733A (en) * 2020-06-11 2020-10-13 安徽旅贲科技有限公司 Document security classification discrimination method based on statistical word segmentation
CN114676142A (en) * 2022-05-30 2022-06-28 佳瑛科技有限公司 Method, system, and medium for storing electronic forms in encrypted manner
CN115643018A (en) * 2022-10-14 2023-01-24 浙江星汉信息技术股份有限公司 Electronic file sharing method and system based on block chain
CN115643018B (en) * 2022-10-14 2023-09-01 浙江星汉信息技术股份有限公司 Electronic file sharing method and system based on blockchain

Similar Documents

Publication Publication Date Title
CN103577766A (en) Safety management method and safety management system for electronic file
CN112235283B (en) Vulnerability description attack graph-based network attack evaluation method for power engineering control system
US11461785B2 (en) System and method to identify, classify and monetize information as an intangible asset and a production model based thereon
US8352535B2 (en) Method and system for managing confidential information
US9215197B2 (en) System, method, and computer program product for preventing image-related data loss
CN107004090A (en) For determining the dangerous statistical analysis technique that the content based on file is brought
Sarhan et al. Cyber threat intelligence sharing scheme based on federated learning for network intrusion detection
US20110307408A1 (en) System and Method for Assigning a Business Value Rating to Documents in an Enterprise
JP5125069B2 (en) Security risk management system, security risk management method, and security risk management program
CN110166451B (en) Lightweight electronic document transfer control system and method
Tsukerman Machine Learning for Cybersecurity Cookbook: Over 80 recipes on how to implement machine learning algorithms for building security systems using Python
US20160210347A1 (en) Classification and storage of documents
CN116934285B (en) Visual intelligent system and equipment for realizing digitization and entity file management
CN110572302B (en) Diskless local area network scene identification method and device and terminal
CN115409466A (en) Data acquisition management system based on big data
Feng et al. Hrs: A hybrid framework for malware detection
CN204680024U (en) Computer security based on dynamic human face recognition technology is taken precautions against and early warning system
Mead et al. Crowd sourcing the creation of personae non gratae for requirements-phase threat modeling
Gabriel et al. Analyzing malware log data to support security information and event management: Some research results
JP2019021161A (en) Security design assist system and security design assist method
CN109413048A (en) Software approach, electronic equipment and program product are extorted based on the detection of file type honey jar
CN115544543A (en) Document online management system and method for business management
Anil et al. Detection of phishing websites based on feature extraction using machine learning
CN115481108B (en) Management method and system for same data among different departments
Adamkani et al. A content filtering scheme in social sites

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140212

WD01 Invention patent application deemed withdrawn after publication