CN112733188B - Sensitive file management method - Google Patents

Sensitive file management method Download PDF

Info

Publication number
CN112733188B
CN112733188B CN202110039654.5A CN202110039654A CN112733188B CN 112733188 B CN112733188 B CN 112733188B CN 202110039654 A CN202110039654 A CN 202110039654A CN 112733188 B CN112733188 B CN 112733188B
Authority
CN
China
Prior art keywords
sensitive
sensitive file
file set
file
files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110039654.5A
Other languages
Chinese (zh)
Other versions
CN112733188A (en
Inventor
刘进江
葛旸
杨华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerosun Corp
Original Assignee
Aerosun Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerosun Corp filed Critical Aerosun Corp
Priority to CN202110039654.5A priority Critical patent/CN112733188B/en
Publication of CN112733188A publication Critical patent/CN112733188A/en
Application granted granted Critical
Publication of CN112733188B publication Critical patent/CN112733188B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2107File encryption

Abstract

The invention discloses a sensitive file management method, which comprises the following steps: a sensitive file database is established in advance, and sensitivity is set for sensitive files stored in the sensitive file database; acquiring network flow data in a preset time period, analyzing, screening a file set of a target type, and identifying whether a sensitive file exists in the file set according to a sensitive file database; when the existence of the sensitive file is determined, carrying out cluster analysis on the sensitive file to form a sensitive file set; the method comprises the steps of obtaining sensitivity of each sensitive file in a sensitive file set, calculating sensitive aggregation degree of the sensitive file set, establishing a life cycle of the sensitive file set when the sensitive aggregation degree is determined to be larger than a preset sensitive aggregation degree threshold value, analyzing a circulation path of the sensitive file set, and sending an alarm prompt when the circulation path of the sensitive file set is determined to deviate from the preset circulation path. The leakage of the sensitive file is avoided, and the circulation process of the sensitive file is effectively monitored.

Description

Sensitive file management method
Technical Field
The invention relates to the technical field of sensitive file management, in particular to a sensitive file management method.
Background
In the current intelligent manufacturing system, the management and control of sensitive files are relatively lost, and basically, simple authority management is adopted for control. When the rights management is bypassed or the output file is verified by the right rights, the file is uncontrolled, and sensitive data is easily leaked in the reading and forwarding processes. The method can not effectively monitor the sensitive file in the process of circulation, and can not accurately search the leakage position and can not timely improve the safety level of the leakage position when the sensitive file is leaked.
Disclosure of Invention
The present invention aims to solve, at least to some extent, one of the technical problems in the above-described technology. Therefore, the invention aims to provide a sensitive file management method which can avoid the leakage of sensitive files, effectively monitor the circulation process of the sensitive files and improve the security of the sensitive files.
To achieve the above objective, an embodiment of the present invention provides a method for managing sensitive files, including:
a sensitive file database is established in advance, and sensitivity is set for sensitive files stored in the sensitive file database;
acquiring network flow data in a preset time period, analyzing, screening a file set of a target type, and identifying whether a sensitive file exists in the file set according to the sensitive file database;
when the existence of the sensitive file is determined, carrying out cluster analysis on the sensitive file to form a sensitive file set; acquiring the sensitivity of each sensitive file in the sensitive file set according to the sensitive file database, calculating the sensitive aggregation degree of the sensitive file set, and judging whether the sensitive aggregation degree is larger than a preset sensitive aggregation degree threshold value or not;
when the sensitive aggregation degree is determined to be larger than a preset sensitive aggregation degree threshold value, a life cycle of the sensitive file set is established, a circulation path of the sensitive file set is analyzed, and when the circulation path of the sensitive file set is determined to deviate from the preset circulation path, an alarm prompt is sent.
According to some embodiments of the invention, when the sensitive file set is circulated, a working key is acquired according to the sensitive file set;
the working key is used for compressing and encrypting the sensitive file set, and a first encrypted file is obtained;
the public key of the target circulation node is obtained, the working key is encrypted by the public key, and an encryption key ciphertext is obtained;
and transmitting the first encrypted file and the encrypted key ciphertext to a target circulation node through a network, wherein the target circulation node decrypts the encrypted key ciphertext based on a private key in the target circulation node to obtain a working key, and decrypting the first encrypted file by using the working key to obtain a decrypted sensitive file set.
According to some embodiments of the invention, the obtaining a working key according to the sensitive file set includes:
randomly generating a Random character string containing letters and data by utilizing a Random function;
and taking the randomly generated character string as a working key.
According to some embodiments of the invention, the obtaining the public key of the target circulation node, encrypting the working key pin by using the public key, and obtaining the encrypted key ciphertext includes:
obtaining a public key of the target circulation node through USBkey operation;
and encrypting the working key by using an asymmetric algorithm public key encryption algorithm, and obtaining an encryption key ciphertext.
According to some embodiments of the invention, identifying whether there is a sensitive file in the set of files from the sensitive file database comprises:
extracting the characteristics of the files in the file set respectively, and extracting characteristic keywords;
carrying out standardized processing on the feature keywords to obtain standardized feature keywords, and judging whether the standardized feature keywords exist in the sensitive file database;
counting the number of standardized feature keywords existing in a sensitive file database, and indicating that sensitive files exist in the file set when the number is determined to be larger than a preset number.
According to some embodiments of the present invention, when determining that there is a sensitive file, performing cluster analysis on the sensitive file to form a sensitive file set, including:
determining attribute relations among the sensitive files, and determining correlation coefficients among the sensitive files according to the attribute relations;
and sequencing the correlation coefficients among the sensitive files, determining the correlation degree among the sensitive files according to the magnitude of the correlation coefficients, and establishing a topological connection relation among the sensitive files to form a sensitive file set.
According to some embodiments of the invention, further comprising:
monitoring a sensitive file database, recording access information to the sensitive file database, and generating a sensitive file access table;
setting the maximum access times to the sensitive files in a preset time period according to the sensitivity of the sensitive files in the sensitive file database;
inquiring a sensitive file access table to obtain the access times of the target sensitive file in a preset time period, and sending out an alarm prompt when the access times are determined to be greater than the maximum access times.
According to some embodiments of the present invention, the sensitive file set is encrypted before being streamed;
decrypting the sensitive file set when the sensitive file set flows to the target position, and acquiring a flow path of the sensitive file set when decryption fails or the decrypted sensitive file set is inconsistent with the sensitive file set before encryption, and determining the flow node according to the flow path;
sequentially detecting risk levels of the circulation nodes, sequencing, screening out circulation nodes with highest risk levels, and acquiring operation logs of the circulation nodes on a sensitive file set;
analyzing the operation log, judging whether abnormal behaviors exist, carrying out alarm prompt when the abnormal behaviors exist, blocking the circulation node, reducing the risk level of the circulation node according to the abnormal behaviors during blocking, and switching on the circulation node when the risk level is smaller than a preset risk level.
According to some embodiments of the present invention, before analyzing the network traffic in the preset time period, the method further includes:
and detecting viruses of the network flow, analyzing the viruses when the viruses exist in the network flow, calculating to obtain virus values of the viruses, determining virus grades according to the virus values and a preset virus grade table, and sending out alarm grades corresponding to the virus grades.
According to some embodiments of the present invention, when the network traffic is detected by viruses, the effective rate of virus detection is calculated, and when the effective rate is determined to be less than a preset effective rate, unqualified detection information is sent out and the network traffic is re-detected;
the calculating of the effective rate of virus detection comprises:
calculating a detection difficulty coefficient S of virus data:
wherein M is the number of detected virus data; b is a detection scale factor;average length of the detected virus data; a is that i Length of the i-th virus data detected; d is the average spacing between detected adjacent virus data; l is the length of the network traffic processed by wavelet analysis; n is the number of virus data detected in the network traffic processed by wavelet analysis;
according to the detection difficulty coefficient of virus data, calculating the effective rate K of virus detection:
wherein lambda is the level of all virus data in the detected network trafficThe average time length; lambda (lambda) i For a duration in which the ith virus data is detected; t (T) i Is the noise value at the time of detecting the ith virus data.
The beneficial effects are that: each sensitive file is prevented from being monitored, a monitoring mechanism for the sensitive file set is established by carrying out cluster analysis on the sensitive files, and the sensitive file set with the sensitive aggregation degree larger than a preset sensitive aggregation degree threshold value is effectively monitored, so that system resources can be effectively saved, the monitoring quantity of the sensitive files is reduced, the monitoring complexity is reduced, and the monitoring efficiency is improved. When the sensitive file is leaked, data tracking can be performed, the leakage position can be accurately searched, the searching time is shortened, the searching speed is improved, meanwhile, the safety level of the leakage position is improved in time, and the leakage risk of the sensitive file is reduced.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of a sensitive file management method according to one embodiment of the invention;
FIG. 2 is a block diagram of a sensitive file management system in accordance with one embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
As shown in fig. 1, an embodiment of the first aspect of the present invention proposes a sensitive file management method, including steps S1 to S4:
s1, a sensitive file database is established in advance, and sensitivity is set for sensitive files stored in the sensitive file database;
s2, acquiring network flow data in a preset time period, analyzing, screening a file set of a target type, and identifying whether a sensitive file exists in the file set according to the sensitive file database;
s3, when the existence of the sensitive file is determined, carrying out cluster analysis on the sensitive file to form a sensitive file set; acquiring the sensitivity of each sensitive file in the sensitive file set according to the sensitive file database, calculating the sensitive aggregation degree of the sensitive file set, and judging whether the sensitive aggregation degree is larger than a preset sensitive aggregation degree threshold value or not;
and S4, when the sensitive aggregation degree is determined to be larger than a preset sensitive aggregation degree threshold value, establishing a life cycle of the sensitive file set, analyzing a circulation path of the sensitive file set, and when the circulation path of the sensitive file set is determined to deviate from the preset circulation path, sending an alarm prompt.
The working principle of the technical scheme is as follows: a sensitive file database is established in advance, and sensitivity is set for sensitive files stored in the sensitive file database; the sensitivity is set according to the importance degree of the sensitive file, and the higher the importance degree of the sensitive file is, the higher the sensitivity is; acquiring network flow data in a preset time period, analyzing the network flow data, screening out a file set of a target type, wherein the target type can be a text type, namely, the data of the text type is reserved, the file set of the text type is brushed out, and whether a sensitive file exists in the file set or not is identified according to the sensitive file database; when the existence of the sensitive file is determined, carrying out cluster analysis on the sensitive file to form a sensitive file set; for example, two sensitive files in the text set are detected, and cluster analysis is performed on the two sensitive files to form a sensitive file set. Acquiring the sensitivity of each sensitive file in the sensitive file set according to the sensitive file database, calculating the sensitive aggregation degree of the sensitive file set, and judging whether the sensitive aggregation degree is larger than a preset sensitive aggregation degree threshold value or not; the sensitivity aggregation degree is calculated according to the sensitivity of the sensitive files in the sensitive file set and represents the total sensitivity of the sensitive file set. When the sensitive aggregation degree is determined to be larger than a preset sensitive aggregation degree threshold value, the importance degree of the sensitive file set transmitted in a preset time period is higher, monitoring management is needed, the life cycle of the sensitive file set is established at this time, namely the sensitive file set is monitored in the whole process of transmission to the application, the circulation paths of the sensitive file set are analyzed, namely the circulation nodes through which the sensitive file set passes, when the circulation paths of the sensitive file set are determined to deviate from the preset circulation paths, or when the circulation area of the sensitive file set deviates from the circulation area, the occurrence of the possible leakage condition of the sensitive file is indicated, an alarm prompt is sent, the occurrence of the leakage event of the sensitive file is timely prevented, and the loss is reduced.
The beneficial effects of the technical scheme are that: each sensitive file is prevented from being monitored, a monitoring mechanism for the sensitive file set is established by carrying out cluster analysis on the sensitive files, and the sensitive file set with the sensitive aggregation degree larger than a preset sensitive aggregation degree threshold value is effectively monitored, so that system resources can be effectively saved, the monitoring quantity of the sensitive files is reduced, the monitoring complexity is reduced, and the monitoring efficiency is improved. When the sensitive file is leaked, data tracking can be performed, the leakage position can be accurately searched, the searching time is shortened, the searching speed is improved, meanwhile, the safety level of the leakage position is improved in time, and the leakage risk of the sensitive file is reduced.
According to some embodiments of the invention, when the sensitive file set is circulated, a working key is acquired according to the sensitive file set;
the working key is used for compressing and encrypting the sensitive file set, and a first encrypted file is obtained;
the public key of the target circulation node is obtained, the working key is encrypted by the public key, and an encryption key ciphertext is obtained;
and transmitting the first encrypted file and the encrypted key ciphertext to a target circulation node through a network, wherein the target circulation node decrypts the encrypted key ciphertext based on a private key in the target circulation node to obtain a working key, and decrypting the first encrypted file by using the working key to obtain a decrypted sensitive file set.
The technical scheme has the working principle and beneficial effects that: when the sensitive file set flows, a working secret key is obtained according to the sensitive file set; the working key is used for compressing and encrypting the sensitive file set, and a first encrypted file is obtained; the public key of the target circulation node is obtained, the working key is encrypted by the public key, and an encryption key ciphertext is obtained; and transmitting the first encrypted file and the encrypted key ciphertext to a target circulation node through a network, wherein the target circulation node decrypts the encrypted key ciphertext based on a private key in the target circulation node to obtain a working key, and decrypting the first encrypted file by using the working key to obtain a decrypted sensitive file set. The sensitive file set is compressed and encrypted by utilizing a big data technology, so that cluster hardware resources can be recycled, compression and encryption efficiency and transmission efficiency are greatly improved, the compressed and encrypted file not only improves the security of network transmission, but also greatly reduces the file volume after compression, the network transmission rate can be improved, the security of file data is further improved through double encryption operation, and the loss caused by leakage of the file data is avoided. The sensitive file set can be opened only at the target circulation node in the circulation process, and access to the sensitive file set can not be realized at other circulation nodes, so that the risk of leakage of the sensitive file set is reduced, and the security of the sensitive file set is improved.
According to some embodiments of the invention, the obtaining a working key according to the sensitive file set includes:
randomly generating a Random character string containing letters and data by utilizing a Random function;
and taking the randomly generated character string as a working key.
According to some embodiments of the invention, the obtaining a working key according to the sensitive file set includes:
and inquiring a preset sensitive aggregation degree-working key corresponding table according to the sensitive aggregation degree of the sensitive file set to obtain a working key.
According to some embodiments of the invention, the obtaining the public key of the target circulation node, encrypting the working key pin by using the public key, and obtaining the encrypted key ciphertext includes:
obtaining a public key of the target circulation node through USBkey operation;
and encrypting the working key by using an asymmetric algorithm public key encryption algorithm, and obtaining an encryption key ciphertext.
The technical scheme has the working principle and beneficial effects that: firstly, obtaining a public key of a target user through USBkey operation; finally, encrypting the working key by utilizing an asymmetric algorithm public key encryption algorithm, and obtaining an encryption key ciphertext; the asymmetric algorithm encryption is an asymmetric encryption algorithm of the base Yu Difei-Huffman key exchange, adopts public key encryption, private key decryption and one-way encryption and decryption operation, thereby realizing the directional authorized access to appointed persons or nodes, and enabling other anyone to decrypt the file.
According to some embodiments of the invention, identifying whether there is a sensitive file in the set of files from the sensitive file database comprises:
extracting the characteristics of the files in the file set respectively, and extracting characteristic keywords;
carrying out standardized processing on the feature keywords to obtain standardized feature keywords, and judging whether the standardized feature keywords exist in the sensitive file database;
counting the number of standardized feature keywords existing in a sensitive file database, and indicating that sensitive files exist in the file set when the number is determined to be larger than a preset number.
The working principle of the technical scheme is as follows: extracting the characteristics of the files in the file set respectively, and extracting characteristic keywords; carrying out standardized processing on the characteristic keywords, carrying out term mapping on the characteristic keywords, carrying out standardization processing on the format of the characteristic keywords, such as 'automobile appearance', which is the characteristic keywords extracted from the file, for example, to eliminate unnecessary words, carrying out standardized processing to obtain standardized characteristic keywords, and judging whether the standardized characteristic keywords exist in the sensitive file database; counting the number of standardized feature keywords existing in a sensitive file database, and indicating that sensitive files exist in the file set when the number is determined to be larger than a preset number.
The beneficial effects of the technical scheme are that: the method can accurately judge whether the sensitive files exist in the file set and the quantity of the sensitive files exist, simultaneously perform standardized processing on the characteristic keywords, eliminate useless words and the like, improve the matching efficiency of the standardized characteristic keywords and keywords extracted from the sensitive file database, reduce the matching time, improve the user experience, and when the quantity is determined to be larger than the preset quantity, indicate that the sensitive files exist in the file set, and improve the accuracy of judging whether the sensitive files exist in the file set.
According to some embodiments of the present invention, when determining that there is a sensitive file, performing cluster analysis on the sensitive file to form a sensitive file set, including:
determining attribute relations among the sensitive files, and determining correlation coefficients among the sensitive files according to the attribute relations;
and sequencing the correlation coefficients among the sensitive files, determining the correlation degree among the sensitive files according to the magnitude of the correlation coefficients, and establishing a topological connection relation among the sensitive files to form a sensitive file set.
The working principle of the technical scheme is as follows: determining attribute relations among the sensitive files, and determining correlation coefficients among the sensitive files according to the attribute relations; and sequencing the correlation coefficients among the sensitive files, determining the correlation degree among the sensitive files according to the magnitude of the correlation coefficients, and establishing a topological connection relation among the sensitive files to form a sensitive file set.
The beneficial effects of the technical scheme are that: the cluster analysis efficiency and the cluster analysis effect are improved, the sensitive file set is established, and the sensitive aggregation degree of the sensitive file set is conveniently and accurately calculated.
According to some embodiments of the invention, further comprising:
monitoring a sensitive file database, recording access information to the sensitive file database, and generating a sensitive file access table;
setting the maximum access times to the sensitive files in a preset time period according to the sensitivity of the sensitive files in the sensitive file database;
inquiring a sensitive file access table to obtain the access times of the target sensitive file in a preset time period, and sending out an alarm prompt when the access times are determined to be greater than the maximum access times.
The working principle of the technical scheme is as follows: monitoring a sensitive file database, recording access information to the sensitive file database, and generating a sensitive file access table; setting the maximum access times to the sensitive files in a preset time period according to the sensitivity of the sensitive files in the sensitive file database; inquiring a sensitive file access table to obtain the access times of the target sensitive file in a preset time period, and sending out an alarm prompt when the access times are determined to be greater than the maximum access times.
The beneficial effects of the technical scheme are that: the method has the advantages that the effective monitoring of the target sensitive files in the sensitive file database is realized, the access times of the target sensitive files are limited, the leakage of the target sensitive files is avoided, and the security of the target sensitive files is ensured from the sensitive file database level.
According to some embodiments of the present invention, the sensitive file set is encrypted before being streamed;
decrypting the sensitive file set when the sensitive file set flows to the target position, and acquiring a flow path of the sensitive file set when decryption fails or the decrypted sensitive file set is inconsistent with the sensitive file set before encryption, and determining the flow node according to the flow path;
sequentially detecting risk levels of the circulation nodes, sequencing, screening out circulation nodes with highest risk levels, and acquiring operation logs of the circulation nodes on a sensitive file set;
analyzing the operation log, judging whether abnormal behaviors exist, carrying out alarm prompt when the abnormal behaviors exist, blocking the circulation node, reducing the risk level of the circulation node according to the abnormal behaviors during blocking, and switching on the circulation node when the risk level is smaller than a preset risk level.
The working principle of the technical scheme is as follows: encrypting the sensitive file set before the sensitive file set is circulated; decrypting the sensitive file set when the sensitive file set is transferred to a target position, wherein the target position can be a final transfer node, and when decryption fails or the decrypted sensitive file set is inconsistent with the sensitive file set before encryption, a tamper event occurs in the transfer process of the sensitive file set, a transfer path of the sensitive file set is obtained, and the transfer node is determined according to the transfer path; sequentially detecting risk levels of the circulation nodes, sequencing, screening out circulation nodes with highest risk levels, and acquiring operation logs of the circulation nodes on a sensitive file set; analyzing the operation log, judging whether abnormal behaviors exist, carrying out alarm prompt when the abnormal behaviors exist, blocking the circulation node, reducing the risk level of the circulation node according to the abnormal behaviors during blocking, and switching on the circulation node when the risk level is smaller than a preset risk level.
The beneficial effects of the technical scheme are that: the occurrence of events such as theft, leakage and the like of the sensitive file set in the circulation process can be avoided, and the safety of data transmission is improved. When a tamper event occurs to the sensitive file set in the circulation process, the application of the sensitive file set is avoided, and the loss is reduced. Meanwhile, the circulation node of the tamper event of the sensitive file set is searched, the circulation node of the tamper event is blocked, more leakage and tamper event are prevented from happening through the circulation node, and loss is reduced; and in the blocking period, reducing the risk level of the circulation node according to the abnormal behavior, switching on the circulation node when the risk level is smaller than the preset risk level, ensuring the safety of the circulation node, and enabling the sensitive file set to flow through the circulation node under the condition that the circulation node is safe.
In an embodiment, before analyzing the network traffic in the preset time period, the method further includes:
and detecting viruses of the network flow, analyzing the viruses when the viruses exist in the network flow, calculating to obtain virus values of the viruses, determining virus grades according to the virus values and a preset virus grade table, and sending out alarm grades corresponding to the virus grades.
The technical scheme has the working principle and beneficial effects that: before analyzing the network flow in the preset time period, the method further comprises the following steps: detecting viruses of the network flow, analyzing the viruses when determining that the viruses exist in the network flow, calculating to obtain virus values of the viruses, determining virus grades according to the virus values and a preset virus grade table, and sending out alarm grades corresponding to the virus grades; the method can detect whether the virus data exists or not before analyzing the network flow data in a preset time period, if the virus data is found, the corresponding alarm level is sent out according to the virus level, and the higher the virus level is, the higher the sent alarm level is, so that a user can timely and accurately acquire the virus level, and corresponding measures are taken to eliminate the virus data, so that the leakage of sensitive files caused by the existence of the virus data in the analysis process is avoided, and the safety of the sensitive files is ensured.
In an embodiment, when the network traffic is detected by viruses, calculating the effective rate of virus detection, and when the effective rate is determined to be smaller than a preset effective rate, sending out unqualified detection information and detecting the network traffic again;
the calculating of the effective rate of virus detection comprises:
calculating a detection difficulty coefficient S of virus data:
wherein M is the number of detected virus data; b is a detection scale factor;average length of the detected virus data; a is that i Length of the i-th virus data detected; d is the average spacing between detected adjacent virus data; l is the length of the network traffic processed by wavelet analysis; n is the number of virus data detected in the network traffic processed by wavelet analysis;
according to the detection difficulty coefficient of virus data, calculating the effective rate K of virus detection:
lambda is the average duration of all virus data in the detected network flow; lambda (lambda) i For a duration in which the ith virus data is detected; t (T) i Is the noise value at the time of detecting the ith virus data.
The technical scheme has the working principle and beneficial effects that: when the network traffic is detected by viruses, calculating the effective rate of virus detection, and when the effective rate is determined to be smaller than the preset effective rate, sending out unqualified detection information and re-detecting the network traffic; the accuracy of virus data detection in network traffic is ensured, thereby being beneficial to accurately eliminating virus data and ensuring the safety of data transmission. The larger the detection scale coefficient is, the lower the accuracy of the clustering center of the obtained virus data is, and the more virus data are screened. The network flow processed by the wavelet analysis method can effectively improve the accuracy of virus detection and reduce the detection difficulty of virus data to a certain extent. The detection difficulty coefficient of the virus data can represent the difficulty of the virus data detection; noise exists in the process of detecting the virus data, the effective rate of the virus detection is accurately calculated according to the noise, the detection difficulty coefficient of the virus data and the like, namely the credibility of the detected virus data, and correct elimination measures are selected according to the detected virus data, so that the virus data is accurately eliminated.
As shown in fig. 2, an embodiment of the second aspect of the present invention proposes a sensitive file management system, including:
the preset module is used for pre-establishing a sensitive file database and setting sensitivity to sensitive files stored in the sensitive file database;
the first judging module is used for acquiring network flow data in a preset time period, analyzing the network flow data, screening a file set of a target type, and identifying whether a sensitive file exists in the file set according to the sensitive file database;
the second judging module is used for carrying out cluster analysis on the sensitive files when the sensitive files are determined to exist, so as to form a sensitive file set; acquiring the sensitivity of each sensitive file in the sensitive file set according to the sensitive file database, calculating the sensitive aggregation degree of the sensitive file set, and judging whether the sensitive aggregation degree is larger than a preset sensitive aggregation degree threshold value or not;
and the alarm module is used for establishing a life cycle of the sensitive file set when the sensitive aggregation degree is determined to be larger than a preset sensitive aggregation degree threshold value, analyzing a circulation path of the sensitive file set, and sending an alarm prompt when the circulation path of the sensitive file set is determined to deviate from the preset circulation path.
The working principle of the technical scheme is as follows: the method comprises the steps that a preset module pre-establishes a sensitive file database, and sets sensitivity to sensitive files stored in the sensitive file database; the sensitivity is set according to the importance degree of the sensitive file, and the higher the importance degree of the sensitive file is, the higher the sensitivity is; the first judging module acquires network flow data in a preset time period and analyzes the network flow data, and screens out a file set of a target type, wherein the target type can be a text type, namely, the data of the text type is reserved, the file set of the text type is brushed out, and whether a sensitive file exists in the file set or not is identified according to the sensitive file database; when the second judging module determines that the sensitive file exists, cluster analysis is carried out on the sensitive file to form a sensitive file set; for example, two sensitive files in the text set are detected, and cluster analysis is performed on the two sensitive files to form a sensitive file set. Acquiring the sensitivity of each sensitive file in the sensitive file set according to the sensitive file database, calculating the sensitive aggregation degree of the sensitive file set, and judging whether the sensitive aggregation degree is larger than a preset sensitive aggregation degree threshold value or not; the sensitivity aggregation degree is calculated according to the sensitivity of the sensitive files in the sensitive file set and represents the total sensitivity of the sensitive file set. When the sensitive aggregation degree is determined to be larger than a preset sensitive aggregation degree threshold value, the alarm module indicates that the importance degree of the sensitive file set transmitted in a preset time period is higher, monitoring management is needed, at the moment, the life cycle of the sensitive file set is established, namely, the sensitive file set is monitored in the whole process of being transmitted to an application, the circulation paths of the sensitive file set, namely, the circulation nodes through which the sensitive file set passes are analyzed, when the circulation paths of the sensitive file set are determined to deviate from the preset circulation paths, or when the circulation area of the sensitive file set deviates from the circulation area, the occurrence of the possible leakage condition of the sensitive file is indicated, an alarm prompt is sent, the occurrence of the leakage event of the sensitive file is timely prevented, and the loss is reduced.
The beneficial effects of the technical scheme are that: each sensitive file is prevented from being monitored, a monitoring mechanism for the sensitive file set is established by carrying out cluster analysis on the sensitive files, and the sensitive file set with the sensitive aggregation degree larger than a preset sensitive aggregation degree threshold value is effectively monitored, so that system resources can be effectively saved, the monitoring quantity of the sensitive files is reduced, the monitoring complexity is reduced, and the monitoring efficiency is improved. When the sensitive file is leaked, data tracking can be performed, the leakage position can be accurately searched, the searching time is shortened, the searching speed is improved, meanwhile, the safety level of the leakage position is improved in time, and the leakage risk of the sensitive file is reduced.
According to some embodiments of the invention, further comprising:
the encryption and decryption module is used for encrypting the sensitive file set before the sensitive file set is circulated; decrypting the sensitive file set when the sensitive file set flows to the target position, and acquiring a flow path of the sensitive file set when decryption fails or the decrypted sensitive file set is inconsistent with the sensitive file set before encryption, and determining the flow node according to the flow path;
the circulation node detection module is used for sequentially detecting the risk levels of the circulation nodes and sorting the circulation nodes, screening out the circulation node with the highest risk level, and obtaining an operation log of the circulation node on the sensitive file set; analyzing the operation log, judging whether abnormal behaviors exist, carrying out alarm prompt when the abnormal behaviors exist, blocking the circulation node, reducing the risk level of the circulation node according to the abnormal behaviors during blocking, and switching on the circulation node when the risk level is smaller than a preset risk level.
The working principle of the technical scheme is as follows: the encryption and decryption module encrypts the sensitive file set before the sensitive file set is circulated; decrypting the sensitive file set when the sensitive file set is transferred to a target position, wherein the target position can be a final transfer node, and when decryption fails or the decrypted sensitive file set is inconsistent with the sensitive file set before encryption, a tamper event occurs in the transfer process of the sensitive file set, a transfer path of the sensitive file set is obtained, and the transfer node is determined according to the transfer path; the circulation node detection module sequentially detects the risk levels of the circulation nodes and sorts the circulation nodes, screens out the circulation node with the highest risk level, and acquires an operation log of the circulation node on the sensitive file set; analyzing the operation log, judging whether abnormal behaviors exist, carrying out alarm prompt when the abnormal behaviors exist, blocking the circulation node, reducing the risk level of the circulation node according to the abnormal behaviors during blocking, and switching on the circulation node when the risk level is smaller than a preset risk level.
The beneficial effects of the technical scheme are that: the occurrence of events such as theft, leakage and the like of the sensitive file set in the circulation process can be avoided, and the safety of data transmission is improved. When a tamper event occurs to the sensitive file set in the circulation process, the application of the sensitive file set is avoided, and the loss is reduced. Meanwhile, the circulation node of the tamper event of the sensitive file set is searched, the circulation node of the tamper event is blocked, more leakage and tamper event are prevented from happening through the circulation node, and loss is reduced; and in the blocking period, reducing the risk level of the circulation node according to the abnormal behavior, switching on the circulation node when the risk level is smaller than the preset risk level, ensuring the safety of the circulation node, and enabling the sensitive file set to flow through the circulation node under the condition that the circulation node is safe.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (9)

1. A method of sensitive file management comprising:
a sensitive file database is established in advance, and sensitivity is set for sensitive files stored in the sensitive file database;
acquiring network flow data in a preset time period, analyzing, screening a file set of a target type, and identifying whether a sensitive file exists in the file set according to the sensitive file database;
when the existence of the sensitive file is determined, carrying out cluster analysis on the sensitive file to form a sensitive file set; acquiring the sensitivity of each sensitive file in the sensitive file set according to the sensitive file database, calculating the sensitive aggregation degree of the sensitive file set, and judging whether the sensitive aggregation degree is larger than a preset sensitive aggregation degree threshold value or not;
when the sensitive aggregation degree is determined to be larger than a preset sensitive aggregation degree threshold value, a life cycle of the sensitive file set is established, a circulation path of the sensitive file set is analyzed, and when the circulation path of the sensitive file set is determined to deviate from the preset circulation path, an alarm prompt is sent;
encrypting the sensitive file set before the sensitive file set is circulated;
decrypting the sensitive file set when the sensitive file set flows to the target position, and acquiring a flow path of the sensitive file set when decryption fails or the decrypted sensitive file set is inconsistent with the sensitive file set before encryption, and determining the flow node according to the flow path;
sequentially detecting risk levels of the circulation nodes, sequencing, screening out circulation nodes with highest risk levels, and acquiring operation logs of the circulation nodes on a sensitive file set;
analyzing the operation log, judging whether abnormal behaviors exist, carrying out alarm prompt when the abnormal behaviors exist, blocking the circulation node, reducing the risk level of the circulation node according to the abnormal behaviors during blocking, and switching on the circulation node when the risk level is smaller than a preset risk level.
2. The sensitive file management method of claim 1, wherein, when said sensitive file set is streamed, a working key is obtained from said sensitive file set;
the working key is used for compressing and encrypting the sensitive file set, and a first encrypted file is obtained;
the public key of the target circulation node is obtained, the working key is encrypted by the public key, and an encryption key ciphertext is obtained;
and transmitting the first encrypted file and the encrypted key ciphertext to a target circulation node through a network, wherein the target circulation node decrypts the encrypted key ciphertext based on a private key in the target circulation node to obtain a working key, and decrypting the first encrypted file by using the working key to obtain a decrypted sensitive file set.
3. The method for managing sensitive files as claimed in claim 2, wherein said obtaining a working key from said set of sensitive files comprises:
randomly generating a Random character string containing letters and data by utilizing a Random function;
and taking the randomly generated character string as a working key.
4. The method for managing sensitive files as claimed in claim 2, wherein said obtaining the public key of the target circulation node, encrypting the working key pin by using the public key, and obtaining the encrypted key ciphertext, comprises:
obtaining a public key of the target circulation node through USBkey operation;
and encrypting the working key by using an asymmetric algorithm public key encryption algorithm, and obtaining an encryption key ciphertext.
5. The sensitive file management method of claim 1, wherein identifying whether a sensitive file exists in said collection of files based on said sensitive file database comprises:
extracting the characteristics of the files in the file set respectively, and extracting characteristic keywords;
carrying out standardized processing on the feature keywords to obtain standardized feature keywords, and judging whether the standardized feature keywords exist in the sensitive file database;
counting the number of standardized feature keywords existing in a sensitive file database, and indicating that sensitive files exist in the file set when the number is determined to be larger than a preset number.
6. The method for managing sensitive files as claimed in claim 1, wherein when determining that there is a sensitive file, performing cluster analysis on the sensitive file to form a sensitive file set, comprising:
determining attribute relations among the sensitive files, and determining correlation coefficients among the sensitive files according to the attribute relations;
and sequencing the correlation coefficients among the sensitive files, determining the correlation degree among the sensitive files according to the magnitude of the correlation coefficients, and establishing a topological connection relation among the sensitive files to form a sensitive file set.
7. The sensitive file management method according to claim 1, further comprising:
monitoring a sensitive file database, recording access information to the sensitive file database, and generating a sensitive file access table;
setting the maximum access times to the sensitive files in a preset time period according to the sensitivity of the sensitive files in the sensitive file database;
inquiring a sensitive file access table to obtain the access times of the target sensitive file in a preset time period, and sending out an alarm prompt when the access times are determined to be greater than the maximum access times.
8. The method for managing sensitive files as claimed in claim 1, further comprising, before parsing the network traffic for a predetermined period of time:
and detecting viruses of the network flow, analyzing the viruses when the viruses exist in the network flow, calculating to obtain virus values of the viruses, determining virus grades according to the virus values and a preset virus grade table, and sending out alarm grades corresponding to the virus grades.
9. The sensitive file management method according to claim 8, wherein when said network traffic is detected by viruses, calculating an effective rate of virus detection, and when said effective rate is determined to be less than a preset effective rate, sending out a detection failure message and re-detecting said network traffic;
the calculating of the effective rate of virus detection comprises:
calculating a detection difficulty coefficient S of virus data:
wherein M is the number of detected virus data; b is a detection scale factor;average length of the detected virus data; a is that i Length of the i-th virus data detected; d is the average spacing between detected adjacent virus data; l is the length of the network traffic processed by wavelet analysis; n is the number of virus data detected in the network traffic processed by wavelet analysis;
according to the detection difficulty coefficient of virus data, calculating the effective rate K of virus detection:
lambda is the average duration of all virus data in the detected network flow; lambda (lambda) i For a duration in which the ith virus data is detected; t (T) i Is the noise value at the time of detecting the ith virus data.
CN202110039654.5A 2021-01-13 2021-01-13 Sensitive file management method Active CN112733188B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110039654.5A CN112733188B (en) 2021-01-13 2021-01-13 Sensitive file management method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110039654.5A CN112733188B (en) 2021-01-13 2021-01-13 Sensitive file management method

Publications (2)

Publication Number Publication Date
CN112733188A CN112733188A (en) 2021-04-30
CN112733188B true CN112733188B (en) 2023-09-22

Family

ID=75591479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110039654.5A Active CN112733188B (en) 2021-01-13 2021-01-13 Sensitive file management method

Country Status (1)

Country Link
CN (1) CN112733188B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114781194B (en) * 2022-06-20 2022-09-09 航天晨光股份有限公司 Construction method of database based on metal hose

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001063528A1 (en) * 2000-02-23 2001-08-30 Ipdn Corporation Methods and devices for storing, distributing, and accessing intellectual property in digital form
US7870614B1 (en) * 2006-01-27 2011-01-11 Aspect Loss Prevention, LLC Sensitive data aliasing
CN105740661A (en) * 2014-12-11 2016-07-06 中国移动通信集团公司 Method and device for protecting application program
CN106713067A (en) * 2016-11-30 2017-05-24 广东电网有限责任公司信息中心 Sensitive file circulation monitoring method based on DPI
CN107577939A (en) * 2017-09-12 2018-01-12 中国石油集团川庆钻探工程有限公司 A kind of data leakage prevention method based on key technology
CN107733902A (en) * 2017-10-23 2018-02-23 中国移动通信集团广东有限公司 A kind of monitoring method and device of target data diffusion process
CN108133138A (en) * 2017-12-21 2018-06-08 北京明朝万达科技股份有限公司 A kind of sensitive information source tracing method of leakage, device and system
CN108667766A (en) * 2017-03-28 2018-10-16 腾讯科技(深圳)有限公司 File detection method and file detection device
CN109766525A (en) * 2019-01-14 2019-05-17 湖南大学 A kind of sensitive information leakage detection framework of data-driven
WO2019196224A1 (en) * 2018-04-09 2019-10-17 平安科技(深圳)有限公司 Regulation information processing method and apparatus, computer device and storage medium
CN110377479A (en) * 2019-05-24 2019-10-25 平安普惠企业管理有限公司 Sensitive field monitoring method, device and the computer equipment of journal file
CN111967024A (en) * 2020-07-10 2020-11-20 苏州浪潮智能科技有限公司 File sensitive data protection method and device
CN112115493A (en) * 2020-09-16 2020-12-22 安徽长泰信息安全服务有限公司 Data leakage protection system based on data acquisition

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8135948B2 (en) * 2006-01-27 2012-03-13 Imperva, Inc. Method and system for transparently encrypting sensitive information
TWI528218B (en) * 2013-11-29 2016-04-01 財團法人資訊工業策進會 Method for discriminating sensitive data and data loss prevention system using the method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001063528A1 (en) * 2000-02-23 2001-08-30 Ipdn Corporation Methods and devices for storing, distributing, and accessing intellectual property in digital form
US7870614B1 (en) * 2006-01-27 2011-01-11 Aspect Loss Prevention, LLC Sensitive data aliasing
CN105740661A (en) * 2014-12-11 2016-07-06 中国移动通信集团公司 Method and device for protecting application program
CN106713067A (en) * 2016-11-30 2017-05-24 广东电网有限责任公司信息中心 Sensitive file circulation monitoring method based on DPI
CN108667766A (en) * 2017-03-28 2018-10-16 腾讯科技(深圳)有限公司 File detection method and file detection device
CN107577939A (en) * 2017-09-12 2018-01-12 中国石油集团川庆钻探工程有限公司 A kind of data leakage prevention method based on key technology
CN107733902A (en) * 2017-10-23 2018-02-23 中国移动通信集团广东有限公司 A kind of monitoring method and device of target data diffusion process
CN108133138A (en) * 2017-12-21 2018-06-08 北京明朝万达科技股份有限公司 A kind of sensitive information source tracing method of leakage, device and system
WO2019196224A1 (en) * 2018-04-09 2019-10-17 平安科技(深圳)有限公司 Regulation information processing method and apparatus, computer device and storage medium
CN109766525A (en) * 2019-01-14 2019-05-17 湖南大学 A kind of sensitive information leakage detection framework of data-driven
CN110377479A (en) * 2019-05-24 2019-10-25 平安普惠企业管理有限公司 Sensitive field monitoring method, device and the computer equipment of journal file
CN111967024A (en) * 2020-07-10 2020-11-20 苏州浪潮智能科技有限公司 File sensitive data protection method and device
CN112115493A (en) * 2020-09-16 2020-12-22 安徽长泰信息安全服务有限公司 Data leakage protection system based on data acquisition

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
严敏 ; 何庆 ; .基于大数据平台敏感数据流转全生命周期监控的研究与应用.信息安全研究.2018,(第02期),第51-55页. *
李自清 ; .基于网络的数据库敏感数据加密模型研究.计算机测量与控制.2017,(第05期),第184-187、191页. *
许暖 ; .基于敏感数据流向分析的数据管控体系的研究.网络安全技术与应用.2020,(第03期),第67-68页. *
陈颖.基于数据驱动的敏感信息泄露检测系统.《基于数据驱动的敏感信息泄露检测系统》.2020,全文. *

Also Published As

Publication number Publication date
CN112733188A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
Srinivas et al. Designing secure user authentication protocol for big data collection in IoT-based intelligent transportation system
Shrestha et al. A support vector machine-based framework for detection of covert timing channels
Hei et al. A trusted feature aggregator federated learning for distributed malicious attack detection
CN111756522B (en) Data processing method and system
CN112182519A (en) Computer storage system security access method and access system
US10574658B2 (en) Information security apparatus and methods for credential dump authenticity verification
CN111882233A (en) Storage risk early warning method, system and device based on block chain and storage medium
CN116561809B (en) Destroying method for identifying security medium based on point cloud
CN111698241A (en) Internet of things cloud platform system, verification method and data management method
CN110362536A (en) Log cipher text retrieval method based on alarm association
CN112733188B (en) Sensitive file management method
CN112651010A (en) Method and device for verifying sliding verification code, computer equipment and medium
CN115695048A (en) Secure network data processing method and system
CN110247911B (en) Flow abnormity detection method and system
CN111639355B (en) Data security management method and system
CN116015894B (en) Information security management method and system
CN117113199A (en) File security management system and method based on artificial intelligence
CN116132989A (en) Industrial Internet security situation awareness system and method
Ficco et al. A weight-based symptom correlation approach to SQL injection attacks
CN111371727A (en) Detection method for NTP protocol covert communication
CN116418587B (en) Data cross-domain switching behavior audit trail method and data cross-domain switching system
CN116668085B (en) Flow multi-process intrusion detection method and system based on lightGBM
CN113377898B (en) Analysis method based on mass discrete data
CN117692257B (en) High-speed encryption method and device for service data of electric power Internet of things
CN117149590B (en) Data center system with data security monitoring module and monitoring method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant