CN113392111A - Self-learning management system based on sensitive database - Google Patents

Self-learning management system based on sensitive database Download PDF

Info

Publication number
CN113392111A
CN113392111A CN202110672561.6A CN202110672561A CN113392111A CN 113392111 A CN113392111 A CN 113392111A CN 202110672561 A CN202110672561 A CN 202110672561A CN 113392111 A CN113392111 A CN 113392111A
Authority
CN
China
Prior art keywords
data
sensitive
characteristic
height
highly
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110672561.6A
Other languages
Chinese (zh)
Other versions
CN113392111B (en
Inventor
林德威
高董英
方志坚
黄芳芳
潘建笠
刘积娟
黄鹏
陈强
谢妙红
李建平
曾驰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Information and Telecommunication Co Ltd
Information and Telecommunication Branch of State Grid Fujian Electric Power Co Ltd
Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd
Original Assignee
State Grid Information and Telecommunication Co Ltd
Information and Telecommunication Branch of State Grid Fujian Electric Power Co Ltd
Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Information and Telecommunication Co Ltd, Information and Telecommunication Branch of State Grid Fujian Electric Power Co Ltd, Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd filed Critical State Grid Information and Telecommunication Co Ltd
Priority to CN202110672561.6A priority Critical patent/CN113392111B/en
Publication of CN113392111A publication Critical patent/CN113392111A/en
Application granted granted Critical
Publication of CN113392111B publication Critical patent/CN113392111B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/382Payment protocols; Details thereof insuring higher security of transaction
    • G06Q20/3829Payment protocols; Details thereof insuring higher security of transaction involving key management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Computer Security & Cryptography (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Storage Device Security (AREA)

Abstract

The invention provides a self-learning management system based on a sensitive database, which comprises a database updating module, a storage module, a self-learning module and a processing module, wherein an initial sensitive database is stored in the storage module, the self-learning module is used for constructing sensitive data characteristics according to the initial sensitive database, the processing module is used for classifying received data, and the database updating module is used for storing the classified sensitive data into the storage module; the self-learning management system can update and classify the newly generated sensitive data again, so that the accuracy of classifying the sensitive data can be improved, and the problems of the conventional sensitive data that the processing process is more rigid and the processing efficiency and the safety are lower are solved.

Description

Self-learning management system based on sensitive database
Technical Field
The invention relates to the technical field of data processing, in particular to a self-learning management system based on a sensitive database.
Background
Sensitive data refers to data that may pose serious harm to the society or individuals after leakage. Including personal privacy data such as name, identification number, address, telephone, bank account, mailbox, password, medical information, educational background, etc.; but also data that the enterprise or social organization is not suitable for publishing, such as the business situation of the enterprise, the network structure of the enterprise, the IP address list, etc. Especially, the popularization of the current intelligent power grid system improves the granularity of information collection and also improves the leakage risk of the power utilization information.
In the prior art, in the process of processing sensitive data, the sensitive data are generally divided according to judgment criteria set in advance manually and then classified, the management method of the sensitive data is not suitable for the era of data flooding, the data updating speed is high in the current data processing field, the combination types of different data are changed, the existing identification scene cannot be met by using the original sensitive data management system, and the novel sensitive data are easily missed to be judged, so that the safety and the efficiency of the data processing process are reduced.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a self-learning management system based on a sensitive database, which can update and classify newly generated sensitive data again through the self-learning management system, can improve the accuracy of classifying the sensitive data, and solves the problems of the existing sensitive data that the processing process is more rigid and the processing efficiency and the safety are lower.
In order to achieve the purpose, the invention is realized by the following technical scheme: a self-learning management system based on a sensitive database comprises a database updating module, a storage module, a self-learning module and a processing module, wherein an initial sensitive database is stored in the storage module, the self-learning module is used for constructing sensitive data characteristics according to the initial sensitive database, the processing module is used for classifying received data, and the database updating module is used for storing the classified sensitive data into the storage module;
the self-learning module comprises a first learning unit and a second learning unit; the first learning unit is used for constructing the sensitive data characteristics according to the initial sensitive database, and the second learning unit is used for constructing the sensitive data characteristics according to the updated sensitive database;
the first learning unit comprises a sensitive data classification subunit and a first feature construction subunit; the sensitive data classification subunit is configured with a sensitive data classification policy, where the sensitive data classification policy includes: classifying the sensitive data in the initial sensitive database, wherein the classification level is high sensitive data, medium sensitive data and light sensitive data;
then, performing data label classification on the highly sensitive data, the moderately sensitive data and the mildly sensitive data, wherein the data labels are divided into a data source area, digital data, combined data, physical sign data, payment record data and login record data;
the first feature construction subunit includes a first feature construction policy that includes: extracting a data source area in the highly sensitive data, and marking the data source area as a height area characteristic;
extracting digital data and payment record data in the highly sensitive data, and marking a combination of the payment record data and the digital data marked at the same time as a high payment password characteristic;
extracting combined data and login record data in the highly sensitive data, and marking a combination which simultaneously marks the combined data and the login record data as a highly login password characteristic;
extracting digital data and login record data in the highly sensitive data, and marking a combination of the digital data and the login record data which are marked simultaneously as the characteristics of a highly logged-in account;
extracting sign data and payment record data in the highly sensitive data, and marking a combination of the marked sign data and the payment record data as a highly payment sign characteristic;
extracting sign data and login record data in the highly sensitive data, and marking a combination of simultaneously marked sign data and login record data as a highly logged sign feature;
the processing module comprises a sensitive data dividing unit, the sensitive data dividing unit is configured with a comparison strategy, and the comparison strategy comprises: classifying the received data by data labels, comparing the received data with a height area characteristic, a height payment password characteristic, a height login account characteristic, a height payment physical sign characteristic and a height login physical sign characteristic respectively, classifying the data into primary highly sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which are matched with the comparison into the data for classification;
the database updating module comprises a cache unit, and the cache unit is used for storing newly classified first-level highly sensitive data in the first time;
the storage module comprises a highly sensitive data storage unit, the highly sensitive data storage unit is configured with a relocation strategy, and the relocation strategy comprises: and the storage data in the cache unit is transferred into the highly sensitive data storage unit every first time.
Further, the second learning unit includes a second feature construction subunit configured with a second feature construction strategy, which includes: and extracting the data with the height area characteristic and the height payment password characteristic, and marking the data with the height area characteristic and the height payment password characteristic as a height concentrated payment area characteristic.
Further, the second feature construction policy further includes: and extracting the data with the high login account number characteristics and the high area characteristics, and marking the data with the high login account number characteristics and the high area characteristics as the high concentrated login area characteristics.
Further, the second feature construction policy further includes: and extracting the data with the high payment password characteristic and the data with the high login password characteristic, and marking the data with the high payment password characteristic and the high login password characteristic as the high password using characteristic.
Further, the second feature construction policy further includes: extracting the data with the height payment sign characteristics and the height login sign characteristics, and marking the data with the height payment sign characteristics and the height login sign characteristics as the height sign characteristics.
Further, the alignment strategy further comprises: and classifying the received data by data labels, comparing the received data with the characteristics of a highly concentrated payment area, a highly concentrated login area, a highly password using characteristic and a highly physical sign characteristic respectively, classifying the data into secondary highly sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which meet the comparison into the data for classification.
Further, the second learning unit further includes a feature subdivision sub-unit configured with a feature subdivision policy, the feature subdivision policy including: splitting the use characteristics of the high-level password, recording the digit of the use characteristics of the high-level password and the type number of the use combination symbols, classifying the type numbers of the combination symbols corresponding to different digits, selecting the combination with the most occurrence frequency of the type numbers of the combination symbols under different digits as a mutually matched combination, and marking the combination as the type number characteristics corresponding to the digits.
Further, the alignment strategy further comprises: and classifying the received data by data labels, comparing the received data with the type number characteristics corresponding to the digits, classifying the data into subdivided sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which meet the comparison into the data for classification.
Further, the data tag also comprises video data, picture data and mobile phone shooting source data;
the first feature construction policy further comprises: extracting video data in the highly sensitive data and mobile phone shooting source data, and marking a combination of simultaneously marked video data and mobile phone shooting source data as a high video feature;
and extracting picture data in the highly sensitive data and mobile phone shooting source data, and marking a combination of the simultaneously marked picture data and the mobile phone shooting source data as a high picture characteristic.
Further, the alignment strategy further comprises: and carrying out data label classification on the received data, then comparing the received data with the height video characteristic and the height picture characteristic, classifying the data into highly sensitive data when the comparison meets the characteristics, and adding the label of the matched characteristic into the data for classification.
The invention has the beneficial effects that: according to the method, the sensitive data in the initial sensitive database can be classified through a sensitive data classification strategy, and the classification level is high sensitive data, medium sensitive data and light sensitive data; and then, carrying out data label classification on the highly sensitive data, the moderately sensitive data and the mildly sensitive data, wherein the data labels are divided into a data source area, digital data, combined data, sign data, payment record data and login record data, learning is carried out according to the characteristics, and a first characteristic construction strategy can be used for constructing a height area characteristic, a height payment password characteristic, a height login account characteristic, a height payment sign characteristic and a height login sign characteristic, so that the classification of the sensitive data of the received data can be rapidly carried out, and the self-learning processing efficiency of the sensitive data is improved.
According to the invention, by arranging the second learning unit, the second learning unit can construct the sensitive data characteristics according to the updated sensitive database, and can re-construct the highly concentrated payment region characteristics, the highly concentrated login region characteristics, the highly password using characteristics and the highly physical sign characteristics, so that the sensitivity of the sensitive data is upgraded, and the classification accuracy of the highly sensitive data is improved; meanwhile, a feature subdivision strategy is added, so that feature subdivision can be performed according to the number of bits of the use features of the high-level password and the number of types of the use combination symbols, and the identification accuracy of password data is improved; video data, picture data and mobile phone shooting source data are added into the data label, so that high video characteristics and high picture characteristics can be obtained, and the comprehensiveness of sensitive data classification is improved.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a schematic block diagram of a first embodiment of the present invention;
fig. 2 is a schematic block diagram of a second embodiment of the present invention.
In the figure: 1. a self-learning management system; 11. a self-learning module; 111. a first learning unit; 1111. a sensitive data classification subunit; 1112. a first feature building subunit; 112. a second learning unit; 1121. a second feature building subunit; 1122. a feature subdivision subunit; 12. a processing module; 121. a sensitive data dividing unit; 13. a database update module; 131. a buffer unit; 14. a storage module; 141. and a memory unit.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
In a first embodiment, please refer to fig. 1, a self-learning management system based on a sensitive database includes a database updating module 13, a storage module 14, a self-learning module 11, and a processing module 12, an initial sensitive database is stored in the storage module 14, the self-learning module 11 is configured to construct sensitive data characteristics according to the initial sensitive database, the processing module 12 is configured to classify received data, and the database updating module 13 is configured to store the classified sensitive data in the storage module 14.
The self-learning module 11 comprises a first learning unit 111 and a second learning unit 112; the first learning unit 111 is configured to construct sensitive data features according to an initial sensitive database, and the second learning unit 112 is configured to construct sensitive data features according to an updated sensitive database.
The first learning unit 111 comprises a sensitive data classification subunit 1111 and a first feature construction subunit 1112; the sensitive data classification subunit 1111 is configured with a sensitive data classification policy, which includes: classifying the sensitive data in the initial sensitive database, wherein the classification level is high sensitive data, medium sensitive data and light sensitive data;
and then, carrying out data label classification on the highly sensitive data, the moderately sensitive data and the mildly sensitive data, wherein the data labels are divided into a data source area, digital data, combined data, physical sign data, payment record data and login record data.
The first feature construction subunit 1112 comprises a first feature construction strategy comprising: extracting a data source region in the highly sensitive data, and marking the data source region as a height region characteristic, wherein the height region is generally divided into regions needing to ensure data security, such as a national research institute, a scientific research institute, a bank and the like, and the data output from the regions needs to be divided into the highly sensitive data.
The digital data and the payment record data in the highly sensitive data are extracted, the combination of the payment record data and the digital data which are marked simultaneously is marked as a highly sensitive payment password characteristic, and if the digital data and the payment record data occur simultaneously, the digital data is the payment password with a high probability, so that the digital data needs to be classified as the highly sensitive data.
The combined data and the login record data in the highly sensitive data are extracted, the combination of the combination data marked simultaneously and the login record data is marked as a highly login password characteristic, and if the combined data and the login record data occur simultaneously, the combined data is the login password data with a high probability, so that the combined data needs to be classified as the highly sensitive data.
The digital data and the login record data in the highly sensitive data are extracted, the combination of the simultaneously marked digital data and the login record data is marked as the highly logged account characteristic, and when the data and the login record data occur simultaneously, the digital data is the login account or the mobile phone number at a high probability, so that the highly sensitive data needs to be classified.
The sign data and the payment record data in the highly sensitive data are extracted, the combination of the sign data and the payment record data which are marked simultaneously is marked as a highly payment sign characteristic, and under the condition that the sign data and the payment record data appear simultaneously, the sign data is a sign password, such as a fingerprint password, during payment at a high probability.
The sign data and the login record data in the highly sensitive data are extracted, the combination of the sign data and the login record data which are marked simultaneously is marked as the highly logged sign feature, and under the condition that the sign data and the login record data appear simultaneously, the sign data is a sign password such as a fingerprint password during login in a high probability.
The processing module 12 includes a sensitive data dividing unit 121, where the sensitive data dividing unit 121 is configured with a comparison policy, where the comparison policy includes: and classifying the received data by data labels, comparing the received data with the height area characteristic, the height payment password characteristic, the height login account characteristic, the height payment physical sign characteristic and the height login physical sign characteristic respectively, classifying the data into primary highly sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which are matched with the comparison into the data for classification.
The data tag also comprises video data, picture data and mobile phone shooting source data;
the first feature construction policy further comprises: extracting video data in the highly sensitive data and mobile phone shooting source data, and marking a combination of simultaneously marked video data and mobile phone shooting source data as a high video feature;
and extracting picture data in the highly sensitive data and mobile phone shooting source data, and marking a combination of the simultaneously marked picture data and the mobile phone shooting source data as a high picture characteristic.
The alignment strategy further comprises: and carrying out data label classification on the received data, then comparing the received data with the height video characteristic and the height picture characteristic, classifying the data into highly sensitive data when the comparison meets the characteristics, and adding the label of the matched characteristic into the data for classification.
The database updating module 13 includes a cache unit 131, where the cache unit 131 is configured to store the newly categorized first-level highly sensitive data in the first time;
the storage module 14 includes a highly sensitive data storage unit 141, and the highly sensitive data storage unit 141 is configured with a relocation policy, where the relocation policy includes: the storage data in the buffer unit 131 is shifted into the highly sensitive data storage unit 141 every first time.
In the second embodiment, referring to fig. 2, on the basis of the first embodiment, a second learning unit 112 is added, and the second learning unit 112 can perform feature extraction according to the updated sensitive database, so that the subdivision degree of the highly sensitive data is further improved, and the accuracy of classifying the sensitive data is improved. The second learning unit 112 includes a second feature construction sub-unit 1121, and the second feature construction sub-unit 1121 is configured with a second feature construction strategy, which includes: the data with the height area characteristic and the height payment password characteristic are extracted, and the data with the height area characteristic and the height payment password characteristic are marked as the height centralized payment area characteristic, which is more common in the field of centralized payment behaviors such as banks or shopping malls, so that the security priority of data processing in the area is higher.
The second feature construction policy further includes: the data with the high login account characteristics and the high area characteristics are extracted, and the data with the high login account characteristics and the high area characteristics are marked as the high concentrated login area characteristics, which are common in entertainment places with more user terminals such as internet cafes and the like, and users can frequently log in accounts.
The second feature construction policy further includes: and extracting the data with the high payment password characteristic and the data with the high login password characteristic, and marking the data with the high payment password characteristic and the high login password characteristic as the high password using characteristic. By extracting the characteristic, the data related to the password can be identified, so that the password is subjected to emphatic encryption processing.
The second feature construction policy further includes: extracting the data with the height payment sign characteristics and the height login sign characteristics, and marking the data with the height payment sign characteristics and the height login sign characteristics as the height sign characteristics. The data of the physical signs of the human body comprise a plurality of types, and if the data is used for payment and login, the characteristic of the physical signs is used for a physical sign password with high probability, such as a fingerprint password and a face recognition password.
The alignment strategy further comprises: and classifying the received data by data labels, comparing the received data with the characteristics of a highly concentrated payment area, a highly concentrated login area, a highly password using characteristic and a highly physical sign characteristic respectively, classifying the data into secondary highly sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which meet the comparison into the data for classification.
The second learning unit 112 further includes a feature segmentation subunit 1122, the feature segmentation subunit 1122 being configured with a feature segmentation strategy including: splitting the use characteristics of the high-level password, recording the digit of the use characteristics of the high-level password and the type number of the use combination symbols, classifying the type numbers of the combination symbols corresponding to different digits, selecting the combination with the most occurrence frequency of the type numbers of the combination symbols under different digits as a mutually matched combination, and marking the combination as the type number characteristics corresponding to the digits.
The alignment strategy further comprises: and classifying the received data by data labels, comparing the received data with the type number characteristics corresponding to the digits, classifying the data into subdivided sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which meet the comparison into the data for classification.
The working principle is as follows: in the process of processing data, the self-learning module 11 can extract features according to an initial sensitive database stored in the storage module 14, perform label and feature classification on highly sensitive data, and classify newly received data by the processing module 12, so as to improve the self-learning classification efficiency of the sensitive data, the marked highly sensitive data is firstly cached in the database updating module 13, and is uniformly and intensively stored in the storage module 14 after a certain time, and the newly added sensitive data in the storage module 14 can be subjected to re-learning classification by adding the second learning unit 112 in the self-learning module 11, so as to further improve the accuracy and the fineness of classification of the sensitive data, and improve the overall self-learning management efficiency of the sensitive data.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. The self-learning management system based on the sensitive database is characterized by comprising a database updating module (13), a storage module (14), a self-learning module (11) and a processing module (12), wherein an initial sensitive database is stored in the storage module (14), the self-learning module (11) is used for constructing sensitive data characteristics according to the initial sensitive database, the processing module (12) is used for classifying received data, and the database updating module (13) is used for storing the classified sensitive data into the storage module (14);
the self-learning module (11) comprises a first learning unit (111) and a second learning unit (112); the first learning unit (111) is used for constructing the sensitive data features according to an initial sensitive database, and the second learning unit (112) is used for constructing the sensitive data features according to an updated sensitive database;
the first learning unit (111) comprises a sensitive data classification subunit (1111) and a first feature construction subunit (1112); the sensitive data classification subunit (1111) is configured with a sensitive data classification policy comprising: classifying the sensitive data in the initial sensitive database, wherein the classification level is high sensitive data, medium sensitive data and light sensitive data;
then, performing data label classification on the highly sensitive data, the moderately sensitive data and the mildly sensitive data, wherein the data labels are divided into a data source area, digital data, combined data, physical sign data, payment record data and login record data;
the first feature construction subunit (1112) comprises a first feature construction strategy comprising: extracting a data source area in the highly sensitive data, and marking the data source area as a height area characteristic;
extracting digital data and payment record data in the highly sensitive data, and marking a combination of the payment record data and the digital data marked at the same time as a high payment password characteristic;
extracting combined data and login record data in the highly sensitive data, and marking a combination which simultaneously marks the combined data and the login record data as a highly login password characteristic;
extracting digital data and login record data in the highly sensitive data, and marking a combination of the digital data and the login record data which are marked simultaneously as the characteristics of a highly logged-in account;
extracting sign data and payment record data in the highly sensitive data, and marking a combination of the marked sign data and the payment record data as a highly payment sign characteristic;
extracting sign data and login record data in the highly sensitive data, and marking a combination of simultaneously marked sign data and login record data as a highly logged sign feature;
the processing module (12) comprises a sensitive data dividing unit (121), wherein the sensitive data dividing unit (121) is configured with a comparison strategy, and the comparison strategy comprises: classifying the received data by data labels, comparing the received data with a height area characteristic, a height payment password characteristic, a height login account characteristic, a height payment physical sign characteristic and a height login physical sign characteristic respectively, classifying the data into primary highly sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which are matched with the comparison into the data for classification;
the database updating module (13) comprises a cache unit (131), wherein the cache unit (131) is used for storing newly classified first-level highly sensitive data in a first time;
the storage module (14) includes a highly sensitive data storage unit (141), the highly sensitive data storage unit (141) configured with a relocation policy, the relocation policy including: the storage data in the cache unit (131) is transferred into the highly sensitive data storage unit (141) at intervals of a first time.
2. A sensitive database based self-learning management system according to claim 1, wherein the second learning unit (112) comprises a second feature construction sub-unit (1121), the second feature construction sub-unit (1121) is configured with a second feature construction strategy, the second feature construction strategy comprises: and extracting the data with the height area characteristic and the height payment password characteristic, and marking the data with the height area characteristic and the height payment password characteristic as a height concentrated payment area characteristic.
3. The sensitive database based self-learning management system of claim 2, wherein the second feature construction policy further comprises: and extracting the data with the high login account number characteristics and the high area characteristics, and marking the data with the high login account number characteristics and the high area characteristics as the high concentrated login area characteristics.
4. The sensitive database based self-learning management system of claim 3, wherein the second feature construction policy further comprises: and extracting the data with the high payment password characteristic and the data with the high login password characteristic, and marking the data with the high payment password characteristic and the high login password characteristic as the high password using characteristic.
5. The sensitive database-based self-learning management system of claim 4, wherein the second feature construction policy further comprises: extracting the data with the height payment sign characteristics and the height login sign characteristics, and marking the data with the height payment sign characteristics and the height login sign characteristics as the height sign characteristics.
6. The sensitive database-based self-learning management system of claim 5, wherein the alignment strategy further comprises: and classifying the received data by data labels, comparing the received data with the characteristics of a highly concentrated payment area, a highly concentrated login area, a highly password using characteristic and a highly physical sign characteristic respectively, classifying the data into secondary highly sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which meet the comparison into the data for classification.
7. The sensitive database based self-learning management system according to claim 6, wherein the second learning unit (112) further comprises a feature subdivision sub-unit (1122), the feature subdivision sub-unit (1122) being configured with a feature subdivision strategy comprising: splitting the use characteristics of the high-level password, recording the digit of the use characteristics of the high-level password and the type number of the use combination symbols, classifying the type numbers of the combination symbols corresponding to different digits, selecting the combination with the most occurrence frequency of the type numbers of the combination symbols under different digits as a mutually matched combination, and marking the combination as the type number characteristics corresponding to the digits.
8. The sensitive database-based self-learning management system of claim 7, wherein the alignment strategy further comprises: and classifying the received data by data labels, comparing the received data with the type number characteristics corresponding to the digits, classifying the data into subdivided sensitive data when the comparison meets the characteristics, and adding the labels of the characteristics which meet the comparison into the data for classification.
9. The sensitive database based self-learning management system of claim 8, wherein the data tag further comprises video data, picture data and mobile phone shooting source data;
the first feature construction policy further comprises: extracting video data in the highly sensitive data and mobile phone shooting source data, and marking a combination of simultaneously marked video data and mobile phone shooting source data as a high video feature;
and extracting picture data in the highly sensitive data and mobile phone shooting source data, and marking a combination of the simultaneously marked picture data and the mobile phone shooting source data as a high picture characteristic.
10. The sensitive database-based self-learning management system of claim 9, wherein the alignment strategy further comprises: and carrying out data label classification on the received data, then comparing the received data with the height video characteristic and the height picture characteristic, classifying the data into highly sensitive data when the comparison meets the characteristics, and adding the label of the matched characteristic into the data for classification.
CN202110672561.6A 2021-06-17 2021-06-17 Self-learning management system based on sensitive database Active CN113392111B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110672561.6A CN113392111B (en) 2021-06-17 2021-06-17 Self-learning management system based on sensitive database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110672561.6A CN113392111B (en) 2021-06-17 2021-06-17 Self-learning management system based on sensitive database

Publications (2)

Publication Number Publication Date
CN113392111A true CN113392111A (en) 2021-09-14
CN113392111B CN113392111B (en) 2022-04-29

Family

ID=77621795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110672561.6A Active CN113392111B (en) 2021-06-17 2021-06-17 Self-learning management system based on sensitive database

Country Status (1)

Country Link
CN (1) CN113392111B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110040983A1 (en) * 2006-11-09 2011-02-17 Grzymala-Busse Withold J System and method for providing identity theft security
CN109344258A (en) * 2018-11-28 2019-02-15 中国电子科技网络信息安全有限公司 A kind of intelligent self-adaptive sensitive data identifying system and method
CN110580416A (en) * 2019-09-11 2019-12-17 国网浙江省电力有限公司信息通信分公司 sensitive data automatic identification method based on artificial intelligence
CN112507376A (en) * 2020-12-01 2021-03-16 浙商银行股份有限公司 Sensitive data detection method and device based on machine learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110040983A1 (en) * 2006-11-09 2011-02-17 Grzymala-Busse Withold J System and method for providing identity theft security
CN109344258A (en) * 2018-11-28 2019-02-15 中国电子科技网络信息安全有限公司 A kind of intelligent self-adaptive sensitive data identifying system and method
CN110580416A (en) * 2019-09-11 2019-12-17 国网浙江省电力有限公司信息通信分公司 sensitive data automatic identification method based on artificial intelligence
CN112507376A (en) * 2020-12-01 2021-03-16 浙商银行股份有限公司 Sensitive data detection method and device based on machine learning

Also Published As

Publication number Publication date
CN113392111B (en) 2022-04-29

Similar Documents

Publication Publication Date Title
CN109658042B (en) Review method, device, equipment and storage medium based on artificial intelligence
EP2803031B1 (en) Machine-learning based classification of user accounts based on email addresses and other account information
US9510198B2 (en) Mobile terminal and user identity recognition method
CN112307472B (en) Abnormal user identification method and device based on intelligent decision and computer equipment
CN111539021A (en) Data privacy type identification method, device and equipment
CN109800304A (en) Processing method, device, equipment and the medium of case notes
CN106650799A (en) Electronic evidence classification extraction method and system
CN107633022A (en) Personnel's portrait analysis method, device and storage medium
CN102739774A (en) Method and system for obtaining evidence under cloud computing environment
CN104866775A (en) Bleaching method for financial data
WO2021136318A1 (en) Digital humanities-oriented email history eventline generating method and apparatus
CN103870502A (en) Enterprise mobile social network system based on face recognition
Neal et al. You are not acting like yourself: A study on soft biometric classification, person identification, and mobile device use
CN112734436A (en) Terminal and method for supporting face recognition
US20230291731A1 (en) Systems and methods for monitoring decentralized data storage
CN107742068A (en) A kind of implicit identity authorization system of the multi-source of smart machine and method
CN113392111B (en) Self-learning management system based on sensitive database
CN111383072A (en) User credit scoring method, storage medium and server
CN116070248B (en) Data processing system and method for ensuring safety of power data
CN116778306A (en) Fake object detection method, related device and storage medium
WO2023000792A1 (en) Methods and apparatuses for constructing living body identification model and for living body identification, device and medium
CN115544558A (en) Sensitive information detection method and device, computer equipment and storage medium
CN113657443B (en) On-line Internet of things equipment identification method based on SOINN network
CN113780318B (en) Method, device, server and medium for generating prompt information
CN115964706A (en) Training data poisoning defense method under federal learning scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant