EA201991625A1 - METHOD AND SYSTEM FOR DATA CLASSIFICATION FOR DETECTING CONFIDENTIAL INFORMATION - Google Patents
METHOD AND SYSTEM FOR DATA CLASSIFICATION FOR DETECTING CONFIDENTIAL INFORMATIONInfo
- Publication number
- EA201991625A1 EA201991625A1 EA201991625A EA201991625A EA201991625A1 EA 201991625 A1 EA201991625 A1 EA 201991625A1 EA 201991625 A EA201991625 A EA 201991625A EA 201991625 A EA201991625 A EA 201991625A EA 201991625 A1 EA201991625 A1 EA 201991625A1
- Authority
- EA
- Eurasian Patent Office
- Prior art keywords
- data
- confidential information
- tags
- classifying
- processing
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
Abstract
Настоящее изобретение в общем относится к области вычислительной обработки данных, а в частности к методам классификации данных для выявления конфиденциальной информации. Компьютерно-реализуемый способ классификации данных для выявления конфиденциальной информации выполняется с помощью по меньшей мере одного процессора и содержит этапы, на которых получают данные, представленные в табличном формате; осуществляют обработку полученных данных с помощью ансамбля нейронных сетей, в ходе которой данным в каждой ячейке таблицы присваивается тег, соответствующий заданному типу конфиденциальной информации, причем для каждой нейронной сети сформирована матрица классификации, на основании которой вычисляется F-мера для каждого типа данных; осуществляют обработку полученных данных с помощью алгоритмов определения контрольных разрядов на предмет выявления в ячейках таблицы данных, обладающих контрольным разрядом; на основе полученных от каждой нейронной сети таблиц с проставленными тегами и соответствующей нейронным сетям матрицы F-мер формируют итоговую таблицу с проставленными тегами с учетом данных, обладающих контрольным разрядом; выполняют классификацию данных итоговой таблицы по классам конфиденциальности на основе сравнения проставленных тегов итоговой таблицы с заданными тегами конфиденциальной информации.The present invention relates generally to the field of computational data processing, and in particular to methods for classifying data for identifying confidential information. A computer-implemented method for classifying data for detecting confidential information is performed using at least one processor and comprises the steps of obtaining data presented in a tabular format; processing the received data using an ensemble of neural networks, during which the data in each cell of the table is assigned a tag corresponding to a given type of confidential information, and for each neural network a classification matrix is formed, on the basis of which the F-measure is calculated for each type of data; carry out the processing of the obtained data using algorithms for determining the control digits in order to identify data in the cells of the table with a control bit; on the basis of the tables with affixed tags received from each neural network and the matrix of F-measures corresponding to the neural networks, a final table with the affixed tags is formed, taking into account the data having a control bit; classifying the summary table data into privacy classes based on a comparison of the set tags of the summary table with the specified tags of confidential information.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
RU2019121020A RU2759786C1 (en) | 2019-07-05 | 2019-07-05 | Method and system for classifying data for identifying confidential information |
Publications (2)
Publication Number | Publication Date |
---|---|
EA201991625A1 true EA201991625A1 (en) | 2021-01-29 |
EA038259B1 EA038259B1 (en) | 2021-07-30 |
Family
ID=74114915
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EA201991625A EA038259B1 (en) | 2019-07-05 | 2019-07-31 | Method and system for classifying data in order to detect confidential information |
Country Status (3)
Country | Link |
---|---|
EA (1) | EA038259B1 (en) |
RU (1) | RU2759786C1 (en) |
WO (1) | WO2021006755A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113918577B (en) * | 2021-12-15 | 2022-03-11 | 北京新唐思创教育科技有限公司 | Data table identification method and device, electronic equipment and storage medium |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7480640B1 (en) * | 2003-12-16 | 2009-01-20 | Quantum Leap Research, Inc. | Automated method and system for generating models from data |
WO2006138502A2 (en) * | 2005-06-16 | 2006-12-28 | The Board Of Trustees Operating Michigan State University | Methods for data classification |
US8490194B2 (en) * | 2006-01-31 | 2013-07-16 | Robert Moskovitch | Method and system for detecting malicious behavioral patterns in a computer, using machine learning |
US8752181B2 (en) * | 2006-11-09 | 2014-06-10 | Touchnet Information Systems, Inc. | System and method for providing identity theft security |
US9082080B2 (en) * | 2008-03-05 | 2015-07-14 | Kofax, Inc. | Systems and methods for organizing data sets |
US8286255B2 (en) * | 2008-08-07 | 2012-10-09 | Sophos Plc | Computer file control through file tagging |
FR2956541B1 (en) * | 2010-02-18 | 2012-03-23 | Centre Nat Rech Scient | CRYPTOGRAPHIC METHOD FOR COMMUNICATING CONFIDENTIAL INFORMATION. |
US10169715B2 (en) * | 2014-06-30 | 2019-01-01 | Amazon Technologies, Inc. | Feature processing tradeoff management |
US10535017B2 (en) * | 2015-10-27 | 2020-01-14 | Legility Data Solutions, Llc | Apparatus and method of implementing enhanced batch-mode active learning for technology-assisted review of documents |
RU2647640C2 (en) * | 2015-12-07 | 2018-03-16 | федеральное государственное казенное военное образовательное учреждение высшего образования "Краснодарское высшее военное училище имени генерала армии С.М. Штеменко" Министерства обороны Российской Федерации | Method of automatic classification of confidential formalized documents in electronic document management system |
WO2019035765A1 (en) * | 2017-08-14 | 2019-02-21 | Dathena Science Pte. Ltd. | Methods, machine learning engines and file management platform systems for content and context aware data classification and security anomaly detection |
-
2019
- 2019-07-05 WO PCT/RU2019/000481 patent/WO2021006755A1/en active Application Filing
- 2019-07-05 RU RU2019121020A patent/RU2759786C1/en active
- 2019-07-31 EA EA201991625A patent/EA038259B1/en unknown
Also Published As
Publication number | Publication date |
---|---|
RU2759786C1 (en) | 2021-11-17 |
WO2021006755A1 (en) | 2021-01-14 |
EA038259B1 (en) | 2021-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11341417B2 (en) | Method and apparatus for completing a knowledge graph | |
CN109583325B (en) | Face sample picture labeling method and device, computer equipment and storage medium | |
CN109635838B (en) | Face sample picture labeling method and device, computer equipment and storage medium | |
CN111324784B (en) | Character string processing method and device | |
WO2017124942A1 (en) | Method and apparatus for abnormal access detection | |
Wang et al. | Fast and robust object detection using asymmetric totally corrective boosting | |
WO2016177069A1 (en) | Management method, device, spam short message monitoring system and computer storage medium | |
US20220222372A1 (en) | Automated data masking with false positive detection and avoidance | |
US11334773B2 (en) | Task-based image masking | |
Khullar et al. | f-FNC: Privacy concerned efficient federated approach for fake news classification | |
EA201991625A1 (en) | METHOD AND SYSTEM FOR DATA CLASSIFICATION FOR DETECTING CONFIDENTIAL INFORMATION | |
Ali et al. | Fake accounts detection on social media using stack ensemble system | |
Bhuyan et al. | SE_SPnet: Rice leaf disease prediction using stacked parallel convolutional neural network with squeeze‐and‐excitation | |
Chua et al. | Problem Understanding of Fake News Detection from a Data Mining Perspective | |
US20130322682A1 (en) | Profiling Activity Through Video Surveillance | |
Jairath et al. | Adaptive skin color model to improve video face detection | |
US8918406B2 (en) | Intelligent analysis queue construction | |
EP4227855A1 (en) | Graph explainable artificial intelligence correlation | |
CN115438658A (en) | Entity recognition method, recognition model training method and related device | |
US11775592B2 (en) | System and method for association of data elements within a document | |
EA201992491A1 (en) | METHOD AND SYSTEM FOR DATA CLASSIFICATION FOR DETECTING CONFIDENTIAL INFORMATION IN TEXT | |
CN112989869B (en) | Optimization method, device, equipment and storage medium of face quality detection model | |
CN114398887A (en) | Text classification method and device and electronic equipment | |
US20190057321A1 (en) | Classification | |
KR20170085876A (en) | Method for analyzing association of diseases using data mining |