SG11202101452RA - Methods, machine learning engines and file management platform systems for content and context aware data classification and security anomaly detection - Google Patents

Methods, machine learning engines and file management platform systems for content and context aware data classification and security anomaly detection

Info

Publication number
SG11202101452RA
SG11202101452RA SG11202101452RA SG11202101452RA SG11202101452RA SG 11202101452R A SG11202101452R A SG 11202101452RA SG 11202101452R A SG11202101452R A SG 11202101452RA SG 11202101452R A SG11202101452R A SG 11202101452RA SG 11202101452R A SG11202101452R A SG 11202101452RA
Authority
SG
Singapore
Prior art keywords
methods
content
machine learning
management platform
file management
Prior art date
Application number
SG11202101452RA
Inventor
Christopher Muffat
Original Assignee
Dathena Science Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dathena Science Pte Ltd filed Critical Dathena Science Pte Ltd
Publication of SG11202101452RA publication Critical patent/SG11202101452RA/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
SG11202101452RA 2017-08-14 2018-08-14 Methods, machine learning engines and file management platform systems for content and context aware data classification and security anomaly detection SG11202101452RA (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG10201706637Q 2017-08-14
PCT/SG2018/050411 WO2019035765A1 (en) 2017-08-14 2018-08-14 Methods, machine learning engines and file management platform systems for content and context aware data classification and security anomaly detection

Publications (1)

Publication Number Publication Date
SG11202101452RA true SG11202101452RA (en) 2021-03-30

Family

ID=65362476

Family Applications (1)

Application Number Title Priority Date Filing Date
SG11202101452RA SG11202101452RA (en) 2017-08-14 2018-08-14 Methods, machine learning engines and file management platform systems for content and context aware data classification and security anomaly detection

Country Status (3)

Country Link
US (1) US12033040B2 (en)
SG (1) SG11202101452RA (en)
WO (1) WO2019035765A1 (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11580170B2 (en) * 2018-11-01 2023-02-14 Google Llc Machine learning based automatic audience segment in ad targeting
US20200193231A1 (en) * 2018-12-17 2020-06-18 International Business Machines Corporation Training Model Generation
US11194691B2 (en) * 2019-05-31 2021-12-07 Gurucul Solutions, Llc Anomaly detection using deep learning models
US11005872B2 (en) 2019-05-31 2021-05-11 Gurucul Solutions, Llc Anomaly detection in cybersecurity and fraud applications
RU2759786C1 (en) * 2019-07-05 2021-11-17 Публичное Акционерное Общество "Сбербанк России" (Пао Сбербанк) Method and system for classifying data for identifying confidential information
US20210012239A1 (en) * 2019-07-12 2021-01-14 Microsoft Technology Licensing, Llc Automated generation of machine learning models for network evaluation
CN112242984B (en) * 2019-07-19 2023-05-30 伊姆西Ip控股有限责任公司 Method, electronic device and computer program product for detecting abnormal network request
CN112632269A (en) * 2019-09-24 2021-04-09 北京国双科技有限公司 Method and related device for training document classification model
CN110795564B (en) * 2019-11-01 2022-02-22 南京稷图数据科技有限公司 Text classification method lacking negative cases
CN111143842B (en) * 2019-12-12 2022-07-01 广州大学 Malicious code detection method and system
EP4107925A4 (en) * 2020-02-17 2023-06-07 Bigid Inc. Machine learning systems and methods for predicting personal information using file metadata
EP3889849A1 (en) * 2020-03-31 2021-10-06 Tata Consultancy Services Limited Method and system for specific actionable determination in an application
US11481648B2 (en) 2020-05-07 2022-10-25 Microsoft Technology Licensing, Llc Software categorization based on knowledge graph and machine learning techniques
US11461680B2 (en) * 2020-05-21 2022-10-04 Sap Se Identifying attributes in unstructured data files using a machine-learning model
CN113836345A (en) * 2020-06-23 2021-12-24 索尼公司 Information processing apparatus, information processing method, and computer-readable storage medium
CN113051395A (en) * 2020-09-15 2021-06-29 卢霞浩 Keyword clustering method and system based on cloud computing and big data
US11797770B2 (en) 2020-09-24 2023-10-24 UiPath, Inc. Self-improving document classification and splitting for document processing in robotic process automation
US11741087B2 (en) * 2021-01-04 2023-08-29 Servicenow, Inc. Automatically generated graphical user interface application with dynamic user interface segment elements
US20220237373A1 (en) * 2021-01-28 2022-07-28 Accenture Global Solutions Limited Automated categorization and summarization of documents using machine learning
CN113254634A (en) * 2021-02-04 2021-08-13 天津德尔塔科技有限公司 File classification method and system based on phase space
CN113761358A (en) * 2021-05-11 2021-12-07 中科天玑数据科技股份有限公司 Multi-channel hotspot discovery method and multi-channel hotspot discovery system
CN114124509B (en) * 2021-11-17 2024-06-18 浪潮云信息技术股份公司 Spark-based network abnormal flow detection method and system
CN115129959A (en) * 2022-08-25 2022-09-30 北京美络克思科技有限公司 Intelligent file identification method, device and system
CN115150196B (en) * 2022-09-01 2022-11-18 北京金睛云华科技有限公司 Ciphertext data-based anomaly detection method, device and equipment under normal distribution
WO2024107688A1 (en) * 2022-11-14 2024-05-23 Bp Corporation North America Inc. Systems and methods for providing searchable access to documents across separate document repositories
US11830270B1 (en) 2023-04-20 2023-11-28 FPT USA Corp. Machine learning systems for auto-splitting and classifying documents
CN116975863A (en) * 2023-07-10 2023-10-31 福州大学 Malicious code detection method based on convolutional neural network
CN118312998B (en) * 2024-04-09 2024-09-27 浙江志诚云信息科技有限公司 Large model system and method for data management and analysis
CN118364112A (en) * 2024-06-19 2024-07-19 杭州嘉识科技有限公司 Data processing method and system based on large model

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3974511B2 (en) * 2002-12-19 2007-09-12 インターナショナル・ビジネス・マシーンズ・コーポレーション Computer system for generating data structure for information retrieval, method therefor, computer-executable program for generating data structure for information retrieval, computer-executable program for generating data structure for information retrieval Stored computer-readable storage medium, information retrieval system, and graphical user interface system
US20130104251A1 (en) * 2005-02-01 2013-04-25 Newsilike Media Group, Inc. Security systems and methods for use with structured and unstructured data
DE112010004087T5 (en) * 2009-12-09 2012-10-18 International Business Machines Corporation A method, computer system and computer program for searching document data using a search term
US10354187B2 (en) * 2013-01-17 2019-07-16 Hewlett Packard Enterprise Development Lp Confidentiality of files using file vectorization and machine learning
US9626528B2 (en) 2014-03-07 2017-04-18 International Business Machines Corporation Data leak prevention enforcement based on learned document classification
US10217147B2 (en) * 2014-09-12 2019-02-26 Ebay Inc. Mapping products between different taxonomies
US9979748B2 (en) * 2015-05-27 2018-05-22 Cisco Technology, Inc. Domain classification and routing using lexical and semantic processing
US10795560B2 (en) * 2016-09-30 2020-10-06 Disney Enterprises, Inc. System and method for detection and visualization of anomalous media events
US10540516B2 (en) 2016-10-13 2020-01-21 Commvault Systems, Inc. Data protection within an unsecured storage environment
CN106897459A (en) * 2016-12-14 2017-06-27 中国电子科技集团公司第三十研究所 A kind of text sensitive information recognition methods based on semi-supervised learning
US20180300315A1 (en) * 2017-04-14 2018-10-18 Novabase Business Solutions, S.A. Systems and methods for document processing using machine learning
US10210244B1 (en) * 2018-02-12 2019-02-19 Asapp, Inc. Updating natural language interfaces by processing usage data
US10783877B2 (en) * 2018-07-24 2020-09-22 Accenture Global Solutions Limited Word clustering and categorization

Also Published As

Publication number Publication date
US20210319179A1 (en) 2021-10-14
WO2019035765A1 (en) 2019-02-21
US12033040B2 (en) 2024-07-09
WO2019035765A9 (en) 2019-03-21

Similar Documents

Publication Publication Date Title
SG11202101452RA (en) Methods, machine learning engines and file management platform systems for content and context aware data classification and security anomaly detection
SG11202106314VA (en) Methods for detecting and interpreting data anomalies, and related systems and devices
SG10201913241PA (en) Computer-implemented method and data processing system for testing device security
EP3190765A4 (en) Sensitive information processing method, device, server and security determination system
EP3329412A4 (en) System and method for in-situ classifier retraining for malware identification and model heterogeneity
EP3553725A4 (en) Business data processing method, verification method, apparatus and system
HK1215766A1 (en) Method and system for verifying identity, method for processing server data and server
SG11201800370VA (en) Method, system, electronic device, and medium for classifying license plates based on deep learning
GB201618161D0 (en) Improved method, system and software for searching, identifying, retrieving and presenting electronic documents
EP3118771A4 (en) Confidential data management method and device, and security authentication method and system
EP2947811A4 (en) Method, server, host and system for protecting data security
EP3198834A4 (en) Method and system for email privacy, security and information theft detection
EP3541006A4 (en) Reuse system, key creating device, data security device, on-vehicle computer, reuse method, and computer program
GB201915196D0 (en) A method and system for network access control based on traffic monitoring and vulnerability detection using process related information
EP3096235A4 (en) Information processing system, information processing server, information processing program, and fatigue evaluation method
EP3702956A4 (en) Gesture detection method, gesture processing device, and computer readable storage medium
EP3435252A4 (en) Optimization method, evaluation method, processing method, and device for data migration
EP3319058A4 (en) Anomaly detection method, anomaly detection program, and information processing device
EP3499793A4 (en) Data provision system, data security device, data provision method, and computer program
EP3142370A4 (en) Method, device and system for processing media resource information
EP3432278A4 (en) Identification device, identification method, identification program, and computer readable medium containing identification program
EP3428877A4 (en) Detection device, information processing device, detection method, detection program, and detection system
EP3432277A4 (en) Identification device, identification method, identification program, and computer readable medium containing identification program
SG11201808251XA (en) Access management method, information processing device, program, and recording medium
LT3563240T (en) Systems and methods for harvesting data associated with fraudulent content in a networked environment