CN117677959A - 使用经训练的机器学习管线识别分类层次结构 - Google Patents

使用经训练的机器学习管线识别分类层次结构 Download PDF

Info

Publication number
CN117677959A
CN117677959A CN202280049145.3A CN202280049145A CN117677959A CN 117677959 A CN117677959 A CN 117677959A CN 202280049145 A CN202280049145 A CN 202280049145A CN 117677959 A CN117677959 A CN 117677959A
Authority
CN
China
Prior art keywords
classification
machine learning
target data
data item
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280049145.3A
Other languages
English (en)
Chinese (zh)
Inventor
A·波莱里
R·库马尔
M·M·布罗恩
陈国栋
S·阿格拉瓦尔
R·S·布赫海姆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Publication of CN117677959A publication Critical patent/CN117677959A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/231Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CN202280049145.3A 2021-06-10 2022-06-08 使用经训练的机器学习管线识别分类层次结构 Pending CN117677959A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US17/303,918 US20220398445A1 (en) 2021-06-10 2021-06-10 Identifying a classification hierarchy using a trained machine learning pipeline
US17/303,918 2021-06-10
PCT/US2022/032705 WO2022261233A1 (en) 2021-06-10 2022-06-08 Identifying a classification hierarchy using a trained machine learning pipeline

Publications (1)

Publication Number Publication Date
CN117677959A true CN117677959A (zh) 2024-03-08

Family

ID=82482578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280049145.3A Pending CN117677959A (zh) 2021-06-10 2022-06-08 使用经训练的机器学习管线识别分类层次结构

Country Status (5)

Country Link
US (1) US20220398445A1 (https=)
EP (1) EP4352655A1 (https=)
JP (1) JP2024528393A (https=)
CN (1) CN117677959A (https=)
WO (1) WO2022261233A1 (https=)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12093864B2 (en) * 2021-05-18 2024-09-17 Ebay Inc. Inventory item prediction and listing recommendation
US20220415524A1 (en) * 2021-06-29 2022-12-29 International Business Machines Corporation Machine learning-based adjustment of epidemiological model projections with flexible prediction horizon
WO2024015964A1 (en) * 2022-07-14 2024-01-18 SucceedSmart, Inc. Systems and methods for candidate database querying
US11841851B1 (en) * 2022-07-24 2023-12-12 SAS, Inc. Systems, methods, and graphical user interfaces for taxonomy-based classification of unlabeled structured datasets
US12056214B1 (en) * 2022-09-29 2024-08-06 Amazon Technologies, Inc. Systems for automatically correcting categories of items
CN115859128B (zh) * 2023-02-23 2023-05-09 成都瑞安信信息安全技术有限公司 一种基于档案数据交互相似度的分析方法和系统
CN117271674A (zh) * 2023-08-28 2023-12-22 杭州数梦工场科技有限公司 字段类型识别方法、装置、电子设备及存储介质
WO2026044241A1 (en) * 2024-08-23 2026-02-26 D. E. Shaw Research, Llc Modeling molecules with transformer machine-learned models

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9483768B2 (en) * 2014-08-11 2016-11-01 24/7 Customer, Inc. Methods and apparatuses for modeling customer interaction experiences
US10282368B2 (en) * 2016-07-29 2019-05-07 Symantec Corporation Grouped categorization of internet content
US9928448B1 (en) * 2016-09-23 2018-03-27 International Business Machines Corporation Image classification utilizing semantic relationships in a classification hierarchy
US10679330B2 (en) * 2018-01-15 2020-06-09 Tata Consultancy Services Limited Systems and methods for automated inferencing of changes in spatio-temporal images
US20190354850A1 (en) * 2018-05-17 2019-11-21 International Business Machines Corporation Identifying transfer models for machine learning tasks
US20190294999A1 (en) * 2018-06-16 2019-09-26 Moshe Guttmann Selecting hyper parameters for machine learning algorithms based on past training results
US11693910B2 (en) * 2018-12-13 2023-07-04 Microsoft Technology Licensing, Llc Personalized search result rankings
US10937417B2 (en) * 2019-05-31 2021-03-02 Clinc, Inc. Systems and methods for automatically categorizing unstructured data and improving a machine learning-based dialogue system
US11494559B2 (en) * 2019-11-27 2022-11-08 Oracle International Corporation Hybrid in-domain and out-of-domain document processing for non-vocabulary tokens of electronic documents
US11481448B2 (en) * 2020-03-31 2022-10-25 Microsoft Technology Licensing, Llc Semantic matching and retrieval of standardized entities
US11687812B2 (en) * 2020-08-18 2023-06-27 Accenture Global Solutions Limited Autoclassification of products using artificial intelligence

Also Published As

Publication number Publication date
EP4352655A1 (en) 2024-04-17
US20220398445A1 (en) 2022-12-15
WO2022261233A1 (en) 2022-12-15
JP2024528393A (ja) 2024-07-30

Similar Documents

Publication Publication Date Title
CN117677959A (zh) 使用经训练的机器学习管线识别分类层次结构
US12399905B2 (en) Context-sensitive linking of entities to private databases
US12086548B2 (en) Event extraction from documents with co-reference
US11494559B2 (en) Hybrid in-domain and out-of-domain document processing for non-vocabulary tokens of electronic documents
US12223448B2 (en) Issue tracking system using a similarity score to suggest and create duplicate issue requests across multiple projects
US11507747B2 (en) Hybrid in-domain and out-of-domain document processing for non-vocabulary tokens of electronic documents
US10963686B2 (en) Semantic normalization in document digitization
US11573995B2 (en) Analyzing the tone of textual data
US11481554B2 (en) Systems and methods for training and evaluating machine learning models using generalized vocabulary tokens for document processing
US11567948B2 (en) Autonomous suggestion of related issues in an issue tracking system
US20250322312A1 (en) Automated Data Hierarchy Extraction And Prediction Using A Machine Learning Model
US20220100967A1 (en) Lifecycle management for customized natural language processing
US12346741B2 (en) Computing environment provisioning
US20260080014A1 (en) Large Language Machine Learning Model Query Management
CN111886596B (zh) 使用基于序列的锁定/解锁分类进行机器翻译锁定
US20240242018A1 (en) Machine learning based prediction of document metadata
WO2022072237A1 (en) Lifecycle management for customized natural language processing
US20240330375A1 (en) Comparison of names
US20240242108A1 (en) Training of machine learning models for predicting document metadata
US12417239B2 (en) System, apparatus, and method for structuring documentary data for improved topic extraction and modeling
CN119452364A (zh) 机器学习模型的数据集的引导增强
AU2022204724B1 (en) Supervised machine learning method for matching unsupervised data
US20250173369A1 (en) Triggering execution of machine learning based prediction of document metadata

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination