CN103782309B - 用于机器学习分类器的自动数据清除 - Google Patents
用于机器学习分类器的自动数据清除 Download PDFInfo
- Publication number
- CN103782309B CN103782309B CN201280019651.4A CN201280019651A CN103782309B CN 103782309 B CN103782309 B CN 103782309B CN 201280019651 A CN201280019651 A CN 201280019651A CN 103782309 B CN103782309 B CN 103782309B
- Authority
- CN
- China
- Prior art keywords
- document
- group
- documentation
- sets
- unvds
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/338—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (26)
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161445236P | 2011-02-22 | 2011-02-22 | |
US61/445236 | 2011-02-22 | ||
US13/046266 | 2011-03-11 | ||
US13/046,266 US8626682B2 (en) | 2011-02-22 | 2011-03-11 | Automatic data cleaning for machine learning classifiers |
PCT/US2012/025930 WO2012115958A2 (en) | 2011-02-22 | 2012-02-21 | Automatic data cleaning for machine learning classifiers |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103782309A CN103782309A (zh) | 2014-05-07 |
CN103782309B true CN103782309B (zh) | 2017-06-16 |
Family
ID=46653595
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201280019651.4A Active CN103782309B (zh) | 2011-02-22 | 2012-02-21 | 用于机器学习分类器的自动数据清除 |
CN201280019647.8A Active CN104025130B (zh) | 2011-02-22 | 2012-02-21 | 计算实体之间的重要性的方法、系统和设备 |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201280019647.8A Active CN104025130B (zh) | 2011-02-22 | 2012-02-21 | 计算实体之间的重要性的方法、系统和设备 |
Country Status (4)
Country | Link |
---|---|
US (3) | US8626682B2 (zh) |
EP (2) | EP2678806A2 (zh) |
CN (2) | CN103782309B (zh) |
WO (2) | WO2012115958A2 (zh) |
Families Citing this family (87)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10999298B2 (en) | 2004-03-02 | 2021-05-04 | The 41St Parameter, Inc. | Method and system for identifying users and detecting fraud by use of the internet |
US11301585B2 (en) | 2005-12-16 | 2022-04-12 | The 41St Parameter, Inc. | Methods and apparatus for securely displaying digital images |
US8151327B2 (en) | 2006-03-31 | 2012-04-03 | The 41St Parameter, Inc. | Systems and methods for detection of session tampering and fraud prevention |
US9112850B1 (en) | 2009-03-25 | 2015-08-18 | The 41St Parameter, Inc. | Systems and methods of sharing information through a tag-based consortium |
US8682814B2 (en) | 2010-12-14 | 2014-03-25 | Symantec Corporation | User interface and workflow for performing machine learning |
US9015082B1 (en) * | 2010-12-14 | 2015-04-21 | Symantec Corporation | Data quality assessment for vector machine learning |
US9094291B1 (en) | 2010-12-14 | 2015-07-28 | Symantec Corporation | Partial risk score calculation for a data object |
US8626682B2 (en) * | 2011-02-22 | 2014-01-07 | Thomson Reuters Global Resources | Automatic data cleaning for machine learning classifiers |
US10754913B2 (en) * | 2011-11-15 | 2020-08-25 | Tapad, Inc. | System and method for analyzing user device information |
US8856130B2 (en) * | 2012-02-09 | 2014-10-07 | Kenshoo Ltd. | System, a method and a computer program product for performance assessment |
US9633201B1 (en) | 2012-03-01 | 2017-04-25 | The 41St Parameter, Inc. | Methods and systems for fraud containment |
US9521551B2 (en) | 2012-03-22 | 2016-12-13 | The 41St Parameter, Inc. | Methods and systems for persistent cross-application mobile device identification |
US9116982B1 (en) * | 2012-04-27 | 2015-08-25 | Google Inc. | Identifying interesting commonalities between entities |
WO2014022813A1 (en) | 2012-08-02 | 2014-02-06 | The 41St Parameter, Inc. | Systems and methods for accessing records via derivative locators |
US11126720B2 (en) * | 2012-09-26 | 2021-09-21 | Bluvector, Inc. | System and method for automated machine-learning, zero-day malware detection |
WO2014078569A1 (en) | 2012-11-14 | 2014-05-22 | The 41St Parameter, Inc. | Systems and methods of global identification |
US9146980B1 (en) * | 2013-06-24 | 2015-09-29 | Google Inc. | Temporal content selection |
US10902327B1 (en) | 2013-08-30 | 2021-01-26 | The 41St Parameter, Inc. | System and method for device identification and uniqueness |
US20150088798A1 (en) * | 2013-09-23 | 2015-03-26 | Mastercard International Incorporated | Detecting behavioral patterns and anomalies using metadata |
US11094015B2 (en) | 2014-07-11 | 2021-08-17 | BMLL Technologies, Ltd. | Data access and processing system |
US10091312B1 (en) | 2014-10-14 | 2018-10-02 | The 41St Parameter, Inc. | Data structures for intelligently resolving deterministic and probabilistic device identifiers to device profiles and/or groups |
US10649740B2 (en) * | 2015-01-15 | 2020-05-12 | International Business Machines Corporation | Predicting and using utility of script execution in functional web crawling and other crawling |
WO2016128491A1 (en) | 2015-02-11 | 2016-08-18 | British Telecommunications Public Limited Company | Validating computer resource usage |
CN104615442A (zh) * | 2015-02-13 | 2015-05-13 | 广东欧珀移动通信有限公司 | 控件使用统计表的更新方法和装置、软件调整方法和装置 |
WO2017021153A1 (en) | 2015-07-31 | 2017-02-09 | British Telecommunications Public Limited Company | Expendable access control |
WO2017021155A1 (en) | 2015-07-31 | 2017-02-09 | British Telecommunications Public Limited Company | Controlled resource provisioning in distributed computing environments |
US11347876B2 (en) | 2015-07-31 | 2022-05-31 | British Telecommunications Public Limited Company | Access control |
WO2017032427A1 (en) | 2015-08-27 | 2017-03-02 | Longsand Limited | Identifying augmented features based on a bayesian analysis of a text document |
GB201517462D0 (en) * | 2015-10-02 | 2015-11-18 | Tractable Ltd | Semi-automatic labelling of datasets |
US10062084B2 (en) * | 2015-10-21 | 2018-08-28 | International Business Machines Corporation | Using ontological distance to measure unexpectedness of correlation |
US11200466B2 (en) * | 2015-10-28 | 2021-12-14 | Hewlett-Packard Development Company, L.P. | Machine learning classifiers |
US20170206904A1 (en) * | 2016-01-19 | 2017-07-20 | Knuedge Incorporated | Classifying signals using feature trajectories |
US10878341B2 (en) * | 2016-03-18 | 2020-12-29 | Fair Isaac Corporation | Mining and visualizing associations of concepts on a large-scale unstructured data |
EP3437007B1 (en) | 2016-03-30 | 2021-04-28 | British Telecommunications public limited company | Cryptocurrencies malware based detection |
US11023248B2 (en) | 2016-03-30 | 2021-06-01 | British Telecommunications Public Limited Company | Assured application services |
US11194901B2 (en) | 2016-03-30 | 2021-12-07 | British Telecommunications Public Limited Company | Detecting computer security threats using communication characteristics of communication protocols |
US11159549B2 (en) | 2016-03-30 | 2021-10-26 | British Telecommunications Public Limited Company | Network traffic threat identification |
US11153091B2 (en) | 2016-03-30 | 2021-10-19 | British Telecommunications Public Limited Company | Untrusted code distribution |
CA3008462A1 (en) * | 2016-04-05 | 2017-10-12 | Thomson Reuters Global Resources Unlimited Company | Self-service classification system |
US20170364804A1 (en) * | 2016-06-15 | 2017-12-21 | International Business Machines Corporation | Answer Scoring Based on a Combination of Specificity and Informativity Metrics |
US20170364519A1 (en) * | 2016-06-15 | 2017-12-21 | International Business Machines Corporation | Automated Answer Scoring Based on Combination of Informativity and Specificity Metrics |
US10657482B2 (en) | 2016-06-16 | 2020-05-19 | Adp, Llc | Dynamic organization structure model |
US10606849B2 (en) * | 2016-08-31 | 2020-03-31 | International Business Machines Corporation | Techniques for assigning confidence scores to relationship entries in a knowledge graph |
US10607142B2 (en) * | 2016-08-31 | 2020-03-31 | International Business Machines Corporation | Responding to user input based on confidence scores assigned to relationship entries in a knowledge graph |
CN108121737B (zh) * | 2016-11-29 | 2022-04-26 | 阿里巴巴集团控股有限公司 | 一种业务对象属性标识的生成方法、装置和系统 |
WO2018107128A1 (en) * | 2016-12-09 | 2018-06-14 | U2 Science Labs, Inc. | Systems and methods for automating data science machine learning analytical workflows |
US11003716B2 (en) | 2017-01-10 | 2021-05-11 | International Business Machines Corporation | Discovery, characterization, and analysis of interpersonal relationships extracted from unstructured text data |
EP3382591B1 (en) | 2017-03-30 | 2020-03-25 | British Telecommunications public limited company | Hierarchical temporal memory for expendable access control |
EP3602380B1 (en) | 2017-03-30 | 2022-02-23 | British Telecommunications public limited company | Hierarchical temporal memory for access control |
EP3602369B1 (en) | 2017-03-30 | 2022-03-30 | British Telecommunications public limited company | Anomaly detection for computer systems |
WO2018206374A1 (en) * | 2017-05-08 | 2018-11-15 | British Telecommunications Public Limited Company | Load balancing of machine learning algorithms |
US11823017B2 (en) | 2017-05-08 | 2023-11-21 | British Telecommunications Public Limited Company | Interoperation of machine learning algorithms |
WO2018206406A1 (en) * | 2017-05-08 | 2018-11-15 | British Telecommunications Public Limited Company | Adaptation of machine learning algorithms |
EP3622450A1 (en) | 2017-05-08 | 2020-03-18 | British Telecommunications Public Limited Company | Management of interoperating machine leaning algorithms |
EP3622449A1 (en) * | 2017-05-08 | 2020-03-18 | British Telecommunications Public Limited Company | Autonomous logic modules |
US10489722B2 (en) * | 2017-07-27 | 2019-11-26 | Disney Enterprises, Inc. | Semiautomatic machine learning model improvement and benchmarking |
US10929383B2 (en) * | 2017-08-11 | 2021-02-23 | International Business Machines Corporation | Method and system for improving training data understanding in natural language processing |
US10585933B2 (en) | 2017-08-16 | 2020-03-10 | International Business Machines Corporation | System and method for classification of low relevance records in a database using instance-based classifiers and machine learning |
WO2019055553A1 (en) * | 2017-09-12 | 2019-03-21 | Schlumberger Technology Corporation | DYNAMIC REPRESENTATION OF RELATIONSHIPS OF EXPLORATION AND / OR PRODUCTION ENTITIES |
US11574287B2 (en) | 2017-10-10 | 2023-02-07 | Text IQ, Inc. | Automatic document classification |
US10162850B1 (en) | 2018-04-10 | 2018-12-25 | Icertis, Inc. | Clause discovery for validation of documents |
EP3811323A4 (en) | 2018-06-19 | 2022-03-09 | Thomson Reuters Enterprise Centre GmbH | SYSTEMS AND METHODS FOR DETERMINING STRUCTURED PROCESS OUTCOMES |
WO2020005986A1 (en) * | 2018-06-25 | 2020-01-02 | Diffeo, Inc. | Systems and method for investigating relationships among entities |
US11144581B2 (en) * | 2018-07-26 | 2021-10-12 | International Business Machines Corporation | Verifying and correcting training data for text classification |
US11120367B2 (en) * | 2018-07-30 | 2021-09-14 | International Business Machines Corporation | Validating training data of classifiers |
CN109635029B (zh) * | 2018-12-07 | 2023-10-13 | 深圳前海微众银行股份有限公司 | 基于标签指标体系的数据处理方法、装置、设备及介质 |
US10936974B2 (en) | 2018-12-24 | 2021-03-02 | Icertis, Inc. | Automated training and selection of models for document analysis |
JP7261022B2 (ja) * | 2019-01-30 | 2023-04-19 | キヤノン株式会社 | 情報処理システム、端末装置及びその制御方法、プログラム、記憶媒体 |
US10726374B1 (en) | 2019-02-19 | 2020-07-28 | Icertis, Inc. | Risk prediction based on automated analysis of documents |
JP7148444B2 (ja) * | 2019-03-19 | 2022-10-05 | 株式会社日立製作所 | 文分類装置、文分類方法及び文分類プログラム |
US11270078B2 (en) | 2019-05-18 | 2022-03-08 | Exxonmobil Upstream Research Company | Method and system for generating a surprisingness score for sentences within geoscience text |
US11157777B2 (en) | 2019-07-15 | 2021-10-26 | Disney Enterprises, Inc. | Quality control systems and methods for annotated content |
CN110674840B (zh) * | 2019-08-22 | 2022-03-25 | 中国司法大数据研究院有限公司 | 一种多方证据关联模型构建方法和证据链提取方法及装置 |
US11010606B1 (en) | 2019-11-15 | 2021-05-18 | Maxar Intelligence Inc. | Cloud detection from satellite imagery |
US11386649B2 (en) | 2019-11-15 | 2022-07-12 | Maxar Intelligence Inc. | Automated concrete/asphalt detection based on sensor time delay |
US11250260B2 (en) | 2019-11-15 | 2022-02-15 | Maxar Intelligence Inc. | Automated process for dynamic material classification in remotely sensed imagery |
US11556825B2 (en) * | 2019-11-26 | 2023-01-17 | International Business Machines Corporation | Data label verification using few-shot learners |
US11645579B2 (en) | 2019-12-20 | 2023-05-09 | Disney Enterprises, Inc. | Automated machine learning tagging and optimization of review procedures |
US11086891B2 (en) * | 2020-01-08 | 2021-08-10 | Subtree Inc. | Systems and methods for tracking and representing data science data runs |
US20230162049A1 (en) * | 2020-04-03 | 2023-05-25 | Presagen Pty Ltd | Artificial intelligence (ai) method for cleaning data for training ai models |
US12093245B2 (en) | 2020-04-17 | 2024-09-17 | International Business Machines Corporation | Temporal directed cycle detection and pruning in transaction graphs |
CN113762519B (zh) * | 2020-06-03 | 2024-06-28 | 杭州海康威视数字技术股份有限公司 | 一种数据清洗方法、装置及设备 |
US11288115B1 (en) | 2020-11-05 | 2022-03-29 | International Business Machines Corporation | Error analysis of a predictive model |
US11568319B2 (en) * | 2020-12-30 | 2023-01-31 | Hyland Uk Operations Limited | Techniques for dynamic machine learning integration |
CN112463915B (zh) * | 2021-02-02 | 2021-06-25 | 冠传网络科技(南京)有限公司 | 美妆产品社交媒体评论挖掘的方法、系统及存储介质 |
US11941020B2 (en) * | 2021-02-26 | 2024-03-26 | Micro Focus Llc | Displaying query results using machine learning model-determined query results visualizations |
US11361034B1 (en) | 2021-11-30 | 2022-06-14 | Icertis, Inc. | Representing documents using document keys |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101127029A (zh) * | 2007-08-24 | 2008-02-20 | 复旦大学 | 用于在大规模数据分类问题中训练svm分类器的方法 |
Family Cites Families (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5600831A (en) * | 1994-02-28 | 1997-02-04 | Lucent Technologies Inc. | Apparatus and methods for retrieving information by modifying query plan based on description of information sources |
US6862710B1 (en) * | 1999-03-23 | 2005-03-01 | Insightful Corporation | Internet navigation using soft hyperlinks |
EP1170676A1 (de) * | 2000-07-05 | 2002-01-09 | Abb Research Ltd. | Darstellung einer Informationsstruktur von Dokumenten des Word Wide Web |
US20070192863A1 (en) * | 2005-07-01 | 2007-08-16 | Harsh Kapoor | Systems and methods for processing data flows |
US7043661B2 (en) * | 2000-10-19 | 2006-05-09 | Tti-Team Telecom International Ltd. | Topology-based reasoning apparatus for root-cause analysis of network faults |
US6693651B2 (en) * | 2001-02-07 | 2004-02-17 | International Business Machines Corporation | Customer self service iconic interface for resource search results display and selection |
US20030046297A1 (en) | 2001-08-30 | 2003-03-06 | Kana Software, Inc. | System and method for a partially self-training learning system |
US7188117B2 (en) * | 2002-05-17 | 2007-03-06 | Xerox Corporation | Systems and methods for authoritativeness grading, estimation and sorting of documents in large heterogeneous document collections |
US6990485B2 (en) | 2002-08-02 | 2006-01-24 | Hewlett-Packard Development Company, L.P. | System and method for inducing a top-down hierarchical categorizer |
US6886010B2 (en) * | 2002-09-30 | 2005-04-26 | The United States Of America As Represented By The Secretary Of The Navy | Method for data and text mining and literature-based discovery |
US7451152B2 (en) * | 2004-07-29 | 2008-11-11 | Yahoo! Inc. | Systems and methods for contextual transaction proposals |
EP1776668A4 (en) * | 2004-08-12 | 2009-05-06 | Jigsaw Data Corp | CONTACT INFORMATION MARKET |
US20060117252A1 (en) * | 2004-11-29 | 2006-06-01 | Joseph Du | Systems and methods for document analysis |
JP4640591B2 (ja) | 2005-06-09 | 2011-03-02 | 富士ゼロックス株式会社 | 文書検索装置 |
US20070067320A1 (en) * | 2005-09-20 | 2007-03-22 | International Business Machines Corporation | Detecting relationships in unstructured text |
TWI468969B (zh) * | 2005-10-18 | 2015-01-11 | Intertrust Tech Corp | 授權對電子內容作存取的方法及授權對該電子內容執行動作之方法 |
US8903810B2 (en) * | 2005-12-05 | 2014-12-02 | Collarity, Inc. | Techniques for ranking search results |
US7739279B2 (en) * | 2005-12-12 | 2010-06-15 | Fuji Xerox Co., Ltd. | Systems and methods for determining relevant information based on document structure |
US7716217B2 (en) * | 2006-01-13 | 2010-05-11 | Bluespace Software Corporation | Determining relevance of electronic content |
EP1903479B1 (en) | 2006-08-25 | 2014-03-12 | Motorola Mobility LLC | Method and system for data classification using a self-organizing map |
US20080109454A1 (en) * | 2006-11-03 | 2008-05-08 | Willse Alan R | Text analysis techniques |
US20080195567A1 (en) * | 2007-02-13 | 2008-08-14 | International Business Machines Corporation | Information mining using domain specific conceptual structures |
AU2008225256B2 (en) * | 2007-03-12 | 2009-07-30 | Vortex Technology Services Limited | Intentionality matching |
WO2008124536A1 (en) | 2007-04-04 | 2008-10-16 | Seeqpod, Inc. | Discovering and scoring relationships extracted from human generated lists |
WO2009019830A1 (ja) * | 2007-08-03 | 2009-02-12 | Panasonic Corporation | 関連語提示装置 |
JP5232449B2 (ja) * | 2007-11-21 | 2013-07-10 | Kddi株式会社 | 情報検索装置およびコンピュータプログラム |
US8856182B2 (en) * | 2008-01-25 | 2014-10-07 | Avaya Inc. | Report database dependency tracing through business intelligence metadata |
US8082278B2 (en) * | 2008-06-13 | 2011-12-20 | Microsoft Corporation | Generating query suggestions from semantic relationships in content |
US8271422B2 (en) * | 2008-11-29 | 2012-09-18 | At&T Intellectual Property I, Lp | Systems and methods for detecting and coordinating changes in lexical items |
CN101770580B (zh) * | 2009-01-04 | 2014-03-12 | 中国科学院计算技术研究所 | 一种跨领域的文本情感分类器的训练方法和分类方法 |
US8166032B2 (en) * | 2009-04-09 | 2012-04-24 | MarketChorus, Inc. | System and method for sentiment-based text classification and relevancy ranking |
US8375032B2 (en) | 2009-06-25 | 2013-02-12 | University Of Tennessee Research Foundation | Method and apparatus for predicting object properties and events using similarity-based information retrieval and modeling |
JP2011013732A (ja) * | 2009-06-30 | 2011-01-20 | Sony Corp | 情報処理装置、情報処理方法、およびプログラム |
US20110106807A1 (en) * | 2009-10-30 | 2011-05-05 | Janya, Inc | Systems and methods for information integration through context-based entity disambiguation |
US8762375B2 (en) * | 2010-04-15 | 2014-06-24 | Palo Alto Research Center Incorporated | Method for calculating entity similarities |
US8346776B2 (en) | 2010-05-17 | 2013-01-01 | International Business Machines Corporation | Generating a taxonomy for documents from tag data |
US9043360B2 (en) * | 2010-12-17 | 2015-05-26 | Yahoo! Inc. | Display entity relationship |
US8626682B2 (en) * | 2011-02-22 | 2014-01-07 | Thomson Reuters Global Resources | Automatic data cleaning for machine learning classifiers |
US9721039B2 (en) * | 2011-12-16 | 2017-08-01 | Palo Alto Research Center Incorporated | Generating a relationship visualization for nonhomogeneous entities |
-
2011
- 2011-03-11 US US13/046,266 patent/US8626682B2/en active Active
- 2011-05-13 US US13/107,665 patent/US9495635B2/en active Active
-
2012
- 2012-02-21 CN CN201280019651.4A patent/CN103782309B/zh active Active
- 2012-02-21 CN CN201280019647.8A patent/CN104025130B/zh active Active
- 2012-02-21 EP EP12708205.5A patent/EP2678806A2/en not_active Ceased
- 2012-02-21 WO PCT/US2012/025930 patent/WO2012115958A2/en active Application Filing
- 2012-02-21 WO PCT/US2012/025937 patent/WO2012115962A1/en active Application Filing
- 2012-02-21 EP EP12707689.1A patent/EP2678808A1/en not_active Withdrawn
-
2016
- 2016-11-14 US US15/351,256 patent/US10650049B2/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101127029A (zh) * | 2007-08-24 | 2008-02-20 | 复旦大学 | 用于在大规模数据分类问题中训练svm分类器的方法 |
Non-Patent Citations (2)
Title |
---|
关联词约束的半监督文本分类方法;韩红旗等;《计算机工程与应用》;20101231;第46卷(第4期);第113-116页 * |
半监督的文本分类—两阶段协同学习;郝秀兰等;《小型微型计算机系统》;20091031;第30卷(第10期);第1921-1926页 * |
Also Published As
Publication number | Publication date |
---|---|
US10650049B2 (en) | 2020-05-12 |
WO2012115958A3 (en) | 2012-10-18 |
US8626682B2 (en) | 2014-01-07 |
EP2678806A2 (en) | 2014-01-01 |
US20170220674A1 (en) | 2017-08-03 |
EP2678808A1 (en) | 2014-01-01 |
CN103782309A (zh) | 2014-05-07 |
CN104025130A (zh) | 2014-09-03 |
US9495635B2 (en) | 2016-11-15 |
US20120215777A1 (en) | 2012-08-23 |
US20120215727A1 (en) | 2012-08-23 |
CN104025130B (zh) | 2018-07-20 |
WO2012115958A2 (en) | 2012-08-30 |
WO2012115962A1 (en) | 2012-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103782309B (zh) | 用于机器学习分类器的自动数据清除 | |
JP7169369B2 (ja) | 機械学習アルゴリズムのためのデータを生成する方法、システム | |
US10824959B1 (en) | Explainers for machine learning classifiers | |
Maleki et al. | A comprehensive literature review of the rank reversal phenomenon in the analytic hierarchy process | |
Shasha et al. | Unordered tree mining with applications to phylogeny | |
Bergmann et al. | Approximation of dispatching rules for manufacturing simulation using data mining methods | |
Basgalupp et al. | Predicting software maintenance effort through evolutionary-based decision trees | |
CN107748783A (zh) | 一种基于句向量的多标签公司描述文本分类方法 | |
Yousefnezhad et al. | A new selection strategy for selective cluster ensemble based on diversity and independency | |
Rasiman et al. | How effective is automated trace link recovery in model-driven development? | |
da Costa et al. | Clustering interval data through kernel-induced feature space | |
Tayal et al. | A new MapReduce solution for associative classification to handle scalability and skewness in vertical data structure | |
JP5110950B2 (ja) | 多重トピック分類装置、多重トピック分類方法、および多重トピック分類プログラム | |
JP2021179859A (ja) | 学習モデル生成システム、及び学習モデル生成方法 | |
Van Oirschot et al. | Using trace clustering for configurable process discovery explained by event log data | |
Guo et al. | A latent topic model for linked documents | |
JP2020077236A (ja) | 探索プログラム、探索方法及び探索装置 | |
Chaturvedi | Data mining and it's application in EDM domain | |
Rawat et al. | Analyzing the performance of various clustering algorithms | |
Riesen | Graph edit distance | |
Czibula et al. | A Lagrangian relaxation-based heuristic to solve large extended graph partitioning problems | |
Rigutini et al. | A neural network approach for learning object ranking | |
JP6631139B2 (ja) | 検索制御プログラム、検索制御方法および検索サーバ装置 | |
Filippidou et al. | Online partitioning of multi-labeled graphs | |
Irfan et al. | Evolving the taxonomy based on hierarchical clustering approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder |
Address after: Swiss Swiss Patentee after: Thomsen Reuters global resources unlimited company Address before: Swiss Swiss Patentee before: Thomson Reuters Globle Resources |
|
CP01 | Change in the name or title of a patent holder | ||
TR01 | Transfer of patent right |
Effective date of registration: 20190819 Address after: England city Patentee after: Finance and Risk Organizations Limited Address before: Swiss Swiss Patentee before: Thomsen Reuters global resources unlimited company |
|
TR01 | Transfer of patent right |