IN2014MU00169A - - Google Patents

Info

Publication number
IN2014MU00169A
IN2014MU00169A IN169MU2014A IN2014MU00169A IN 2014MU00169 A IN2014MU00169 A IN 2014MU00169A IN 169MU2014 A IN169MU2014 A IN 169MU2014A IN 2014MU00169 A IN2014MU00169 A IN 2014MU00169A
Authority
IN
India
Prior art keywords
documents
entity
merged
document
bucket
Prior art date
Application number
Other languages
English (en)
Inventor
Puneet Agarwal
Gautam Shroff
Pankaj Malhotra
Original Assignee
Tata Consultancy Services Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tata Consultancy Services Ltd filed Critical Tata Consultancy Services Ltd
Priority to IN169MU2014 priority Critical patent/IN2014MU00169A/en
Priority to EP14186280.5A priority patent/EP2897054A3/en
Priority to AU2014253497A priority patent/AU2014253497B2/en
Priority to CA2868540A priority patent/CA2868540C/en
Priority to MX2014013314A priority patent/MX355195B/es
Priority to US14/533,866 priority patent/US10311093B2/en
Priority to BR102014027639-4A priority patent/BR102014027639B1/pt
Publication of IN2014MU00169A publication Critical patent/IN2014MU00169A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Creation or modification of classes or clusters
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/131Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
IN169MU2014 2014-01-17 2014-01-17 IN2014MU00169A (enExample)

Priority Applications (7)

Application Number Priority Date Filing Date Title
IN169MU2014 IN2014MU00169A (enExample) 2014-01-17 2014-01-17
EP14186280.5A EP2897054A3 (en) 2014-01-17 2014-09-24 Entity resolution from documents
AU2014253497A AU2014253497B2 (en) 2014-01-17 2014-10-22 Entity resolution from documents
CA2868540A CA2868540C (en) 2014-01-17 2014-10-24 Entity resolution from documents
MX2014013314A MX355195B (es) 2014-01-17 2014-11-03 Resolucioón de entidad de documentos.
US14/533,866 US10311093B2 (en) 2014-01-17 2014-11-05 Entity resolution from documents
BR102014027639-4A BR102014027639B1 (pt) 2014-01-17 2014-11-05 Método para resolver as entidades de uma pluralidade de documentos, e sistema de resolução de entidade para a resolução de entidade de uma pluralidade de documentos

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
IN169MU2014 IN2014MU00169A (enExample) 2014-01-17 2014-01-17

Publications (1)

Publication Number Publication Date
IN2014MU00169A true IN2014MU00169A (enExample) 2015-08-28

Family

ID=51625852

Family Applications (1)

Application Number Title Priority Date Filing Date
IN169MU2014 IN2014MU00169A (enExample) 2014-01-17 2014-01-17

Country Status (7)

Country Link
US (1) US10311093B2 (enExample)
EP (1) EP2897054A3 (enExample)
AU (1) AU2014253497B2 (enExample)
BR (1) BR102014027639B1 (enExample)
CA (1) CA2868540C (enExample)
IN (1) IN2014MU00169A (enExample)
MX (1) MX355195B (enExample)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165291B (zh) * 2018-06-29 2021-07-09 厦门快商通信息技术有限公司 一种文本匹配方法及电子设备
CN109635114A (zh) * 2018-12-17 2019-04-16 北京百度网讯科技有限公司 用于处理信息的方法和装置
FR3104282B1 (fr) * 2019-12-05 2024-01-19 Codexo Sauvegarde de documents en blocs
US12314666B2 (en) * 2020-05-01 2025-05-27 Salesforce, Inc. Stable identification of entity mentions
CN111882165A (zh) * 2020-07-01 2020-11-03 国网河北省电力有限公司经济技术研究院 一种综合项目造价分析数据拆分装置及方法
US12198459B2 (en) * 2021-11-24 2025-01-14 Adobe Inc. Systems for generating indications of relationships between electronic documents
EP4573477A1 (en) * 2022-08-18 2025-06-25 9197-1168 Québec Inc. Systems and methods for identifying documents and references
CN119646178B (zh) * 2024-11-26 2025-08-05 湖北邮电规划设计有限公司 基于知识图谱的增强文档生成和检索方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7213198B1 (en) * 1999-08-12 2007-05-01 Google Inc. Link based clustering of hyperlinked documents
US7900052B2 (en) * 2002-11-06 2011-03-01 International Business Machines Corporation Confidential data sharing and anonymous entity resolution
US8683312B2 (en) * 2005-06-16 2014-03-25 Adobe Systems Incorporated Inter-document links involving embedded documents
US8423425B2 (en) * 2007-11-14 2013-04-16 Panjiva, Inc. Evaluating public records of supply transactions for financial investment decisions
US20090204590A1 (en) * 2008-02-11 2009-08-13 Queplix Corp. System and method for an integrated enterprise search
US8805861B2 (en) * 2008-12-09 2014-08-12 Google Inc. Methods and systems to train models to extract and integrate information from data sources
US20110119268A1 (en) * 2009-11-13 2011-05-19 Rajaram Shyam Sundar Method and system for segmenting query urls
CN102906736B (zh) * 2010-03-12 2018-03-23 爱立信(中国)通信有限公司 用于匹配实体的系统和方法及其中使用的同义词群组织器
US9189473B2 (en) * 2012-05-18 2015-11-17 Xerox Corporation System and method for resolving entity coreference
US9442929B2 (en) * 2013-02-12 2016-09-13 Microsoft Technology Licensing, Llc Determining documents that match a query
US10140664B2 (en) * 2013-03-14 2018-11-27 Palantir Technologies Inc. Resolving similar entities from a transaction database

Also Published As

Publication number Publication date
BR102014027639A8 (pt) 2021-08-24
US20150205803A1 (en) 2015-07-23
BR102014027639A2 (pt) 2016-05-24
BR102014027639B1 (pt) 2022-05-03
AU2014253497B2 (en) 2020-05-28
US10311093B2 (en) 2019-06-04
EP2897054A2 (en) 2015-07-22
AU2014253497A1 (en) 2015-08-06
EP2897054A3 (en) 2015-09-16
CA2868540A1 (en) 2015-07-17
MX355195B (es) 2018-04-06
MX2014013314A (es) 2016-03-15
CA2868540C (en) 2020-09-22

Similar Documents

Publication Publication Date Title
MX2014013314A (es) Resolucion de entidad de documentos.
WO2015191746A8 (en) Systems and methods for a database of software artifacts
MX2014014048A (es) Resolucion de la identidad de documentos.
EP2664997A3 (en) System and method for resolving named entity coreference
PH12015000372A1 (en) Conversion of documents of different types to a uniform and an editable or a searchable format
GB2550777A (en) Classification and storage of documents
PH12016000485B1 (en) Document processing
GB2549875A (en) Automated content classification/filtering
MX2018009457A (es) Metodos y sistemas para procesar datos de nube de puntos con un escaner de linea.
IN2014MU00919A (enExample)
HK1253368A1 (zh) 图像和文本数据层级分类器
WO2015181639A3 (en) Methods and computer-program products for organizing electronic documents
GB2536826A (en) Matching of an input document to documents in a document collection
MY176481A (en) Method and apparatus for classifying object based on social networking service, and storage medium
GB2513747A (en) System and method for detecting malware in documents
IN2013CH06086A (enExample)
AU2015364405A8 (en) Methods for simultaneous source separation
GB2583636A8 (en) Facilitation of domain and client-specific application program interface recommendations
IN2015CH01303A (enExample)
GB2527230A (en) Processing seismic attributes using mathematical morphology
CA2912019C (en) Systems and methods for generating issue networks
GB2533243A (en) Document-based search with facet information
WO2014012863A3 (en) Method of automatically extracting features from a computer readable file
Gasarch Classifying problems into complexity classes
GB201300134D0 (en) Method and apparautus for analyzing a document