BR102014027639B1 - Método para resolver as entidades de uma pluralidade de documentos, e sistema de resolução de entidade para a resolução de entidade de uma pluralidade de documentos - Google Patents

Método para resolver as entidades de uma pluralidade de documentos, e sistema de resolução de entidade para a resolução de entidade de uma pluralidade de documentos Download PDF

Info

Publication number
BR102014027639B1
BR102014027639B1 BR102014027639-4A BR102014027639A BR102014027639B1 BR 102014027639 B1 BR102014027639 B1 BR 102014027639B1 BR 102014027639 A BR102014027639 A BR 102014027639A BR 102014027639 B1 BR102014027639 B1 BR 102014027639B1
Authority
BR
Brazil
Prior art keywords
documents
entity
document
textual
referential
Prior art date
Application number
BR102014027639-4A
Other languages
English (en)
Portuguese (pt)
Other versions
BR102014027639A8 (pt
BR102014027639A2 (pt
Inventor
Puneet Agarwal
Gautam Shroff
Pankaj Malhotra
Original Assignee
Tata Consultancy Services Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tata Consultancy Services Limited filed Critical Tata Consultancy Services Limited
Publication of BR102014027639A2 publication Critical patent/BR102014027639A2/pt
Publication of BR102014027639A8 publication Critical patent/BR102014027639A8/pt
Publication of BR102014027639B1 publication Critical patent/BR102014027639B1/pt

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/131Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Creation or modification of classes or clusters
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
BR102014027639-4A 2014-01-17 2014-11-05 Método para resolver as entidades de uma pluralidade de documentos, e sistema de resolução de entidade para a resolução de entidade de uma pluralidade de documentos BR102014027639B1 (pt)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN169/MUM/2014 2014-01-17
IN169MU2014 IN2014MU00169A (https=) 2014-01-17 2014-01-17

Publications (3)

Publication Number Publication Date
BR102014027639A2 BR102014027639A2 (pt) 2016-05-24
BR102014027639A8 BR102014027639A8 (pt) 2021-08-24
BR102014027639B1 true BR102014027639B1 (pt) 2022-05-03

Family

ID=51625852

Family Applications (1)

Application Number Title Priority Date Filing Date
BR102014027639-4A BR102014027639B1 (pt) 2014-01-17 2014-11-05 Método para resolver as entidades de uma pluralidade de documentos, e sistema de resolução de entidade para a resolução de entidade de uma pluralidade de documentos

Country Status (7)

Country Link
US (1) US10311093B2 (https=)
EP (1) EP2897054A3 (https=)
AU (1) AU2014253497B2 (https=)
BR (1) BR102014027639B1 (https=)
CA (1) CA2868540C (https=)
IN (1) IN2014MU00169A (https=)
MX (1) MX355195B (https=)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165291B (zh) * 2018-06-29 2021-07-09 厦门快商通信息技术有限公司 一种文本匹配方法及电子设备
CN109635114A (zh) * 2018-12-17 2019-04-16 北京百度网讯科技有限公司 用于处理信息的方法和装置
FR3104282B1 (fr) * 2019-12-05 2024-01-19 Codexo Sauvegarde de documents en blocs
US12314666B2 (en) * 2020-05-01 2025-05-27 Salesforce, Inc. Stable identification of entity mentions
CN111882165A (zh) * 2020-07-01 2020-11-03 国网河北省电力有限公司经济技术研究院 一种综合项目造价分析数据拆分装置及方法
US12198459B2 (en) * 2021-11-24 2025-01-14 Adobe Inc. Systems for generating indications of relationships between electronic documents
CA3264743A1 (en) * 2022-08-18 2024-02-22 9197-1168 Québec Inc. SYSTEMS AND METHODS FOR IDENTIFYING DOCUMENTS AND REFERENCES
CN119646178B (zh) * 2024-11-26 2025-08-05 湖北邮电规划设计有限公司 基于知识图谱的增强文档生成和检索方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7213198B1 (en) * 1999-08-12 2007-05-01 Google Inc. Link based clustering of hyperlinked documents
CN1757188A (zh) * 2002-11-06 2006-04-05 国际商业机器公司 机密数据共享和匿名实体解析度
US8683312B2 (en) * 2005-06-16 2014-03-25 Adobe Systems Incorporated Inter-document links involving embedded documents
WO2011085360A1 (en) * 2010-01-11 2011-07-14 Panjiva, Inc. Evaluating public records of supply transactions for financial investment decisions
US20090204590A1 (en) * 2008-02-11 2009-08-13 Queplix Corp. System and method for an integrated enterprise search
US8805861B2 (en) * 2008-12-09 2014-08-12 Google Inc. Methods and systems to train models to extract and integrate information from data sources
US20110119268A1 (en) * 2009-11-13 2011-05-19 Rajaram Shyam Sundar Method and system for segmenting query urls
US8949227B2 (en) * 2010-03-12 2015-02-03 Telefonaktiebolaget L M Ericsson (Publ) System and method for matching entities and synonym group organizer used therein
US9189473B2 (en) * 2012-05-18 2015-11-17 Xerox Corporation System and method for resolving entity coreference
US9442929B2 (en) * 2013-02-12 2016-09-13 Microsoft Technology Licensing, Llc Determining documents that match a query
US10140664B2 (en) * 2013-03-14 2018-11-27 Palantir Technologies Inc. Resolving similar entities from a transaction database

Also Published As

Publication number Publication date
CA2868540A1 (en) 2015-07-17
MX355195B (es) 2018-04-06
EP2897054A3 (en) 2015-09-16
IN2014MU00169A (https=) 2015-08-28
AU2014253497B2 (en) 2020-05-28
MX2014013314A (es) 2016-03-15
US20150205803A1 (en) 2015-07-23
BR102014027639A8 (pt) 2021-08-24
BR102014027639A2 (pt) 2016-05-24
US10311093B2 (en) 2019-06-04
CA2868540C (en) 2020-09-22
AU2014253497A1 (en) 2015-08-06
EP2897054A2 (en) 2015-07-22

Similar Documents

Publication Publication Date Title
BR102014027639B1 (pt) Método para resolver as entidades de uma pluralidade de documentos, e sistema de resolução de entidade para a resolução de entidade de uma pluralidade de documentos
US11182356B2 (en) Indexing for evolving large-scale datasets in multi-master hybrid transactional and analytical processing systems
JP6669892B2 (ja) 分散型データストアのバージョン化された階層型データ構造
US9576071B2 (en) Graph-based data models for partitioned data
US8527556B2 (en) Systems and methods to update a content store associated with a search index
EP3173965B1 (en) System and method for enablement of data masking for web documents
US8732127B1 (en) Method and system for managing versioned structured documents in a database
CN109690522B (zh) 一种基于b+树索引的数据更新方法、装置及存储装置
CN110569328B (zh) 实体链接方法、电子装置及计算机设备
BR102014028893B1 (pt) Método para resolver entidades de uma pluralidade de documentos; e sistema de resolução de entidade para resolução de entidade de uma pluralidade de documentos
US9020916B2 (en) Database server apparatus, method for updating database, and recording medium for database update program
WO2023179787A1 (zh) 分布式文件系统的元数据管理方法和装置
CN108292310A (zh) 用于数字实体相关的技术
US11514188B1 (en) System and method for serving subject access requests
US20180144061A1 (en) Edge store designs for graph databases
US8527480B1 (en) Method and system for managing versioned structured documents in a database
US20120221538A1 (en) Optimistic, version number based concurrency control for index structures with atomic, non-versioned pointer updates
CN103973810A (zh) 基于互联网协议ip盘的数据处理方法和装置
US11500837B1 (en) Automating optimizations for items in a hierarchical data store
US20230252012A1 (en) Method for indexing data
CN107077511A (zh) 用于在rdbms中的记录上创建用户自定义可变大小标签的装置和方法
US10235422B2 (en) Lock-free parallel dictionary encoding
CN110020272A (zh) 缓存方法、装置以及计算机存储介质
US11222003B1 (en) Executing transactions for a hierarchy of data objects stored in a non-transactional data store
CN119202115A (zh) 用于地址区划补全的方法、装置、设备和可读介质

Legal Events

Date Code Title Description
B03A Publication of a patent application or of a certificate of addition of invention [chapter 3.1 patent gazette]
B06F Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]
B06U Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]
B03H Publication of an application: rectification [chapter 3.8 patent gazette]

Free format text: REFERENTE A RPI 2368 DE 24/05/2016, QUANTO AO ITEM (54).

B09A Decision: intention to grant [chapter 9.1 patent gazette]
B15K Others concerning applications: alteration of classification

Free format text: AS CLASSIFICACOES ANTERIORES ERAM: G06F 17/22 , G06F 17/27 , G06F 17/30

Ipc: G06F 16/35 (2006.01), G06F 16/901 (2006.01), G06F

B16A Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]

Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 05/11/2014, OBSERVADAS AS CONDICOES LEGAIS.