CA3225020A1 - Automatic labeling of text data - Google Patents
Automatic labeling of text data Download PDFInfo
- Publication number
- CA3225020A1 CA3225020A1 CA3225020A CA3225020A CA3225020A1 CA 3225020 A1 CA3225020 A1 CA 3225020A1 CA 3225020 A CA3225020 A CA 3225020A CA 3225020 A CA3225020 A CA 3225020A CA 3225020 A1 CA3225020 A1 CA 3225020A1
- Authority
- CA
- Canada
- Prior art keywords
- label
- text
- candidate
- search
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/383—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Library & Information Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- User Interface Of Digital Computer (AREA)
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| IN202141029147 | 2021-06-29 | ||
| IN202141029147 | 2021-06-29 | ||
| US17/711,506 | 2022-04-01 | ||
| US17/711,506 US12197486B2 (en) | 2021-06-29 | 2022-04-01 | Automatic labeling of text data |
| PCT/US2022/030464 WO2023278070A1 (en) | 2021-06-29 | 2022-05-23 | Automatic labeling of text data |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CA3225020A1 true CA3225020A1 (en) | 2023-01-05 |
Family
ID=82156528
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA3225020A Pending CA3225020A1 (en) | 2021-06-29 | 2022-05-23 | Automatic labeling of text data |
Country Status (9)
| Country | Link |
|---|---|
| US (1) | US20240370484A1 (enExample) |
| EP (1) | EP4364000A1 (enExample) |
| JP (1) | JP2024524060A (enExample) |
| KR (1) | KR20240023535A (enExample) |
| AU (1) | AU2022304683A1 (enExample) |
| BR (1) | BR112023027439A2 (enExample) |
| CA (1) | CA3225020A1 (enExample) |
| WO (1) | WO2023278070A1 (enExample) |
| ZA (1) | ZA202400308B (enExample) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116628556A (zh) * | 2023-06-14 | 2023-08-22 | 上海桥创科技有限公司 | 一种产品标签的建立方法及其建立系统 |
| CN120430300A (zh) * | 2025-07-09 | 2025-08-05 | 中国民用航空飞行学院 | 一种航行通告文本自动纠错方法、系统、存储介质及终端 |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20230385966A1 (en) * | 2022-05-31 | 2023-11-30 | Docusign, Inc. | Predictive text for contract generation in a document management system |
| US20240054285A1 (en) * | 2022-08-10 | 2024-02-15 | TOTVS, Inc. | Sentence pair ranking in natural language processing for a virtual assistant |
| CN116415154B (zh) * | 2023-06-12 | 2023-08-22 | 江西五十铃汽车有限公司 | 一种基于gpt的车辆故障解决方案生成方法及装置 |
| JP2025036355A (ja) * | 2023-08-30 | 2025-03-14 | 宏達國際電子股▲ふん▼有限公司 | 外れた文字データをスクリーニングするためのデータ分類方法 |
| CN116910279B (zh) * | 2023-09-13 | 2024-01-05 | 深圳市智慧城市科技发展集团有限公司 | 标签提取方法、设备及计算机可读存储介质 |
| CN121970062A (zh) * | 2023-10-24 | 2026-05-01 | 株式会社半导体能源研究所 | 信息处理系统、信息处理方法 |
| KR102763213B1 (ko) * | 2024-04-04 | 2025-02-07 | 주식회사 리턴제로 | 도메인에 따른 템플릿 기반 데이터 라벨링을 수행하는 전자 장치 및 방법 |
| US12530377B2 (en) | 2024-05-22 | 2026-01-20 | Shopify Inc. | Additional searching based on confidence in a classification performed by a generative language machine learning model |
| CN118689468A (zh) * | 2024-06-19 | 2024-09-24 | 北京百度网讯科技有限公司 | 基于大模型的代码生成方法、装置、电子设备及存储介质 |
| KR102823763B1 (ko) * | 2024-12-10 | 2025-06-23 | 한화시스템 주식회사 | 문장 구문 해석 기반 전투체계 데이터 생성 시스템 및 방법 |
| CN120541194B (zh) * | 2025-07-25 | 2025-10-24 | 浪潮通用软件有限公司 | 基于多维标签的知识检索方法、系统及计算机设备 |
| CN121303112A (zh) * | 2025-09-28 | 2026-01-09 | 北京首发展智能科技有限公司 | 一种基于llm模型的标签获取方法、设备及介质 |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10635727B2 (en) * | 2016-08-16 | 2020-04-28 | Ebay Inc. | Semantic forward search indexing of publication corpus |
-
2022
- 2022-05-23 KR KR1020237045327A patent/KR20240023535A/ko active Pending
- 2022-05-23 CA CA3225020A patent/CA3225020A1/en active Pending
- 2022-05-23 AU AU2022304683A patent/AU2022304683A1/en active Pending
- 2022-05-23 JP JP2023576164A patent/JP2024524060A/ja active Pending
- 2022-05-23 EP EP22732737.6A patent/EP4364000A1/en active Pending
- 2022-05-23 WO PCT/US2022/030464 patent/WO2023278070A1/en not_active Ceased
- 2022-05-23 BR BR112023027439A patent/BR112023027439A2/pt unknown
-
2024
- 2024-01-09 ZA ZA2024/00308A patent/ZA202400308B/en unknown
- 2024-07-19 US US18/777,830 patent/US20240370484A1/en active Pending
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116628556A (zh) * | 2023-06-14 | 2023-08-22 | 上海桥创科技有限公司 | 一种产品标签的建立方法及其建立系统 |
| CN120430300A (zh) * | 2025-07-09 | 2025-08-05 | 中国民用航空飞行学院 | 一种航行通告文本自动纠错方法、系统、存储介质及终端 |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20240023535A (ko) | 2024-02-22 |
| AU2022304683A1 (en) | 2024-01-04 |
| JP2024524060A (ja) | 2024-07-05 |
| WO2023278070A1 (en) | 2023-01-05 |
| ZA202400308B (en) | 2025-10-29 |
| BR112023027439A2 (pt) | 2024-03-12 |
| US20240370484A1 (en) | 2024-11-07 |
| EP4364000A1 (en) | 2024-05-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12197486B2 (en) | Automatic labeling of text data | |
| US20240370484A1 (en) | Automatic labeling of text data | |
| CN112800170B (zh) | 问题的匹配方法及装置、问题的回复方法及装置 | |
| CN110297868B (zh) | 构建企业特定知识图 | |
| CN101523338B (zh) | 应用来自用户的反馈来改进搜索结果的搜索引擎 | |
| US11048705B2 (en) | Query intent clustering for automated sourcing | |
| CN106055549B (zh) | 利用加速器的概念分析操作的方法和系统 | |
| CN118132719A (zh) | 一种基于自然语言处理的智能对话方法及系统 | |
| JP5391633B2 (ja) | オントロジー空間を規定するタームの推奨 | |
| US11017040B2 (en) | Providing query explanations for automated sourcing | |
| US20180232434A1 (en) | Proactive and retrospective joint weight attribution in a streaming environment | |
| CN112507715A (zh) | 确定实体之间关联关系的方法、装置、设备和存储介质 | |
| US20180232702A1 (en) | Using feedback to re-weight candidate features in a streaming environment | |
| US20060242130A1 (en) | Information retrieval using conjunctive search and link discovery | |
| CN109829104A (zh) | 基于语义相似度的伪相关反馈模型信息检索方法及系统 | |
| US20170371965A1 (en) | Method and system for dynamically personalizing profiles in a social network | |
| CN108090231A (zh) | 一种基于信息熵的主题模型优化方法 | |
| US20170169355A1 (en) | Ground Truth Improvement Via Machine Learned Similar Passage Detection | |
| US20210319066A1 (en) | Sub-Question Result Merging in Question and Answer (QA) Systems | |
| JP2014120053A (ja) | 質問応答装置、方法、及びプログラム | |
| US12406008B1 (en) | Using intent-based rankings to generate large language model responses | |
| CN113239071A (zh) | 面向科技资源学科及研究主题信息的检索查询方法及系统 | |
| CN118626611A (zh) | 检索的方法、装置、电子设备及可读存储介质 | |
| CN115391479B (zh) | 用于文档搜索的排序方法、装置、电子介质及存储介质 | |
| Jiang et al. | Understanding a bag of words by conceptual labeling with prior weights |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| D00 | Search and/or examination requested or commenced |
Free format text: ST27 STATUS EVENT CODE: A-1-1-D10-D00-D120 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: VOLUNTARY SUBMISSION OF PRIOR ART RECEIVED Effective date: 20240715 |
|
| MFA | Maintenance fee for application paid |
Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 3RD ANNIV.) - STANDARD Year of fee payment: 3 |
|
| U00 | Fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED Effective date: 20250425 |
|
| U11 | Full renewal or maintenance fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL Effective date: 20250425 |
|
| D00 | Search and/or examination requested or commenced |
Free format text: ST27 STATUS EVENT CODE: A-1-1-D10-D00-D123 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: PRIOR ART DISCLOSURE DETERMINED COMPLIANT Effective date: 20250509 |
|
| W00 | Other event occurred |
Free format text: ST27 STATUS EVENT CODE: A-1-1-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT Effective date: 20250509 Free format text: ST27 STATUS EVENT CODE: A-1-1-W10-W00-W100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: LETTER SENT Effective date: 20250509 |
|
| MFA | Maintenance fee for application paid |
Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 4TH ANNIV.) - STANDARD Year of fee payment: 4 |
|
| U00 | Fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED Effective date: 20260421 |
|
| U11 | Full renewal or maintenance fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL Effective date: 20260421 |