JP2024524060A - テキストデータの自動ラベル付け - Google Patents
テキストデータの自動ラベル付け Download PDFInfo
- Publication number
- JP2024524060A JP2024524060A JP2023576164A JP2023576164A JP2024524060A JP 2024524060 A JP2024524060 A JP 2024524060A JP 2023576164 A JP2023576164 A JP 2023576164A JP 2023576164 A JP2023576164 A JP 2023576164A JP 2024524060 A JP2024524060 A JP 2024524060A
- Authority
- JP
- Japan
- Prior art keywords
- label
- text
- candidate
- labeling
- examples
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/383—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Library & Information Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- User Interface Of Digital Computer (AREA)
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| IN202141029147 | 2021-06-29 | ||
| IN202141029147 | 2021-06-29 | ||
| US17/711,506 | 2022-04-01 | ||
| US17/711,506 US12197486B2 (en) | 2021-06-29 | 2022-04-01 | Automatic labeling of text data |
| PCT/US2022/030464 WO2023278070A1 (en) | 2021-06-29 | 2022-05-23 | Automatic labeling of text data |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2024524060A true JP2024524060A (ja) | 2024-07-05 |
| JP2024524060A5 JP2024524060A5 (enExample) | 2025-04-30 |
Family
ID=82156528
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2023576164A Pending JP2024524060A (ja) | 2021-06-29 | 2022-05-23 | テキストデータの自動ラベル付け |
Country Status (9)
| Country | Link |
|---|---|
| US (1) | US20240370484A1 (enExample) |
| EP (1) | EP4364000A1 (enExample) |
| JP (1) | JP2024524060A (enExample) |
| KR (1) | KR20240023535A (enExample) |
| AU (1) | AU2022304683A1 (enExample) |
| BR (1) | BR112023027439A2 (enExample) |
| CA (1) | CA3225020A1 (enExample) |
| WO (1) | WO2023278070A1 (enExample) |
| ZA (1) | ZA202400308B (enExample) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2025036437A (ja) * | 2024-06-19 | 2025-03-14 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | 大規模モデルに基づくコード生成方法、装置、電子機器および記憶媒体 |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20230385966A1 (en) * | 2022-05-31 | 2023-11-30 | Docusign, Inc. | Predictive text for contract generation in a document management system |
| US20240054285A1 (en) * | 2022-08-10 | 2024-02-15 | TOTVS, Inc. | Sentence pair ranking in natural language processing for a virtual assistant |
| CN116415154B (zh) * | 2023-06-12 | 2023-08-22 | 江西五十铃汽车有限公司 | 一种基于gpt的车辆故障解决方案生成方法及装置 |
| JP2025036355A (ja) * | 2023-08-30 | 2025-03-14 | 宏達國際電子股▲ふん▼有限公司 | 外れた文字データをスクリーニングするためのデータ分類方法 |
| CN116910279B (zh) * | 2023-09-13 | 2024-01-05 | 深圳市智慧城市科技发展集团有限公司 | 标签提取方法、设备及计算机可读存储介质 |
| CN121970062A (zh) * | 2023-10-24 | 2026-05-01 | 株式会社半导体能源研究所 | 信息处理系统、信息处理方法 |
| KR102763213B1 (ko) * | 2024-04-04 | 2025-02-07 | 주식회사 리턴제로 | 도메인에 따른 템플릿 기반 데이터 라벨링을 수행하는 전자 장치 및 방법 |
| US12530377B2 (en) | 2024-05-22 | 2026-01-20 | Shopify Inc. | Additional searching based on confidence in a classification performed by a generative language machine learning model |
| KR102823763B1 (ko) * | 2024-12-10 | 2025-06-23 | 한화시스템 주식회사 | 문장 구문 해석 기반 전투체계 데이터 생성 시스템 및 방법 |
| CN120430300B (zh) * | 2025-07-09 | 2025-09-23 | 中国民用航空飞行学院 | 一种航行通告文本自动纠错方法、系统、存储介质及终端 |
| CN120541194B (zh) * | 2025-07-25 | 2025-10-24 | 浪潮通用软件有限公司 | 基于多维标签的知识检索方法、系统及计算机设备 |
| CN121303112A (zh) * | 2025-09-28 | 2026-01-09 | 北京首发展智能科技有限公司 | 一种基于llm模型的标签获取方法、设备及介质 |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10635727B2 (en) * | 2016-08-16 | 2020-04-28 | Ebay Inc. | Semantic forward search indexing of publication corpus |
-
2022
- 2022-05-23 KR KR1020237045327A patent/KR20240023535A/ko active Pending
- 2022-05-23 CA CA3225020A patent/CA3225020A1/en active Pending
- 2022-05-23 AU AU2022304683A patent/AU2022304683A1/en active Pending
- 2022-05-23 JP JP2023576164A patent/JP2024524060A/ja active Pending
- 2022-05-23 EP EP22732737.6A patent/EP4364000A1/en active Pending
- 2022-05-23 WO PCT/US2022/030464 patent/WO2023278070A1/en not_active Ceased
- 2022-05-23 BR BR112023027439A patent/BR112023027439A2/pt unknown
-
2024
- 2024-01-09 ZA ZA2024/00308A patent/ZA202400308B/en unknown
- 2024-07-19 US US18/777,830 patent/US20240370484A1/en active Pending
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2025036437A (ja) * | 2024-06-19 | 2025-03-14 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | 大規模モデルに基づくコード生成方法、装置、電子機器および記憶媒体 |
| JP7802144B2 (ja) | 2024-06-19 | 2026-01-19 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | 大規模モデルに基づくコード生成方法、装置、電子機器および記憶媒体 |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20240023535A (ko) | 2024-02-22 |
| AU2022304683A1 (en) | 2024-01-04 |
| CA3225020A1 (en) | 2023-01-05 |
| WO2023278070A1 (en) | 2023-01-05 |
| ZA202400308B (en) | 2025-10-29 |
| BR112023027439A2 (pt) | 2024-03-12 |
| US20240370484A1 (en) | 2024-11-07 |
| EP4364000A1 (en) | 2024-05-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12197486B2 (en) | Automatic labeling of text data | |
| US20240370484A1 (en) | Automatic labeling of text data | |
| CN110297868B (zh) | 构建企业特定知识图 | |
| CN112800170B (zh) | 问题的匹配方法及装置、问题的回复方法及装置 | |
| CN106055549B (zh) | 利用加速器的概念分析操作的方法和系统 | |
| Li et al. | Mining opinion summarizations using convolutional neural networks in Chinese microblogging systems | |
| JP5391633B2 (ja) | オントロジー空間を規定するタームの推奨 | |
| US11048705B2 (en) | Query intent clustering for automated sourcing | |
| Ghosal et al. | Novelty detection: A perspective from natural language processing | |
| CN107844533A (zh) | 一种智能问答系统及分析方法 | |
| JP2014120053A (ja) | 質問応答装置、方法、及びプログラム | |
| US10198497B2 (en) | Search term clustering | |
| CN112417170B (zh) | 面向不完备知识图谱的关系链接方法 | |
| AN et al. | Scoring Impressions and Associations for Improved Concept Map Excavating from Dominion Text Demonstration | |
| CN114365122A (zh) | 通过对开放域事实的贝叶斯结构学习来学习实体、关系词和概念之间的可解释关系 | |
| Singh et al. | Real-time event detection and classification in social text steam using embedding | |
| CN111581326B (zh) | 一种基于异构外部知识源图结构抽取答案信息的方法 | |
| Kumaravel et al. | PQPS: Prior‐Art Query‐Based Patent Summarizer Using RBM and Bi‐LSTM | |
| Wang et al. | Event assignment based on KBQA for government service hotlines | |
| EP4641409A1 (en) | Conversational agnostic matchmaking model architecture | |
| Hao | Naive Bayesian prediction of Japanese annotated corpus for textual semantic word formation classification | |
| CN109902149B (zh) | 查询处理方法和装置、计算机可读介质 | |
| CN117581221A (zh) | 文本数据的自动标记 | |
| Nouriinanloo | Improving information retrieval and recommender systems with contextual data and re-ranking | |
| US20250094882A1 (en) | Suggesting Resources using a Latency-Efficient Machine-Trained Ranking Model |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20250421 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20250421 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20260130 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20260219 |