JP2024524060A - テキストデータの自動ラベル付け - Google Patents

テキストデータの自動ラベル付け Download PDF

Info

Publication number
JP2024524060A
JP2024524060A JP2023576164A JP2023576164A JP2024524060A JP 2024524060 A JP2024524060 A JP 2024524060A JP 2023576164 A JP2023576164 A JP 2023576164A JP 2023576164 A JP2023576164 A JP 2023576164A JP 2024524060 A JP2024524060 A JP 2024524060A
Authority
JP
Japan
Prior art keywords
label
text
candidate
labeling
examples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2023576164A
Other languages
English (en)
Japanese (ja)
Other versions
JP2024524060A5 (enExample
Inventor
セワク,モヒト
キラン レディ ポルリ,ラヴィ
ブラム,ウィリアム
オン チャン,パク
リー,ウェイシェン
シリシュ アーチャーリャ,シャラダ
ラドニック,クリスチャン
アブラハム ベトサー,マイケル
ドリニック,ミレンコ
リウ,シホン
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Technology Licensing LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/711,506 external-priority patent/US12197486B2/en
Application filed by Microsoft Technology Licensing LLC filed Critical Microsoft Technology Licensing LLC
Publication of JP2024524060A publication Critical patent/JP2024524060A/ja
Publication of JP2024524060A5 publication Critical patent/JP2024524060A5/ja
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/383Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)
JP2023576164A 2021-06-29 2022-05-23 テキストデータの自動ラベル付け Pending JP2024524060A (ja)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
IN202141029147 2021-06-29
IN202141029147 2021-06-29
US17/711,506 2022-04-01
US17/711,506 US12197486B2 (en) 2021-06-29 2022-04-01 Automatic labeling of text data
PCT/US2022/030464 WO2023278070A1 (en) 2021-06-29 2022-05-23 Automatic labeling of text data

Publications (2)

Publication Number Publication Date
JP2024524060A true JP2024524060A (ja) 2024-07-05
JP2024524060A5 JP2024524060A5 (enExample) 2025-04-30

Family

ID=82156528

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2023576164A Pending JP2024524060A (ja) 2021-06-29 2022-05-23 テキストデータの自動ラベル付け

Country Status (9)

Country Link
US (1) US20240370484A1 (enExample)
EP (1) EP4364000A1 (enExample)
JP (1) JP2024524060A (enExample)
KR (1) KR20240023535A (enExample)
AU (1) AU2022304683A1 (enExample)
BR (1) BR112023027439A2 (enExample)
CA (1) CA3225020A1 (enExample)
WO (1) WO2023278070A1 (enExample)
ZA (1) ZA202400308B (enExample)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2025036437A (ja) * 2024-06-19 2025-03-14 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド 大規模モデルに基づくコード生成方法、装置、電子機器および記憶媒体

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230385966A1 (en) * 2022-05-31 2023-11-30 Docusign, Inc. Predictive text for contract generation in a document management system
US20240054285A1 (en) * 2022-08-10 2024-02-15 TOTVS, Inc. Sentence pair ranking in natural language processing for a virtual assistant
CN116415154B (zh) * 2023-06-12 2023-08-22 江西五十铃汽车有限公司 一种基于gpt的车辆故障解决方案生成方法及装置
JP2025036355A (ja) * 2023-08-30 2025-03-14 宏達國際電子股▲ふん▼有限公司 外れた文字データをスクリーニングするためのデータ分類方法
CN116910279B (zh) * 2023-09-13 2024-01-05 深圳市智慧城市科技发展集团有限公司 标签提取方法、设备及计算机可读存储介质
CN121970062A (zh) * 2023-10-24 2026-05-01 株式会社半导体能源研究所 信息处理系统、信息处理方法
KR102763213B1 (ko) * 2024-04-04 2025-02-07 주식회사 리턴제로 도메인에 따른 템플릿 기반 데이터 라벨링을 수행하는 전자 장치 및 방법
US12530377B2 (en) 2024-05-22 2026-01-20 Shopify Inc. Additional searching based on confidence in a classification performed by a generative language machine learning model
KR102823763B1 (ko) * 2024-12-10 2025-06-23 한화시스템 주식회사 문장 구문 해석 기반 전투체계 데이터 생성 시스템 및 방법
CN120430300B (zh) * 2025-07-09 2025-09-23 中国民用航空飞行学院 一种航行通告文本自动纠错方法、系统、存储介质及终端
CN120541194B (zh) * 2025-07-25 2025-10-24 浪潮通用软件有限公司 基于多维标签的知识检索方法、系统及计算机设备
CN121303112A (zh) * 2025-09-28 2026-01-09 北京首发展智能科技有限公司 一种基于llm模型的标签获取方法、设备及介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10635727B2 (en) * 2016-08-16 2020-04-28 Ebay Inc. Semantic forward search indexing of publication corpus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2025036437A (ja) * 2024-06-19 2025-03-14 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド 大規模モデルに基づくコード生成方法、装置、電子機器および記憶媒体
JP7802144B2 (ja) 2024-06-19 2026-01-19 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド 大規模モデルに基づくコード生成方法、装置、電子機器および記憶媒体

Also Published As

Publication number Publication date
KR20240023535A (ko) 2024-02-22
AU2022304683A1 (en) 2024-01-04
CA3225020A1 (en) 2023-01-05
WO2023278070A1 (en) 2023-01-05
ZA202400308B (en) 2025-10-29
BR112023027439A2 (pt) 2024-03-12
US20240370484A1 (en) 2024-11-07
EP4364000A1 (en) 2024-05-08

Similar Documents

Publication Publication Date Title
US12197486B2 (en) Automatic labeling of text data
US20240370484A1 (en) Automatic labeling of text data
CN110297868B (zh) 构建企业特定知识图
CN112800170B (zh) 问题的匹配方法及装置、问题的回复方法及装置
CN106055549B (zh) 利用加速器的概念分析操作的方法和系统
Li et al. Mining opinion summarizations using convolutional neural networks in Chinese microblogging systems
JP5391633B2 (ja) オントロジー空間を規定するタームの推奨
US11048705B2 (en) Query intent clustering for automated sourcing
Ghosal et al. Novelty detection: A perspective from natural language processing
CN107844533A (zh) 一种智能问答系统及分析方法
JP2014120053A (ja) 質問応答装置、方法、及びプログラム
US10198497B2 (en) Search term clustering
CN112417170B (zh) 面向不完备知识图谱的关系链接方法
AN et al. Scoring Impressions and Associations for Improved Concept Map Excavating from Dominion Text Demonstration
CN114365122A (zh) 通过对开放域事实的贝叶斯结构学习来学习实体、关系词和概念之间的可解释关系
Singh et al. Real-time event detection and classification in social text steam using embedding
CN111581326B (zh) 一种基于异构外部知识源图结构抽取答案信息的方法
Kumaravel et al. PQPS: Prior‐Art Query‐Based Patent Summarizer Using RBM and Bi‐LSTM
Wang et al. Event assignment based on KBQA for government service hotlines
EP4641409A1 (en) Conversational agnostic matchmaking model architecture
Hao Naive Bayesian prediction of Japanese annotated corpus for textual semantic word formation classification
CN109902149B (zh) 查询处理方法和装置、计算机可读介质
CN117581221A (zh) 文本数据的自动标记
Nouriinanloo Improving information retrieval and recommender systems with contextual data and re-ranking
US20250094882A1 (en) Suggesting Resources using a Latency-Efficient Machine-Trained Ranking Model

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20250421

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20250421

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20260130

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20260219