KR20240023535A - 텍스트 데이터의 자동 라벨링 - Google Patents

텍스트 데이터의 자동 라벨링 Download PDF

Info

Publication number
KR20240023535A
KR20240023535A KR1020237045327A KR20237045327A KR20240023535A KR 20240023535 A KR20240023535 A KR 20240023535A KR 1020237045327 A KR1020237045327 A KR 1020237045327A KR 20237045327 A KR20237045327 A KR 20237045327A KR 20240023535 A KR20240023535 A KR 20240023535A
Authority
KR
South Korea
Prior art keywords
label
text
candidate
search
labeling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
KR1020237045327A
Other languages
English (en)
Korean (ko)
Inventor
모히트 세왁
라비 키란 레디 폴루리
윌리엄 블럼
박 온 찬
웨이셍 리
샤라다 쉬리시 아차리아
크리스찬 루드닉
마이클 아브라함 벳서
밀렌코 드리니크
시홍 리우
Original Assignee
마이크로소프트 테크놀로지 라이센싱, 엘엘씨
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/711,506 external-priority patent/US12197486B2/en
Application filed by 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 filed Critical 마이크로소프트 테크놀로지 라이센싱, 엘엘씨
Publication of KR20240023535A publication Critical patent/KR20240023535A/ko
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/383Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)
KR1020237045327A 2021-06-29 2022-05-23 텍스트 데이터의 자동 라벨링 Pending KR20240023535A (ko)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
IN202141029147 2021-06-29
IN202141029147 2021-06-29
US17/711,506 2022-04-01
US17/711,506 US12197486B2 (en) 2021-06-29 2022-04-01 Automatic labeling of text data
PCT/US2022/030464 WO2023278070A1 (en) 2021-06-29 2022-05-23 Automatic labeling of text data

Publications (1)

Publication Number Publication Date
KR20240023535A true KR20240023535A (ko) 2024-02-22

Family

ID=82156528

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020237045327A Pending KR20240023535A (ko) 2021-06-29 2022-05-23 텍스트 데이터의 자동 라벨링

Country Status (9)

Country Link
US (1) US20240370484A1 (https=)
EP (1) EP4364000A1 (https=)
JP (1) JP2024524060A (https=)
KR (1) KR20240023535A (https=)
AU (1) AU2022304683A1 (https=)
BR (1) BR112023027439A2 (https=)
CA (1) CA3225020A1 (https=)
WO (1) WO2023278070A1 (https=)
ZA (1) ZA202400308B (https=)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102763213B1 (ko) * 2024-04-04 2025-02-07 주식회사 리턴제로 도메인에 따른 템플릿 기반 데이터 라벨링을 수행하는 전자 장치 및 방법
KR102823763B1 (ko) * 2024-12-10 2025-06-23 한화시스템 주식회사 문장 구문 해석 기반 전투체계 데이터 생성 시스템 및 방법

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230385966A1 (en) * 2022-05-31 2023-11-30 Docusign, Inc. Predictive text for contract generation in a document management system
US20240054285A1 (en) * 2022-08-10 2024-02-15 TOTVS, Inc. Sentence pair ranking in natural language processing for a virtual assistant
CN116415154B (zh) * 2023-06-12 2023-08-22 江西五十铃汽车有限公司 一种基于gpt的车辆故障解决方案生成方法及装置
JP2025036355A (ja) * 2023-08-30 2025-03-14 宏達國際電子股▲ふん▼有限公司 外れた文字データをスクリーニングするためのデータ分類方法
CN116910279B (zh) * 2023-09-13 2024-01-05 深圳市智慧城市科技发展集团有限公司 标签提取方法、设备及计算机可读存储介质
CN121970062A (zh) * 2023-10-24 2026-05-01 株式会社半导体能源研究所 信息处理系统、信息处理方法
US12530377B2 (en) 2024-05-22 2026-01-20 Shopify Inc. Additional searching based on confidence in a classification performed by a generative language machine learning model
CN118689468A (zh) * 2024-06-19 2024-09-24 北京百度网讯科技有限公司 基于大模型的代码生成方法、装置、电子设备及存储介质
CN120430300B (zh) * 2025-07-09 2025-09-23 中国民用航空飞行学院 一种航行通告文本自动纠错方法、系统、存储介质及终端
CN120541194B (zh) * 2025-07-25 2025-10-24 浪潮通用软件有限公司 基于多维标签的知识检索方法、系统及计算机设备
CN121303112A (zh) * 2025-09-28 2026-01-09 北京首发展智能科技有限公司 一种基于llm模型的标签获取方法、设备及介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10635727B2 (en) * 2016-08-16 2020-04-28 Ebay Inc. Semantic forward search indexing of publication corpus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102763213B1 (ko) * 2024-04-04 2025-02-07 주식회사 리턴제로 도메인에 따른 템플릿 기반 데이터 라벨링을 수행하는 전자 장치 및 방법
KR102823763B1 (ko) * 2024-12-10 2025-06-23 한화시스템 주식회사 문장 구문 해석 기반 전투체계 데이터 생성 시스템 및 방법

Also Published As

Publication number Publication date
AU2022304683A1 (en) 2024-01-04
JP2024524060A (ja) 2024-07-05
CA3225020A1 (en) 2023-01-05
WO2023278070A1 (en) 2023-01-05
ZA202400308B (en) 2025-10-29
BR112023027439A2 (pt) 2024-03-12
US20240370484A1 (en) 2024-11-07
EP4364000A1 (en) 2024-05-08

Similar Documents

Publication Publication Date Title
US12197486B2 (en) Automatic labeling of text data
US20240370484A1 (en) Automatic labeling of text data
CN110297868B (zh) 构建企业特定知识图
US11347783B2 (en) Implementing a software action based on machine interpretation of a language input
CN112800170B (zh) 问题的匹配方法及装置、问题的回复方法及装置
CN106055549B (zh) 利用加速器的概念分析操作的方法和系统
US11048705B2 (en) Query intent clustering for automated sourcing
US11144830B2 (en) Entity linking via disambiguation using machine learning techniques
US10984385B2 (en) Query building for search by ideal candidates
US10373075B2 (en) Smart suggestions for query refinements
CN112507715A (zh) 确定实体之间关联关系的方法、装置、设备和存储介质
US11017040B2 (en) Providing query explanations for automated sourcing
US20060242130A1 (en) Information retrieval using conjunctive search and link discovery
US20160189029A1 (en) Displaying Quality of Question Being Asked a Question Answering System
CN112889043A (zh) 以用户为中心的浏览器位置
US20200175360A1 (en) Dynamic updating of a word embedding model
US12608556B2 (en) Intention recognition method, electronic device, and storage medium
WO2023278037A1 (en) Multiple semantic hypotheses for search query intent understanding
US12406008B1 (en) Using intent-based rankings to generate large language model responses
US12511322B1 (en) Large language model-assisted entity name resolution
CN111368555B (zh) 一种数据识别方法、装置、存储介质和电子设备
EP4582968A1 (en) Efficient generation of application programming interface calls using language models, data types, and enriched schema
Shrivastava et al. ISEQL: Interactive sequence learning
EP4641409A1 (en) Conversational agnostic matchmaking model architecture
CN117581221A (zh) 文本数据的自动标记

Legal Events

Date Code Title Description
PA0105 International application

Patent event date: 20231228

Patent event code: PA01051R01D

Comment text: International Patent Application

PG1501 Laying open of application
A201 Request for examination
PA0201 Request for examination

Patent event code: PA02012R01D

Patent event date: 20250523

Comment text: Request for Examination of Application