CN109661664A - 一种信息处理的方法及相关装置 - Google Patents

一种信息处理的方法及相关装置 Download PDF

Info

Publication number
CN109661664A
CN109661664A CN201780054183.7A CN201780054183A CN109661664A CN 109661664 A CN109661664 A CN 109661664A CN 201780054183 A CN201780054183 A CN 201780054183A CN 109661664 A CN109661664 A CN 109661664A
Authority
CN
China
Prior art keywords
sentence
information
vector
coding
sentences
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201780054183.7A
Other languages
English (en)
Other versions
CN109661664B (zh
Inventor
孔行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Publication of CN109661664A publication Critical patent/CN109661664A/zh
Application granted granted Critical
Publication of CN109661664B publication Critical patent/CN109661664B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3059Digital compression and data reduction techniques where the original information is represented by a subset or similar information, e.g. lossy compression
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3082Vector coding
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6005Decoder aspects
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6011Encoder aspects
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/70Type of the data to be coded, other than image and sound
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/70Type of the data to be coded, other than image and sound
    • H03M7/707Structured documents, e.g. XML
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Molecular Biology (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Machine Translation (AREA)

Abstract

本发明实施例公开了一种信息处理的方法,包括:获取待处理文本信息及句子集合;采用第一编码器对句子集合中的句子进行编码,得到第一编码向量,并采用第二编码器对句子集合中的句子进行编码,得到第二编码向量,第一编码向量是根据句子确定的,第二编码向量是根据句子的特征确定的;根据第一编码向量与第二编码向量确定句子编码向量;采用第三编码器对句子编码向量进行编码,得到全局信息;采用解码器对全局信息进行解码处理,确定待处理文本信息中各个句子对应的概率值。本发明还提供一种信息处理装置。本发明在使用深度学习方法同时,还加入了人工抽取的句子进行特征训练,有效地提高了模型的学习能力,从而提升信息处理的能力和效果。

Description

PCT国内申请,说明书已公开。

Claims (15)

  1. PCT国内申请,权利要求书已公开。
CN201780054183.7A 2017-06-22 2017-06-22 一种信息处理的方法及相关装置 Active CN109661664B (zh)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/089586 WO2018232699A1 (zh) 2017-06-22 2017-06-22 一种信息处理的方法及相关装置

Publications (2)

Publication Number Publication Date
CN109661664A true CN109661664A (zh) 2019-04-19
CN109661664B CN109661664B (zh) 2021-04-27

Family

ID=64735906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780054183.7A Active CN109661664B (zh) 2017-06-22 2017-06-22 一种信息处理的方法及相关装置

Country Status (3)

Country Link
US (1) US10789415B2 (zh)
CN (1) CN109661664B (zh)
WO (1) WO2018232699A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597814A (zh) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 一种人机交互命名实体识别方法、装置、设备及存储介质
CN112269872A (zh) * 2020-10-19 2021-01-26 北京希瑞亚斯科技有限公司 简历解析方法、装置、电子设备及计算机存储介质
CN112560398A (zh) * 2019-09-26 2021-03-26 百度在线网络技术(北京)有限公司 一种文本生成方法及装置
CN114095033A (zh) * 2021-11-16 2022-02-25 上海交通大学 基于上下文的图卷积的目标交互关系语义无损压缩系统及方法

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018232699A1 (zh) 2017-06-22 2018-12-27 腾讯科技(深圳)有限公司 一种信息处理的方法及相关装置
CN109740158B (zh) * 2018-12-29 2023-04-07 安徽省泰岳祥升软件有限公司 一种文本语义解析方法及装置
CN110781674B (zh) * 2019-09-19 2023-10-27 北京小米智能科技有限公司 一种信息处理方法、装置、计算机设备及存储介质
CN113112993B (zh) * 2020-01-10 2024-04-02 阿里巴巴集团控股有限公司 一种音频信息处理方法、装置、电子设备以及存储介质
CN111428024A (zh) * 2020-03-18 2020-07-17 北京明略软件系统有限公司 实现文本摘要抽取的方法、装置、计算机存储介质及终端
CN111507726B (zh) * 2020-04-07 2022-06-24 支付宝(杭州)信息技术有限公司 一种报文生成方法、装置及设备
CN112069813B (zh) * 2020-09-10 2023-10-13 腾讯科技(深圳)有限公司 文本处理方法、装置、设备及计算机可读存储介质
CN113642756B (zh) * 2021-05-27 2023-11-24 复旦大学 基于深度学习技术的减刑刑期预测方法
CN113254684B (zh) * 2021-06-18 2021-10-29 腾讯科技(深圳)有限公司 一种内容时效的确定方法、相关装置、设备以及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117340A1 (en) * 2002-12-16 2004-06-17 Palo Alto Research Center, Incorporated Method and apparatus for generating summary information for hierarchically related information
CN105930314A (zh) * 2016-04-14 2016-09-07 清华大学 基于编码-解码深度神经网络的文本摘要生成系统及方法
CN106855853A (zh) * 2016-12-28 2017-06-16 成都数联铭品科技有限公司 基于深度神经网络的实体关系抽取系统

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6591801B2 (ja) * 2015-06-29 2019-10-16 任天堂株式会社 情報処理プログラム、情報処理システム、情報処理装置、および情報処理方法
JP6646991B2 (ja) * 2015-10-01 2020-02-14 任天堂株式会社 情報処理システム、情報処理方法、情報処理装置、および、情報処理プログラム
CN105512687A (zh) * 2015-12-15 2016-04-20 北京锐安科技有限公司 训练情感分类模型和文本情感极性分析的方法及系统
CN105740226A (zh) * 2016-01-15 2016-07-06 南京大学 使用树形神经网络和双向神经网络实现中文分词
CN106407178B (zh) * 2016-08-25 2019-08-13 中国科学院计算技术研究所 一种会话摘要生成方法、装置、服务器设备以及终端设备
JP6734761B2 (ja) * 2016-11-07 2020-08-05 任天堂株式会社 情報処理システム、情報処理装置、情報処理装置の制御方法および情報処理プログラム
WO2018232699A1 (zh) 2017-06-22 2018-12-27 腾讯科技(深圳)有限公司 一种信息处理的方法及相关装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117340A1 (en) * 2002-12-16 2004-06-17 Palo Alto Research Center, Incorporated Method and apparatus for generating summary information for hierarchically related information
CN105930314A (zh) * 2016-04-14 2016-09-07 清华大学 基于编码-解码深度神经网络的文本摘要生成系统及方法
CN106855853A (zh) * 2016-12-28 2017-06-16 成都数联铭品科技有限公司 基于深度神经网络的实体关系抽取系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
我偏笑_NSNIRVANA: "浅谈智能搜索和对话式OS", 《简书---HTTPS://WWW.JIANSHU.COM/P/3A9F49834C4A》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560398A (zh) * 2019-09-26 2021-03-26 百度在线网络技术(北京)有限公司 一种文本生成方法及装置
CN111597814A (zh) * 2020-05-22 2020-08-28 北京慧闻科技(集团)有限公司 一种人机交互命名实体识别方法、装置、设备及存储介质
CN111597814B (zh) * 2020-05-22 2023-05-26 北京慧闻科技(集团)有限公司 一种人机交互命名实体识别方法、装置、设备及存储介质
CN112269872A (zh) * 2020-10-19 2021-01-26 北京希瑞亚斯科技有限公司 简历解析方法、装置、电子设备及计算机存储介质
CN112269872B (zh) * 2020-10-19 2023-12-19 北京希瑞亚斯科技有限公司 简历解析方法、装置、电子设备及计算机存储介质
CN114095033A (zh) * 2021-11-16 2022-02-25 上海交通大学 基于上下文的图卷积的目标交互关系语义无损压缩系统及方法
CN114095033B (zh) * 2021-11-16 2024-05-14 上海交通大学 基于上下文的图卷积的目标交互关系语义无损压缩系统及方法

Also Published As

Publication number Publication date
CN109661664B (zh) 2021-04-27
WO2018232699A1 (zh) 2018-12-27
US10789415B2 (en) 2020-09-29
US20190370316A1 (en) 2019-12-05

Similar Documents

Publication Publication Date Title
CN109661664B (zh) 一种信息处理的方法及相关装置
KR102382499B1 (ko) 번역 방법, 타깃 정보 결정 방법, 관련 장치 및 저장 매체
Bahdanau et al. Learning to compute word embeddings on the fly
EP4206994A1 (en) Model compression method and apparatus
CN111597830A (zh) 基于多模态机器学习的翻译方法、装置、设备及存储介质
CN112395385B (zh) 基于人工智能的文本生成方法、装置、计算机设备及介质
CN109740158B (zh) 一种文本语义解析方法及装置
Khan et al. RNN-LSTM-GRU based language transformation
CN111401081A (zh) 神经网络机器翻译方法、模型及模型形成方法
CN112016275A (zh) 一种语音识别文本的智能纠错方法、系统和电子设备
CN110263304B (zh) 语句编码方法、语句解码方法、装置、存储介质及设备
CN110569505A (zh) 一种文本输入方法及装置
CN112307168A (zh) 基于人工智能的问诊会话处理方法、装置和计算机设备
CN113569562A (zh) 一种降低端到端语音翻译跨模态跨语言障碍的方法及系统
CN111949762B (zh) 基于上下文情感对话的方法和系统、存储介质
CN110348007A (zh) 一种文本相似度确定方法及装置
CN110874535A (zh) 依存关系对齐组件、依存关系对齐训练方法、设备及介质
Kasai et al. End-to-end graph-based TAG parsing with neural networks
CN113935312A (zh) 长文本匹配方法及装置、电子设备及计算机可读存储介质
CN116776287A (zh) 融合多粒度视觉与文本特征的多模态情感分析方法及系统
WO2023165111A1 (zh) 客服热线中用户意图轨迹识别的方法及系统
Paul et al. English to bengali neural machine translation system for the aviation domain
CN111783435A (zh) 共享词汇的选择方法、装置及存储介质
CN113392629B (zh) 基于预训练模型的人称代词消解方法
CN114925175A (zh) 基于人工智能的摘要生成方法、装置、计算机设备及介质

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant