WO2023213233A1 - Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support - Google Patents
Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support Download PDFInfo
- Publication number
- WO2023213233A1 WO2023213233A1 PCT/CN2023/091356 CN2023091356W WO2023213233A1 WO 2023213233 A1 WO2023213233 A1 WO 2023213233A1 CN 2023091356 W CN2023091356 W CN 2023091356W WO 2023213233 A1 WO2023213233 A1 WO 2023213233A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- scale
- matching
- graph representation
- node
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 125
- 238000012549 training Methods 0.000 title claims abstract description 54
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 21
- 238000003672 processing method Methods 0.000 title claims abstract description 9
- 239000013598 vector Substances 0.000 claims abstract description 110
- 238000012545 processing Methods 0.000 claims abstract description 82
- 238000000605 extraction Methods 0.000 claims description 43
- 230000008569 process Effects 0.000 claims description 20
- 230000004044 response Effects 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 14
- 238000000547 structure data Methods 0.000 claims description 14
- 230000001419 dependent effect Effects 0.000 claims description 12
- 238000013459 approach Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 description 18
- 238000004891 communication Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 7
- 108090000623 proteins and genes Proteins 0.000 description 7
- 230000009466 transformation Effects 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 239000000284 extract Substances 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 5
- 102000004169 proteins and genes Human genes 0.000 description 5
- 238000005070 sampling Methods 0.000 description 5
- 230000001413 cellular effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002790 cross-validation Methods 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000007667 floating Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000001629 suppression Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 125000003275 alpha amino acid group Chemical group 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 125000000524 functional group Chemical group 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000013140 knowledge distillation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 238000013215 result calculation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
Definitions
- Step 101 Obtain first data and second data.
- the first data and second data are respectively one of image data, audio data, text data, molecular structure data and sequence data;
- Step 104 Perform graph matching on the graph representation of the first scale of the first data and the graph representation of the first scale of the second data to obtain a first matching result;
- FIG. 2 shows a schematic diagram of a multi-scale graph representation according to one embodiment of the present disclosure.
- graph representations 202, 204, and 206 of three scales from high to low constitute a multi-scale graph representation.
- Each graph representation includes multiple nodes, and the graph representation 206 includes multiple adjacent edges.
- Graph representations 202, 204, 206 may be obtained by sparsifying dense data 208.
- the dense data 208 includes three dense data corresponding to the three scales respectively.
- the graph representations 202, 204, and 206 of the three scales can be obtained.
- detection-based approaches may be used to determine nodes in dense data.
- the detection-based method may include key point detection, target detection, or other types of detection, which are not limited here.
- node sparsification can be performed through a node sparsification network, and the node sparsification network can include a detection network, a saliency network, etc.
- the node sparse network is a detection network
- the dense data is input into the detection network, and the sparse nodes corresponding to the dense data and the corresponding confidence of the nodes are obtained.
- the node sparse network is a saliency network
- the dense data and the feature vector corresponding to the dense data are input into the saliency network, and the saliency score of each dense node corresponding to the dense data is obtained.
- the nodes of at least one scale may be obtained by merging low-scale nodes, and the low-scale nodes may be obtained by sparsifying dense data.
- a clustering or graph neural network method may be used to cluster multiple low-scale nodes obtained by sparsification, or the package A subgraph containing multiple low-scale nodes is input into the graph neural network to obtain higher-scale nodes and/or node attributes.
- Multiscale graphs can include adjacency edges when the relative relationships between nodes are helpful in characterizing the data. For example, the distance between two targets in the image, the role between the two targets in the image, the association between the preceding and following words in speech, and the interaction of different groups in the sequence.
- the nodes at the second scale are obtained by clustering dense data
- the nodes at the first scale are obtained by merging the nodes at the second scale, and then the merging is used to obtain the first
- the attributes of the dependent edge may be determined based on the attributes of the two nodes connected to the dependent edge.
- the attributes of the subordinate edge can be determined in various ways according to the vector type attributes and/or the scalar type attributes of the two nodes connected to the subordinate edge, which are not limited here.
- Step 306 Based on the matching results of the candidate matching edge pairs, determine the matching results of the graph representation of the scale of the first data and the graph representation of the scale of the second data.
- the graph representation of each scale in the multi-scale graph representation may include at least one node, the node may include attributes, and the attributes of the node may include scalar type attributes and vector type attributes.
- the graph representation of at least one scale in the multi-scale graph representation may include at least one adjacent edge. Each of the at least one adjacent edge is used to characterize the relative relationship between two nodes of the same scale. The adjacent edge has an attribute. Properties include scalar type properties and vector type properties.
- the Nth round of training means that the network has undergone at least one round of training and thus has a certain inference ability, but it is not intended to limit the specific number of training rounds of the network.
- FIG. 10 shows a structural block diagram of a task processing device 1000 according to an embodiment of the present disclosure.
- the device 1000 includes: a first acquisition unit 1010 configured to acquire first data and second data, the first data and the second data.
- the two data are respectively one of image data, audio data, text data, molecular structure data, and sequence data;
- the second acquisition unit 1020 is configured to acquire a first-scale graphic representation of each of the first data and the second data,
- the graph representation of the first scale includes at least one node of the first scale, wherein the node of the first scale has attributes, and the attributes of the nodes of the first scale include attributes of vector type;
- the third obtaining unit 1030 is configured to obtain the first A graph representation of a second scale respectively of the data and the second data, the second scale being lower than the first scale, the graph representation of the second scale including at least one node of the second scale, wherein the node of the second scale has an attribute, and the graph representation of the second scale
- the attributes of the scale nodes include
- Respective multi-scale graph representation wherein the multi-scale graph representation is determined using the graph representation extraction network, and the multi-scale graph representation includes a graph representation of the first scale and a graph representation of the second scale;
- the third graph matching unit 1130 is configured To perform graph matching on the graph representation of the first scale of the first sample data and the graph representation of the first scale of the second sample data to obtain a first current matching result that represents the matching degree of the first scale;
- the fourth graph matching Unit 1140 is configured to perform graph matching on the graph representation of the second scale of the first sample data and the graph representation of the second scale of the second sample data to obtain a second current matching result that represents the matching degree of the second scale.
- FIG 12 illustrates an example configuration of an electronic device 1200 that may be used to implement the methods described herein.
- Each of the above-described apparatus 1000 and apparatus 1100 may also be fully or at least partially implemented by an electronic device 1200 or similar device or system.
- Electronic device 1200 may be a variety of different types of devices. Examples of electronic devices 1200 include, but are not limited to: desktop computers, server computers, laptop or netbook computers, mobile devices (e.g., tablet computers, cellular or other wireless phones (e.g., smartphones), notepad computers, mobile stations), Wearable devices (eg, glasses, watches), entertainment devices (eg, entertainment appliances, set-top boxes communicatively coupled to display devices, game consoles), televisions or other display devices, automotive computers, and the like.
- mobile devices e.g., tablet computers, cellular or other wireless phones (e.g., smartphones), notepad computers, mobile stations
- Wearable devices eg, glasses, watches
- entertainment devices eg, entertainment appliances, set-top boxes communicatively coupled to display devices, game consoles
- televisions or other display devices automotive computers, and the like.
- a display device 1208, such as a monitor may be included for displaying information and images to a user.
- Other I/O devices 1210 may be devices that receive various inputs from the user and provide various outputs to the user, and may include touch input devices, gesture input devices, cameras, keyboards, remote controls, mice, printers, audio input/ Output devices and so on.
- a cloud includes and/or represents a platform for resources.
- the platform abstracts the underlying functionality of the cloud's hardware (e.g., servers) and software resources.
- Resources may include applications and/or data that may be used while performing computing processing on a server remote from electronic device 1200 .
- Resources may also include services provided over the Internet and/or through subscriber networks such as cellular or Wi-Fi networks.
- the platform can abstract resources and functionality to connect electronic device 1200 with other electronic devices. Therefore, implementation of the functionality described in this article can be distributed throughout the cloud. For example, functionality may be implemented partly on the electronic device 1200 and partly through a platform that abstracts the functionality of the cloud.
Abstract
Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support. Le procédé de traitement de tâche consiste : à obtenir des premières données et des secondes données ; à obtenir des représentations graphiques respectives d'une première échelle et des représentations graphiques respectives d'une seconde échelle des premières données et des secondes données, la seconde échelle étant inférieure à la première échelle, la représentation graphique de chaque échelle comprenant un nœud de l'échelle, le nœud de chaque échelle comprenant un attribut d'un type de vecteur, un nœud d'au moins une échelle de chaque élément de données étant obtenu par la dispersion de données denses correspondant aux données, et la représentation graphique d'au moins une échelle de chaque élément de données comprenant des bords adjacents représentant une relation relative du nœud de l'échelle ; à effectuer une mise en correspondance de graphes sur la première échelle et la seconde échelle sur les premières données et les secondes données, respectivement afin d'obtenir un premier résultat de mise en correspondance et un second résultat de mise en correspondance ; et à déterminer un résultat de mise en correspondance multi-échelle en fonction du premier résultat de mise en correspondance et/ou du second résultat de mise en correspondance, et à déterminer en outre un résultat de traitement de tâche.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210488516.X | 2022-05-06 | ||
CN202210488516.XA CN117078977A (zh) | 2022-05-06 | 2022-05-06 | 任务处理方法、神经网络的训练方法、装置、设备和介质 |
CN202210488466.5 | 2022-05-06 | ||
CN202210488466.5A CN117077751A (zh) | 2022-05-06 | 2022-05-06 | 神经网络的训练方法、图表示提取方法和任务处理方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023213233A1 true WO2023213233A1 (fr) | 2023-11-09 |
Family
ID=88646265
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/091356 WO2023213233A1 (fr) | 2022-05-06 | 2023-04-27 | Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023213233A1 (fr) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020006961A1 (fr) * | 2018-07-03 | 2020-01-09 | 北京字节跳动网络技术有限公司 | Procédé et dispositif d'extraction d'image |
EP3876114A2 (fr) * | 2020-12-25 | 2021-09-08 | Baidu Online Network Technology (Beijing) Co., Ltd. | Procede de recommandation d'un terme de recherche, procede d'entrainement d'un modele cible, dispositif de recommandation d'un terme de recherche, dispositif d'entrainement d'un modele cible, dispositif electronique et produit programme |
CN113536383A (zh) * | 2021-01-27 | 2021-10-22 | 支付宝(杭州)信息技术有限公司 | 基于隐私保护训练图神经网络的方法及装置 |
CN113609337A (zh) * | 2021-02-24 | 2021-11-05 | 腾讯科技(深圳)有限公司 | 图神经网络的预训练方法、训练方法、装置、设备及介质 |
-
2023
- 2023-04-27 WO PCT/CN2023/091356 patent/WO2023213233A1/fr unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020006961A1 (fr) * | 2018-07-03 | 2020-01-09 | 北京字节跳动网络技术有限公司 | Procédé et dispositif d'extraction d'image |
EP3876114A2 (fr) * | 2020-12-25 | 2021-09-08 | Baidu Online Network Technology (Beijing) Co., Ltd. | Procede de recommandation d'un terme de recherche, procede d'entrainement d'un modele cible, dispositif de recommandation d'un terme de recherche, dispositif d'entrainement d'un modele cible, dispositif electronique et produit programme |
CN113536383A (zh) * | 2021-01-27 | 2021-10-22 | 支付宝(杭州)信息技术有限公司 | 基于隐私保护训练图神经网络的方法及装置 |
CN113609337A (zh) * | 2021-02-24 | 2021-11-05 | 腾讯科技(深圳)有限公司 | 图神经网络的预训练方法、训练方法、装置、设备及介质 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111581961B (zh) | 一种中文视觉词汇表构建的图像内容自动描述方法 | |
US20210382937A1 (en) | Image processing method and apparatus, and storage medium | |
CN112269868B (zh) | 一种基于多任务联合训练的机器阅读理解模型的使用方法 | |
WO2019214289A1 (fr) | Procédé et appareil de traitement d'images, et dispositif électronique et support de stockage | |
CN109034248B (zh) | 一种基于深度学习的含噪声标签图像的分类方法 | |
CN113033438B (zh) | 一种面向模态非完全对齐的数据特征学习方法 | |
CN113204952B (zh) | 一种基于聚类预分析的多意图与语义槽联合识别方法 | |
CN114092742B (zh) | 一种基于多角度的小样本图像分类装置和方法 | |
CN110188827B (zh) | 一种基于卷积神经网络和递归自动编码器模型的场景识别方法 | |
US11803971B2 (en) | Generating improved panoptic segmented digital images based on panoptic segmentation neural networks that utilize exemplar unknown object classes | |
Bresler et al. | Modeling flowchart structure recognition as a max-sum problem | |
CN114612921B (zh) | 表单识别方法、装置、电子设备和计算机可读介质 | |
CN114564563A (zh) | 一种基于关系分解的端到端实体关系联合抽取方法及系统 | |
CN112966088B (zh) | 未知意图的识别方法、装置、设备及存储介质 | |
CN113434683A (zh) | 文本分类方法、装置、介质及电子设备 | |
CN111538846A (zh) | 基于混合协同过滤的第三方库推荐方法 | |
CN114691864A (zh) | 文本分类模型训练方法及装置、文本分类方法及装置 | |
CN115544303A (zh) | 用于确定视频的标签的方法、装置、设备及介质 | |
CN112418320A (zh) | 一种企业关联关系识别方法、装置及存储介质 | |
KR102608867B1 (ko) | 업계 텍스트를 증분하는 방법, 관련 장치 및 매체에 저장된 컴퓨터 프로그램 | |
Rajyagor et al. | Tri-level handwritten text segmentation techniques for Gujarati language | |
Ambili et al. | Siamese Neural Network Model for Recognizing Optically Processed Devanagari Hindi Script | |
CN111966798A (zh) | 一种基于多轮K-means算法的意图识别方法、装置和电子设备 | |
CN115640401B (zh) | 文本内容提取方法及装置 | |
WO2023213233A1 (fr) | Procédé de traitement de tâche, procédé d'entraînement de réseau neuronal, appareil, dispositif et support |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23799219 Country of ref document: EP Kind code of ref document: A1 |