CN111737466A - Method for quantizing interactive information of deep neural network - Google Patents
Method for quantizing interactive information of deep neural network Download PDFInfo
- Publication number
- CN111737466A CN111737466A CN202010558767.1A CN202010558767A CN111737466A CN 111737466 A CN111737466 A CN 111737466A CN 202010558767 A CN202010558767 A CN 202010558767A CN 111737466 A CN111737466 A CN 111737466A
- Authority
- CN
- China
- Prior art keywords
- unit
- units
- neural network
- deep neural
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a method for quantizing deep neural network interaction information, which comprises the following steps: s1, obtaining a sample from the natural language processing field data set, wherein the sample comprises a plurality of units, each unit corresponds to a word, and the units in the sample are subjected to multiple aggregation processing until the units in the sample are aggregated into a unit; s2, constructing a tree diagram reflecting the inter-word interaction information of the deep neural network internal modeling according to the unit aggregation mode in the multiple aggregation processing process of the given sample in the step S1. The method can objectively quantify the interactive information among the input sample words modeled in the deep neural network, and cluster the adjacent units with obvious interaction according to the interactive information ratio to finally obtain a tree-shaped hierarchical structure reflecting the interactive information among the words modeled in the deep neural network, thereby providing a universal method for further understanding the deep neural network.
Description
Technical Field
The invention relates to the technical field of deep learning, in particular to application of a deep neural network in the field of natural language processing, and more particularly relates to a method for quantizing interactive information of the deep neural network.
Background
At present, a deep neural network (deep neural network) shows excellent modeling capability on various tasks of natural language processing, but the deep neural network is generally considered as a black box model, the internal modeling logic of the deep neural network is invisible, and the characteristic becomes a blind point for effectively evaluating the accuracy and reliability of a final decision result, so that the interpretation of the internal modeling logic of the neural network becomes an important research direction. Particularly in the field of natural language processing, the neural network models which mutual information among input words is still opaque, so that decoupling and quantizing the mutual information among all words in an input sentence modeled by the deep neural network plays an important role in understanding the internal logic and decision making mechanism of the neural network.
Disclosure of Invention
Therefore, the present invention is directed to overcoming the above-mentioned drawbacks of the prior art and providing a new method for deep neural network mutual information quantification for understanding the logic inherent in the deep neural network.
The invention discloses a method for quantizing deep neural network interaction information, which is used for constructing a tree diagram for quantizing the interaction information among words modeled by a deep neural network in a natural language processing task, and the method comprises the following steps:
s1, obtaining a sample from the natural language processing field data set, wherein the sample comprises a plurality of units, each unit corresponds to a word, and the units in the sample are subjected to multiple aggregation processing until the units in the sample are aggregated into a unit; wherein each polymerization treatment comprises: inputting a current sample into a deep neural network, and calculating a Shapril value of each unit in the current sample according to the output of the deep neural network, wherein the deep neural network is used for a natural language processing task in the field of natural language processing; calculating the interactive gain rate between every two adjacent units based on the sand-pril value of each unit, aggregating the two adjacent units with the maximum interactive gain rate into a new unit, and forming a new current sample with other units in the current sample for the next aggregation treatment;
s2, constructing a tree diagram reflecting the inter-word interaction information of the deep neural network internal modeling according to the unit aggregation mode in the multiple aggregation processing process of the given sample in the step S1. Preferably, the binary tree is constructed as follows: s31, forming a first layer leaf node of the binary tree from bottom to top by using all units in the sample; and S32, according to the aggregation sequence, taking a new unit formed after each aggregation as a father node of two adjacent units before the aggregation until a root node of the tree is formed.
Wherein the value of the salpril for each cell in the current sample is a weighted average of the marginal contributions of that cell in the set of all other cells in the current sample that may be made up. The value of salpril for each unit in the current sample is determined as follows:
wherein, v represents a neural network,representing the ith cell a in the current sampleiThe value of Shapril, N represents the set of all units in the current sample, | N | represents the size of the set N, and S represents the unit a except the ith unit in the current sampleiOther possible combinations of units than S, where S represents the size of the set S, where factorial, v (-) represents the output of the deep neural network, and v (S ∪ { a) }iMeans of the ith unit aiMarginal contribution to set S, where v (S ∪ { a)iDenotes the addition of the i-th unit a to the set SiThe resulting outputs from the set S input neural network are denoted by v (S).
The interaction gain ratio between two adjacent units is the ratio of the interaction gain of the two adjacent units to all interaction information interacting with the two units.
Preferably, the interaction gain ratio between two adjacent cells is determined by:
wherein [ S ]1]Is represented by the set S1A unit formed by polymerizing all the units in (A), (B), (C) and (C) [ S ]2]Is represented by the set S2A unit formed by polymerizing all the units in (A), (B), (C) and (C) [ S ]1]、[S2]Being two adjacent cells, Bbetween([S1],[S2]) Is two adjacent units [ S1]、[S2]Gain of interaction between, [ S ]1]Is a unit of [ S1]Units adjacent to the left side thereof before being polymerized, [ S ]2]Is a unit of [ S2]Its right-adjacent unit before being polymerized, Bbetween([S1]',[S1]) Is a unit [ S1]'、[S1]Gain of interaction between, Bbetween([S2],[S2]') is a unit [ S2]、[S2]' mutual gain between, and unit [ S1]、[S2]The related total interaction information is Bbetween([S1],[S2])、Bbetween([S1]',[S1])、Bbetween([S2],[S2]')、φ([S1])、φ([S2]) Where phi ([ S ]1])、φ([S2]) Are respectively a unit [ S1]、[S2]A value of salpril.
And the interactive gain of the two adjacent units is the difference between the interactive gain of the new unit after the two adjacent units are aggregated and the interactive gain of the two adjacent units before the two adjacent units are not aggregated.
Preferably, the interaction gain of two adjacent cells is determined by:
Bbetween([S1],[S2])=B([S])-B([S1])-B([S2])
wherein [ S ]]Denotes a unit formed by the polymerization of all units in the set S, [ S ]1]Is represented by the set S1Wherein all units are polymerized to form a unit, [ S ]2]Is represented by the set S2Wherein all units are aggregated to form a unit, B (-) represents the interaction gain within the unit, Bbetween(. represents) interactions between unitsAnd (4) gain.
In some embodiments of the invention, each cell interaction gain is determined by:
wherein, [ S ] represents a unit formed by aggregating all units in the set S, b is a unit in the set S, and N \ S represents a set formed by the units in the set N except the set S.
Compared with the prior art, the invention has the advantages that: the invention innovatively provides a method for quantitatively evaluating and understanding internal logic of a deep neural network, which can be used for objectively quantifying interaction information among input sample words modeled in the deep neural network by combining the thought of a game theory, providing a special index to evaluate an interaction gain rate and constructing a tree structure according to the interaction gain rate, clustering adjacent units with obvious interaction according to the size of the interaction information rate, and finally obtaining a tree-shaped hierarchical structure reflecting the interaction information among the words modeled in the deep neural network, thereby providing a universal method for further understanding the deep neural network.
Drawings
Embodiments of the invention are further described below with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart illustrating a method for deep neural network interaction information quantification according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a binary tree established based on an example sample in a method for quantizing deep neural network interaction information according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail by embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention aims to provide a method for quantizing interactive information among input words of deep neural network modeling aiming at the black box property of the current deep neural network, so that the interaction among the words of the deep neural network modeling is objectively explained, and the understanding of the internal logic and decision mechanism of the deep neural network is facilitated.
According to an embodiment of the present invention, as shown in fig. 1, there is provided a method for deep neural network interaction information quantization, including steps T1, T2, T3, T4, and T5, each of which is described in detail below.
In step T1, a deep neural network for a natural language processing task in the natural language processing domain is obtained.
In step T2, based on the given input sample, calculating a value of salpril for each unit in the given input sample according to the output of the deep neural network feature layer; wherein the given input sample is a sentence from the natural language processing domain-related dataset comprising a plurality of units, and each unit in the initial given input sample consists of a word. The output of the deep neural network feature layer can be the output of any feature layer of the deep neural network or the final output of the deep neural network.
The Shapril value method is a distribution mode for fairly distributing the benefits which are obtained by each member according to the contribution of each member in one cooperation in the cooperative game theory. In the present invention, given a trained neural network v (i.e., the field game), the input samples N ═ a1,a2,...,anWhere n is the number of cells contained in the input sample (each cell initially consisting of a word), ai(1 < i < n) denotes the individual units in the input sample (i.e., the players of the game). In order to obtain more benefits, part of the participants cooperate to form a coalition (coordination) S (i.e. a set of partial units), and the total benefit obtained by the coalition S in the game is v (S), i.e. the output value of the neural network when the input unit set is S. If a participant a is outside the federation SiAlso join the federation, the total benefit ultimately obtained by the federation is v (S ∪ { a)i}), then v (S ∪ { a)i}) -v (S) denotes participant aiThe marginal contribution to the federation S. The salpril values model a weighted average of each participant' S contribution to the marginal contribution from the various possible leagues S in the game v. The ith cell a in the input sampleiThe value of salapril can be expressed as phiv(ai),φv(ai) Calculated by equation (1):
wherein v represents a neural network, N represents a set composed of all units in the current input sample, | N | represents the size of the set N, and S represents the unit a except the ith unit in the current sampleiOther units than possible may form a set, | S | represents the size of the set S |, and |. Representing factorial, v (-) represents the output of the deep neural network. And finally, the size of the sand pril value is used for proportionally distributing the benefits of each member in the game, so that the influence degree of each unit in the input sample on the final decision of the neural network is obtained, and the larger the sand pril value is, the larger the influence of the unit on the final decision of the neural network is, and vice versa.
Preferably, phi is calculated by means of samplingv(ai). For example, for a given input sample containing N elements, denoted as the set N ═ a1,a2,...,anConsider the currently needed computing unit aiSaapril value of phiv(ai) Set N \ a formed by the rest of other units in set NiSampling once to obtain a set S, masking word vectors corresponding to units (namely, units not included in the set S) which are not sampled in the input samples to be 0 vectors, thereby obtaining samples S after masking, sending the samples S to a deep neural network to obtain v (S), and similarly, if the unit a is subjected to masking, obtaining v (S) by using the word vectorsiAdding the sample into the sampled set S, and then masking the processed sample S ∪ { a }iSending into neural network to obtain v (S ∪ { a) }i}),v(S∪{aiThe current unit a isiThe marginal contribution amount of (c). According toIn one embodiment of the invention, sample M (M ≦ 2)n-1) Then, calculate the average value of M times of marginal contribution amounts as unit aiSaapril value of phiv(i) In that respect Similarly, the value of salpril for each unit in the input sample can be obtained.
In step T3, the interaction gain ratio between any two adjacent cells in a given input sample is calculated. For a given game (in the present invention, a trained deep neural network v), several participants form an indivisible whole (in the present invention, a whole [ S ] formed by some units in a given input sample), that is, the whole [ S ] is taken as one participant, S represents a union of the several participants (in the present invention, the participants are units, and S is a set of the units), so that the interaction gain B ([ S ]) included in the participant [ S ] is:
wherein b is an element in the set S, [ S ]]Is a unit aggregated by all units in the set S, v(N \S)∪{[S]}Representing actual participants in game v as members of set N minus members of set S plus participant [ S ]]Gain obtained (in the present invention, v(N\S)∪{[S]}Means that the cell in a given input sample set is subtracted by the cell in the set S plus the cell S]Output of the deep neural network v as input); similarly, v(N\S)∪{b}Representing the actual participant in game v as a member in N minus a member in S plus the benefit earned by participant b (in the present invention, v(N\S)∪{b}Representing the output of the deep neural network v at the input given the unit in the input sample set N minus the unit in the set S plus the unit b).Indicating unit [ S]At v(N\S)∪{[S]}The values of the underlying safapril, in the same way,indicating unit b at v(N\S)∪{b}The following values of salpril, the calculation of the salpril values, are referred to the aforementioned formula (1).
According to one example of the present invention, assume that the set of all elements in an input sample is N ═ a1,a2,...,anConsider a unit formed by the aggregation of some two adjacent unitsThe unit (containing a plurality of words) and other units left in the sample form a clustered sampleCan calculate the unitAnd then the cross gain of the unit is obtained according to the formula (2):
because the units in the input samples are continuously aggregated, the number of the units contained in the aggregated samples is continuously reduced until the units are finally aggregated into a unit (namely, the whole input sample is taken as a unit). In this polymerization process, if two adjacent units [ S ]1],[S2]Are polymerized into a unit [ S ]]Then, the interactive gain index between the three is the following equation:
B([S])=B([S1])+B([S2])+Bbetween([S1],[S2]) (3)
B([S1]),B([S2]) Are respectively two [ S1]、[S1]Inter-gain within a cell, Bbetween([S1],[S2]) Is the interaction gain between two units, then Bbetween([S1],[S2]) This can be derived as follows:
Bbetween([S1],[S2])=B([S])-B([S1])-B([S2])
finally, the interaction gain B between two adjacent cells can be calculatedbetweenThe ratio r of all the interaction information interacting with these two units. When two units [ S ]1],[S2]Are aggregated into a unit [ S ]]Time, note unit [ S1]The left adjacent unit before being polymerized is [ S ]1]', unit [ S2]The unit adjacent to the right before being polymerized is [ S ]2]', unit [ S1],[S2]The gain of the interaction between is Bbetween([S1],[S2]) Unit [ S ]1]',[S1]The gain of the interaction between is Bbetween([S1]',[S1]) Unit [ S ]2],[S2]' the gain of interaction between Bbetween([S2],[S2]'), and the unit [ S ]1],[S2]The related total interaction information is Bbetween([S1],[S2])、Bbetween([S1]',[S1])、Bbetween([S2],[S2]')、φ([S1])、φ([S2]) Wherein phi ([ S ]1]),φ([S2]) Are respectively a unit [ S1]、[S2]A value of salpril of [ S ]1]、[S2]The interactive gain ratio between the two is:
in step T4, two adjacent cells with the maximum inter-gain ratio are aggregated to form a new cell, and the new cell and the remaining other cells in the sample together form a once aggregated sample; the steps T2 to T4 are repeated with the aggregated sample as the new given input sample, and the iteration is continued until the aggregated sample contains only one unit, which obviously contains all the words in the initial sample.
For example, according to an example of the present invention, the set of all elements in the input sample is N ═ a1,a2,...,anThe cells in the set(n-1) combinations of two adjacent units that can be formed, the two units with the largest interaction gain ratio, such as ai,ajAre polymerized into a unitForming a clustered sample with the rest (n-2) unitsObviously, N' contains N-1 units, and the iteration is repeated, and finally, a sample N containing only one unit is formedroot=[{a1,a2,...,an}]Set N ofrootContains all the words a in a given input sample1,a2,...,an。
In step T5, a binary tree containing a tree-like hierarchy is created according to the process of continuously aggregating cells in the input samples. Obviously, the leaf nodes of the tree are the words in the input sample, each time two units are aggregated into one unit, an intermediate node of the tree is formed, and as the aggregation progresses, the root node of the tree is finally formed, and a binary tree can be built in such a bottom-up manner.
According to one example of the invention, a binary tree is constructed with a given input sample { the, sun, is, coming, out } as an example. The method comprises the steps that a sample composed of the units, sun, is, corning and out is input into a deep neural network trained by a natural language processing field data set, the units in the sample are continuously aggregated according to the output of the deep neural network, for example, when the units are aggregated for the first time, the units are aggregated into a new unit, the sun and the sun are aggregated into a new sample { the sun, is, corning and out } with the rest of the units is, corning and out; during the second polymerization, the units the sun and is are polymerized to form a new unit the sun is, and then the new unit is combined with the rest other units and out to form a new sample { the sun is, combining and out } after polymerization; during the third aggregation, the units corning and out are aggregated to form a new unit corning out, and then a new aggregated sample { the sun is, corning out } is formed with the rest other units the sun is; the sun is and the corning out are aggregated to form a unit of the sun is corning out in the fourth aggregation, the unit is a root node of the tree, a binary tree corresponding to a sample { the, sun, is, corning, out } constructed according to the aggregation process is shown in fig. 2, and the inter-word interaction information of the deep neural network modeling is displayed through the binary tree, so that the understanding of the internal logic of the deep neural network can be facilitated.
The method provided by the invention explains the internal logic of the neural network by utilizing a hierarchical structure, can objectively quantify the interactive information among the input sample words modeled in the deep neural network, and clusters the adjacent units with obvious interaction according to the ratio of the interactive information, finally obtains a tree-shaped hierarchical structure reflecting the interactive information among the words modeled in the deep neural network, and provides a universal method for further understanding the deep neural network. The method can be used for constructing the tree diagram for any deep neural network used for natural language processing tasks in the natural language processing field so as to understand the inherent logic of the deep neural network.
It should be noted that, although the steps are described in a specific order, the steps are not necessarily performed in the specific order, and in fact, some of the steps may be performed concurrently or even in a changed order as long as the required functions are achieved.
The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.
The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may include, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (11)
1. A method for deep neural network interaction information quantification, which is used for constructing a tree diagram for quantifying inter-word interaction information modeled by a deep neural network in a natural language processing task, and is characterized by comprising the following steps:
s1, obtaining a sample from the natural language processing field data set, wherein the sample comprises a plurality of units, each unit corresponds to a word, and the units in the sample are subjected to multiple aggregation processing until the units in the sample are aggregated into a unit;
wherein each polymerization treatment comprises:
inputting a current sample into a deep neural network, and calculating a Shapril value of each unit in the current sample according to the output of the deep neural network, wherein the deep neural network is used for a natural language processing task in the field of natural language processing;
calculating the interactive gain rate between every two adjacent units based on the sand-pril value of each unit, aggregating the two adjacent units with the maximum interactive gain rate into a new unit, and forming a new current sample with other units in the current sample for the next aggregation treatment;
s2, constructing a tree diagram reflecting the inter-word interaction information of the deep neural network internal modeling according to the unit aggregation mode in the multiple aggregation processing process of the given sample in the step S1.
2. The method of claim 1, wherein the value of salpril for each cell in the current sample is a weighted average of the marginal contributions of the cell in the set of all other cells in the current sample.
3. The method for deep neural network mutual information quantification as claimed in claim 2, wherein the value of the salpril of each unit in the current sample is determined by:
wherein, v represents a neural network,representing the ith cell a in the current sampleiThe value of Shapril, N represents the set of all units in the current sample, | N | represents the size of the set N, and S represents the unit a except the ith unit in the current sampleiOther possible combinations of units than S, where S represents the size of the set S, where factorial, v (-) represents the output of the deep neural network, and v (S ∪ { a) }iMeans of the ith unit aiMarginal contribution to set S, where v (S ∪ { a)iDenotes the addition of the i-th unit a to the set SiThe resulting outputs from the set S input neural network are denoted by v (S).
4. The method of claim 3, wherein the interaction gain ratio between two adjacent units is the ratio of the interaction gain of the two adjacent units to the total interaction information interacted with the two units.
5. The method of claim 4, wherein the interaction gain ratio between two adjacent units is determined by:
wherein [ S ]1]Is represented by the set S1A unit formed by polymerizing all the units in (A), (B), (C) and (C) [ S ]2]Is represented by the set S2A unit formed by polymerizing all the units in (A), (B), (C) and (C) [ S ]1]、[S2]Being two adjacent cells, Bbetween([S1],[S2]) Is two adjacent units [ S1]、[S2]Gain of interaction between, [ S ]1]Is a unit of [ S1]Units adjacent to the left side thereof before being polymerized, [ S ]2]Is a unit of [ S2]Its right-adjacent unit before being polymerized, Bbetween([S1]',[S1]) Is a unit [ S1]'、[S1]Gain of interaction between, Bbetween([S2],[S2]') is a unit [ S2]、[S2]' mutual gain between, and unit [ S1]、[S2]The related total interaction information is Bbetween([S1],[S2])、Bbetween([S1]',[S1])、Bbetween([S2],[S2]')、φ([S1])、φ([S2]) Where phi ([ S ]1])、φ([S2]) Are respectively a unit [ S1]、[S2]A value of salpril.
6. The method of claim 5, wherein the inter-gain of two neighboring units is a difference between an inter-gain of a new unit after aggregation of the two neighboring units and an inter-gain of the two neighboring units before aggregation.
7. The method of claim 6, wherein the interaction gains of two adjacent units are determined by:
Bbetween([S1],[S2])=B([S])-B([S1])-B([S2])
wherein [ S ]]Denotes a unit formed by the polymerization of all units in the set S, [ S ]1]Is represented by the set S1Wherein all units are polymerized to form a unit, [ S ]2]Is represented by the set S2Wherein all units are aggregated to form a unit, B (-) represents the interaction gain within the unit, Bbetween(. cndot.) represents the interaction gain between units.
9. The method for quantizing deep neural network interaction information according to any one of claims 1 to 8, wherein the step S3 is implemented by constructing a binary tree as follows:
s31, forming a first layer leaf node of the binary tree from bottom to top by using all units in the sample;
and S32, according to the aggregation sequence, taking a new unit formed after each aggregation as a father node of two adjacent units before the aggregation until a root node of the tree is formed.
10. A computer-readable storage medium, having embodied thereon a computer program, the computer program being executable by a processor to perform the steps of the method of any one of claims 1 to 9.
11. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs which, when executed by the one or more processors, cause the electronic device to carry out the steps of the method according to any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010558767.1A CN111737466B (en) | 2020-06-18 | 2020-06-18 | Method for quantizing interactive information of deep neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010558767.1A CN111737466B (en) | 2020-06-18 | 2020-06-18 | Method for quantizing interactive information of deep neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111737466A true CN111737466A (en) | 2020-10-02 |
CN111737466B CN111737466B (en) | 2022-11-29 |
Family
ID=72649650
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010558767.1A Active CN111737466B (en) | 2020-06-18 | 2020-06-18 | Method for quantizing interactive information of deep neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111737466B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112200320A (en) * | 2020-12-02 | 2021-01-08 | 成都数联铭品科技有限公司 | Model interpretation method, system, equipment and storage medium based on cooperative game method |
CN118378667A (en) * | 2024-04-01 | 2024-07-23 | 佛山科学技术学院 | NAS neural network design method and system based on saprolil values |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180253646A1 (en) * | 2017-03-05 | 2018-09-06 | International Business Machines Corporation | Hybrid aggregation for deep learning neural networks |
CN108875024A (en) * | 2018-06-20 | 2018-11-23 | 清华大学深圳研究生院 | File classification method, system, readable storage medium storing program for executing and electronic equipment |
CN109299262A (en) * | 2018-10-09 | 2019-02-01 | 中山大学 | A kind of text implication relation recognition methods for merging more granular informations |
CN109858032A (en) * | 2019-02-14 | 2019-06-07 | 程淑玉 | Merge more granularity sentences interaction natural language inference model of Attention mechanism |
CN110866113A (en) * | 2019-09-30 | 2020-03-06 | 浙江大学 | Text classification method based on sparse self-attention mechanism fine-tuning Bert model |
-
2020
- 2020-06-18 CN CN202010558767.1A patent/CN111737466B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180253646A1 (en) * | 2017-03-05 | 2018-09-06 | International Business Machines Corporation | Hybrid aggregation for deep learning neural networks |
CN108875024A (en) * | 2018-06-20 | 2018-11-23 | 清华大学深圳研究生院 | File classification method, system, readable storage medium storing program for executing and electronic equipment |
CN109299262A (en) * | 2018-10-09 | 2019-02-01 | 中山大学 | A kind of text implication relation recognition methods for merging more granular informations |
CN109858032A (en) * | 2019-02-14 | 2019-06-07 | 程淑玉 | Merge more granularity sentences interaction natural language inference model of Attention mechanism |
CN110866113A (en) * | 2019-09-30 | 2020-03-06 | 浙江大学 | Text classification method based on sparse self-attention mechanism fine-tuning Bert model |
Non-Patent Citations (2)
Title |
---|
张小庆等: "基于合作博弈的虚拟化资源效用分配策略", 《计算机科学》 * |
程淑玉等: "融合Attention多粒度句子交互自然语言推理研究", 《小型微型计算机系统》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112200320A (en) * | 2020-12-02 | 2021-01-08 | 成都数联铭品科技有限公司 | Model interpretation method, system, equipment and storage medium based on cooperative game method |
CN118378667A (en) * | 2024-04-01 | 2024-07-23 | 佛山科学技术学院 | NAS neural network design method and system based on saprolil values |
Also Published As
Publication number | Publication date |
---|---|
CN111737466B (en) | 2022-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110807154B (en) | Recommendation method and system based on hybrid deep learning model | |
CN109783817B (en) | Text semantic similarity calculation model based on deep reinforcement learning | |
Lall et al. | The MIDAS touch: accurate and scalable missing-data imputation with deep learning | |
CN111078836B (en) | Machine reading understanding method, system and device based on external knowledge enhancement | |
CN109992779B (en) | Emotion analysis method, device, equipment and storage medium based on CNN | |
CN110674850A (en) | Image description generation method based on attention mechanism | |
CN111309927B (en) | Personalized learning path recommendation method and system based on knowledge graph mining | |
CN110046228B (en) | Short text topic identification method and system | |
CN111125520B (en) | Event line extraction method based on deep clustering model for news text | |
Evans | Uncertainty and error | |
CN111737466B (en) | Method for quantizing interactive information of deep neural network | |
CN112417289A (en) | Information intelligent recommendation method based on deep clustering | |
WO2023045725A1 (en) | Method for dataset creation, electronic device, and computer program product | |
Clarke | Logical constraints: The limitations of QCA in social science research | |
Christensen et al. | Factor or network model? Predictions from neural networks | |
CN117313160B (en) | Privacy-enhanced structured data simulation generation method and system | |
Roussos | Normative formal Epistemology as modelling | |
Shin et al. | End-to-end task dependent recurrent entity network for goal-oriented dialog learning | |
CN109977194B (en) | Text similarity calculation method, system, device and medium based on unsupervised learning | |
Yang | Machine learning methods on COVID-19 situation prediction | |
Zhu et al. | A hybrid model for nonlinear regression with missing data using quasilinear kernel | |
Wang et al. | [Retracted] Application of Improved Machine Learning and Fuzzy Algorithm in Educational Information Technology | |
CN112507185B (en) | User portrait determination method and device | |
CN110348577B (en) | Knowledge tracking method based on fusion cognitive computation | |
CN112463964A (en) | Text classification and model training method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |