CN115994221A - Memristor-based text emotion detection system and method - Google Patents

Memristor-based text emotion detection system and method Download PDF

Info

Publication number
CN115994221A
CN115994221A CN202310091466.6A CN202310091466A CN115994221A CN 115994221 A CN115994221 A CN 115994221A CN 202310091466 A CN202310091466 A CN 202310091466A CN 115994221 A CN115994221 A CN 115994221A
Authority
CN
China
Prior art keywords
circuit
attention
memory unit
convolution
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310091466.6A
Other languages
Chinese (zh)
Inventor
周跃
肖和
胡小方
洪浩钦
段书凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University
Original Assignee
Southwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University filed Critical Southwest University
Priority to CN202310091466.6A priority Critical patent/CN115994221A/en
Publication of CN115994221A publication Critical patent/CN115994221A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of text emotion recognition, and particularly discloses a memristor-based text emotion detection system and method. The linear mapping and matrix multiplication operation in the system is constructed by adopting a memristive cross array, and the rest circuit modules are also constructed by adopting a CMOS technology. Compared with the traditional computing architecture (GPU), the hardware scheme designed by the invention has lower power consumption, and simultaneously reduces the extra circuit area overhead caused by the separation of the memory and the computing unit; the invention provides a lightweight network model named MLA-DSCN to realize text emotion detection task, and the improved network model MLA-DSCN not only can pay attention to global information of input text information, but also can improve the information extraction capacity of the improved network model MLA-DSCN at a local level, and reduces the number of network parameters by 3 times on the premise of not sacrificing precision.

Description

Memristor-based text emotion detection system and method
Technical Field
The invention relates to the technical field of text emotion recognition, in particular to a memristor-based text emotion detection system and method.
Background
With the rapid development of Web 2.0 and 5G networks, a large amount of social media on the internet has affected various aspects of people's life. As the number of users increases, a large amount of data in different formats is uploaded to the network for sharing, including photos, voices and words. Therefore, a great deal of available text data has recently received much attention.
Due to the expansion of NLP (emotion analysis, machine reading understanding) tasks, deep Neural Networks (DNNs) are widely used to process text data, including Recurrent Neural Networks (RNNs) and Long Short Term Memory (LSTM) networks. While both RNN and LSTM networks solve some of the problems in text information processing, they still do not fully understand the context. Thus, attention-based networks are used to implement NLP tasks, which perform better than RNN and LSTM networks in terms of processing context information. The novel network model transducer, utilizing a multi-headed attention mechanism as both encoder and decoder, is becoming a computational paradigm in NLP. Recently, several work applications apply pre-trained BERT models, i.e., the variant of the transducer commonly used in NLP, to detect emotion in conversational tasks and get a good score. However, as the parameters of the attention-based network model increase rapidly, more memory and computing resources are required when deploying an attention-based network to an embedded device. On the other hand, classical von neumann structures separate computation and memory cells, which are called "memory walls", resulting in increased energy overhead and computation delay when performing large data transfers. This indicates that implementing a text emotion detection system in practical applications is severely hampered by memory walls. In particular, social media and other applications have been applied to embedded devices such as cell phones, tablet computers, and even smartwatches. Therefore, deploying a text emotion detection system on an embedded device is a very promising direction for embedded applications. However, in order to overcome the bottleneck of the memory wall, a new computing architecture needs to be established for the embedded emotion detection system, and for this purpose, the memristive-based neuromorphic computing system of the embodiment provides a feasible solution.
Memristors are ideal memory computing devices because they are dual-termination devices with dynamic resistance, low power consumption, and easy integration. In addition, memristive crossover arrays have proven useful for implementing hardware deployment of Convolutional Neural Networks (CNNs), and are also widely used for developing various intelligent applications. However, only a few hardware solutions have been proposed for implementing attention-based architectures. A character recognition task is now successfully completed through the use of memristor circuits, demonstrating that memristors can be used to deploy attention-based networks on hardware. There are many tasks that need to be done. First, memristor hardware implementations for multi-layer attention networks remain open. Furthermore, implementing text emotion detection on embedded devices remains a significant challenge due to the large amount of network parameters and computing resource overhead.
In short, emotion detection in text conversations is widely used in the field of human-computer interaction. Unlike detecting a sentence, detecting a potential emotion in a dialog text requires modeling the dependency between context and local information. However, conventional emotion detection networks do not focus on the inherent links between the local and global parts of the conversation. Meanwhile, due to the large number of parameters of the emotion detection network and the large scale of text information, the current deployment of a text emotion detection system to embedded equipment is still challenging.
Disclosure of Invention
The invention provides a memristor-based text emotion detection system and a memristor-based text emotion detection method, which solve the technical problems that: how to focus on the inherent links between the local and global conversations and how to reduce the number of network parameters and the system area.
In order to solve the technical problems, the invention provides a text emotion detection system based on a memristor, which comprises a preprocessing module, an attention calculating module and an emotion classification module;
the preprocessing module is used for inputting speaking text data and carrying out position coding and mark embedding on the speaking text data to obtain embedded text data;
the attention computation module includes a multi-layer attention circuit and a forward propagation circuit; the attention circuit of the first layer is used for extracting the multi-head attention of the embedded text data, the forward propagation circuit is used for propagating the output of the attention circuit of the current layer to the attention circuit of the next layer to be used as input until the attention circuit of the last layer, and the final attention is obtained and output to the emotion classification module;
the emotion classification module comprises a maximum pooling circuit, a third layer regularization circuit, a first point-to-convolution circuit, a depth convolution circuit, a Swish activation function circuit, a second point-to-convolution circuit, a fourth layer regularization circuit, a full-connection layer circuit and a second softMax activation function circuit which are sequentially connected, and is used for executing corresponding maximum pooling, regularization, point-to-convolution, depth convolution, swish activation function calculation, point-to-convolution, regularization, full-connection and softMax activation function calculation operations on respective inputs and finally outputting emotion detection results;
The attention extraction operation, the weight mapping operation, the first point convolution circuit, the depth convolution circuit, the second point convolution circuit and the full connection layer circuit in the attention calculation module are all constructed based on a memristive cross array, every two columns of the memristive cross array correspond to positive and negative weights in a neural network, and the output of every two columns corresponds to one output voltage.
Specifically, the attention circuit comprises a multi-head attention circuit constructed based on a memristive cross array, a first memory unit module circuit for storing voltage signals, a first multiplication accumulation circuit and a second multiplication accumulation circuit which are connected with the first memory unit module circuit, and further comprises a first SoftMax activation function circuit, wherein the multi-head attention circuit is used for carrying out multi-head attention extraction on embedded text data to output multi-head Q, K, V signals, the first multiplication accumulation circuit is used for carrying out multiplication accumulation on multi-head Q, K signals to obtain total Q, K signals, the second multiplication accumulation circuit is used for carrying out multiplication accumulation on multi-head V signals to obtain total V signals, the first SoftMax activation function circuit is used for calculating total Q, K signals by adopting a SoftMax function, and the calculated result is output to the forward propagation circuit after being multiplied by the total V signals;
The output of the attention circuit is expressed as:
Figure BDA0004070546160000031
wherein Concat { } represents concatenation, softmax () represents a softmax function, d h D/h represents the dimension of the header, h represents the number of headers, d is the dimension of the input Q, K, V, Q h 、K h 、V h Q, K, V values, W, respectively representing single-head outputs Z Representing the mapping matrix.
Specifically, the forward propagation circuit comprises a first weight mapping circuit, a first layer regularization circuit, a second weight mapping circuit, a third weight mapping circuit and a second layer regularization circuit which are sequentially connected, and further comprises a second memory unit module circuit; the second regularization circuit is used for connecting the attention circuit of the next layer, and the second memory unit module circuit is used for storing the attention obtained by each calculation in the forward propagation process;
the first weight mapping circuit, the second weight mapping circuit and the third weight mapping circuit are all constructed based on the memristive cross array;
the forward propagation process of the forward propagation circuit is expressed as:
Forward=LN{W 3 [W 2 (LN(W 1 ·Attention+b 1 ))+b 2 ]+b 3 },
where Forward represents the output of the Forward propagation and Attention represents the input of the Forward propagation,W 1 、b 1 Respectively represent the weight and bias of the first weight mapping circuit, W 2 、b 2 Respectively represent the weight and bias of the second weight mapping circuit, W 3 、b 3 Representing the weight and bias of the third weight mapping circuit, respectively, LN () represents a regularization function.
The output of the single channel of the depth convolution circuit is expressed as:
Figure BDA0004070546160000041
where k is the kernel width of the depth convolution,
Figure BDA0004070546160000042
representing the weight of the kernel->
Figure BDA0004070546160000043
Representing the input of the depth convolution circuit, U representing the number of channels;
the output of the depth convolution circuit is expressed as:
Figure BDA0004070546160000044
the output of the second point to the convolution circuit is expressed as:
Figure BDA0004070546160000045
W pw1 、W pw2 and W is dw Weights W representing the convolution kernels of the first point-wise convolution circuit, the second point-wise convolution circuit, and the depth convolution circuit, respectively P1 Weights representing the third layer regularization circuits, max (Vatten m,n ) Representing the output result of the max-pooling circuit, swish () represents the Swish function.
Specifically, the emotion classification module further comprises a third memory unit module circuit connected between the maximum pooling circuit and the third layer regularization circuit, a fourth memory unit module circuit connected between the first point convolution circuit and the depth convolution circuit, and a fifth memory unit module circuit connected between the first point convolution circuit and the fourth layer regularization circuit, and the fifth memory unit module circuit is used for storing output signals of the upper circuit and providing the output signals to the lower circuit.
Specifically, the first memory unit module circuit, the second memory unit module circuit, the third memory unit module circuit, the fourth memory unit module circuit and the fifth memory unit module circuit are all constructed based on memory unit module circuits, the memory unit module circuits comprise a plurality of memory unit circuits which are connected in an array manner, and the number of the memory unit circuits is the same as that of voltage signals to be stored;
for the memory unit circuits of each row of the memory unit module circuit, all the voltage signal input ends are connected together, all the voltage signal output ends are connected together, all the control ends of the first MOS switches are connected together, and all the control ends of the second MOS switches are connected together; the memory unit module circuit sequentially inputs and outputs voltages according to the beats of the clock.
Specifically, the memory unit circuit comprises a first MOS switch, a second MOS switch, a first operational amplifier and a capacitor, wherein the control end of the first MOS switch is connected with the control end of the second MOS switch to form a clock control signal, the reverse input end of the first operational amplifier is connected with the output end of the first MOS switch, the forward input end of the first operational amplifier is connected with the capacitor and then grounded, the output end of the first operational amplifier is connected with the input end of the second MOS switch, the input end of the first MOS switch is used as the voltage signal input end of the memory unit circuit, and the output end of the second MOS switch is used as the voltage signal output end of the memory unit circuit;
Specifically, the memristor cross array adopts a memristor model with a 2M structure.
Specifically, the text emotion detection system is trained using both Friends and EmotionPush datasets to train the network, with weighted cross entropy used as training penalty.
The invention also provides a text emotion detection method based on the memristor, which comprises the following steps: constructing a text emotion detection system, training and testing the constructed text emotion detection system, inputting a text to be detected to the trained text emotion detection system, and outputting an emotion detection result by the text emotion detection system.
The invention provides a memristor-based text emotion detection system and a memristor-based text emotion detection method, which mainly contribute to:
1. a text emotion detection system (called MTEDS) with a software and hardware collaborative design is provided, hardware deployment of a network model MLA-DSCN is realized, linear mapping and matrix multiplication operation in the system are constructed by adopting a memristor cross array, and other circuit modules are also constructed by adopting a CMOS technology. Compared with the traditional computing architecture (GPU), the hardware scheme designed by the invention has lower power consumption, and simultaneously reduces the extra circuit area overhead caused by the separation of the memory and the computing unit;
2. A lightweight network model named MLA-DSCN is provided to realize text emotion detection task, and the improved network model MLA-DSCN can pay attention to global information of input text information and can improve information extraction capacity of the improved network model MLA-DSCN at a local level by fusing a depth separation convolution module into a network, and the number of network parameters is reduced by 3 times on the premise of not sacrificing accuracy.
Drawings
FIG. 1 is a graph of memristance versus memory cell during programming provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of the software and hardware collaborative design of the MTEDS provided by the embodiment of the invention;
FIG. 3 is a frame diagram and an application diagram of an MTEDS system provided by an embodiment of the present invention;
FIG. 4 is a block diagram of a text emotion detection module provided by an embodiment of the present invention;
FIG. 5 is a circuit diagram of an MMLAN provided by an embodiment of the present invention;
FIG. 6 is a schematic diagram of a 2M structure memristor cross array provided by an embodiment of the present disclosure;
fig. 7 is a circuit diagram of an analog MUM provided by an embodiment of the present invention;
fig. 8 is a circuit diagram of an MEC provided by an embodiment of the invention;
FIG. 9 is an input and output presentation of the layers of the training phase network provided by an embodiment of the present invention;
FIG. 10 is an analysis diagram of an activation function circuit provided by an embodiment of the present invention;
FIG. 11 is a diagram showing performance of a MUM circuit according to embodiments of the present invention;
FIG. 12 is a graph of simulation results of memristors at different LHRs provided by an embodiment of the present invention;
FIG. 13 is a diagram illustrating a trade-off analysis for different memory bank sizes according to an embodiment of the present invention.
Detailed Description
The following examples are given for the purpose of illustration only and are not to be construed as limiting the invention, including the drawings for reference and description only, and are not to be construed as limiting the scope of the invention as many variations thereof are possible without departing from the spirit and scope of the invention.
With the development of artificial intelligence, attention mechanisms have gradually become an important component of neural networks. Whereas conventional single head attention can be calculated by:
Figure BDA0004070546160000071
/>
it is assumed that the input feature information X is transferred to the query matrix Q, key matrix K, value matrix V after word embedding, where d is the dimension of the input Q, K, V and SoftMax () represents a SoftMax function.
Multiple head attention is linked by multiple single head attention links (Concat) and then W is used Z The matrix is mapped to the final output. In addition, the calculation process is as shown in formula (2), Where h indicates the number of heads, dimension d of the heads h =d/h。
Figure BDA0004070546160000072
Furthermore, the multi-headed attentiveness mechanism extends only in one dimension of the single-headed attentiveness mechanism, and thus, a hardware implementation strategy of the single-headed attentiveness mechanism can be used to prove the effectiveness of the multi-headed attentiveness mechanism.
One of the benchmarks for handling emotion detection in a conversation before the occurrence of the attention mechanism is a CNN-DCN based self-coded emotion classifier. As attention mechanisms come to achieve significant performance in different areas of learning, they begin to be used for emotion detection tasks. Furthermore, pre-trained language models have been found to be more suitable for processing NLP tasks (emotion analysis, machine reading understanding) because they are pre-trained on a considerable unlabeled corpus to provide deep context embedding, which makes the methods developed on them perform better. In addition, sentence and dialogue embedding techniques, and classifier selection have a substantial impact on the final emotion detection results under the same pre-training model architecture. Traditional embedding methods in NLP include Bag-of-Word (BOW), term frequency-inverse document frequency (TFIDF), and WordPiece. In addition, classification methods include Random Forest (RF), textCNN, whose initial word is embedded as GloVe and classifier implemented by SELU activation function. Furthermore, CNN has recently been widely used in the field of attention-based deep learning, with SOTA results being achieved in most sub-field tasks.
Therefore, in this embodiment, for all input corpora in different channels, a point-to-convolution with a kernel size of 1×1 is used, and in the deep convolution process, all features of the same channel need to undergo a convolution procedure. The output of the depth separation convolution module is denoted DS U Wherein the weight is verified
Figure BDA0004070546160000081
U is the output dimension of the depth convolution, k is the depth convolutionKernel width, X is the input feature.
Figure BDA0004070546160000082
In many works, the swish activation function shown in the following equation (4) achieves better results than the ReLu activation function. The hyper-parameter β determines the output form of the function, where when β=0.1, the swish function is close to a linear function, when β= infinity, it is similar to a ReLu activation function, x represents the input feature, and sigmoid () represents the sigmoid function.
Swish(x)=x·sigmoid(βx) (4)
Many memristor models are proposed. To simulate an artificial neuron, it is desirable for a memristor to be able to handle pulsed signals. Document 1 (Y.Zhang, X.Wang, Y.Li and E.G.Friedman, "Memristive model for synaptic circuits," IEEE Trans. Circuits Syst.II, exp.Briefs, vol.64, no.7, pp.767-771, jul.2017.) shows a voltage controlled threshold model only when the applied voltage exceeds the voltage threshold [ V T- ,V T+ ]Time memristance R m Will change. Furthermore, the expression of the threshold feature is related to time t, wherein W [ R ] m (t)]Representing a window function, memristances being represented as minimum and maximum memristances [ R ] on ,R off ]。
Memristor models are shown below:
Figure BDA0004070546160000083
/>
Figure BDA0004070546160000084
wherein mu v Represents the average ion mobility, i 0 Is a constant, i on And i off The currents of the corresponding on and off states are respectively represented, and D represents the length of the memristor model. It is worth mentioning that these parameters together determine the characteristics of the memristor, with a large switching ratioSmall R on /R off Directly affects the mapping relationship between the neural network and the memristors, and the energy overhead of the single memristor. In addition, the parameter configuration of the peripheral circuit is also determined therefrom. To explore the effect of different switching ratios on memristors, table 1 shows three different switching ratios of AIST-based memristors, where LHR represents the switching ratio R on /R off . FIG. 1 further illustrates the memristor change over time, demonstrating the change in memristance during programming. Pulses of amplitudes 1V, 1.5V and 2V are applied to both sides of the memristor, respectively, in a period of 0.5s, which may be configured in stages in an ideal case. However, it is worth noting that actual memristor devices have difficulty achieving any level of memristance between on and off states, and thus such limitations and effects are considered in performance analysis.
Table 1 simulation configuration of memristor model
Figure BDA0004070546160000091
For embedded devices, the design of hardware and software is not separable. Fig. 2 illustrates the co-design strategy of the present invention. In terms of software, an improved attention-based network model is designed and proposed by using a PyTorch platform to achieve higher F1 score and more accurate emotion detection, and then the trained model is mapped to corresponding hardware, so that a practical solution is provided for embedded application. However, due to the large number of parameters of the attention network and their unique large number of multiply-accumulate operations, a new computing architecture: memristive neuromorphic computing systems are considered ideal frameworks for hardware deployment due to their ideal memory computing characteristics.
Furthermore, given the impact of limited physical hardware resources, this means that additional energy and time may be required for signal processing. Finally, the linear mapping enables uploading of the network model weights. Notably, the bit precision of the memristor model and the power consumption of individual neurons have an impact on the overall system in an actual hardware implementation. Thus, there is a need for further quantification of the trained network as an input to enable MTEDS deployment to embedded devices.
FIG. 3 shows the overall framework of the MTEDS and illustrates the process of emotion calculation and human interaction. This example illustrates that the MTEDS embedded in an embedded device can help a machine understand emotion in a human conversation and react to the corresponding emotion expression. The improved model MLA-DSCN is mainly realized by two parts: an attention computation module (MMLAN) and a memristive emotion classification Module (MEC). Fig. 4 illustrates the corresponding network model structure of MTEDS. The model is partitioned accordingly at the network layer to facilitate the design of hardware circuits.
From a macroscopic point of view, convolution has proven to be effective in solving the problem that the attention mechanism cannot be focused on local features. Furthermore, using a multi-layer attention network may better simulate contextual information of text in an NLP. However, to facilitate the deployment of embedded devices, lighter convolution operations are considered a promising option in combination with multi-tier attention networks and are implemented for the first time in this embodiment. The proposed network MLA-DSCN allows the model to extract the contextual feature information more fully and performs particularly well when processing dialog-based text rather than single sentences. From a microscopic perspective, a deep convolution layer and two point-wise convolution layers are added to the MEC. The present invention does not use the GLU layer because the addition of GLU reduces the amount of extracted feature information, the difference being from the largest pooling layer added at the top of the MEC. This means that the global information calculated from the attention has been further extracted without requiring additional operations to filter the redundant information. In addition, corresponding hardware implementations are designed on the basis of improved networks. In emotion detection programs, a dialog is first embedded and encoded by means of a computer during preprocessing. Further, the embedded text information is input into the MMLAN and then fed into the MEC to identify the final detection results of emotion, including nature, happiness, difficulty and vigilance. Therefore, the MTEDS can be widely deployed on embedded equipment and applied to the fields of social media, public opinion analysis robots, emotion management families and the like as a proxy platform.
As shown in fig. 3, the text emotion detection system based on memristors provided in the embodiment of the present invention integrally includes a preprocessing module (for data preprocessing, location embedding and word embedding) attention calculating module (for attention calculating) and an emotion classifying module (for emotion classifying).
As shown in fig. 4, the attention computation module includes a multi-layer attention circuit and a forward propagation circuit; the attention circuit of the first layer is used for extracting the multi-head attention of the embedded text data, and the forward propagation circuit is used for propagating the output of the attention circuit of the current layer to the attention circuit of the next layer to be used as input until the attention circuit of the last layer, so that the final attention is obtained and output to the emotion classification module.
In this embodiment, the MMLAN is implemented by fusing memristors and CMOS technology. Fig. 5 illustrates the detailed hardware scheme implementation strategy of the MMLAN. As shown in fig. 5, the attention circuit includes a multi-headed attention circuit constructed based on a memristive cross array, a first memory cell module circuit for storing voltage signals, a first multiply-accumulate circuit connected to the first memory cell module circuit, a second multiply-accumulate circuit, and a first SoftMax activation function circuit, wherein the multi-headed attention circuit is used for multi-headed attention extraction and output of Q, K, V signals of multi-headed text data, the first multiply-accumulate circuit is used for multiply-accumulating Q, K signals of multi-headed signals to obtain total Q, K signals, the second multiply-accumulate circuit is used for multiply-accumulating V signals of multi-headed signals to obtain total V signals, the first SoftMax activation function circuit is used for calculating the total Q, K signals by adopting a SoftMax function, and the calculated result is multiplied by the total V signals and then output to the forward propagation circuit.
As shown in fig. 4 and 5, the forward propagation circuit includes a first weight mapping circuit, a first layer regularization circuit, a second weight mapping circuit, a third weight mapping circuit, a second layer regularization circuit, and a second memory cell module circuit that are sequentially connected; the second regularization circuit is used for connecting the attention circuit of the next layer, and the second memory unit module circuit is used for storing the attention calculated each time in the forward propagation process. The first weight mapping circuit, the second weight mapping circuit and the third weight mapping circuit are all constructed based on memristive cross arrays.
It should be noted that, in order to reduce the energy consumption in the digital-to-analog conversion (DAC) and analog-to-digital conversion (ADC) conversion processes, the MUM (memory cell module circuit) temporarily stores the input voltage and sequentially performs the subsequent two matrix multiplication operations. Three different memristive crossover arrays are used to represent Q, K, V, where K and V are typically assigned the same weight in the attention calculation. In previous work, K and V were deployed on a memristive crossover array to reduce power consumption and circuit area. However, the present embodiment uses three different memristive cross arrays and one MUM, which ensures that the stored intermediate voltage signal is not affected by layer-to-layer transmission, and reduces signal distortion effects caused by parallel connection and serial connection of other circuit modules, further improves the speed of processing large-scale characteristic information by hardware, and reduces the complexity of the circuit. In addition, linear mapping is also achieved by loading weights onto the memristive crossbar array. However, due to the integration limitations of memristive crossover arrays in practice, the designed hardware circuit cannot be deployed in the ideal size d of the software design. In other words, to complete feature extraction of one dialog, a plurality of clock cycles is required. Thus, an additional MUM is necessary at the end of the MMLAN circuit to hold intermediate information, ensuring that a complete session has been processed before feeding into the MEC circuit.
The output of the memristive crossbar depends on the type of memristor model and the structure of the memristive crossbar. Memristive cross array frameworks widely used in neural networks mainly comprise 1M, 2M, 1T1M. In this example, a 2M structure (see, document 2:M.Prezioso et al, "Training and operation of an integrated neuromorphic network based on metal-oxide memristors," Nature, vol.521, no.7550, pp.61-64,2015) was used to implement linear mapping and matrix multiplication operations.
FIG. 6 illustrates an input voltage signal
Figure BDA0004070546160000124
Where i denotes the length of the input signal and h is the number of heads, one for each complete attention layer. Every two columns of the memristive cross array correspond to positive and negative weights in the neural network, and the outputs of the two columns of memristors correspond to one output voltage +.>
Figure BDA0004070546160000125
Where j represents the dimension of the output signal. In MMLAN, the parameters of the attention layer are Q, K and +.>
Figure BDA0004070546160000123
In addition, since memristors cannot represent negative values, each pair of memristors is used to form a neuron G i,j The conductance value of which represents a weight. Furthermore, since the final outputs of each two columns of memristive crossbar are current +.>
Figure BDA0004070546160000127
And->
Figure BDA0004070546160000126
The current is then converted to a voltage by a volt-ampere converter and the intermediate voltage is stored in the MUM. Specifically, equation (7) and equation (8) show the overall process of voltage conversion. Wherein n and m represent the nth row of the intersecting array and m of the intersecting array, respectively th Columns. Rf represents the resistance of the resistor in the circuit.
Figure BDA0004070546160000121
Figure BDA0004070546160000122
Fig. 7 shows a circuit diagram of the memory cell module circuit MUM. Specifically, the memory cell module circuit includes a plurality of memory cell circuits connected in an array, and the number of the memory cell circuits is the same as the number of the voltage signals to be stored. The memory unit circuit comprises a first MOS switch, a second MOS switch, a first operational amplifier and a capacitor, wherein the control end of the first MOS switch is connected with a clock control signal, the reverse input end of the first operational amplifier is connected with the output end of the first MOS switch, the forward input end of the first operational amplifier is connected with the capacitor and then grounded, the output end of the first operational amplifier is connected with the input end of the second MOS switch, the input end of the first MOS switch is used as the voltage signal input end of the memory unit circuit, and the output end of the second MOS switch is used as the voltage signal output end of the memory unit circuit. For each row of memory cell circuits of the memory cell module circuit, all voltage signal input ends are connected together, all voltage signal output ends are connected together, all control ends of the first MOS switches are connected together, and all control ends of the second MOS switches are connected together. The memory cell module circuit sequentially inputs and outputs voltages in accordance with the clock beats.
Due to the use of the MUM, the energy consumption of the ADC and DAC can be effectively avoided and the intermediate voltage signal can be temporarily stored in a stable manner during the calculation. Four wires are connected to each memory cell, wherein CL c (=cl) and CR c (=cr) is a clock signal for controlling two NMOS switches on each side of the operational amplifier. In addition, each memory cell sequentially inputs and outputs voltage according to clock beat, and then respectively passes through the lead RI r And RO r And (5) transmitting. It is worth mentioning that the capacitor C plays an important role in the MUM, which not only stores the intermediate voltage, but also converts the size of the input signal. In addition, the operational amplifier maintains the temporarily stored intermediate voltage in a stable state to avoid distortion of the signal.
The neural network implements a large number of forward propagation processes by introducing functions at different layers of the MMLAN, including linear mapping (corresponding to the first weight mapping circuit, the second weight mapping circuit, and the third weight mapping circuit), layer regularization (corresponding to the first layer regularization circuit and the second layer regularization circuit), and SoftMax activation function (corresponding to the first SoftMax activation function circuit). Accordingly, the respective functional circuits will be described in order below.
The SoftMax activation function is mainly used for classifying multi-category networks, and converts the output of all categories into [0,1 ]]The sum of probability distributions in the range is 1. As shown in fig. 5, the first SoftMax activation function circuit is composed of a plurality of exponential function circuits, and the SoftMax activation function circuit is composed of an exponential function circuit (Exp), an inverter circuit, an operational amplifier, and a divider. The exponential function circuit consists of two symmetrical transistors, and successfully solves the problem of temperature drift. Its output voltage V exp As shown in the following formula.
Figure BDA0004070546160000131
V in Representing the input voltage.
In addition, the output voltage of the SoftMax activation function circuit is expressed as follows:
Figure BDA0004070546160000132
n represents the number of SoftMax activation function circuit input voltages.
The forward propagation process of the forward propagation circuit is expressed as:
Forward=LN{W 3 [W 2 (LN(W 1 ·Attention+b 1 ))+b 2 ]+b 3 } (11)
where Forward represents the output of the Forward propagation, attention represents the input of the Forward propagation, W 1 、b 1 Respectively represent the weight and bias of the first weight mapping circuit, W 2 、b 2 Respectively represent the weight and bias of the second weight mapping circuit, W 3 、b 3 The weight and bias of the third weight mapping circuit are represented respectively, and LN () represents the regularization function.
Figure BDA0004070546160000141
And->
Figure BDA0004070546160000142
However, unlike the memristor cross array described previously, this example adds a row of memristor cross arrays to represent bias weights b for three weight circuits 1 、b 2 、b 3
In the circuit, attention represents the multi-layer Attention extracted by the multi-head Attention unit, and LN () represents a function for linear mapping. The output of the linear mapping circuit can then also be expressed by equations (7) and (8).
LN operation is a common data normalization operation, which alleviates the problem of "gradient disappearance" in the network training process. As shown in the following formula. Alpha and beta are trainable parameters and shift parameters, respectively. X is x i I representing an input characteristic signal th Channel, mu l Sum sigma l Is the mean and variance of the samples under the different input channels, epsilon is a constant whose initial value is nearly 0.
Figure BDA0004070546160000143
The LN layer retains a constant mean and variance value for the training phase for use in the reasoning phase. Therefore, to simplify the complexity of the hardware implementation of the LN, the conventional formula is rewritten as a linear expression.
Figure BDA0004070546160000144
Input voltage V x Corresponding to the element of the input sample, V LN Representing the regularized output voltage, and Vb being the control signal for performing LN operations, M 1 Indicating the magnitude of the resistance of the memristor.
Figure BDA0004070546160000145
Fig. 8 illustrates a hardware implementation of the MEC. It can be seen that the emotion classification module further comprises a third memory cell module circuit connected between the maximum pooling circuit and the third layer regularization circuit, a fourth memory cell module circuit connected between the first point-to-convolution circuit and the depth convolution circuit, and a fifth memory cell module circuit connected between the first point-to-convolution circuit and the fourth layer regularization circuit, and the fifth memory cell module circuit is used for storing output signals of the upper circuit and providing the output signals to the lower circuit. The third memory cell module circuit, the fourth memory cell module circuit, and the fifth memory cell module circuit are all constructed based on the memory cell module circuits.
The emotion classifier is similar to a decoder that receives global information processed by the attention layer and then performs a deep-split convolution operation to obtain more local information. In addition, the final emotion detection is accomplished by a fully connected layer and a SoftMax activation function. In addition, the MEC contains a max pooling circuit to maximize the embedding of the voltage from the MMLAN. In other words, the most valuable information from the MMLAN is filtered out prior to the convolution operation. The max pooling circuit is made up of pairs of NMOS devices, the size of which is adjusted according to the number of input features. Its output is described by equation (15), where M represents the maximum pooling size.
V max =Max(V 1 ,V 2 ...V M ) (15)
The feature signal of each of the K contextual dialogues is fed into a maximize pooling circuit that includes all of the corpora in the dataset. Each dialogue outputs a signal of U x d after maximum embedding, and
Figure BDA0004070546160000151
in addition, the embedded features are fed row by row into a third layer regularization circuit (LN circuit), followed by a point-to-convolution, a depth-convolution, and a point-to-convolution. Since the information of the point-to-convolution and the depth convolution processes come from different dimensions, a memory cell module circuit is interposed therebetween to temporarily store intermediate voltages and perform dimensional conversion of features. Furthermore, the Swish circuit is applied before the next point-to-convolution, aiming to normalize the output of the depth convolution At the same time, the problem of gradient disappearance due to saturation of the sigmoid function is solved. Equation (16) shows the expression of a Swish circuit, with the super parameter β=1.
Figure BDA0004070546160000152
V bias Representing the reference voltage, v sh Representing the input voltage, V sh Representing its output voltage, exp () represents an exponential function.
The circuit output after two point convolutions and one depth convolution can be expressed as:
Figure BDA0004070546160000153
Figure BDA0004070546160000154
W pw1 、W pw2 and W is dw Weights respectively representing two-time point-wise convolution kernels and depth convolution kernels, W P1 Weights representing the third layer regularization circuits, max (Vatten m,n ) Representing the output result of the max-pooling circuit, swish () represents the Swish function. Notably, the size of the convolution kernel of the depth convolution is kd 2 Thus, d-kd is required in total 2 +1 deep convolution operations, which are done by parallel operations. In addition, the generated intermediate voltage is stored in the MUM. The output voltage of the regularization operation is then fed into a fully connected layer circuit (weight and bias W respectively 4 And b 4 ) And a second SoftMax activation function circuit (also constructed based on the SoftMax activation function circuit), the result being shown as:
result=Max{SoftMax[W 4 ·LN(V pw2 )+b 4 ]} (19)
W 4 、b 4 respectively, the weight and bias of the full connection layer circuit, max () represents taking the probability maximum as output.
Correspondingly, the embodiment of the invention also provides a text emotion detection method based on the memristor, which comprises the following steps: the text emotion detection system is constructed, training and testing are carried out on the constructed text emotion detection system, a text to be detected is input to the text emotion detection system after training is completed, and the text emotion detection system outputs emotion detection results.
To analyze the performance and effectiveness of MTEDS, the present embodiment conducted a series of experiments by comparing the proposed system with other related dialog text emotion detection efforts.
EmotionLines is a dialog data set consisting of two subsets. Friends and EmotionPush. The Friends dataset is a voice-based dataset that compiles a multiparty conversation in a well-known comedy series. Text data in the EmotionPush dataset is collected through social network messengers (e.g., facebook). The 4000 dialogs in each dataset included 1000 original english dialogs and 3000 enhanced dialogs that were respectively back-translated into french, german, and italian. There is a large number of corpora in each dialog, each labeled as seven emotions. This example selects four emotions including happy, hard, angry and natural as the label candidates of the present invention and is regarded as a benchmark in the performance evaluation shown in table 8. In addition, about 3000 corpora from 240 conversations are provided to the evaluation process; the distribution of the test data is the same as the training set.
The present embodiment pre-processes the EmotionLines dataset. First, all utterances are tokenized and lowercase. By introducing special markers between utterances, all markers in the same dialog are appended. It is worth mentioning that all token are embedded by the WordPiece method. After position coding and mark embedding, the text information is input into a corresponding multi-layer attention network, and training is carried out by using a PyTorch platform.
FIG. 9 represents the inputs and outputs of the various layers of the training phase network, which is related to the size of the memristive crossbar array and the circuitry of the MTEDS reasoning phase. Furthermore, dropout is used throughout the training phase to avoid overfitting of the data, the values of which affect the final network accuracy. In addition, to increase the generalization capability of the network, the present embodiment trains the network using both Friends and EmotionPush datasets. However, it is difficult to train two data sets simultaneously due to the imbalance of the data distribution of the two data sets. Thus, weighted Cross Entropy (WCE) is used as a training penalty to weight minority human-other data.
In addition, by incorporating weighted equilibrium rolling into WCE losses, the model can learn small-scale emotion data faster, allowing the training process to run more efficiently. In addition, adam optimizers are employed that work with a predetermined learning rate. Table 2 shows some key parameters used in the training process, which were determined after extensive experimentation.
Table 2 network parameters for training phase
Figure BDA0004070546160000171
To evaluate the predictive performance, mainly the F1 score, if each data is labeled as only one class, it is comparable to the Weighted Accuracy (WA) and is calculated by the following formula. Furthermore, to compare the performance of the proposed model with other network topologies, the last 20% of english dialogue data in the Friends dataset was used as a validation set to evaluate the model.
In this embodiment, all circuits are implemented based on memristors and CMOS technology. In addition, the parameters of the remaining electronic components of the circuit are shown in table 3. And determining the specific circuit structure and the area size of the MTEDS according to the trained network structure and the corresponding weight. In addition, the weights of the network layers are uploaded to the memristive cross array to complete hardware deployment of MMLAN and MEC for reasoning tasks of emotion detection. In addition, according to given parameters, SPICE is used for circuit simulation, and hardware implementation of emotion detection is simulated. It is worth mentioning that the memristor model adopted has higher switching rate R off /R on This satisfies the application of memristive neural computing systems in practical scenarios. In addition, the clock of the circuit is set to 10 μs, which is true for this embodimentAlmost all of the circuitry in the embodiments is sufficient. However, due to the nonlinearity of the activation function circuit, the simulation time is set to 10 times that of the clock to obtain a stable and accurate output.
Table 3 circuit parameters and configuration thereof
Figure BDA0004070546160000181
Through network training, MTEDS obtains the best text emotion detection result so far. The methods used in the previous work were first separately tested by validating the dataset and calculating the F1 scores they took in four emotional categories (natural, happy, hard and angry). The specific results are shown in Table 4.BOW and TFIDF are traditional baselines with F1 scores up to 0.81, but low scores for both gas production and difficulty. TextCNN and causal model TextCNN (C-TextCNN) use a weighted loss method, resulting in a fairly good balance. On the basis of the original TextCNN, both previous and target embeddings are mapped into the model using a causal corpus model.
TABLE 4 results of different neural networks on Friends validation dataset
Figure BDA0004070546160000182
In addition, a pre-trained base and large-scale test BERT model is used to predict results. As shown in Table 5, better performance than earlier works is achieved by fusing the deep split convolution module with a multi-layer attention network. The proposed model MLA-DSCN has a good balance, contains data of large sample size and small sample size, and obtains superior results. Notably, the F1 score of the small sample size data (hard and gas) was increased by 24% and 14%, respectively. In addition, other similar works are also listed for comparison, as the attention-based mechanism approach achieves better results than other networks.
TABLE 5 Performance on EmotionPush and Friends test data sets
Figure BDA0004070546160000191
/>
As shown in table 6, the accuracy of the proposed model on the EmotionLines dataset was 72.09%, while the F1 score was found to be 0.79. Based on the reasoning results on the test dataset, it is clear that the proposed MTEDS achieves SOTA performance of the final F1 score. Furthermore, in this embodiment, no additional data set is required to pre-train the model to override the previous model. In other words, the proposed structure proves to have excellent feature extraction capability and greater generalization capability, so that the MTEDS has more universality in practical application scenarios. Furthermore, the designed hardware deployment scheme provides a feasible scheme for edge deployment of the MLA-DSCN, the parameters of the whole network model are only 110.6M, and the parameters of the Friends-BERT-large model are 340M. Therefore, the proposed system further improves the feasibility and scalability of the network in practical embedded applications.
TABLE 6 comparison of prior methods based on EmotionLines dataset
Figure BDA0004070546160000192
After the deployment process of the MTEDS is completed, the stability and the correctness of the proposed circuit are further considered when reasoning the hardware equipment. Since the designed circuit includes MOS transistors, its nonlinear characteristics become a major factor affecting the stability of the circuit. Therefore, in a designated clock period, transient analysis and direct current analysis are respectively carried out on different circuits so as to test the stability of the circuits under different input modes.
Unlike previous activation function designs, the proposed activation function circuit completely mimics the entire mathematical model of the activation function and works well within the nominal input range [ -2v,2v ]. This is due to the proper scheduling of particular component parameters and clock cycles in the trimming circuit. Because of the transistor in the activation function circuit, the circuit has high nonlinear characteristics when transient analysis is performed. Therefore, in order to ensure the stability of the circuit, a clock signal of 10 times length is applied to the output of each transient signal.
For transient analysis, all voltages are limited to the range of [ -1V,1V ]. It is worth mentioning that during actual circuit operation, the intermediate voltage values in the circuit are typically in the mV level, or even lower, which is typically achieved by scaling the input voltage with an operational amplifier. The analog output of the circuit is represented as a smooth curve and compared to an ideal output voltage curve for dc analysis.
A correlation analysis of the activation function is shown in fig. 10, wherein: (a) inputting pulses, (b) comparing the DC analysis and reasoning output with the ideal output of the SoftMax, (c) transient analysis results of the SoftMax circuit, (d) comparing the DC analysis and the Swish circuit, and (e) transient analysis results of the Swish circuit. In FIG. 10, the sum of ideal DC voltages V total_ideal Inferred circuit output V total Almost identical, all equal 1V, meets the practical use requirement. In addition, although the input of the pulse signal generates slight fluctuation in the output signal of the SoftMax circuit, the circuit remains in an almost stable state after operation, and the sum of all the outputs thereof is equal to 1V. It can thus be concluded that the presence of transistors and capacitors in the exponential function circuit is a major cause of such an unstable signal, which can be effectively alleviated by embedding a 1nF capacitor in the exponential function circuit. The same test is also applicable to the Swish circuit as the SoftMax circuit, as shown in fig. 10 (d). It is worth mentioning that in a specific experiment, the bias voltage of the multiplication and division operation is trimmed to the mV level, so that the Swish circuit operates in the most ideal case.
The capacitance of the memory cell is the most important factor affecting the intermediate voltage. In order to prevent mutual access of the memory cells and the computation circuit and to avoid frequent transitions of analog and digital signals, a MUM circuit is employed in operations involving dimensional transformations and interlayer transmission. In order to ensure the stability of each circuit branch, the memory unit needs to maintain the accuracy of the intermediate voltage signal in different clock cycles, i.e. the voltage at the input terminal and the voltage at the output terminal are ensured to be equal as much as possible no matter how many clock cycles pass. Therefore, an operational amplifier is used to keep the voltage drop across the capacitor. Based on this, the present embodiment discusses the performance of the MUM at different capacitances. Fig. 11 shows the relationship between input and output signals at five capacitance values (1 nf,10pf,1pf,0.1pf and 1 fF). Further, the solid line represents an input, and the broken line represents an output. It was found that the input and output can be stable and consistent when the capacitance is equal to 1 nF. Thus, the final capacitance is determined to be 1nF, so that temporary storage of the intermediate voltage can be accurately achieved.
After the model and circuit co-design is completed, errors occur when weights are loaded on each neuron, which is the first place for nonlinear errors to occur. There is still a 2% error in using the AIST based model for linear weight mapping. In addition, the memristors produced are still not perfectly linearly graded in practical applications due to the non-linear characteristics of the memristors themselves. This is still a problem in practical implementations, where the weights of the memristor model are assumed to be linearly graded. Further, training weights based on the PyTorch platform are stored as 32-bit floating point numbers. However, for practical embedded device deployment, weights often need to be weighted to 8-bit integers to meet the requirements of hardware deployment such as ARM and FPGA. Thus, in this embodiment, the trained MLA-DSCN is also quantized to 8 bits to fit the interface with the design hardware. It is worth mentioning that the quantization benefits are not only the realization of the interface between hardware and software, but also the memory overhead is greatly reduced from 110.67MB to 25.06MB, while the F1 fraction is only reduced by 2.19%. This not only facilitates the deployment of emerging memory architectures, but also provides a viable solution for legacy embedded devices. Table 7 shows the properties of the final MTEDS.
Table 7 hardware deployment with precision penalty at different bit precision levels
Figure BDA0004070546160000211
Each neuron consists of two memristors. The voltage input to the circuit is in mV stages, so the magnitude of the voltage is limited to between [ -3.3mV,3.3mV ] at the time of calculation. Setting the time period to 10us, 50 pulse voltages with the amplitude of 3.3mV are input into the memristor, and simulating the stability of the memristor model in the threshold voltage range and the corresponding maximum power consumption. As shown in fig. 12, after the weight mapping is completed, the memristor is in a steady state. The energy consumption of memristive neurons is:
Figure BDA0004070546160000212
based on simulation test results, the on-off ratio of memristors is not a factor determining the power consumption of individual neurons, R on Directly affecting the maximum power consumption of the neuron. Furthermore, according to the specific structure of the MTEDS of fig. 8, the area overhead of the entire system is calculated in table 8. The energy consumption depends on the power overhead of each circuit module, as well as the computation time. Notably, the computation time is related to the amount and size of the input information, as well as the size of the memristive crossover array. In this embodiment, the energy consumption of the MTEDS is evaluated on a dialogue basis, because the corpus of each dialogue is embedded with 768-dimensional feature information by maximum pooling. Thus, the ideal design is a memristive cross array of size 768 x 768, in which case the information processing of one dialog can be done with one input and one clock cycle. Table 8 illustrates the power consumption and area of the MTEDS on an overall dialogue basis, with the abbreviations of the individual circuit blocks represented by their representative parameters. In integrated circuits, MOSEFETs, transistors, resistors and other basic elements are all nanoscale. Thus, in calculating the overall circuit module area overhead, the size of each element is set to 100nm 2 One of the operational amplifiers consists of 8 MOSEFETs. Furthermore, since the electrical components in the proposed circuit can be reused by scheduling clock signals, the memristive crossover array and its peripheral circuits are large in MTEDSPart of the energy and area overhead sources. However, the size of the peripheral circuitry is limited within the memristive crossover array that is actually produced.
TABLE 8 Main Power and area consumption of MTEDS
Figure BDA0004070546160000221
Therefore, the available device size must be considered in an actual hardware deployment. This embodiment contemplates two different strategies. The first is to use only one memristive crossover array and then store the results of multiple operations by the MUM. Another is parallel processing using multiple memristive crossover arrays. Both schemes require consideration of the power consumption of the memristive crossbar array and the power consumption of the additional peripheral support circuitry due to the change in the memristive crossbar array size. However, in practical application, an ideal state cannot be achieved, and the main size of the used memristive cross array is 1K, so that parallel processing of a plurality of memristive cross arrays is adopted in a simulation experiment. In addition, in order to minimize power consumption and area overhead in hardware design, the peripheral circuits are reused herein, while requiring more time periods to complete a reasoning. With further integration of memristive crossbar arrays, larger-sized memristive crossbar arrays are produced, such as 8K memristor arrays (128 rows and 64 columns). Further, as shown in fig. 13, the performance of MTEDS at different cross array sizes is reported. Thus, the energy consumption to complete a dialogue inference is 1.9mJ, with a total time cost of 17.15ms. Finally, ten emotion detection inferences were made on the GPU platform (Geforce RTX 3070) for comparison with the proposed MTEDS. The average time of these 10 inference processes is 6.42ms. Thus, the average time to complete an inference on the GPU platform was determined to be 6.42ms/240dialogues = 26.75 μs. The overall power consumption is 26.75 μs×220w=5.885 mj, mteds has only about 32.3% of the GPU overhead, but the runtime is 640 times the GPU.
Experimental results demonstrate that the proposed MTEDS of the present embodiments has lower energy and area overhead than standard neuromorphic computational structures. In addition, the memory function of the memristor enables it to simulate neurons in the memory computation structure of the memristor. This is due to the threshold model of the memristor, which allows the memristor to remain stable when the operating voltage is below the threshold voltage.
The embodiment provides a hardware implementation scheme for text emotion detection, namely MTEDS. Firstly, by fusing the depth separation convolution module into a multi-layer attention network, the improved network model MLA-DSCN not only can pay attention to the global information of the input text information, but also can improve the information extraction capability of the input text information at a local layer, and the number of network parameters is reduced by 3 times on the premise of not sacrificing the precision. The proposed network then achieves the current best F1 score for text-based emotion detection tasks on the EmotionLines dataset. Further, hardware parts such as sizes and weights corresponding to the software are also determined. The stability and power consumption of the proposed circuit were also analyzed by SPICE simulation, verifying the feasibility of practical application in the real world. In particular, under the same reference condition, compared with a GPU, the proposed MTEDS can effectively reduce energy consumption and area overhead, and the full-simulation configuration of the circuit increases the calculation time of the MTEDS. The MTEDS provided by the embodiment not only has better text emotion detection capability, but also has a lighter structure and lower energy consumption, and provides a good solution for the application of the embedded equipment deployment text emotion detection system and method.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (10)

1. A text emotion detection system based on memristors is characterized in that: the system comprises a preprocessing module, an attention calculating module and an emotion classifying module;
the preprocessing module is used for inputting speaking text data and carrying out position coding and mark embedding on the speaking text data to obtain embedded text data;
the attention computation module includes a multi-layer attention circuit and a forward propagation circuit; the attention circuit of the first layer is used for extracting the multi-head attention of the embedded text data, the forward propagation circuit is used for propagating the output of the attention circuit of the current layer to the attention circuit of the next layer to be used as input until the attention circuit of the last layer, and the final attention is obtained and output to the emotion classification module;
the emotion classification module comprises a maximum pooling circuit, a third layer regularization circuit, a first point-to-convolution circuit, a depth convolution circuit, a Swish activation function circuit, a second point-to-convolution circuit, a fourth layer regularization circuit, a full-connection layer circuit and a second softMax activation function circuit which are sequentially connected, and is used for executing corresponding maximum pooling, regularization, point-to-convolution, depth convolution, swish activation function calculation, point-to-convolution, regularization, full-connection and softMax activation function calculation operations on respective inputs and finally outputting emotion detection results;
The attention extraction operation, the weight mapping operation, the first point convolution circuit, the depth convolution circuit, the second point convolution circuit and the full connection layer circuit in the attention calculation module are all constructed based on a memristive cross array, every two columns of the memristive cross array correspond to positive and negative weights in a neural network, and the output of every two columns corresponds to one output voltage.
2. The memristor-based text emotion detection system of claim 1, wherein: the attention circuit comprises a multi-head attention circuit constructed based on a memristive cross array, a first memory unit module circuit for storing voltage signals, a first multiplication accumulation circuit and a second multiplication accumulation circuit which are connected with the first memory unit module circuit, and a first softMax activation function circuit; the multi-head attention circuit is used for extracting multi-head attention from embedded text data to output a multi-head Q, K, V signal, the first multiply-accumulate circuit is used for multiply-accumulating a multi-head Q, K signal to obtain a total Q, K signal, the second multiply-accumulate circuit is used for multiply-accumulating a multi-head V signal to obtain a total V signal, the first SoftMax activation function circuit is used for calculating the total Q, K signal by adopting a Softmax function, and the calculation result is multiplied by the total V signal in weight and then is output to the forward propagation circuit;
The output of the attention circuit is expressed as:
Figure FDA0004070546150000021
wherein Concat { } represents the join, softmax () represents the softmax function, d h D/h represents the dimension of the header, h represents the number of headers, d is the dimension of the input Q, K, V, Q h 、K h 、V h Q, K, V values, W, respectively representing single-head outputs Z Representing the mapping matrix.
3. The memristor-based text emotion detection system of claim 2, wherein: the forward propagation circuit comprises a first weight mapping circuit, a first layer regularization circuit, a second weight mapping circuit, a third weight mapping circuit and a second layer regularization circuit which are sequentially connected, and also comprises a second memory unit module circuit; the second regularization circuit is used for connecting the attention circuit of the next layer, and the second memory unit module circuit is used for storing the attention obtained by each calculation in the forward propagation process;
the first weight mapping circuit, the second weight mapping circuit and the third weight mapping circuit are all constructed based on the memristive cross array;
the forward propagation process of the forward propagation circuit is expressed as:
Forward=LN{W 3 [W 2 (LN(W 1 ·Attention+b 1 ))+b 2 ]+b 3 },
where Forward represents the output of the Forward propagation, and Attention represents the Forward propagation Input to propagation, W 1 、b 1 Respectively represent the weight and bias of the first weight mapping circuit, W 2 、b 2 Respectively represent the weight and bias of the second weight mapping circuit, W 3 、b 3 Representing the weight and bias of the third weight mapping circuit, respectively, LN () represents a regularization function.
4. The memristor-based text emotion detection system of claim 3, wherein the output of the deep convolution circuit single channel is represented as:
Figure FDA0004070546150000031
where k is the kernel width of the depth convolution,
Figure FDA0004070546150000032
representing the weight of the kernel->
Figure FDA0004070546150000033
Representing the input of the depth convolution circuit, U representing the number of channels;
the output of the depth convolution circuit is expressed as:
Figure FDA0004070546150000034
the output of the second point to the convolution circuit is expressed as:
Figure FDA0004070546150000035
W pw1 、W pw2 and W is dw Weights W representing the convolution kernels of the first point-wise convolution circuit, the second point-wise convolution circuit, and the depth convolution circuit, respectively P1 Representing third layer regularized electricityWeights of roads, max (Vatten m,n ) Representing the output result of the max-pooling circuit, swish () represents the Swish function.
5. The memristor-based text emotion detection system of claim 4, wherein: the emotion classification module further comprises a third memory unit module circuit connected between the maximum pooling circuit and the third layer regularization circuit, a fourth memory unit module circuit connected between the first point convolution circuit and the depth convolution circuit, and a fifth memory unit module circuit connected between the first point convolution circuit and the fourth layer regularization circuit and used for storing output signals of the upper circuit and providing the output signals for the lower circuit.
6. The memristor-based text emotion detection system of claim 5, wherein: the first memory unit module circuit, the second memory unit module circuit, the third memory unit module circuit, the fourth memory unit module circuit and the fifth memory unit module circuit are all constructed based on memory unit module circuits, the memory unit module circuits comprise a plurality of memory unit circuits which are connected in an array manner, and the number of the memory unit circuits is the same as that of voltage signals to be stored;
for the memory unit circuits of each row of the memory unit module circuit, all the voltage signal input ends are connected together, all the voltage signal output ends are connected together, all the control ends of the first MOS switches are connected together, and all the control ends of the second MOS switches are connected together; the memory unit module circuit sequentially inputs and outputs voltages according to the beats of the clock.
7. The memristor-based text emotion detection system of claim 6, wherein: the memory unit circuit comprises a first MOS switch, a second MOS switch, a first operational amplifier and a capacitor, wherein the control end of the first MOS switch is connected with the control end of the second MOS switch to form a clock control signal, the reverse input end of the first operational amplifier is connected with the output end of the first MOS switch, the forward input end of the first operational amplifier is connected with the capacitor and then grounded, the output end of the first operational amplifier is connected with the input end of the second MOS switch, the input end of the first MOS switch is used as the voltage signal input end of the memory unit circuit, and the output end of the second MOS switch is used as the voltage signal output end of the memory unit circuit.
8. The memristor-based text emotion detection system of claim 7, wherein: the memristor cross array adopts a memristor model with a 2M structure.
9. The memristor-based text emotion detection system of claim 1, wherein: the text emotion detection system is trained using the Friends and EmotionPush datasets simultaneously to train the network, with weighted cross entropy used as training penalty.
10. The text emotion detection method based on the memristor is characterized by comprising the following steps of: constructing the text emotion detection system according to any one of claims 1 to 9, training and testing the constructed text emotion detection system, inputting a text to be detected to the trained text emotion detection system, and outputting an emotion detection result by the text emotion detection system.
CN202310091466.6A 2023-02-06 2023-02-06 Memristor-based text emotion detection system and method Pending CN115994221A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310091466.6A CN115994221A (en) 2023-02-06 2023-02-06 Memristor-based text emotion detection system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310091466.6A CN115994221A (en) 2023-02-06 2023-02-06 Memristor-based text emotion detection system and method

Publications (1)

Publication Number Publication Date
CN115994221A true CN115994221A (en) 2023-04-21

Family

ID=85990134

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310091466.6A Pending CN115994221A (en) 2023-02-06 2023-02-06 Memristor-based text emotion detection system and method

Country Status (1)

Country Link
CN (1) CN115994221A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116523011A (en) * 2023-07-03 2023-08-01 中国人民解放军国防科技大学 Memristor-based binary neural network layer circuit and binary neural network training method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116523011A (en) * 2023-07-03 2023-08-01 中国人民解放军国防科技大学 Memristor-based binary neural network layer circuit and binary neural network training method
CN116523011B (en) * 2023-07-03 2023-09-15 中国人民解放军国防科技大学 Memristor-based binary neural network layer circuit and binary neural network training method

Similar Documents

Publication Publication Date Title
Smagulova et al. A survey on LSTM memristive neural network architectures and applications
Cui et al. Continuous online sequence learning with an unsupervised neural network model
CN109284506B (en) User comment emotion analysis system and method based on attention convolution neural network
US11556712B2 (en) Span selection training for natural language processing
Alaloul et al. Data processing using artificial neural networks
CN107408111A (en) End-to-end speech recognition
CN113313240B (en) Computing device and electronic device
Ku et al. A study of the Lamarckian evolution of recurrent neural networks
CN115994221A (en) Memristor-based text emotion detection system and method
Thukroo et al. A review into deep learning techniques for spoken language identification
Wang et al. Mapping the BCPNN learning rule to a memristor model
Dutta et al. Applications of recurrent neural network: Overview and case studies
Hanson Backpropagation: some comments and variations
Yang et al. Analog circuit implementation of LIF and STDP models for spiking neural networks
Hashana et al. Deep Learning in ChatGPT-A Survey
CN111522926A (en) Text matching method, device, server and storage medium
Ma et al. Non-volatile memory array based quantization-and noise-resilient LSTM neural networks
CN116013309A (en) Voice recognition system and method based on lightweight transducer network
Sun et al. Learnable axonal delay in spiking neural networks improves spoken word recognition
US6490571B1 (en) Method and apparatus for neural networking using semantic attractor architecture
Pambudi et al. Effect of Sentence Length in Sentiment Analysis Using Support Vector Machine and Convolutional Neural Network Method
Lobacheva et al. Bayesian sparsification of gated recurrent neural networks
Hasan Memristor based low power high throughput circuits and systems design
Vo A Diagonally Weighted Binary Memristor Crossbar Architecture Based on Multilayer Neural Network for Better Accuracy Rate in Speech Recognition Application
Masoudian et al. Bitcoin Price Prediction using Long Short Term Memory Neural Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination