CN114841333A

CN114841333A - Nonvolatile memory, storage system and operation method suitable for neural network

Info

Publication number: CN114841333A
Application number: CN202210617589.4A
Authority: CN
Inventors: 周稳; 贾建权; 贾信磊; 游开开; 杨琨; 韩佳茵; 徐盼; 靳磊
Original assignee: Yangtze Memory Technologies Co Ltd
Current assignee: Yangtze Memory Technologies Co Ltd
Priority date: 2022-06-01
Filing date: 2022-06-01
Publication date: 2022-08-02

Abstract

Some embodiments of the present application provide a non-volatile memory, a storage system and an operating method suitable for a neural network. The non-volatile memory includes peripheral circuitry configured to: applying a bit line voltage to the memory cell to the connected bit line, the bit line voltage being one input to a neuron in the neural network; applying a read voltage to a word line connected to the pair of memory cells; and determining the output of the neuron based on the conductance difference value of the two memory units in the pair of memory units, wherein the conductance difference value is used as the weight corresponding to the input of the neuron.

Description

Nonvolatile memory, storage system and operation method suitable for neural network

Technical Field

The present invention relates to the field of semiconductor technology, and more particularly, to a nonvolatile memory suitable for a neural network, a nonvolatile memory system, and an operating method of the nonvolatile memory for executing the neural network.

Background

The rapid development of artificial neural networks (referred to as neural networks for short) has led to new wave of research in artificial intelligence. In order to accelerate the inference and training speed of the neural network, so that the neural network can be deployed on more terminal devices, it is necessary to increase the computing power and reduce the power consumption on the hardware level.

The basic principle of the neural network is that the complex functional relationship is fitted by connecting multiple layers of neurons after the multiplication and addition of input vectors and synaptic weights (weight for short) stored by abstract neurons and nonlinear activation and output.

However, as the parameter scale and the computation amount of the neural network are rapidly increased, the hardware platform of the neural network faces the problem caused by mismatching of the memory bandwidth and the computation speed of the computation unit due to the mass data throughput.

Disclosure of Invention

Embodiments of the present application provide a nonvolatile memory applicable to a neural network, a nonvolatile memory system, and an operating method of a nonvolatile memory for implementing a neural network, which may at least partially solve the above-mentioned problems in the related art.

An aspect of the present application provides a nonvolatile memory suitable for a neural network, the nonvolatile memory including: the channel structure is divided into at least two sub-channel structures by the isolation structure along the extending direction parallel to the channel structure, two sub-channel structures in the same channel structure correspond to adjacent storage strings, the storage unit arrays in the plurality of storage strings are divided into storage unit pairs, two storage units in each storage unit pair are respectively positioned in the two storage strings and connected to the same word line, the two storage strings are respectively connected to the two bit lines, and the plurality of storage units which are connected by the same word line and positioned in different storage strings correspond to one neuron in the neural network; and a peripheral circuit configured to: applying a bit line voltage to the memory cell to the connected bit line, the bit line voltage as one input to a neuron in the neural network; applying a read voltage to a word line connected to the pair of memory cells; and determining the output of the neuron based on the conductance difference value of the two memory units in the pair of memory units, wherein the conductance difference value is used as the weight corresponding to the input of the neuron.

In some embodiments, the peripheral circuitry is further configured to: a programming operation is performed on at least one of the pair of memory cells to adjust the conductance difference.

In some embodiments, the plurality of memory strings form a memory cell array, the memory cell array includes a plurality of two-dimensional memory cell arrays, the plurality of memory strings in each two-dimensional memory cell array are connected to the same top select line, and each two-dimensional memory cell array includes a pair of memory cells, the peripheral circuitry is further configured to: and determining the output of the storage unit positioned in different two-dimensional storage unit arrays to the corresponding neuron in a preset time period.

In some embodiments, the plurality of memory strings form a memory cell array, the memory cell array includes a plurality of two-dimensional memory cell arrays, the plurality of memory strings in each two-dimensional memory cell array are connected to a same top select line, and each two-dimensional memory cell array includes a pair of memory cells, the peripheral circuitry is further configured to: the conductance differences of pairs of memory cells in different two-dimensional memory cell arrays are adjusted over a predetermined period of time.

In some embodiments, the isolation structure extends along a direction in which the plurality of channel structures are arranged at an end away from the bit line.

In some embodiments, the memory cell is a floating gate type memory cell or a charge trap type memory cell.

Another aspect of the present disclosure provides a nonvolatile memory system suitable for a neural network, the nonvolatile memory system including: at least one non-volatile memory as described in any of the previous embodiments; and a controller connected to the at least one non-volatile memory and configured to control peripheral circuits in the non-volatile memory.

Another aspect of the present application also provides an operating method of a nonvolatile memory for performing a neural network, the nonvolatile memory including a plurality of channel structures and isolation structures, the channel structures being divided into at least two sub-channel structures by the isolation structures along an extending direction parallel to the channel structures, the two sub-channel structures in the same channel structure corresponding to adjacent memory strings, memory cells in the plurality of memory strings being divided into memory cell pairs, two memory cells in each memory cell pair being respectively located in two memory strings and connected to a same word line, the two memory strings being respectively connected to two bit lines, a plurality of memory cells in the same word line and located in different memory strings corresponding to one neuron in the neural network, the operating method including: applying a bit line voltage to the memory cell to the connected bit line, the bit line voltage as one input to a neuron in the neural network; applying a read voltage to a word line connected to the pair of memory cells; and determining the output of the neuron based on the conductance difference value of the two memory units in the pair of memory units, wherein the conductance difference value is used as the weight corresponding to the input of the neuron.

In some embodiments, the method of operation further comprises: a programming operation is performed on at least one of the pair of memory cells to adjust the conductance difference.

In some embodiments, the plurality of memory strings form a memory cell array, the memory cell array includes a plurality of two-dimensional memory cell arrays, the plurality of memory strings in each two-dimensional memory cell array are connected to the same top select line, and each two-dimensional memory cell array includes a pair of memory cells, and determining the output of the neuron includes: and determining the output of the storage unit positioned in different two-dimensional storage unit arrays to the corresponding neuron in a preset time period.

In some embodiments, the plurality of memory strings form a memory cell array, the memory cell array includes a plurality of two-dimensional memory cell arrays, the plurality of memory strings in each two-dimensional memory cell array are connected to the same top select line, and each two-dimensional memory cell array includes a pair of memory cells, and adjusting the conductance difference includes: the conductance differences of pairs of memory cells in different two-dimensional memory cell arrays are adjusted over a predetermined period of time.

In addition, according to at least one embodiment of the present application, the nonvolatile memory system and the operating method of the nonvolatile memory for executing the neural network provided by the embodiments of the present application implement a forward propagation process or an inference process of the neural network by using a conductance difference of a pair of memory cells in a memory cell array of the nonvolatile memory as hardware, have advantages in terms of unit storage density and large capacity, and are beneficial to implementing a larger-scale deep neural network, thereby supporting a more complex function implementation of the neural network.

Drawings

Other features, objects, and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings. Wherein:

FIG. 1 is a functional block diagram of a non-volatile storage system connected to a host according to an embodiment of the present application;

FIG. 2 is a functional block diagram of a non-volatile memory according to an embodiment of the present application;

FIG. 3 is an equivalent circuit diagram of a three-dimensional memory cell array according to an embodiment of the present application;

FIG. 4A is a schematic diagram of a physical structure of adjacent memory strings according to an embodiment of the present application;

FIG. 4B is a schematic top view taken along plane A from FIG. 4A with a plurality of memory strings;

FIG. 4C is a schematic diagram of a physical structure of adjacent memory strings in a top view according to another embodiment of the present application;

FIG. 5 is a schematic diagram of a three-layer neural network architecture according to an embodiment of the present application;

FIG. 6 is a flow chart of a training process for a neural network according to an embodiment of the present application;

FIG. 7 is a flow chart of a method of operation of a non-volatile memory for performing a neural network in accordance with an embodiment of the present application;

fig. 8 is an equivalent circuit diagram of a two-dimensional memory cell array among the three-dimensional memory cell arrays shown in fig. 3;

FIG. 9 is a waveform diagram of voltages applied to n word lines to determine the output of a neuron according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a method of operating a non-volatile memory for implementing a neural network in accordance with another embodiment of the present application; and

FIG. 11 is a graph of conductance of a memory cell versus time to perform a programming operation according to an embodiment of the present application.

Detailed Description

For a better understanding of the present application, various aspects of the present application will be described in more detail with reference to the accompanying drawings. It should be understood that the detailed description is merely illustrative of exemplary embodiments of the present application and does not limit the scope of the present application in any way. Like reference numerals refer to like elements throughout the specification. The expression "and/or" includes any and all combinations of one or more of the associated listed items.

It should be noted that in this specification the expressions first, second, third etc. are only used to distinguish one feature from another, and do not indicate any limitation of features, in particular any order of precedence. Thus, a first portion discussed in this application may also be referred to as a second portion and a first channel structure may also be referred to as a second structure, and vice versa, without departing from the teachings of this application.

In the drawings, the thickness, size and shape of the components have been slightly adjusted for convenience of explanation. The figures are purely diagrammatic and not drawn to scale. As used herein, the terms "approximately", "about" and the like are used as table-approximating terms and not as table-degree terms, and are intended to account for inherent deviations in measured or calculated values that would be recognized by one of ordinary skill in the art.

It will be further understood that terms such as "comprising," "including," "having," "including," and/or "containing" are used in this specification to specify the presence of stated features, elements, and/or components, but do not preclude the presence or addition of one or more other features, elements, components, and/or groups thereof. Furthermore, when a statement such as "at least one of" appears after a list of listed features, it modifies that entire list of features rather than just individual elements in the list. Furthermore, when describing embodiments of the present application, the use of "may" mean "one or more embodiments of the present application. Also, the term "exemplary" is intended to refer to an example or illustration.

Unless otherwise defined, all terms (including engineering and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. In addition, unless explicitly defined or contradicted by context, the specific steps included in the methods described herein are not necessarily limited to the order described, but can be performed in any order or in parallel. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Further, in this application, when "connected" or "coupled" is used, it may mean either direct contact or indirect contact between the respective components, unless there is an explicit other limitation or can be inferred from the context.

Fig. 1 is a functional block diagram of a non-volatile storage system 10 connected to a host 20 according to an embodiment of the present application. As shown in fig. 1, the electronic device formed by the host 20 and the storage system 10 may be a mobile phone, a desktop computer, a laptop computer, a tablet computer, an on-board computer, a game console, a printer, a positioning device, a wearable electronic device, a smart sensor, a Virtual Reality (VR) device, an Augmented Reality (AR) device, or any other suitable electronic device.

Host 20 may comprise a processor of an electronic device and be configured to control the overall operation of storage system 10, as well as to send or receive data to and from storage system 10. The host 20 may be a Central Processing Unit (CPU) or may be a system-on-chip (SoC), such as an Application Processor (AP).

The storage system 10 may store data that is accessed by the host 20. According to an interface protocol by which the storage system 10 is connected to the host 20, the storage system 10 may be configured as, for example, a Universal Flash Storage (UFS) system, a Solid State Disk (SSD), a multimedia card in the form of MMC, eMMC, RS-MMC, and micro MMC, a secure digital card in the form of SD, mini SD, and micro SD, a Personal Computer Memory Card International Association (PCMCIA) card type storage system, a Peripheral Component Interconnect (PCI) type storage system, a PCI express (PCI-E) type storage system, a Compact Flash (CF) card, a smart media card, or a memory stick, or any other suitable storage system.

As shown in fig. 1, the memory system 10 may include one or more non-volatile memories 110 for storing data and a controller 120 for controlling the non-volatile memories 110.

The controller 120 is coupled to the non-volatile memory 110 and the host 20, and is configured to control the operation of the non-volatile memory 110, manage data stored in the non-volatile memory 110, and communicate with the host 20. The controller 120 may, for example, include a host interface 121, a processor 122, a flash interface 123.

The host interface 121 in the controller 120 may communicate with the host 20 according to a particular communication protocol. The interface protocol of the host interface 210 may include any one of a universal flash memory (UFS) protocol, a Serial Advanced Technology Attachment (SATA) protocol, a Peripheral Component Interconnect (PCI) protocol and a PCI express (PCI-E) protocol, a Universal Serial Bus (USB) protocol, a multimedia card (MMC) protocol, a Parallel Advanced Technology Attachment (PATA) protocol, a Small Computer System Interface (SCSI) protocol, a serial SCSI (sas) protocol, and the like.

The processor 122 in the controller 120 may, for example, include one or more ARM cores. The processor 122 may control the inherent operation of the non-volatile memory 110 and provide compatibility to the host 20 by driving firmware called a Flash Translation Layer (FTL). Further, the processor 220 may also implement functions such as Wear Leveling (Wear Leveling), Garbage Collection (Garbage Collection), Bad Block Management (Bad Block Management), etc., by driving other firmware, for example.

The flash interface 123 in the memory 120 may be responsible for managing data to read from and write to the non-volatile memory 110, for example, according to flash commands that conform to the ONFI or Toggle standards. For example, for each non-volatile memory 110, commands, addresses, and data may be transferred thereto through the flash interface 123. For multiple non-volatile memories 110, a particular non-volatile memory 110 may be selected by, for example, a strobe signal, prior to transmitting commands, addresses, and data.

Each non-volatile memory 110 may be referred to as a die (die), which may also be referred to as a memory granule. Each die may be the smallest basic management unit for flash memory communications. Illustratively, the non-volatile memory 110 may be a 3D NAND type memory. One non-volatile memory 110 or a plurality of non-volatile memories 110 may be integrated into one package. For example, 4-8 non-volatile memories 110 may be packaged together. It should be noted that the number of the nonvolatile memory 110 packages can be designed according to the capacity requirement, and the specific number is not limited in this application.

Fig. 2 is a functional block diagram of a nonvolatile memory 210 according to an embodiment of the present application. Among them, the nonvolatile memory 210 may be one example of a plurality of nonvolatile memories 110 shown in fig. 1. As shown in fig. 2, the nonvolatile memory 210 may include a three-dimensional memory cell array 220 and peripheral circuits such as a page buffer 231, a row decoder 232, a column decoder 233, a voltage generator 234, a logic control module 235, an I/O module 236, and a data bus 237. It should be understood that the operations performed by the above described circuit modules described in this application may be performed by processing circuitry. Alternatively, the processing circuitry may include, but is not limited to, hardware of logic circuitry or a hardware/software combination of a processor executing software.

The three-dimensional memory cell array 220 may include a plurality of memory cells arranged in a three-dimensional array formation. The plurality of memory cells may be connected to a plurality of Bit Lines (BL) and a plurality of Word Lines (WL) in a predetermined connection manner. Illustratively, each memory cell may be any one of a single-layer memory cell (SLC) capable of storing one bit of data, a two-layer memory cell (MLC) capable of storing two bits of data, a three-layer memory cell (TLC) capable of storing three bits of data, and a four-layer memory cell (QLC) capable of storing four bits of data. For example, multiple SLC memory cells connected by the same word line correspond to a Page (Page).

The page buffer (or referred to as "sense amplifier") 231 may be configured to read data from the three-dimensional memory cell array 220 or program (write) data to the three-dimensional memory cell array 220 according to a control signal from the logic control module 235. In one example, the page buffer 231 may store data to be programmed to one page in the three-dimensional memory cell array 220. In another example, the page buffer 231 may sense a low-power signal of data stored in the memory cells of the three-dimensional memory cell array 220 in a read operation and amplify a small voltage swing to an identifiable logic level.

The row decoder 232 may be configured to be controlled by the logic control module 235 and to select a page in the three-dimensional memory cell array 220. For example, the row decoder 232 may be configured to select a corresponding page by driving a word line using a voltage generated by the voltage generator 234.

The column decoder 233 may be configured to be controlled by the logic control module 235 and select a corresponding bit line by applying a bit line voltage generated by the voltage generator 234.

The voltage generator 234 may be configured to be controlled by the logic control module 235 and generate a word line voltage (e.g., a charging voltage, a ground voltage, a read voltage, a program voltage, a pass voltage, a verify voltage, etc.), a bit line voltage, a source line voltage, and the like, to be provided into the three-dimensional memory cell array 220.

The logic control module 235 may be coupled to each of the peripheral circuit modules described previously and configured to control the operation of the respective peripheral circuit modules, and the logic control module 235 may control to perform an operation method of a neural network to be described later.

I/O module 236 may be coupled to logical control module 235 to forward control commands received from host 20 (see fig. 1) or controller 120 (see fig. 1) to logical control module 235 and to forward status information received from logical control module 235 to controller 120 (see fig. 1). The I/O module 236 may also be coupled to a column decoder 233 via a data bus 237 to buffer and forward data to and from the three dimensional array of memory cells 220.

Fig. 3 is an equivalent circuit diagram of a three-dimensional memory cell array 320 according to an embodiment of the present application. Therein, three-dimensional memory array 320 may be an example of a portion of three-dimensional memory array 220 shown in FIG. 2. For example, the three-dimensional memory cell array 320 illustrated in fig. 3 may be referred to as a memory block.

As shown in FIG. 3, a memory block may include multiple memory strings (e.g., Str 11) ⁺ 、Str12 ⁺ 、Sr13 ⁺ And Strm3-, etc.). Multiple storage strings Str11 ⁺ Strm 3-may be arranged in a two-dimensional array with respect to the xy plane. Each storage string (e.g., Str 11) ⁺ ) May extend in the z-direction and may in turn comprise a top select transistor TST1 connected in series with each other ⁺ Memory cell MC1 ⁺ ～MCn ⁺ And a bottom select transistor BST1 ⁺ . Illustratively, for each memory string (e.g., Str 11) ⁺ ) For example, a top selection transistor TST1 may be further included ⁺ And memory cell MC1 ⁺ One or more dummy memory cells (not shown) disposed in between, and memory cells MCn ⁺ And a bottom select transistor BST1 ⁺ One or more dummy memory cells (not shown) in between. Note that each memory string (e.g., Str 11) ⁺ ) The selection transistor included (e.g., TST 1) ⁺ And BST1 ⁺ ) Memory cell (e.g., MC 1) ⁺ ～MCn ⁺ ) And the number of dummy memory cells are not specifically limited in this application.

In one example, multiple storage strings Str11 in a storage block ⁺ ～Strm3 ^- May be connected to the common source line ACS. Illustratively, each memory string Str11 ⁺ Strm 3-respective bottom select transistors at the ends (e.g., BST 1) ⁺ ) May be connected to a common source line ACS.

In one example, each storage string Str11 ⁺ ～Strm3 ^- A plurality of memory cells (e.g., MC 1) located at (approximately) the same height from a common source line ASC ⁺ 、MC1 ^- Etc.) may be connected to the same word line (e.g., WL1), such that, for example, a memory block includes a plurality of word lines WL1 WLn. As previously described, a plurality of memory cells (e.g., MC 1) connected to the same word line (e.g., WL1) ⁺ MC1-, etc.) may be controlled by the word line (e.g., WL1) that connects the plurality of memory cells (e.g., MC 1) connected by the word line (e.g., WL1) ⁺ MC1-, etc.) may constitute one page such that, for example, a memory block includes a plurality of pages corresponding to a plurality of word lines WL1 WLn.

In one example, multiple memory strings arranged in the x-direction (e.g., Str 11) ⁺ 、Str11 ^- 、Strm1 ⁺ Strm1- "a plurality of top select transistors (e.g., TST 1) located at (approximately) the same height from the common source line ACS ⁺ ) May be connected to the same top select line (e.g., TSL1) such that, for example, a memory block includes multiple top select lines TSL1, TSL2, TSL3 arranged in the y-direction.

In one example, multiple memory strings arranged in the x-direction (e.g., Str 11) ⁺ 、Str11 ^- 、Strm1 ⁺ A plurality of bottom select transistors (e.g., BST 1) in Strm1-) located at (approximately) the same height from a common source line ACS ⁺ ) May be connected to the same bottom select line (e.g., BSL1) such that, for example, the memory block includes a plurality of bottom select lines BSL1, BSL2, BSL3 arranged in the y-direction. In another example, the gate terminals of a plurality of bottom select transistors in the respective memory strings located at (approximately) the same height from the common source line ACS may be connected to each other and to the same bottom select line, so that, for example, the memory block includes one bottom select line (not shown).

In one example, multiple memory strings arranged in the y-direction (e.g., Str 11) ⁺ 、St12 ⁺ 、Str13 ⁺ ) May be connected to the same bit line (e.g., BL 1) ⁺ ). Illustratively, each memory string Str11 ⁺ 、St12 ⁺ 、Str13 ⁺ With each top select transistor (e.g., TST 1) at the ends ⁺ ) May be connected to a bit line BL1 ⁺ So that, for example, the memory block includes a plurality of bit lines BL1 arranged in the x direction ⁺ 、BL1 ^- ～BLm ⁺ 、BLm ^- 。

The physical structure of adjacent memory strings in a three-dimensional memory cell array is exemplarily described below. FIG. 4A is a schematic diagram of a physical structure of adjacent memory strings according to an embodiment of the present application. FIG. 4B is a schematic top view taken along plane A from FIG. 4A with a plurality of memory strings.

As shown in fig. 4A, the channel structure may include two

sub-channel structures

411, 412, each sub-channel structure (e.g., 411) may be, for example, in the shape of a (substantially) half cylinder, and the extending direction (z direction) of the sub-channel structure 411 may be a direction in which a plurality of memory cells in a memory string are arranged. The sub-channel structure 411 may include a charge blocking layer 401, a charge trapping layer 402, a tunneling layer 402, and a channel layer 404 sequentially arranged in a direction perpendicular to a lateral surface. Among them, the charge blocking layer 401, the charge trapping layer 402, and the tunneling layer 402 may be referred to as a functional layer 405. Illustratively, the materials of the charge blocking layer 401, the charge trapping layer 402 and the tunneling layer 402 may be silicon oxide, silicon nitride and silicon oxynitride, respectively. The material of the channel layer 404 may be polysilicon. It should be noted that the shape of the sub-channel structure is not limited to this, and may also be, for example, a semi-elliptical cylinder, a prism, or other irregular shapes.

The isolation structure 420 may be along a direction perpendicular to the extension direction (z-direction) of the

sub-channel structures

411, 412 to separate the two

sub-channel structures

411 and 412. Illustratively, the top surface of the isolation structure 420 may be rectangular in shape, viewed in the xy-plane, disposed between the two semicircular top surfaces of the two

sub-channel structures

411, 412. For example, the two semicircular top surfaces of the

sub-channel structures

411, 412 may be symmetrically arranged with respect to the rectangular top surface of the isolation structure 420. Further, the top surface of the isolation structure 420 may extend in the extending direction (z-direction) of the

sub-channel structure

411 or 412, such that the

sub-channel structures

411, 412 are physically separated by the isolation structure 420. Illustratively, the material of the isolation structure 420 may include a dielectric material such as silicon oxide, silicon nitride, silicon oxynitride, etc. to achieve electrical isolation between the two

sub-channel structures

411, 412.

In one example, the word lines WL1 WLn may be sequentially disposed at intervals along the extending direction (z direction) of the

sub-channel structure

411 or 412. Each word line (e.g., WL2) may surround a portion of the

sub-channel structure

411 or 412 in its extending direction. Illustratively, the material of each word line (e.g., WL 1-WLn) may include tungsten, doped polysilicon, or any suitable conductive material.

In one example, each word line (e.g., WL2) and the portion of the functional layer 405 and the channel layer 404 corresponding to the word line WL2 together constitute a memory cell. For example, applying a voltage to the word line WL2 may cause charges (e.g., electrons) in the channel layer 404 to be injected into the charge trapping layer 402, or cause charges in the charge trapping layer 402 to be withdrawn back to the channel layer 404 through cooperation of the word line WL2 and other control lines. As shown in fig. 4A, the channel layer 404 is shared by a plurality of memory cells arranged in the z-direction, in other words, a plurality of memory cells may be arranged in series in the z-direction (similar to a NAND gate).

It should be noted that, as described above, for a memory cell, the charge trapping layer 402 made of dielectric material (e.g., silicon nitride) can be similar to a trap, so that it is difficult for charges injected therein to escape, and thus the memory cell can be referred to as a charge trapping memory cell. In another example, the functional layer may sequentially include a charge blocking layer, a floating gate layer, and a tunneling layer, the floating gate layer may be made of, for example, a conductive material, and after charges are injected into the floating gate layer, the charge blocking layer and the tunneling layer made of a dielectric material and located at both sides of the floating gate layer may allow the charges to be stored in the floating gate layer without charge escape, so the memory cell may be referred to as a floating-gate type memory cell.

In some examples, as shown in fig. 4A and 4B, the isolation structure 420 extends along a direction (e.g., an x-direction) in which the plurality of channel structures (e.g., the channel structures including the sub-channel structures 411 and 412) are aligned at an end away from the bit line (e.g., the bottom shown in fig. 4A). It is noted that the isolation structure 420 enables the bottom select transistor located in two sub-channel structures (e.g., 411 and 412) to be separately one element for the sub-channel structures (e.g., 411 and 412) included in one channel structure. On the other hand, the isolation structure 420 extending in the x-direction insulates and isolates one end bottom select lines (e.g., BSL1 and BSL2) away from the bit line (e.g., the bottom shown in fig. 4A), so that a plurality of bottom select transistors on both sides of the direction (e.g., the x-direction) extended by the isolation structure 420 can be individually controlled by two bottom select lines (e.g., BSL1 and BSL2), respectively.

FIG. 4C is a schematic diagram of a physical structure of adjacent memory strings in a top view according to another embodiment of the present application. As shown in fig. 4C, the isolation structure 420 ' may separate the four sub-channel structures 411 ', 412 ', 413 ', 414 ' from each other along a direction (z-direction) perpendicular to the extension direction of the sub-channel structures 411 ', 412 ', 413 ', 414 '. Illustratively, the top surface shape of the isolation structure 420 ' may be, for example, a cross shape, and the top surface shape of the four sub-channel structures 411 ', 412 ', 413 ', 414 ' may be, for example, (approximately) 1/4 circular, as viewed in the xy plane. Four sub-channel structures 411 ', 412', 413 ', 414' may be arranged between adjacent intersections of the cross shape, respectively. Further, the top surface of the isolation structure 420 ' may extend along the extending direction (z-direction) of the sub-channel structure (e.g., 411 ') such that the sub-channel structures 411 ', 412 ', 413 ', 414 ' are physically separated by the isolation structure 420 '. When the material of the isolation structure 420 ' is a dielectric material, electrical isolation between the sub-channel structures 411 ', 412 ', 413 ', 414 ' can be achieved.

It is understood that fig. 4A to 4C illustrate the case where the channel structure includes two sub-channel structures and four sub-channel structures, respectively. However, the channel structure may also comprise other numbers of sub-channel structures, such as 3, 5, 6, 7, etc.

In one example, the respective functional layers and channel layers of the plurality of sub-channel structures in the channel structure may be formed during the same process (e.g., a thin film deposition process), such that the physical structure differences (e.g., film thickness differences of the functional layers and/or the channel layers) of the plurality of sub-channel structures included in the same channel structure are small.

The neural network is exemplarily described below. The neural network may be comprised of an input layer, an output layer, and one or more hidden layers between the input layer and the output layer, each layer including one or more neurons. According to the connection relation of each neuron in the neural network, the input traverses each layer in a mathematical transformation mode and is converted into the probability of each output.

Fig. 5 is a schematic diagram of a three-layer neural network architecture 500 according to an embodiment of the present application. FIG. 5 shows a circuit with three input neurons (I) ₁ 、I ₂ 、I ₃ ) Having two output neurons (O) ₁ 、O ₂ ) And has four hidden neurons (H) ₁ 、H ₂ 、H ₃ 、H ₄ ) The hidden layer of (1). The circles in the neural network 500 represent neurons, and the connecting lines represent variable weights between each neuron of the previous layer and one neuron of the current layer. Neurons (e.g., hidden neuron H) ₁ ) Can be implemented as a mathematical function that receives a plurality of inputs (I) ₁ 、I ₂ 、I ₃ ) And weighted and then accumulated to produce an output (O) _H1 ＝I ₁ ω ₁ +I ₂ ω ₂ +I ₃ ω ₃ ). Further, neurons are hidden (e.g., H) ₁ ) Generated output O _H1 May be provided as one input to output each output neuron (e.g., O1). The output generated by the output neurons may be provided as the output of the neural network 500.

In some embodiments, the weight (e.g., ω) ₁ 、ω ₂ 、ω ₃ ) The adjustment may be accomplished using a training process. Furthermore, a neuron (e.g., a hidden neuron or an output neuron) may have a threshold such that when the weighted input accumulation value exceeds the threshold, an output (i.e., an input of a later layer or an output of a neural network) is generated. Alternatively, the output of the neuron may be calculated by some non-linear function (e.g., Sigmoid function). Although fig. 5 shows one hidden layer, a complex Deep Neural Network (DNN) may have many such hidden layers.

Problems such as pattern recognition can be solved based on the trained neural network. For example, a trained neural network can be used to infer a fruit class in an image. Fig. 6 is a flow chart 600 of a training process for a neural network according to an embodiment of the present application. For example, the training process may be based on supervised learning rules. The training process 600 is described in detail below in conjunction with the neural network 500 shown in fig. 5.

In step 601, neuron I is input ₁ 、I ₂ 、I ₃ (refer to fig. 5) training input is received. Illustratively, the training input is a set of images, each image including a fruit category to be identified.

In step 602, neuron I may be input using the current weights ₁ 、I ₂ 、I ₃ Hidden neuron H connected to next layer ₁ 、H ₂ 、H ₃ 、H ₄ . Further, the neuron H will be hidden ₁ 、H ₂ 、H ₃ 、H ₄ Output nerve O connected to the next layer ₁ 、O ₂ And will hide neuron H ₁ 、H ₂ 、H ₃ 、H ₄ As the next layer output neuron O ₁ 、O ₂ Is input. Further, an output neuron O ₁ 、O ₂ As the output of the neural network 500. In other words, the training input from the input layer traverses all hidden layers in this manner until the output layer propagates. Illustratively, as previously described, in an example of identifying fruit categories using a neural network, the hidden layer and the output layer use the current weights to calculate the probability that the fruit in the image is a particular category, and return the label of the target fruit category at step 603.

In step 604, it is determined whether the probability of the particular fruit category output by the neural network satisfies a sufficiently accurate label using the current weights, and if so, the training is complete (step 605). If the results are not accurate enough, the neural network adjusts the weights at step 606 and then loops back to step 604 to run the input data again using the adjusted weights.

When step 605 determines the weights of the neural network, the weights may be used to infer the fruit categories in the images as described above, and the determined weights may be stored in the non-volatile memory 320 (see FIG. 3).

FIG. 7 is a flow chart of a method 700 of operating a non-volatile memory for implementing a neural network in accordance with an embodiment of the present application. Fig. 8 is an equivalent circuit diagram of a two-dimensional memory cell array 820-1 in the three-dimensional memory cell array 320 shown in fig. 3. The method 700 of operation is illustratively described below in conjunction with fig. 7 and 8. The method of operation 700 may utilize, for example, a portion of the three-dimensional memory cell array 320 illustrated in fig. 8 (i.e., the two-dimensional memory cell array 820-1) to perform a forward propagation or inference process in the training process of a neural network.

As shown in fig. 8, the two-dimensional memory cell array 820-1 may be equivalent to a plurality of neurons in a hidden layer or an output layer (hereinafter referred to as a current layer) of a neural network (e.g., the neural network 500 illustrated in fig. 5). For each memory string in two-dimensional memory cell array 820-1, it can be controlled by the same top select line TSL1, such that the multiple memory strings Str11 shown in FIG. 8 ⁺ ～Strm1 ^- Can be under the control of the same top select line TSL1And a plurality of bit lines (e.g., BL 1) ⁺ 、BL1 ^- ～Blm ⁺ 、BLm ^- ) And (4) switching on.

In step 701, as shown in FIG. 8, the same word line (e.g., WL1) is connected and located in a different memory string (e.g., Str 11) ⁺ ～Strm1 ^- ) A plurality of memory cells (e.g., MC 1) ⁺ ～MCm ^- ) May correspond to one neuron in the current layer. The two-dimensional memory cell array 820-1 may correspond to n neurons in the current layer. Wherein each memory cell (e.g., MC 1) ⁺ ～MCm ^- ) Each corresponding to m memory cell pairs, i.e. memory cell pair MC1 ⁺ And MC1 ^- Memory cell pair MCm ⁺ And MCm ^- 。

Further, a plurality of bit lines (e.g., BL 1) ⁺ ～BLm ^- ) May be divided into a plurality of bit line pairs, e.g., bit line pair BL1 ⁺ And BL1 ^- Bit line pair BLm ⁺ And BLm ^- . In this step, each bit line pair (e.g., BL 1) may be addressed ⁺ And BL1 ^- ) Applying a bit line voltage V _BL Bit line pairs (e.g., BL 1) ⁺ And BL1 ^- ) Applied bit line voltage V _BL May correspond to multiple inputs to a neuron in the current layer. In other words, the applied bit line voltage V per bit line pair _BL May correspond to one input to one neuron in the current layer. Illustratively, for one neuron in the current layer, it may receive m inputs (i.e., bit line voltages V on m bit line pairs) _BL ). Optionally, each bit line pair (e.g., BL 1) ⁺ And BL1 ^- ) Is applied to two bit lines _BL May be equal. In another example, each bit line pair BL1 may be addressed first ⁺ And BL1 ^- ～BLm ⁺ And BLm ^- Middle m bit lines BL1 ⁺ 、BL2 ⁺ ～BLm ⁺ Applying a bit line voltage V _BL (m inputs) and then to each bit line pair BL1 ⁺ And BL1 ^- ～BLm ⁺ And BLm ^- Middle m bit lines BL1 ^- 、BL2 ^- ～BLm ^- The bit line voltage V is applied again _BL (m inputs).

In step 702, a pair of memory cells (e.g., MC 1) ⁺ And MC1 ^- ) Difference in conductance G1 ⁺ -G1 ^- May be provided as an input to the corresponding neuron (i.e., applied to bit line pair BL1 ⁺ And BL1 ^- Bit line voltage V of _BL1 ) The weight of (c). It should be noted that, as described above, for the floating gate type or charge trap type memory cell, the conductance value (i.e., the reciprocal of the resistance value) exhibited by each memory cell is different according to the amount of charges injected into the charge trap layer or the floating gate layer. In this step, an execution condition for determining the output of the neuron in the subsequent step 703 may be provided by applying a read voltage to the word line to which the memory cell pair is connected, thereby sensing the current flowing through both of the memory cells in the memory cell pair, according to the above-mentioned characteristics of the memory cells and the electrical connection characteristics of the memory cell array.

Fig. 9 is a waveform diagram of voltages applied to n word lines to determine the output of a neuron according to an embodiment of the present application. As shown in fig. 9, a read pulse voltage Vread (gray) is sequentially applied to the word lines WL1 to WLn to sequentially determine a current difference of each memory cell pair connected to the word line WL1 (for example, the memory cell pair MC 1) ⁺ And MC1 ^- To memory cell pair MCm ⁺ And MCm ^- Current difference) up to the current difference of the respective memory cell pairs to which the word line WLn is connected. In one example, since in step 701 the various storage strings Str11 have been paired ⁺ ～Strm1 ^- Connected bit line BL1 ⁺ ～BLm ^- Applied bit line voltage V _BL1+ ～V _BLm- When a read pulse voltage Vread is applied to the word line WL1 at time t1, the corresponding memory string Str11 is read ⁺ ～Strm1 ^- Is connected (e.g., the end not connected with the bit line), can sense the current flowing through each memory cell pair MC1 connected with the word line WL1 ⁺ And MC1 ^- ～MCm ⁺ And MCm ^- Multiple storage strings Str11 ⁺ ～Strm1 ^- The current value of (1).

In step 703, each may be summed, for example, by a subtractor (not shown)A pair of memory cells (e.g., MC 1) ⁺ And MC1 ^- ) Two storage strings (e.g., Str 11) ⁺ And Str11 ^- ) Current value of (e.g. I) _MC1+ And I _MC1- ) Performing difference processing (i.e. I) _MC1+ -I _MC1- ) To obtain respective inputs (e.g., V) _BL1+ And V _BL1- ) With corresponding weight (e.g. G) ₁₊ -G _1- ) Product of (i.e., I) _MC1+ -I _MC1- ＝G ₁₊ ×V _MC1+ -G _1- ×V _MC1- ). In other words, since the word line WL1 connects a plurality of memory cell pairs MC1 ⁺ And MC1 ^- ～MCm ⁺ And MCm ^- Multiple inputs (e.g. V) _BL1+ ～V _BLm- ) With a plurality of weights (e.g., (G) respectively corresponding to ₁₊ -G _1- )～(G _m+ -G _m- ) Is (i.e., (G)) ₁₊ ×V _MC1+ -G _1- ×V _MC1- )～(G _m+ ×V _MCm+ -G _m- ×V _MCm- ) Can be determined). Further, summing the above products may be accomplished by sensing the total current in the common source line ACS, thereby determining the output of the neuron corresponding to word line WL1 (i.e.,

) To thereby determine n outputs of n neurons of a current layer in the neural network.

In one example, during the application of the read pulse voltage Vread to word line WL1 at time t1, a bias pulse voltage Vbias may be applied to word lines WL2 WLm. For example, the bias pulse voltage Vbias is greater than the read pulse voltage Vread. In other words, for the storage string Str11 ⁺ In other words, memory cell MC1 ⁺ Which may be referred to as a selected memory location, store string Str11 ⁺ Except for memory cell MC1 ⁺ Other memory cells than the memory cell can be referred to as unselected memory cells when the bias pulse voltage Vbias applied to the unselected memory cells is greater than selected memory cell MC1 ⁺ In the case of the applied read pulse voltage Vread, it is possible to sense the current flowing through the selected memory cell MC1 ⁺ Current flow ofInfluence of unselected memory cells, thereby improving accuracy of obtaining neuron output.

It should be noted that the voltage waveform diagrams of the word lines WL1 WLn shown in fig. 9 are only exemplary, and in other examples, the read pulse voltage Vread may not be applied to the word lines WL1 WLn one by one in terms of time domain, in other words, the read pulse voltage Vread may omit that any one or more word lines do not apply the read pulse voltage Vread thereto, so that the memory cell pair connected by the one or more word lines does not correspond to one neuron of the current layer in the neural network.

In some examples, as described previously, in the case where the end of the isolation structure 420 located away from the bit line (e.g., the bottom shown in fig. 4A) extends along the direction (e.g., the x-direction) in which the plurality of channel structures (e.g., the channel structures including the sub-channel structures 411 and 412) are arranged, the plurality of bottom select transistors on both sides of the direction (e.g., the x-direction) in which the isolation structure 420 extends can be individually controlled by the two bottom select lines (e.g., BSL1 and BSL2), respectively. In the process of executing the neural network output by using the memory cell array of the nonvolatile memory, the nonvolatile memory can execute the read operation, so that when the top selection transistor corresponding to the top selection line TSL1 is turned on to control the two-dimensional memory cell array to execute the read operation (i.e., determine the output of the current layer of the neural network), the bottom selection transistor corresponding to the bottom selection line BSL1 can be turned off, thereby avoiding the read interference caused by the communication between each memory string and the common source line ACS during the read operation, and further improving the accuracy of the neural network.

In an exemplary embodiment of the present application, implementing a neural network algorithm with hardware representing weights of neurons for conductance differences of pairs of memory cells in a memory cell array of a non-volatile memory may enable the conductance of two memory cells in a pair of memory cells to be independently adjusted (e.g., increased or decreased). Meanwhile, the weight range can be increased to accommodate more weight states, and the method is favorable for realizing more precise neural network operation. On the other hand, it is possible to realize a weight that is negative, so that the weight can represent a richer function such as neuro-suppression. The method has the advantages of unit storage density and large capacity, and is beneficial to executing a larger-scale deep neural network, thereby supporting the realization of more complex neural network functions.

In addition, since the weight corresponding to one input in the neuron of the current layer can be represented by the conductance difference value of one memory cell pair, the physical structure difference (e.g., the thickness of the layer) of two memory cells in the memory cell pair may affect the consistency of the conductance characteristics presented by the memory cells, thereby affecting the accuracy of the corresponding conductance difference value of the memory cells. By adopting the physical structures of the adjacent memory strings illustrated in fig. 4A or 4C and using the two memory cells connected to the same word line in the adjacent memory strings as a memory cell pair, the influence of the difference in the physical structures of the two memory cells in the memory cell pair on the accuracy of the conductance difference thereof can be effectively reduced, which is beneficial to improving the accuracy of executing the operation method 700 and the anti-interference characteristic of the neural network. At the same time, the unit density of the memory cells can be increased to double the weight density of the neurons represented by the conductance differences of the memory cell pairs.

FIG. 10 is a schematic diagram of a method of operating a non-volatile memory for implementing a neural network according to another embodiment of the present application. As shown in fig. 10, a plurality of memory cells in a plurality of memory strings connected by the top select line TSL1 may constitute one two-dimensional memory cell array (e.g., the first memory cell array 1020-1). Similarly, a plurality of memory strings in the second memory cell array 1020-2 may be connected to one top select line, a plurality of memory strings in the third memory cell array 1020-3 may be connected to another top select line, and a plurality of memory strings in the pth memory cell array may be connected to yet another top select line. In other words, the first memory cell array 1020-1, the second memory cell array 1020-2, the third memory cell array 1020-3 to the pth memory cell array 1020-p may be controlled by p top select lines, respectively. Therein, the first memory cell array 1020-1 may be the same as the portion 820 of the three-dimensional memory cell array illustrated in fig. 8.

In one example, for the firstFor the memory cell array 1020-1, the plurality of memory strings Str11 in the first memory cell array 1020-1 can be made to be conductive by applying a conductive voltage (i.e., a voltage at which the plurality of top select transistors connected to the top select line TSL1 are conductive) to the top select line TSL1 ⁺ ～Strm1 ^- Can receive a bit line voltage (e.g., V) _BL1+ ～V _BLm- ). That is, as described in detail above, each corresponding neuron in the first memory cell array 1020-1 may be a part of a plurality of neurons in a current layer of the neural network, and may be capable of generating a corresponding output within a predetermined time (e.g., a first time period).

Further, at least a portion of the second through pth memory cell arrays 1020-2 through pth memory cell arrays may be enabled to generate corresponding outputs in parallel for a predetermined time (i.e., the first time period) as a portion of the plurality of neurons in the current layer of the neural network by applying a turn-on voltage to the top select line corresponding to one or more of the second through pth memory cell arrays 1020-2 through pth memory cell arrays, thereby effectively improving the efficiency of performing the forward propagation or inference process in the neural network algorithm using the non-volatile memory.

As indicated above, during the training of the neural network, if the probability of the neural network output does not satisfy the sufficiently accurate label, then the individual weights in the neural network need to be adjusted. Further, the difference in conductance due to the pair of memory cells (e.g., G) ₁₊ -G _1- ) As a weight for the input in each neuron, then the adjustment of the conductance difference for a pair of memory cells may be achieved by adjusting the conductance of one or both of the pair of memory cells. It should be noted that adjusting the weights in the neural network according to whether the probability output by the neural network satisfies the sufficiently accurate label may be referred to as Back Propagation (BP) of the neural network.

FIG. 11 is a graph of conductance of a memory cell versus time to perform a programming operation according to an embodiment of the present application. As shown in fig. 11, as the execution time for performing a program operation on a memory cell increases, the conductance of the memory cell decreases accordingly. Specifically, for example, as the time for performing a program operation on the memory cell increases, more charge in the channel layer of the memory cell is injected into the charge trapping layer 402 (see fig. 4), thereby causing the conductance of the memory cell to decrease, i.e., adjusting the conductance of the memory cell. In another example, the conductance of the memory cell may also be adjusted by increasing the pulse strength at which the programming operation is performed on the memory cell, such that more charge in the channel layer of the memory cell is injected into the charge trapping layer 402 (see FIG. 4).

Referring now to FIG. 8, a portion of three-dimensional memory cell array 820-1 is shown for one of memory cells MC1 ⁺ The execution of the programming operation is exemplary.

In one example, memory cell MC1 ⁺ May be referred to as a selected memory cell. When the selected memory cell is programmed, a program voltage (e.g., 15-20V) is applied to the word line WL1 connected to the selected memory cell, and the memory cell MC1 is turned on ⁺ The storage string Str11 of the place ⁺ Top select transistor TST1 on ⁺ For and memory cell MC1 ⁺ The storage string Str11 of the place ⁺ Connected bit line BL1 ⁺ A ground voltage is applied, for example. Under the action of the high voltage on the word line WL1, charges (e.g. electrons) are injected into the charge trapping layer after tunneling to charge up the memory cell MC1 ⁺ The effect of the conductance of (c). Alternatively, other bit lines BL1 may be used ^- ～BLm ^- A program inhibit voltage (e.g., 2V) is applied to prevent charge tunneling to inhibit the memory string Str11 ^- ～Strm1 ^- Memory cell MC1 in ^- ～MCm ^- Is programmed. Note that, the pair of memory cells MC1 described above ⁺ May be performed within a predetermined time (e.g., a second time period).

In one example, referring again to FIG. 10, the conductance of memory cells in one or more of the second through pth memory cell arrays 1020-2 through p-th memory cell array may be adjusted in parallel, for example, over a second period of time, thereby effectively increasing the efficiency of operations that perform back propagation in neural network algorithms using non-volatile memory.

The above description is only an embodiment of the present application and an illustration of the technical principles applied. It will be appreciated by a person skilled in the art that the scope of protection covered by the present application is not limited to the embodiments with a specific combination of the features described above, but also covers other embodiments with any combination of the features described above or their equivalents without departing from the technical idea. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. A non-volatile memory adapted for use in a neural network, comprising:

the channel structure is divided into at least two sub-channel structures by the isolation structure along the extending direction parallel to the channel structure, two sub-channel structures in the same channel structure correspond to adjacent storage strings, storage units in the storage strings are divided into storage unit pairs, two storage units in each storage unit pair are respectively located in two storage strings and connected to the same word line, the two storage strings are respectively connected to two bit lines, and a plurality of storage units connected to the same word line and located in different storage strings correspond to one neuron in the neural network; and

a peripheral circuit configured to:

applying a bit line voltage to the connected bit line of the memory cell pair, the bit line voltage as one input to a neuron in the neural network;

applying a read voltage to a word line connected to the pair of memory cells; and

determining an output of the neuron based on a difference in conductance of two of the pair of memory cells, the difference in conductance being a weight corresponding to the input of the neuron.

2. The non-volatile memory of claim 1, wherein the peripheral circuitry is further configured to:

performing a programming operation on at least one memory cell of the pair of memory cells to adjust the conductance difference value.

3. The non-volatile memory of claim 1, wherein the plurality of memory strings form a memory cell array, the memory cell array comprising a plurality of two-dimensional memory cell arrays, the plurality of memory strings in each two-dimensional memory cell array connected to a same top select line, and each two-dimensional memory cell array comprising the pair of memory cells, the peripheral circuitry further configured to:

and determining the output of the storage unit in different two-dimensional storage unit arrays to the corresponding neuron in a preset time period.

4. The non-volatile memory of claim 2, wherein the plurality of memory strings form a memory cell array, the memory cell array comprising a plurality of two-dimensional memory cell arrays, the plurality of memory strings in each two-dimensional memory cell array connected to a same top select line, and each two-dimensional memory cell array comprising the pair of memory cells, the peripheral circuitry further configured to:

adjusting the conductance difference of the memory cell pairs in different two-dimensional memory cell arrays within a predetermined time period.

5. The non-volatile memory of claim 1, wherein the isolation structure extends along a direction in which a plurality of channel structures are arranged at an end away from the bit line.

6. The non-volatile memory according to any one of claims 1 to 5, wherein the memory cell is a floating-gate type memory cell or a charge trap type memory cell.

7. A non-volatile storage system adapted for use in a neural network, comprising:

at least one non-volatile memory as claimed in any one of claims 1 to 6; and

a controller, coupled to the at least one non-volatile memory, configured to control peripheral circuitry in the non-volatile memory.

8. An operating method of a non-volatile memory for executing a neural network, wherein the non-volatile memory includes a plurality of channel structures and isolation structures, the channel structures are divided into at least two sub-channel structures by the isolation structures along a direction parallel to an extending direction of the channel structures, two sub-channel structures in the same channel structure correspond to adjacent memory strings, memory cells in the plurality of memory strings are divided into memory cell pairs, two memory cells in each memory cell pair are respectively located in two memory strings and connected to a same word line, the two memory strings are respectively connected to two bit lines, and a plurality of memory cells in the same word line and located in different memory strings correspond to one neuron in the neural network, the operating method comprising:

applying a bit line voltage to the memory cell to the connected bit line, the bit line voltage as one input to a neuron in the neural network;

9. The method of operation of claim 8, wherein the method of operation further comprises:

10. The method of operation of claim 8, wherein the plurality of memory strings form a memory cell array, the memory cell array comprising a plurality of two-dimensional memory cell arrays, the plurality of memory strings in each two-dimensional memory cell array connected to a same top select line and each two-dimensional memory cell array comprising the pair of memory cells, the determining the output of the neuron comprising:

11. The operating method of claim 9, wherein the plurality of memory strings form a memory cell array, the memory cell array comprises a plurality of two-dimensional memory cell arrays, the plurality of memory strings in each two-dimensional memory cell array are connected to a same top select line, and each two-dimensional memory cell array comprises the pair of memory cells, and adjusting the conductance difference comprises: