CN110083480B - Configurable multifunctional data processing unit - Google Patents

Configurable multifunctional data processing unit Download PDF

Info

Publication number
CN110083480B
CN110083480B CN201910305943.8A CN201910305943A CN110083480B CN 110083480 B CN110083480 B CN 110083480B CN 201910305943 A CN201910305943 A CN 201910305943A CN 110083480 B CN110083480 B CN 110083480B
Authority
CN
China
Prior art keywords
data
memory
application processor
module
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910305943.8A
Other languages
Chinese (zh)
Other versions
CN110083480A (en
Inventor
李鸽子
杜源
景蔚亮
陈小刚
陈邦明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Xinchu Integrated Circuit Co Ltd
Original Assignee
Shanghai Xinchu Integrated Circuit Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Xinchu Integrated Circuit Co Ltd filed Critical Shanghai Xinchu Integrated Circuit Co Ltd
Priority to CN201910305943.8A priority Critical patent/CN110083480B/en
Publication of CN110083480A publication Critical patent/CN110083480A/en
Application granted granted Critical
Publication of CN110083480B publication Critical patent/CN110083480B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/78Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
    • G06F21/79Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data in semiconductor storage media, e.g. directly-addressable memories
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1044Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices with specific ECC/EDC distribution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a configurable multifunctional data processing unit which is applied to a data center server and comprises a main processor, a memory, a flash memory and a mechanical hard disk, wherein the data processing unit is connected between the memory and the flash memory; the data processing unit comprises a plurality of data processing subunits; each data processing subunit comprises: an application processor; the memory is connected with the application processor, and the application processor is used for loading the data to be processed in the memory into the memory; the auxiliary processing modules are respectively connected with the application processor; the application processor and/or the auxiliary processing module are used for processing the data to be processed in the memory to form a processing result, and storing the processing result in the flash memory. The technical scheme has the beneficial effects that the problems of larger memory power consumption and processing performance reduction caused by the fact that a plurality of memories and a main processor are needed to be provided in the existing central server for processing data are solved.

Description

Configurable multifunctional data processing unit
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a configurable multifunctional data processing unit.
Background
With the development of information technology and the advent of the age of internet of things, numerous terminal devices generate a large amount of data, which needs to be processed and stored by a central server, and the structure of the server in the existing data center is shown in fig. 1, which includes the following parts, a main processor, a memory 2, a flash memory 4 and a mechanical hard disk 5. The main processor generally adopts an x86 architecture processor manufactured under an advanced process (14 nm, 10nm and 7 nm), such as Xeon series, the memory 2 is a DRAM, which is also manufactured under the advanced process, for obtaining higher access speed and performance, the flash memory 4 may be a solid state hard disk or an eMMC, for storing data, and the mechanical hard disk 5 is generally used for backing up and storing data. The existing large amount of data is generally executed in the memory by the processor in the server in the data center, so that on one hand, a large amount of power consumption is required to be consumed, the cost of the processor and the memory is relatively high, and when the data amount is larger, more and more processors and memories are required to process the data, so that the cost of the center server is increased; meanwhile, a great deal of existing algorithms related to machine learning, artificial intelligence and data mining are all executed in the memory 2 by utilizing the processing 1, and the results obtained by the algorithms are directly lost after the central server is suddenly powered off or the system crashes because the memory is volatile, so that inconvenience is brought to a user.
Disclosure of Invention
In order to solve the above-mentioned problems in the prior art when the central server processes data, a configurable multifunctional data processing unit is provided, which aims to process the data to be processed through a plurality of data processing subunits in the data processing unit, so that the pressure of processing the data by a memory in the central server can be reduced, and the data processing subunits can store the processing result in a flash memory.
The specific technical scheme is as follows:
the configurable multifunctional data processing unit is applied to a data center server, and the data center server comprises a main processor, a memory, a flash memory and a mechanical hard disk, wherein the data center server comprises a data processing unit which is connected between the memory and the flash memory;
the data processing unit comprises a plurality of data processing subunits;
each of the data processing subunits comprises:
an application processor;
the memory is connected with the application processor, and the application processor is used for loading the data to be processed in the memory into the memory;
the auxiliary processing modules are respectively connected with the application processor;
The application processor and/or the auxiliary processing module is used for processing the data to be processed in the memory to form a processing result, and storing the processing result in the flash memory.
Preferably, the flash memory includes a first encryption module, where the first encryption module is configured to encrypt or decrypt the data to be processed;
each auxiliary processing module comprises a programmable logic gate array and a self-learning module;
the application processor and/or the programmable logic gate array are/is configured to form a second encryption module, and the data to be processed is encrypted or decrypted through the second encryption module;
the application processor is further configured to determine whether the security level of the data to be processed meets a preset security level;
if yes, encrypting the data to be processed through the first encryption module, and then encrypting the data to be processed again through the second encryption module;
if not, encrypting the data to be processed only through the first encryption module.
Preferably, the flash memory includes a first error correction module, where the error correction module is configured to correct errors of the data to be processed;
Each auxiliary processing module comprises a programmable logic gate array and a self-learning module;
the application processor and/or the programmable logic gate array are configured to form a second error correction module, and error correction is performed on the data to be processed through the second encryption module.
The application processor is also used for judging the type of the data to be processed;
if the data is the first key grade type data, correcting the data to be processed through the first error correction module, and correcting the data to be processed after error correction through the second error correction module again;
and if the data is the data of the second key grade type, correcting the error of the data to be processed only through the first error correction module.
Preferably, each of the memories includes a nonvolatile memory and a volatile memory,
the application processor is further configured to determine a type of the data to be processed:
if the data is the data of the third key grade type, the application processor stores the data to be processed into the nonvolatile memory;
firstly, correcting errors of the data to be processed through the first error correction module, wherein the flash memory comprises an error correction controller, and if error correction bits generated by the first controller are larger than a standard bit number capable of correcting errors of the first error correction module;
And the application processor calls the second error correction module to correct the data to be processed.
Preferably, each of the auxiliary processing modules includes a programmable logic gate array and a self-learning module;
each memory comprises a nonvolatile memory and a volatile memory;
the application processor and/or the programmable logic gate array are/is further configured to form a garbage collection module; the flash memory stores data files;
the application processor and/or the programmable logic gate array are/is further configured to record how often the data file accessed by the user is accessed;
and the garbage collection module is used for deleting the data file with the lowest access frequency according to the record.
Preferably, the application processor and/or the programmable logic gate array are adapted to access the data file in accordance with the access record
Grading the frequency;
if the access frequency of the data file is a first level, and the data file is smaller than a first preset value, storing the current data file in the volatile memory;
and if the access frequency of the data file is a second level, and the data file is larger than the second preset value, storing the current data file in the nonvolatile memory.
Preferably, each of the auxiliary processing modules includes a programmable logic gate array and a self-learning module;
the application processor and/or the programmable logic gate array are/is used for editing and forming a detection module;
the detection module is used for detecting application programs executed by different users in different time periods to form feature records, and storing the feature records in the nonvolatile memory;
the application processor and/or the programmable logic gate array are/is used for judging whether the current application program accesses the flash memory or not according to the characteristic record;
if yes, executing the current application program through the application processor;
if not, the current application program is executed by the main processor.
Preferably, the system further comprises at least one client, wherein the client is connected with the central server and is used for sending voice instructions input by a user to the central server;
the central server converts the voice instruction into search characters and returns search contents related to the search characters to the client according to the search characters;
The client selects a designated content from the search content and sends the designated content to the central server;
the application processor and/or the programmable logic gate array and the self-learning module are used for marking the instruction content;
the application processor and/or the programmable logic gate array and the self-learning module are also configured to perform feature extraction on the marked instruction content, and correlate the instruction content having common features.
Preferably, the self-learning module is a nonvolatile self-learning module based on a novel memory.
Preferably, the nonvolatile memory is any one of a phase change memory, a resistance change memory, a magnetic memory, and a ferroelectric memory.
The technical scheme has the following advantages or beneficial effects: the data processing sub-units can process the data to be processed, and the processing results formed by processing are stored in the flash memory, so that the problems of larger memory power consumption and processing performance reduction caused by the fact that a plurality of memories and a main processor are needed to process the data in the central server in the prior art are solved.
Drawings
Embodiments of the present invention will now be described more fully with reference to the accompanying drawings. The drawings, however, are for illustration and description only and are not intended as a definition of the limits of the invention.
FIG. 1 is a schematic diagram of a portion of a central server in the background art;
FIG. 2 is a schematic diagram illustrating a configuration of a configurable multi-function data processing unit according to an embodiment of the present invention;
FIG. 3 is a schematic diagram showing a structure of a flash memory according to an embodiment of the invention;
fig. 4 is a schematic structural diagram of an auxiliary processing module in an embodiment of a configurable multi-function data processing unit according to the present invention.
Reference numerals denote;
1. a main processor; 2. a memory; 3. a data processing unit; 4, flash memory; 5. a mechanical hard disk; 31. a data processing subunit; 311. an application processor; 312. a nonvolatile memory; 313. a volatile memory; 314. an auxiliary processing module; 41. a first encryption module; 42. a first error correction module; 43. an error correction controller; 3141. a programmable logic gate array; 3142. and a self-learning module.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.
The invention is further described below with reference to the drawings and specific examples, which are not intended to be limiting.
The technical scheme of the invention comprises a configurable multifunctional data processing unit.
As shown in fig. 2, an embodiment of a configurable multifunctional data processing unit is applied to a data center server, wherein the data center server comprises a main processor 1, a memory 2, a flash memory 4 and a mechanical hard disk 5, wherein a data processing unit 3 is provided, and the data processing unit 3 is connected between the memory 2 and the flash memory 4;
the data processing unit 3 includes a plurality of data processing subunits 31;
each data processing subunit 31 comprises:
an application processor 311;
the memory is connected with the application processor 311, and the application processor 311 is used for loading the data to be processed in the memory 2 into the memory;
the plurality of auxiliary processing modules 314 are connected, and the plurality of auxiliary processing modules 314 are respectively connected with the application processor 311;
the application processor 311 and the auxiliary processing module 314 are used for processing the data to be processed in the memory to form a processing result, and storing the processing result in the flash memory 4;
The method also comprises the steps of independently processing the to-be-processed in the memory through the application processor 311 or the auxiliary processing module 314 to form a processing result, and storing the processing result in the flash memory 4;
aiming at the problems that in the prior art, the data center server processes data through a processor and a memory 2 which are arranged in the data center server, a large amount of power consumption is required to be consumed, and the memory 2 and a main processor 1 which are required to be equipped are correspondingly increased as the data processing amount is larger, so that the cost of the data center server is increased;
in the invention, a data processing unit 3 is arranged between a memory 2 and a flash memory 4 in a data center server, wherein the data processing unit 3 comprises a plurality of data processing subunits 31;
each data processing subunit 31, an application processor 311;
the memory is connected with the application processor 311, and the application processor 311 is used for loading the data to be processed in the memory 2 into the memory;
a plurality of auxiliary processing modules 314, wherein the plurality of auxiliary processing modules 314 are respectively connected with the application processor 311;
processing the data to be processed in the memory by the application processor 311 and the auxiliary processing module 314 to form a processing result, and storing the processing result in the flash memory 4;
The method further includes processing the data to be processed in the memory by the auxiliary processing module 314 or the auxiliary processing module 314 alone to form a processing result, and storing the processing result in the flash memory 4.
The plurality of data processing units 3 can process a large amount of data, and effectively share the pressure of processing the data by the main processor 1 and the memory 2 in the data center server.
In the above-mentioned technical solution, it should be noted that, in each of the above-mentioned data processing subunits 31, the performance of the application processor 311 is lower than that of the main processor 1 in the data center server, but the power consumption of the application processor 311 is lower than that of the application processor 311 in a mobile device such as a mobile phone or a tablet (Application Processor, AP), for example, the application processor 311 from the non-advanced process of the huashi semiconductor company, the application processor 311 from the non-advanced process of the carrier company, or the application processor 311 from the non-advanced process of the carrier company, and may also be a processor with x86 architecture, such as 286, 386 series.
In a preferred embodiment, as shown in fig. 3 to 4, the flash memory 4 includes a first encryption module 41, and the first encryption module 41 is used for encrypting or decrypting data to be processed;
Each auxiliary processing module 314 includes a programmable gate array 3141 and a self-learning module 3142;
the application processor 311 and the programmable logic gate array 3141 are configured to form a second encryption module, and encrypt or decrypt data to be processed through the second encryption module;
the second encryption module can be formed by the configuration of the processor or the programmable logic gate array 3141, and the data to be processed is encrypted or decrypted by the second encryption module;
the application processor 311 is further configured to determine whether the security level of the data to be processed meets a preset security level;
if yes, encrypting the data to be processed through the first encryption module 41, and then encrypting the data to be processed through the second encryption module again;
if not, the data to be processed is encrypted only by the first encryption module 41.
In the above technical solution, the programmable gate array 3141 (FPGA) may be a volatile programmable gate array 3141 based on SRAM, or may be a nonvolatile programmable gate array 3141 (nvFPGA) based on a novel memory (such as a phase change memory, a resistive random access memory, a magnetic memory, and a ferroelectric memory);
the optimal programmable gate array 3141 is a non-volatile programmable gate array 3141, the programmable gate array 3141 can be programmed into different functions, such as different interface protocols, for connecting with flash memory 4 cards of different protocols, such as a part of the programmable gate array 3141 is programmed into the eMMC 4.0 protocol, a part of the programmable gate array 3141 is programmed into the eMMC 5.0 protocol, and another part of the programmable gate array 3141 is programmed into the UFS protocol, so that a server can be connected with eMMC products of different manufacturers and having different protocol interfaces.
eMMC (Embedded Multi Media Card) is an embedded memory standard specification defined by the MMC society and mainly aimed at products such as mobile phones and tablet computers.
It should be noted that, at present, in order to ensure the security of the stored data and prevent the data of the user from being stolen, an encryption and decryption algorithm is generally adopted to encrypt and decrypt the data and then store the data, while a hardware module is generally adopted to encrypt and decrypt the data when the data is stored in the current solid state disk or eMMC card, that is, an encryption module built in the solid state disk or eMMC card is adopted to encrypt and decrypt the data, but the encryption and decryption operation of the data by using the built-in encryption module is easy to be attacked, and once the secret key is obtained by an attacker, the result is not envisaged.
Therefore, the first encryption module 41 in the flash memory 4 performs encryption and decryption operations based on the encryption and decryption operations performed by the original solid state disk or the hardware module for eMMC, and then the application processor 311 and the programmable logic gate array 31413211 in the configurable multifunctional data processing subunit 31 are utilized to perform secondary encryption and then store the data;
the data can be stored after being encrypted for the second time only by the application processor 311 or the programmable logic gate array 3141;
Finally, the safety of the user data is ensured, and the encryption and decryption algorithm is realized by an application processor 311 and a programmable logic gate array 3141;
but may also be implemented by the application processor 311 or the programmable gate array 3141 alone;
the existing mode is very flexible and changeable, and can switch encryption and decryption algorithms according to the needs no matter which mode is implemented, so that the safety of user data is ensured to the greatest extent.
The invention provides a multi-level security policy, which at least comprises two levels of security, wherein a first level security algorithm is realized by a hardware encryption and decryption module in a solid state disk or an eMMC card, namely, the first encryption module 41;
the second-stage security algorithm is realized by editing an application processor 311 and a programmable logic gate array 3141 in the data processing subunit 31 to form a second encryption module;
for the security required by different user data, the multi-level security policy provided by the invention can adopt different security levels, if the user data is sensitive data (such as user transaction data), the required security level is high, then a two-level security policy is adopted, and if the user data is general data, the security is required to be lower, then a first-level security level is adopted.
For specific sensitive data, if the data reading speed of the user is high, the invention can also only adopt the first-stage security level according to the requirement of the user on the premise of the user agreeing, skip the second-stage security level, and ensure that the data of the user is read faster so as to meet different requirements of the user.
In a preferred embodiment, the flash memory 4 includes a first error correction module 42 for correcting errors in the data to be processed;
each auxiliary processing module 314 includes a programmable gate array 3141 and a self-learning module 3142;
the application processor 311 and/or the programmable gate array 3141 are configured to form a second error correction module, and error correction is performed on the data to be processed by the second encryption module.
The application process 311 and the programmable gate array 3141 described above may be configured to form the second error correction module at the same time or separately and independently.
A second error correction module may also be configured by the application processor or the programmable gate array 3141 alone, and error correction may be performed on the data to be processed by the second encryption module.
The application processor 311 is further configured to determine a type of data to be processed;
If the data is the first key grade type data, correcting the error of the data to be processed through the first error correction module 42, and correcting the error of the data to be processed through the second error correction module again;
in the case of a second key class type data, the error correction is performed on the data to be processed only by the first error correction module 42.
In a preferred embodiment, each memory includes a volatile memory 313, a nonvolatile memory 313 and a volatile memory 313, and the application processor 311 is further configured to determine the type of data to be processed:
if the data is a third key class type data, the application processor 311 stores the data to be processed in the nonvolatile memory 313 of the volatile memory 313;
firstly, the data to be processed is subjected to error correction through a first error correction module 42, the flash memory 4 comprises an error correction controller 43, and if error correction bits generated by the first controller are larger than a standard bit number capable of correcting errors by the first error correction module 42;
the application processor 311 calls the second error correction module to correct the error of the data to be processed.
In the above technical solution, in order to ensure the reliability of data, the current flash memory 4 or the mechanical hard disk 5 generally has an error checking and correcting (ECC, error Correcting Code error checking and correcting) module for checking and correcting data errors, and the current ECC module of the memory may be implemented by hardware or software, where the ECC is implemented by hardware, which has the advantages of fast speed, large memory area, and the ECC implemented by software, which has the advantages of executing an ECC algorithm by the main processor 1, without increasing the memory area, but the processor executing speed is lower than the hardware implementation, i.e. the performance is lower.
Whether hardware-implemented ECC or software-implemented ECC, the number of bits in which errors occur is increasing with time, and the ability of ECC to correct errors is certain, that is, the ECC cannot completely correct all the bits in error.
The invention provides a method for adopting different grades of ECC according to the key degree of data and adopting different backup methods for improving the reliability of the data.
Firstly, the application processor 311, the programmable gate array 3141 and the self-learning module 3142 of the present invention divide the user's data into critical data, general data and non-critical data according to the user's behavior and whether the data is critical or not.
For key data of different users, two stages of ECC are adopted for error correction, so that the accuracy of the key data of the users is ensured, and the first stage is an ECC module existing in a storage network; the storage network mainly comprises a plurality of flash memories 4 and a plurality of mechanical hard disks 5;
the second level of ECC algorithm implemented by the application processor 311, the programmable gate array 3141 3211, and ECC algorithm implemented by the application processor 311 or the programmable gate array 3141 alone;
for the critical data of the user, not only the hardware ECC module existing in the storage network itself is used for error correction, but also the ECC implemented by the application processor 311 and the programmable logic gate array 3141 3211 is used for error correction.
For the critical data (< 3%) of the user, firstly, 3 to 5 copies of the critical data are backed up in different flash memory 4 products (such as backup in emmcs of different manufacturers) in a storage network, and then, for the critical data of the user, two-level ECC is adopted for error correction, so that the absolute correctness of the critical data of the user is ensured.
For non-critical data (> 70%), such as movie and music data, even if errors with a certain number of bits occur, the user experience is not greatly affected, and the user requirements can be met only by using the hardware ECC module existing in the flash memory 4 to correct errors.
For general data (3% -70%), in order to ensure the integrity of the data in case of sudden power failure or sudden crash of the server, the user data is backed up first in two to three parts, such as one part into the flash memory 4, and then the other part is backed up into the volatile memory 313 and the nonvolatile memory 312 together with the corresponding ECC algorithm.
Then in an offline mode, such as 1AM to 4AM at night, the application processor 311 moves the corresponding ECC algorithm from the volatile memory 313 to the nonvolatile memory 312 to the storage network, and when the controller of the flash memory 4 or the mechanical hard disk 5 reports that the number of bits of errors is greater than the number of bits that can be corrected by the ECC module in the flash memory 4 or the mechanical hard disk 5, the application processor 311 reads the corresponding ECC algorithm in the storage network, so as to correct the errors and ensure the accuracy of the user data.
In a preferred embodiment, each auxiliary processing module 314 includes a programmable gate array 3141 and a self-learning module 3142;
each memory includes a nonvolatile memory 312 and a volatile memory 313;
the application processor 311 and the programmable logic gate array 3141 are further configured to form a garbage collection module;
the garbage collection module may also be configured to be formed solely by the application processor 311 or the programmable gate array 3141;
the flash memory 4 stores data files;
the application processor 311 and the programmable gate array 3141 are further configured to record how frequently the data file accessed by the user is accessed;
the access frequency of the data file accessed by the user can be recorded only through the application processor 311 or the programmable logic gate array 3141;
the garbage collection module is used for deleting the data files with the lowest access frequency according to the records.
In the above technical solution, when the data is updated, the updated data is generally written into a new address, the original data is marked as invalid, and the corresponding relationship between the updated data and the address is updated instead of the original address, so that as the data writing increases, more and more addresses on the address are invalid, and the invalid data becomes garbage, so a garbage recycling mechanism (Garbage Collection, GC) is needed.
Garbage collection is to merge valid pages (records) in all blocks (partitions) into a new Block (partition), and erase old blocks, leaving more free blocks (partitions).
The invention provides a global garbage collection mechanism. Namely, the application processor 311 and the programmable logic gate array 3141 in the data processing subunit 31 implement a garbage collection algorithm, so as to perform garbage collection on the data invalid in the flash memory 4 in the storage network.
Assuming that the flash memory 4 in the storage network is composed of N (N > 2) emmcs, the emmcs come from different vendors, the interface protocols are also different, and the application processor 311 and the programmable logic gate array 3141 in the data processing subunit 31 obtain how often the user reads data after a user behavior analysis for a certain time.
After a certain period of time, M emmcs (N > M > 1) of the N emmcs need to perform garbage collection, and according to the analysis of the application processor 311 and the programmable logic gate array 31413211 in the present invention, it is found that some files in the M emmcs needing garbage collection belong to files that are read and written frequently in the last period of time (the last week), so before garbage collection, the data are transferred to the volatile memory 313 or the nonvolatile memory 312, and if there is no file that is read and written frequently in the last period of time in the M emmcs needing garbage collection, no operation is needed, and garbage collection is performed directly.
Meanwhile, the application processor 311 can be used for counting the error of the data and the number of bits which can be corrected by the ECC module before garbage collection in an offline mode.
The service life of the block can be obtained by counting the error of the data by the application processor 311 in the offline mode, and if the application processor 311 counts that the number of error bits of the data on the block is large, the number of the remaining erasable times of the block is small, so that the application processor 311 in the invention can enable garbage collection to have a self-testing function.
The number of bits that can be corrected by the ECC module in the storage network is counted by the application processor 311 in the offline mode, and if the number of bits that are in error is greater than the number of bits that can be corrected by the ECC, it is determined whether to use the second level ECC for error correction according to the importance of the data.
In a preferred embodiment, the application processor 311, the programmable gate array 3141 is configured to rank the access frequency of the data file according to the access record;
the frequency of access to the data file may also be ranked by the processor 311 or the programmable gate array 3141 alone according to access records;
If the access frequency of the data file is the first level and the data file is smaller than a first preset value, storing the current data file in the volatile memory 313;
if the access frequency of the data file is the second level, and the data file is greater than the second preset value, the current data file is saved in the nonvolatile memory 312.
In the above technical solution, in order to accelerate the data reading speed, different reading methods are adopted according to whether the user data is critical or not.
For the reading of critical data, correction still needs to be carried out through two-stage security and two-stage ECC algorithms;
for non-critical data reading, the second-level security and second-level ECC algorithm can be directly skipped to directly move the data from the storage network to the memory 2 in order to accelerate the reading speed.
For general data of the user, part of the general data can directly skip encryption and decryption operations and a second-level ECC algorithm according to the behavior and the needs of the user, the data is directly moved from the storage network to the memory 2, and the rest general data is moved to the storage network after passing through the second-level security and the second-level ECC algorithm according to the original method.
The application processor 311, the programmable gate array 3141 and the self-learning unit divide the user's data into hot data, cold data,
wherein, the hot data represents the data with highest access degree, and the cold data is the data with lowest access degree;
for data that is read most frequently by a user and that is read randomly and has a data size D less than X (D < X) is stored directly in the volatile memory 313, for data that is read most frequently by a user and has a data size D greater than X less than Y (X < D < Y), the second level ECC algorithm may be skipped directly and the second level security may be used, for data that is relatively hot and has a data size D greater than Y (D > Y), the data is stored in the volatile memory 313 nonvolatile memory 312 because the time to read data from the volatile memory 313 nonvolatile memory 312 is less than the time delay to read data from the storage network, and therefore the time to read data from the storage network may be reduced by storing data in the volatile memory 313 nonvolatile memory 312.
In a preferred embodiment, each auxiliary processing module 314 includes a programmable gate array 3141 and a self-learning module 3142;
The application processor 311 and the programmable gate array 3141 are used for editing and forming a detection module;
the detection module is used for detecting application programs executed by different users in different time periods to form feature records, and storing the feature records in the nonvolatile memory;
the application processor 311 and the programmable gate array 3141 are configured to determine whether the current application program accesses the flash memory 4 according to the feature record;
if yes, executing the current application program by the application processor 311;
if not, the current application program is executed by the main processor 1.
In the above technical solution, all the current self-learning algorithms of the user behavior features are generally executed by the main processor 1, the graphics processor and the memory 2 in the server, which brings about a great deal of problems of power consumption and cost.
The present invention proposes an off-line self-learning method using a configurable multifunctional data processing unit 3 architecture, which we further describe and apply here. Firstly, by using the Application processor 311 provided by the present invention to execute the corresponding monitoring program for monitoring the behavior (Hobby) of different applications (applications) of different users (User) in different Time periods (Time) after a certain period of Time, the behavior features of the users are divided into two cases by self-learning of the Time period, the case of higher requirement on the main processor 1 (CPU) and the case of more frequent access to the flash memory 4. The corresponding behavior characteristics of the different applications of the different users over different time periods are then written into the volatile memory 313 and the non-volatile memory 312. In particular as
Table 1 shows the results.
For example, the running program of the user A running in the morning from 8:00 to 9:00 is A_X, and the behavior of the application program is characterized by frequent access to the flash memory 44, so that the application program A_X is executed by the application processor 311 and the programmable logic gate array 3141;
the running program of the user A running at 11:00-12:00 am is A_Y, and the behavior characteristic of the application program is that the requirement on the main processor 1 is high, so the application program A_Y should be executed by the main processor 1.
For the user B, the application running at 8:00-12:00 am is B_Y, and the behavior characteristic of the application is that the requirement on the main processor 1 is higher, so the application should be executed by the main processor 1;
13:00-21:00 is a b_z running application whose behavior is characterized by frequent access to flash memory 44, and therefore should be executed by application processor 311, programmable gate array 31413211.
If the behavior characteristic of the user is found to be that requiring a high level of processor, the corresponding application program will be executed by the processor when the user logs in, and if the user is found to be accessing the storage network more frequently, the corresponding application program will be executed by the application processor 311, the programmable gate array 3141 of the present invention.
The specific steps are as follows:
step one: the application processor 311 and/or the programmable gate array 3141 execute a monitoring program to monitor different applications at different times by different users;
step two: after a certain time of self-learning, the behavior characteristics of different application programs of different users in different time periods are recorded;
step three: logging in by a user;
step four: according to the logged-in user and the time period, the application program judges the behavior characteristics of the application program;
step five: selecting a corresponding execution mode according to the behavior characteristics;
if the behavior characteristic of the logged-on user's application is frequent access to the storage network, then the application is executed by the application processor 311 and/or the programmable logic gate array 3141;
if the behavior of the application program of the logged-in user is characterized by a high requirement for the main processor 1, the application program is processed and executed by the main processor 1, wherein it should be noted that the application processor 311 and the programmable gate array 3141 may be executed simultaneously, or only one of them may be executed.
In a preferred embodiment, the system further comprises at least one client, wherein the client is connected with the central server and is used for sending the voice command input by the user to the central server;
The central server converts the voice instruction into search characters and returns search contents related to the search characters to the client according to the search characters;
the client selects a designated content from the search content and sends the designated content to the central server;
the application processor 311, the programmable gate array 3141 and the self-learning module 3142 are used to mark the instruction content;
the application processor 311, the programmable gate array 3141 and the self-learning module 3142 are further configured to perform feature extraction on the marked instruction content, and associate the instruction content having common features.
In the above technical solution, in the current smart home, a user wants to obtain a certain content, such as a game, a movie, etc., generally, the user inputs the content to be searched through voice, then the GPU converts the voice input of the user into corresponding text, and then searches the text, if the content meeting the user's requirement is found, the content is returned to the client of the user, and the user makes a selection.
The SHIELD product of weida is a kind of computer that uses speech to input content, then uses GPU to convert speech into text to search, and uses CPU/GPU at the same time (central processing Unit (CPU, central Processing Unit) is a very large scale integrated circuit, and is a computer's operation Core (Core) and Control Core (Control Unit)) its function is mainly to interpret computer instructions and process data in computer software.
Graphics processor (Graphics Processing Unit, abbreviated GPU), also known as display core, vision processor, display chip, is a microprocessor that works exclusively on personal computers, workstations, gaming machines and some mobile devices (e.g., tablet computers, smartphones, etc.). )
And executing a corresponding machine learning algorithm to learn the searched content of the user. Such execution of the corresponding self-learning algorithm with the CPU/GPU still belongs to the volatile self-learning, while consuming a lot of power consumption and costs due to the CPU/GPU and the DRAM.
The invention provides a method for classifying and analyzing contents by using a configurable multifunctional data processing unit 3, wherein the basic principle is that an application processor 311 and/or a programmable logic gate array 3141 and a self-learning module 3142 are used for classifying and analyzing the contents selected after searching by a user, namely, classifying and analyzing the contents in an offline mode.
The method specifically comprises the following steps:
step one, inputting corresponding content to be searched by a user of a client through voice;
step two, the GPU in the server side converts the voice input of the user into corresponding words;
here we should note that the system in the server does not know the exact answer to the question the user wants, e.g. the user searches for "best-hearing music", then the system will only return the music that the user likes to the user, which music is not necessarily the answer that the user wants most;
Step three: the CPU in the server searches the information provided by the content provider and returns the searched content to the client of the user, where the content returned to the client of the user is a basic profile, such as a list, as shown in table 2 below:
content Introduction to the invention
Content A Introduction to content A
Content B Introduction to content B
Content C Introduction to content C
…… ……
TABLE 2
This list is used for narrowing down the selection by the user, by means of a second click or voice input by the user, to get the content that the user wants most.
It should be noted here that at this time, only some of the content in the content provider memory 2 may exist, and some of the other content does not exist in the memory 2, so that the system moves other content stored in the flash memory 4 or the mechanical hard disk 5 to the memory 2 while the user performs a second click or voice input.
For example, there are five candidates in the list returned to the user, namely a, B, C, D and E, but only a, C and E are present in memory 2, so that the system moves B and D stored in flash memory 4 to memory 2 when the user makes a second click or voice input.
Step four: the user selects corresponding content through 2 clicks or voice input;
Step five: the application processor 311 and/or the programmable gate array 3141 and the self-learning module 3142 learn according to the user's input and the last selection, label the content selected last by the user, sort and analyze the content with other content having corresponding labels.
It is noted here that the application processor 311 and/or the programmable gate array 3141 and the self-learning module 3142 are learning, tagging, sorting and analyzing performed in an offline mode.
The following is an example. If the user a inputs "the most recently watched action movie" by voice, the GPU first converts the voice into text, and then returns the text to the client in the form of a list after searching, as shown in table 3:
content Introduction to the invention
Action movie A Brief introduction to action movie A
Action movie B Brief introduction to action movie B
Action movie C Brief introduction to action movie C
Action movie D Brief introduction to action movie D
TABLE 3 Table 3
The user selects "action movie B" by 2 clicks or voice inputs, then the main processor 1 plays the "action movie B" selected by the user, and the application processor 311 and/or the programmable gate array 3141 and the self-learning module 3142 find that the action movie B plays as actor X, and then the application processor 311 and/or the programmable gate array 3141 and the self-learning module 3142 label the action movie B with other movies of actor X, so as to facilitate the search of other users and provide better services to the user, wherein it is required that the application processor 311 and the programmable gate array 3141 can be executed simultaneously, or only one of them can be executed.
In a preferred embodiment, the self-learning module 3142 is a new memory-based non-volatile self-learning module 3142.
In a preferred embodiment, the volatile memory 313 and the nonvolatile memory 312 are any one of a phase change memory, a resistive change memory, a magnetic memory, and a ferroelectric memory.
In the above solution, the volatile memory 31334 in the memory of the processing subunit is a DRAM for processing the relevant data to be processed together with the application processor 311, where the volatile memory 313 (DRAM) is different from the DRAM in the memory 2, the DRAM in the memory 2 is manufactured by advanced technology, and because it works together with the processor and thus requires high reading speed, it has DDR4 or DDR5 protocol, while the volatile memory 313 in the processing subunit is also a DRAM, but it is a DRAM manufactured by depreciation technology, and its protocol is DDR or DDR2.
The nonvolatile memory 312 in the processing subunit may be a new type of nonvolatile memory 312, such as a phase change memory, a resistive random access memory, a magnetic memory, a ferroelectric memory, etc., or may be a flash memory 4 of SLC, where the nonvolatile memory 312 is read faster than the flash memory 4 and the mechanical hard disk 5.
A large number of flash memory 4 products and mechanical hard disks 5 form a storage network, such as SSD or eMMC, which may contain flash memory 4 products of different protocols with different interfaces, such as flash memory 4 products with eMMC 4.0, eMMC 5.0 and UFS interface protocols.
The foregoing description is only illustrative of the preferred embodiments of the present invention and is not to be construed as limiting the scope of the invention, and it will be appreciated by those skilled in the art that equivalent substitutions and obvious variations may be made using the description and illustrations of the present invention, and are intended to be included within the scope of the present invention.

Claims (8)

1. The configurable multifunctional data processing unit is applied to a data center server, and the data center server comprises a main processor, a memory, a flash memory and a mechanical hard disk, and is characterized by further comprising the multifunctional data processing unit which is connected between the memory and the flash memory;
the multifunctional data processing unit comprises a plurality of data processing subunits;
each of the data processing subunits comprises:
an application processor;
The memory is connected with the application processor, and the application processor is used for loading the data to be processed in the memory into the memory;
the auxiliary processing modules are respectively connected with the application processor;
the application processor and/or the auxiliary processing module are/is used for processing the data to be processed in the memory to form a processing result, and storing the processing result in the flash memory;
the flash memory comprises a first error correction module, wherein the error correction module is used for correcting errors of the data to be processed;
each auxiliary processing module comprises a programmable logic gate array and a self-learning module;
the application processor and/or the programmable logic gate array are/is configured to form a second error correction module, and error correction is performed on the data to be processed through the second error correction module;
the application processor is also used for judging the type of the data to be processed;
if the data is the first key grade type data, correcting the data to be processed through the first error correction module, and correcting the data to be processed after error correction through the second error correction module again;
If the data is the second key grade type data, correcting the data to be processed only through the first error correction module;
each of the memories includes, a nonvolatile memory and a volatile memory,
the application processor is further configured to determine a type of the data to be processed:
if the data is the data of the third key grade type, the application processor stores the data to be processed into the nonvolatile memory;
firstly, correcting errors of the data to be processed through the first error correction module, wherein the flash memory comprises an error correction controller, and if error correction bits generated by the error correction controller are larger than a standard bit number capable of correcting errors of the first error correction module;
and the application processor calls the second error correction module to correct the data to be processed.
2. The multifunctional data processing unit of claim 1, wherein the flash memory comprises a first encryption module for encrypting the data to be processed;
the application processor and/or the programmable logic gate array are/is configured to form a second encryption module, and the data to be processed is encrypted through the second encryption module;
The application processor is further configured to determine whether the security level of the data to be processed meets a preset security level;
if yes, encrypting the data to be processed through the first encryption module, and then encrypting the data to be processed again through the second encryption module;
if not, encrypting the data to be processed only through the first encryption module.
3. The multifunctional data processing unit of claim 1, wherein the application processor and/or the programmable logic gate array are further configured to form a garbage collection module; the flash memory stores data files;
the application processor and/or the programmable logic gate array are/is further configured to record how often the data file accessed by the user is accessed;
and the garbage collection module is used for deleting the data file with the lowest access frequency according to the record.
4. A multi-function data processing unit according to claim 3, wherein the application processor and/or the programmable gate array is arranged to rank how frequently the data file is accessed according to the record;
If the access frequency of the data file is a first level, and the data file is smaller than a first preset value, storing the current data file in the volatile memory;
and if the access frequency of the data file is a second level, and the data file is larger than a second preset value, storing the current data file in the nonvolatile memory.
5. A multi-function data processing unit according to claim 1, wherein,
the application processor and/or the programmable logic gate array are/is used for editing and forming a detection module;
the detection module is used for detecting application programs executed by different users in different time periods to form feature records, and storing the feature records in the nonvolatile memory;
the application processor and/or the programmable logic gate array are/is used for judging whether the current application program accesses the flash memory or not according to the characteristic record;
if yes, executing the current application program through the application processor;
if not, the current application program is executed by the main processor.
6. The multifunctional data processing unit of claim 1, wherein the data center server is connected to at least one client for transmitting voice commands input by a user to the center server;
the central server converts the voice instruction into search characters and returns search contents related to the search characters to the client according to the search characters;
the client selects a designated content from the search content and sends the designated content to the central server;
the application processor and/or the programmable logic gate array and the self-learning module are used for marking the specified content;
the application processor and/or the programmable logic gate array and the self-learning module are also configured to perform feature extraction on the marked specified content, and associate the specified content with common features.
7. The multi-function data processing unit of any one of claims 1, 2, 5 and 6, wherein the self-learning module is a new memory-based non-volatile self-learning module.
8. The multifunctional data processing unit of claim 1 or 5, wherein the nonvolatile memory is any one of a phase change memory, a resistive random access memory, a magnetic memory, and a ferroelectric memory.
CN201910305943.8A 2019-04-16 2019-04-16 Configurable multifunctional data processing unit Active CN110083480B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910305943.8A CN110083480B (en) 2019-04-16 2019-04-16 Configurable multifunctional data processing unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910305943.8A CN110083480B (en) 2019-04-16 2019-04-16 Configurable multifunctional data processing unit

Publications (2)

Publication Number Publication Date
CN110083480A CN110083480A (en) 2019-08-02
CN110083480B true CN110083480B (en) 2023-08-18

Family

ID=67415383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910305943.8A Active CN110083480B (en) 2019-04-16 2019-04-16 Configurable multifunctional data processing unit

Country Status (1)

Country Link
CN (1) CN110083480B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278138A (en) * 2022-07-08 2022-11-01 浙江威固信息技术有限责任公司 Solid state disk, image storage device and remote image processing configuration method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104158875A (en) * 2014-08-12 2014-11-19 上海新储集成电路有限公司 Method and system for sharing and reducing tasks of data center server
CN109599155A (en) * 2018-12-10 2019-04-09 上海新储集成电路有限公司 A kind of intelligent service system and method applied to medical data center

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104158875A (en) * 2014-08-12 2014-11-19 上海新储集成电路有限公司 Method and system for sharing and reducing tasks of data center server
CN109599155A (en) * 2018-12-10 2019-04-09 上海新储集成电路有限公司 A kind of intelligent service system and method applied to medical data center

Also Published As

Publication number Publication date
CN110083480A (en) 2019-08-02

Similar Documents

Publication Publication Date Title
US20210165571A1 (en) Memory system and method of controlling nonvolatile memory
US10296452B2 (en) Data separation by delaying hot block garbage collection
US10403369B2 (en) Memory system with file level secure erase and operating method thereof
KR20110089728A (en) Error control method of solid state drive
CN101339806A (en) Apparatus and method to prevent data loss in nonvolatile memory
KR20200113992A (en) Apparatus and method for reducing cell disturb in open block of the memory system during receovery procedure
US20210382660A1 (en) Apparatus and method for performing recovery operation of memory system
US11818248B2 (en) Encoder and decoder using physically unclonable functions
KR20200122685A (en) Apparatus and method for handling different types of data in memory system
CN112988615A (en) Key value storage device and method of operation
CN110083480B (en) Configurable multifunctional data processing unit
US11403018B2 (en) Method and apparatus for performing block management regarding non-volatile memory
US20190310921A1 (en) Data storage device and operation method optimized for recovery performance, and storage system having the same
US9037792B1 (en) Systems and methods for providing caching for applications with solid-state storage devices
KR20210137679A (en) Memory controller
KR20210045114A (en) Memory system for efficiently manage memory block and method operation thereof
KR20210080967A (en) Controller and operation method thereof
KR20210038096A (en) Memory system, data processing system and method for operation the same
KR20210039185A (en) Apparatus and method for providing multi-stream operation in memory system
US10656846B2 (en) Operating method of memory system
KR20210051803A (en) Memory system and controller
US11836073B2 (en) Storage device operating data counter system
US11704444B2 (en) Managing encryption keys per logical block on a persistent memory device
US20230153023A1 (en) Storage device and method performing processing operation requested by host
US11366611B2 (en) Apparatus for transmitting map information in a memory system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant