CN113487006A - Portable artificial intelligence auxiliary computing equipment - Google Patents

Portable artificial intelligence auxiliary computing equipment Download PDF

Info

Publication number
CN113487006A
CN113487006A CN202110777201.2A CN202110777201A CN113487006A CN 113487006 A CN113487006 A CN 113487006A CN 202110777201 A CN202110777201 A CN 202110777201A CN 113487006 A CN113487006 A CN 113487006A
Authority
CN
China
Prior art keywords
chip
intelligent
information
controller
bus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110777201.2A
Other languages
Chinese (zh)
Other versions
CN113487006B (en
Inventor
梁龙飞
陈小刚
阿西木约麦尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai New Helium Brain Intelligence Technology Co ltd
Original Assignee
Shanghai New Helium Brain Intelligence Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai New Helium Brain Intelligence Technology Co ltd filed Critical Shanghai New Helium Brain Intelligence Technology Co ltd
Priority to CN202110777201.2A priority Critical patent/CN113487006B/en
Publication of CN113487006A publication Critical patent/CN113487006A/en
Application granted granted Critical
Publication of CN113487006B publication Critical patent/CN113487006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K19/00Record carriers for use with machines and with at least a part designed to carry digital markings
    • G06K19/06Record carriers for use with machines and with at least a part designed to carry digital markings characterised by the kind of the digital marking, e.g. shape, nature, code
    • G06K19/067Record carriers with conductive marks, printed circuits or semiconductor circuit elements, e.g. credit or identity cards also with resonating or responding marks without active components
    • G06K19/07Record carriers with conductive marks, printed circuits or semiconductor circuit elements, e.g. credit or identity cards also with resonating or responding marks without active components with integrated circuit chips
    • G06K19/077Constructional details, e.g. mounting of circuits in the carrier
    • G06K19/0772Physical layout of the record carrier
    • G06K19/07732Physical layout of the record carrier the record carrier having a housing or construction similar to well-known portable memory devices, such as SD cards, USB or memory sticks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Neurology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Transfer Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a movable artificial intelligence auxiliary computing device, which comprises a USB connector, a controller chip and at least 1 intelligent chip, wherein the controller chip is connected with the USB connector through a USB interface; the USB connector is connected with the controller chip through a USB bus, and the controller chip is connected with the intelligent chip through an intelligent storage bus; the intelligent chip is an artificial neural network computing chip with an intelligent memory interface, the configuration resources in the intelligent chip are uniformly addressed, and read-write access is carried out through the intelligent memory interface. Compared with the prior art, the invention brings key data required by the AI chip, such as access of a network structure and a weight array and AI chip computing resource management into a standard storage model, realizes interaction between the front-end equipment of the Internet of things and the AI chip in a storage system standard access mode, and has obvious low-cost advantage when market introduction is carried out at the edge of the Internet of things and in a front-end computing application scene.

Description

Portable artificial intelligence auxiliary computing equipment
Technical Field
The invention relates to the technical field of edge computing, in particular to a movable artificial intelligence auxiliary computing device.
Background
As early as the last century, the concept of the internet of things has been proposed, a large number of products and use standards have been generated in the development of many years, and countless special chips are developed to meet various application requirements. With the gradual maturity of infrastructure of the internet of things, the number and types of devices connected to the internet are rapidly increased every day, however, under the limitation of various conditions such as power consumption, volume and cost, front-end devices often do not have enough computing power to realize neural network computing, the most common scenes in the internet of things are still front-end collected data which are uploaded to a cloud or a computing center for data processing and analysis, and the artificial intelligence technology is also prevented from entering the front end of the internet of things.
At present, with breakthrough of the edge computing technology, a lot of control can be realized through the front end of the internet of things without being handed to the cloud, so that the processing efficiency is greatly improved, the response is faster, and the load of the cloud is also lightened. The desire of the front-end equipment of the internet of things as a data source for intelligence provides various application scenes for an Artificial Intelligence (AI) technology, and the appearance of a special chip for storing and calculating the integrated AI with extremely high calculation efficiency provides a good scheme for solving front-end calculation and edge calculation.
The intelligent technology is the general demand in thing networking field, and if carry out intelligent upgrading, the direct replacement existing equipment is undoubtedly with huge costs, and the ideal mode is upgraded for current front-end equipment to reduce the cost of upgrading, but, because the quantity of front-end equipment is numerous, the type is various, and the degree of difficulty that leads to upgrading is big. On the one hand, applying the AI chip to front-end equipment will face the transformation of a large number of customized software and hardware structures. For example, in terms of hardware, the AI chip needs to cooperatively interact with processors with different bus interfaces, different frequencies and different architectures, and in terms of system software, operating systems with different types, versions and tailored configurations need to be faced, so that adding intelligent computing functions to the end edge device requires a larger cost than that of a cloud computing device. On the other hand, the implementation schemes of the AI chip are various, no unified standard exists at present, including a hardware interface standard and a software programming model, and different AI chips are good at facing different purposes and application scenes, so that it is difficult to add AI chip design on a circuit board by adopting the same mode at the front end, and in addition, new types of AI chip products are added continuously, so that the system software and hardware are redesigned to different degrees by adopting the new AI chip, and the AI modification process of front-end equipment is extremely difficult.
In addition, data interaction between the AI chip and the front-end device of the internet of things is also a difficult problem, and the trained neural network data, including the network structure and the weight array, needs to be transmitted to the front-end device for use, but due to different conditions of the front-end system, how to conveniently complete the transmission process is also a problem to be solved.
If the user scheme is redesigned in the intelligent technology promotion process of the front-end equipment of the internet of things, especially the hardware system is updated, the market introduction speed of the technology is seriously influenced. The method has the advantages that the intelligent technology is applied to front-end calculation and edge calculation of the Internet of things in what form, the advantages of the intelligent technology can be fully exerted, the cost for updating the technology can be reduced, the updating speed is accelerated, and the method is an important problem to be faced.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide the movable artificial intelligence auxiliary computing equipment, which brings the access of key data required by an AI chip, such as a network structure and a weight array, and the management of AI chip computing resources into a standard storage model, realizes the interaction between the front-end equipment of the internet of things and the AI chip in a standard access mode of a storage system, and has obvious low-cost advantages when market introduction is carried out in the application scenes of the internet of things and the front-end computing.
The purpose of the invention can be realized by the following technical scheme:
a mobile artificial intelligence assisted computing device comprising: the intelligent controller comprises a USB connector, a controller chip and at least 1 intelligent chip;
the USB connector is connected with the controller chip through a USB bus, and the controller chip is connected with the intelligent chip through an intelligent storage bus;
the intelligent chip is an artificial neural network computing chip with an intelligent memory interface, the configuration resources in the intelligent chip are uniformly addressed, and the intelligent chip is subjected to read-write access through the intelligent memory interface;
the controller chip is provided with a USB bus interface, an intelligent storage bus interface and an extended memory bus interface, the controller chip is connected with the USB bus through the USB bus interface, the controller chip is connected with the intelligent storage bus through the intelligent storage bus interface, and the controller chip is connected with the extended memory bus through the extended memory bus interface so as to be connected with an external extended memory.
Furthermore, the intelligent storage bus is a bus with a universal nonvolatile storage chip interface, the number of the nonvolatile storage chip interfaces on the intelligent storage bus is at least 1, and the types of the nonvolatile storage chip interfaces are at least 1.
Furthermore, the type of the interface of the nonvolatile memory chip may be an interface such as ONFI, Toggle DDR, eMMC, SDIO, or other similar interfaces, so that a plurality of smart chips with different structures can be mounted on 1 smart memory bus at the same time.
Further, the configuration resources in the smart chip include: calculating segmentation information of an acceleration unit, calculating connection information between acceleration arrays, calculating weight information in the acceleration arrays, and input and output data and chip configuration information of an artificial neural network;
the calculation acceleration unit is a core unit for calculating and accelerating the artificial neural network, the intelligent chip integrating calculation comprises a plurality of calculation-integration devices, each calculation-integration device is a calculation acceleration unit, a large number of calculation-integration devices form an array, the parallel operation realizes multiplication-addition calculation acceleration, and the calculation-integration devices can be logically divided into a plurality of arrays with different sizes, and the calculation acceleration units are divided into different calculation acceleration unit arrays according to the division information of the calculation acceleration units;
after the calculation acceleration unit is divided into a plurality of arrays with different sizes, different calculation acceleration arrays are cascaded according to the connection information among the calculation acceleration arrays, and signals are transmitted from the output end of the front-stage array to the input end of the rear-stage array;
the weight array is important data of the artificial neural network, in the integrated intelligent chip, the weight array represents the state of each integrated storage device, the state of each integrated storage device in the calculation acceleration array is set according to the weight information in the calculation acceleration array, the modification of the artificial neural network algorithm is realized by modifying the weight array, and the current artificial neural network algorithm is obtained by reading out the weight array;
for the trained artificial neural network, the input and output data of the artificial neural network comprise input data input into the artificial neural network and output data output by the artificial neural network, the artificial neural network can be applied to the intelligent chip according to the segmentation information written into the calculation acceleration unit by the trained artificial neural network, the connection information between the calculation acceleration arrays and the weight information in the calculation acceleration arrays, the input data are written into the intelligent chip, and the intelligent chip outputs the prediction result or classification result of the artificial neural network;
the chip configuration information includes: the chip model information, the capacity information of the resources configured in the chip and the address information of the resources configured in the chip can be obtained through the chip configuration information: the chip comprises a chip manufacturer, a product model, a chip type, types of configuration resources in the chip, the number of various configuration resources, first address and structure information of different types of configuration resources mapped in an address space, state information of the chip, a small number of control registers and the like.
Furthermore, the configuration resources in the smart chip are expressed in an array or a table, and different types of configuration resources are mapped to different address fields.
Furthermore, the controller chip is connected to the front-end device of the internet of things as a USB device, and provides device information of the movable artificial intelligent auxiliary computing device in a large-capacity storage device manner during USB bus enumeration, so as to virtualize a virtual storage device which can be read and written by the front-end device of the internet of things, and the controller chip converts the read and write accesses of the front-end device of the internet of things to the read and write accesses of the configuration resources of the intelligent chip.
Furthermore, the block space of the virtual storage device comprises a partition table and at least 1 partition, the partition table is used for describing partition information, the partitions correspond to the intelligent chips one to one, and the size of the address space of each partition is calculated according to the number of the configuration resources of the intelligent chip corresponding to the partition;
the partition comprises an information section, a buffer section and a resource section, wherein the information section is used for storing the number of configuration resources of the intelligent chip, the buffer section is used for buffering input data and output data of the intelligent chip, the resource section comprises a plurality of resource blocks, each resource block corresponds to the minimum unit which can be allocated by the calculation acceleration array in the intelligent chip, and the resource blocks form resource rows and resource columns so as to be matched with each other to exchange data among the resource blocks.
Furthermore, the resource block includes an input connection table, a weight table and an output connection table, where data in the input connection table describes an input interface of the resource block, the weight table is weight information of a computation acceleration array corresponding to the resource block, and data in the output connection table describes an output interface of the resource block.
Further, the controller chip includes: the system comprises a USB bus controller, an intelligent storage bus controller, a microcontroller, a code storage area and a buffer storage area; the microcontroller is connected with the USB bus controller and the intelligent storage bus controller, the code storage area is an embedded nonvolatile memory, and the buffer memory is an embedded volatile memory.
Furthermore, the controller chip further comprises an extended memory controller, and the extended memory controller is used for extending the external memory to be a buffer memory area.
Furthermore, the movable artificial intelligence auxiliary computing equipment further comprises a shell, wherein a circuit board is arranged on the shell, and the controller chip, the intelligent chip, the USB bus and the intelligent storage bus are integrated on the circuit board.
Compared with the prior art, the invention brings key data required by the AI chip, such as access of a network structure and a weight array and AI chip computing resource management into a standard storage model, realizes interaction between the front-end equipment of the Internet of things and the AI chip in a storage system standard access mode, and has obvious low-cost advantage when market introduction is carried out at the edge of the Internet of things and in a front-end computing application scene.
Drawings
FIG. 1 is a schematic structural view of the present invention;
FIG. 2 is a block space structure diagram of a virtual storage device;
FIG. 3 is a schematic diagram of a controller chip;
reference numerals:
1. the intelligent storage system comprises a USB connector, 2, a controller chip, 3, an intelligent chip, 4, a USB bus, 5, an intelligent storage bus, 6, virtual storage equipment, 7, partitions, 8, a partition table, 9, a shell, 10 and a circuit board;
2-1, a USB bus controller, 2-2, an intelligent storage bus controller, 2-3, a microcontroller, 2-4, a code storage area, 2-5, a buffer storage area, 2-6, an extended memory controller, I1, a USB bus interface, I2, an intelligent storage bus interface, I3 and an extended memory bus interface;
7-1, an information section, 7-2, a buffer section, 7-3, a resource section, 7-3-1, a resource block, 7-3-2, a resource row, 7-3-3, a resource column, T1, an input connection table, T2, a weight value table, T3 and an output connection table.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
In the drawings, structurally identical elements are represented by like reference numerals, and structurally or functionally similar elements are represented by like reference numerals throughout the several views. The size and thickness of each component shown in the drawings are arbitrarily illustrated, and the present invention is not limited to the size and thickness of each component. Parts are exaggerated in the drawing where appropriate for clarity of illustration.
Example 1:
a portable artificial intelligence auxiliary computing device with a form and structure similar to a USB flash disk, as shown in FIG. 1, comprises: the intelligent USB controller comprises a USB connector 1, a controller chip 2 and at least 1 intelligent chip 3; the USB connector 1 is connected with the controller chip 2 through a USB bus 4, and the controller chip 2 is connected with the intelligent chip 3 through an intelligent storage bus 5; the shell 9 is provided with a circuit board 10, and the controller chip 2, the intelligent chip 3, the USB bus 4 and the intelligent storage bus 5 are integrated on the circuit board 10.
The intelligent chip 3 is an artificial neural network computing chip with an intelligent memory interface, the configuration resources in the intelligent chip 3 are uniformly addressed, and the intelligent chip 3 is read and written through the intelligent memory interface.
In the prior art, because the artificial neural network computing chip has a built-in core, it can actively access the external memory like a computer CPU processor, and is usually used as a computing center in the system architecture, rather than being accessed by the external processor. The degree of difficulty when considering to be connected to thing networking front end equipment with intelligent chip 3 as calculation center is higher in this application, has consequently designed controller chip 2, carries out the unified addressing with the inside storage resource of intelligent chip 3, is connected with controller chip 2 through the interface of similar storage access, and controller chip 2 rethread USB connector 1 can easily realize the thing networking front end equipment with the different grade type be connected to realize the interaction between thing networking front end equipment and the intelligent chip 3.
The intelligent storage bus 5 is a bus with a universal nonvolatile storage chip interface, the number of the nonvolatile storage chip interfaces on the intelligent storage bus 5 is at least 1, and the types of the nonvolatile storage chip interfaces are at least 1. The type of the nonvolatile memory chip interface may be an interface such as ONFI, Toggle DDR, eMMC, SDIO, or other similar interfaces, so that the simultaneous mounting of a plurality of smart chips 3 with different structures on 1 smart memory bus 5 may be realized.
As an artificial neural network computing chip integrating storage and computation, the configuration resources in the intelligent chip 3 include: calculating the segmentation information of the acceleration units, calculating the connection information between the acceleration arrays, calculating the weight information in the acceleration arrays, and the input and output data and the chip configuration information of the artificial neural network. After the configuration resources are uniformly addressed, the stored configuration resources can be read and written through the intelligent storage bus 5, and different partition information, connection information and weight information correspond to different network structures and weight arrays of the neural network to modify the configuration resources, so that the modification of the artificial neural network in the intelligent chip 3 is realized.
The calculation acceleration unit is a core unit for artificial neural network calculation acceleration, the calculation-integrated intelligent chip 3 comprises a large number of calculation-integrated devices, each calculation-integrated device is a calculation acceleration unit, the large number of calculation-integrated devices form an array, and the parallel operation realizes multiplication-addition calculation acceleration so as to correspond to different neural network architectures, the calculation-integrated devices can be logically divided into a plurality of arrays with different sizes, and the plurality of calculation acceleration units are divided into different calculation acceleration unit arrays according to the division information of the calculation acceleration units;
after the calculation acceleration unit is divided into a plurality of arrays with different sizes, different calculation acceleration arrays are cascaded according to the connection information among the calculation acceleration arrays, and signals are transmitted to the input end of the rear-stage array from the output end of the front-stage array;
the weight array is important data of the artificial neural network, in the integrated intelligent chip 3, the weight array represents the state of each integrated storage device, the state of each integrated storage device in the calculation acceleration array is set according to weight information in the calculation acceleration array, the artificial neural network algorithm is modified by modifying the weight array, and the current artificial neural network algorithm is obtained by reading out the weight array;
for the trained artificial neural network, the input and output data of the artificial neural network comprise input data input into the artificial neural network and output data output by the artificial neural network, the artificial neural network can be applied to the intelligent chip 3 according to the segmentation information written into the calculation acceleration unit by the trained artificial neural network, the connection information between the calculation acceleration arrays and the weight information in the calculation acceleration arrays, the input data are written into the intelligent chip 3, and the intelligent chip 3 outputs the prediction result or classification result of the artificial neural network;
the chip configuration information includes: the chip model information, the capacity information of the resources configured in the chip and the address information of the resources configured in the chip can be obtained through the chip configuration information: the chip comprises a chip manufacturer, a product model, a chip type, types of configuration resources in the chip, the number of various configuration resources, first address and structure information of different types of configuration resources mapped in an address space, state information of the chip, a small number of control registers and the like.
For the intelligent chips 3 with different scales, the number and scale of the calculation acceleration units in the intelligent chips 3 are different, and the configuration resources such as the partition information, the connection information, the weight information and the like are different, even if the intelligent chips 3 with the same configuration resource condition are configured on each intelligent chip 3, the number of the configuration resources may be different, and the information is obtained by reading the chip configuration information as if the data stored on the same U disk is different.
The configuration resources within the smart chip 3 are expressed in an array or table format, with different types of configuration resources mapped to different address segments. The configuration resources are expressed in an array or table form, so that different types of configuration resources can be placed in the same two-dimensional table, the resource type corresponding to the address can be directly judged by using the high-order address in the address space, and mapping access of hardware is facilitated. For example, a row in the table is 128 bytes, all configuration resources are placed in 64 rows, wherein 0-15 rows are connection data, 16-63 rows are weight data, and the address uses 13 bits, wherein the upper 6 bits are row address, the lower 7 bits are column address, the upper 6 bits are connection data when the upper 2 bits are 00, and the other bits are weight data, and the type of the configuration resources can be determined through the two-bit address.
The controller chip 2 is connected to the front-end device of the internet of things as a USB device, when the USB bus 4 enumerates (the USB bus enumerates refers to the operation of identifying and addressing the USB device connected to the USB bus), the device information of the movable artificial intelligence auxiliary computing device is provided in a large-capacity storage device mode, so that a virtual storage device 6 which can be accessed by the front-end device of the internet of things in a read-write mode is virtualized, and the controller chip 2 converts the read-write access of the front-end device of the internet of things to the virtual storage device 6 into the read-write access of the configuration resources of the intelligent chip 3.
In the virtual storage device 6 virtualized by the controller chip 2, the size of the virtual storage device 6 is related to the number of the intelligent chips 3 in the removable artificial intelligence auxiliary computing device and the number of configuration resources of each intelligent chip 3.
In view of the front-end equipment of the internet of things, a plurality of partitions can be seen in the virtual storage equipment 6, corresponding to a plurality of intelligent chips 3, files in the partitions can be seen, corresponding to the partitioned artificial neural networks, in view of application programs of the front-end equipment of the internet of things, partition information, connection information, weight information and the like are data in the files, the files can be opened by using a file system api, the data can be read and written, the artificial neural networks can be read, and the partition state, the connection relation, the weight data and the like of the calculation acceleration units in the intelligent chips 3 can be changed by modifying the partition information, the connection information, the weight information and the like, so that the artificial neural networks in the intelligent chips 3 are changed.
When the front-end equipment of the internet of things performs read-write access on the virtual storage equipment 6, the controller chip 2 converts the read-write operation into a read-write instruction for the intelligent chip 3, and completes access through the intelligent storage bus 5, if the segmentation information is changed, the change control is performed through writing in the configuration register or the configuration space of the intelligent chip 3, and when the connection information is changed, the change of the connection information is performed through writing in the configuration of the switching matrix in the intelligent chip 3, and the like.
As shown in fig. 2, the block space of the virtual storage device 6 includes a partition table 8 and at least 1 partition 7, the partitions 7 correspond to the intelligent chips 3 one to one, and the size of the address space of the partition 7 is calculated according to the number of the configuration resources of the intelligent chip 3 corresponding to the partition 7; the partition table 8 is a virtual partition table, meets the GPT or MBR format requirements, when the front-end equipment of the Internet of things reads the block address of the partition table 8 through the USB connector 1, the contents of the installed and constructed partition table are returned, and the partition table 8 is configured to be in a read-only mode so as to prevent the front-end equipment of the Internet of things from changing partition information.
The partition table 8 may not be actually stored, and each time the power is turned on, the controller chip 2 enumerates the intelligent storage bus 5, obtains the number of the intelligent chips 3 on the intelligent storage bus 5 and the configuration resource status of each intelligent chip 3, and calculates to obtain the partition table 8.
The partition 7 comprises a plurality of logical sectors, all the logical sectors are divided into three segments, namely an information segment 7-1, a buffer segment 7-2 and a resource segment 7-3:
one or more logic sectors from the initial logic sector form an information section 7-1 for storing the quantity of configuration resources of the intelligent chip 3 for the front-end equipment of the Internet of things to access;
the plurality of logical sectors behind the information section 7-1 form a buffer section 7-2, the number of the logical sectors contained in the buffer section 7-2 is defined by the information section 7-1, and the logical sectors are used for buffering input data and output data of the intelligent chip 3, the input data of the intelligent chip 3 is written into an input buffer area of the buffer section 7-2 and is forwarded to the intelligent chip 3 by the controller chip 2, and the output data of the intelligent chip 3 is read and buffered by the controller chip 2 and is sent to the front-end device of the internet of things when the front-end device of the internet of things reads the output buffer area of the buffer section 7-2.
The logical sector behind the buffer zone 7-2 forms a resource zone 7-3, which is composed of a plurality of resource blocks 7-3-1, each resource block 7-3-1 corresponds to the minimum unit that the computation acceleration array can be allocated in the smart chip 3, the resource block 7-3-1 includes an input connection table T1, a weight value table T2 and an output connection table T3, data in the input connection table T1 describes an input interface of the resource block 7-3-1, the weight value table T2 is weight information of the computation acceleration array corresponding to the resource block 7-3-1, and data in the output connection table T3 describes an output interface of the resource block 7-3-1.
The intelligent chip 3 which can not realize the data full exchange between the resource blocks 7-3-1 can identify the resource blocks 7-3-1 in a physical position related mode, and the common resource blocks 7-3-1 arranged like a two-dimensional grid (2d mesh) can be identified in a row-column mode to form a resource row 7-3-2 and a resource column 7-3-3 so as to cooperate to exchange data between the resource blocks 7-3-1.
In order to implement the above functions, the internal structure of the controller chip 2 is shown in fig. 3, and includes: the system comprises a USB bus controller 2-1, an intelligent storage bus controller 2-2, a microcontroller 2-3, a code storage area 2-4 and a buffer storage area 2-5; the microcontroller 2-3 is connected with the USB bus controller 2-1 and the intelligent storage bus controller 2-2. The modules in the controller chip 2 are designed by using common circuit of chip, and can be designed according to the common knowledge in the industry, and the functions are described as follows:
the micro-controller 2-3 can adopt common reduced instruction set processor (RISC), such as ARM, RISC-V, etc., to complete the cooperative control in the mobile artificial intelligence auxiliary computing device, and the functions of data analysis, sending, bus access request generation and response, etc. The microcontroller 2-3 is also responsible for analyzing the USB instruction of the front-end equipment of the Internet of things and virtualizing a virtual storage device 6 to the front-end equipment of the Internet of things.
The intelligent storage bus controller 2-2 realizes bus access control with the mounted intelligent chip 3, and because the intelligent chip 3 does not have a mature interface standard at present and mainly adopts a customized interface, the bus timing sequence is realized according to the selected interface definition of the intelligent chip 3, and can selectively include standards similar to a memory bus, such as DDR, SRAM and the like, or standards similar to a storage bus, such as ONFI, Toggle DDR, eMMC, SDIO and the like.
The code storage area 2-4 is an embedded nonvolatile memory, provides a code storage function for the microcontroller 2-3, and can use various technologies such as OTP, Nor flash, phase change storage and the like; the buffer memories 2 to 5 are embedded volatile memories, provide input and output data buffering for the front-end equipment of the internet of things when accessing the intelligent chip 3, and generally adopt an SRAM technology.
The controller chip 2 further comprises an extended memory controller 2-6, and the extended memory controller 2-6 is used for extending the external memory as a buffer memory area. When the buffer memory capacity provided by the buffer memory 2-5 is insufficient, as in a video recognition smart application, an external storage device may be required for storing one or more frames of video data input due to the large data volume of the input data.
The controller chip 2 is provided with a USB bus interface I1, an intelligent storage bus interface I2 and an extended memory bus interface I3, the controller chip 2 is connected with the USB bus 4 through the USB bus interface I1, the controller chip 2 is connected with the intelligent storage bus 5 through the intelligent storage bus interface I2, and the controller chip 2 is connected with the extended memory bus through the extended memory bus interface I3 so as to be connected with an external extended memory.
The plug-and-play function can be achieved through the USB connector 1, the USB interface is suitable for front-end equipment of the Internet of things with various types, and an interface technology and an access method compatible with storage equipment are provided.
The AI chip belongs to the category of calculation, generally forms a heterogeneous calculation mode with a processor in a system, and commonly accesses data to realize cooperation, but the heterogeneous calculation is not suitable for a standard solution of front-end equipment of the Internet of things at present, and usually needs to be designed and used by a system hardware designer according to actual conditions, such as an intelligent double-chip scheme combining with the Carabin age, a double-chip scheme of a main processor and a coder-decoder in the early security field and the like, or a chip company integrates the processor and the AI core, and the chip company designs an integrated architecture by itself.
The AI chip is logically abstracted into a model similar to the storage device, the front-end equipment of the Internet of things accesses the intelligent chip 3 through partitioning, a file system and file read-write operation, the partitioning, read-write and other operations become conventional operations of an embedded system, the realization cost is low, after the front-end equipment of the Internet of things is combined with a USB hardware interface, a friendly environment is provided for developers no matter from hardware or software, and the market introduction cost of the intelligent technology is remarkably reduced.
The USB bus enumeration and the access of the storage system form a unified standard in the industry, after the USB bus enumeration and the access of the storage system form the unified standard in the industry, most system software in the front-end equipment of the Internet of things can be directly and automatically identified and prepared for use after being inserted into the front-end equipment of the Internet of things, an operating system kernel of the front-end equipment of the Internet of things does not need to be modified normally, or a driver and a supporting software package do not need to be additionally installed, only a small amount of modification needs to be made on an application program of the front-end equipment of the Internet of things, the upgrading of the application program is also a basic function possessed by the front-end equipment of the Internet of things, and the market introduction cost is low.
For AI chips of different types, the controller chip 2 can convert buses and protocols, so that the same access buses and protocols are provided for the front-end equipment of the Internet of things, the new AI chip can provide intelligent service for most front-end equipment of the Internet of things as long as the new AI chip is supported in the controller chip 2, if the update is needed, the update process only needs to replace plug-and-play movable artificial intelligence auxiliary computing equipment, and the update of a new technology is very convenient.
The neural network structure and the training process can be completed at the cloud end, the neural network structure and the training process are downloaded to the application through a common PC, and then the application is inserted into the front-end equipment of the Internet of things, so that the application can be applied to the front-end equipment of the Internet of things for prediction or classification, and the off-line training and transferring process is very convenient; a communication module can be added, the trained artificial neural network is downloaded from the cloud end through network data transmission and written into the inserted movable artificial intelligent auxiliary computing equipment, initialization or updating can be completed, the whole process is completely implemented by adopting conventional operation of a file system, and additional technical support related to an intelligent algorithm is not required to be added.
According to the method, key data needed by an AI chip, such as access of a network structure and a weight array and AI chip computing resource management, are brought into a standard storage model, interaction between the front-end equipment of the Internet of things and the AI chip is realized in a storage system standard access mode, which is obviously different from a mainstream conventional heterogeneous computing or auxiliary computing acceleration mode (a mainstream scheme is usually an accelerator card, an acceleration chip with a customized interface and the like, usually requires specific bus access, a basic software package and specific drive support, and often a user needs to call a specific API interface function to realize delivery of a computing task and acquisition of a computing result in developing and programming), so that the method has obvious low-cost advantages when market introduction is carried out in an Internet of things edge and front-end computing application scene.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (10)

1. A mobile artificial intelligence assisted computing device, comprising: the intelligent USB interface comprises a USB connector (1), a controller chip (2) and at least 1 intelligent chip (3);
the USB connector (1) is connected with the controller chip (2) through a USB bus (4), and the controller chip (2) is connected with the intelligent chip (3) through an intelligent storage bus (5);
the intelligent chip (3) is an artificial neural network computing chip with an intelligent memory interface, the configuration resources in the intelligent chip (3) are uniformly addressed, and the intelligent chip (3) is read and written through the intelligent memory interface;
the configuration resources in the intelligent chip (3) comprise: calculating segmentation information of an acceleration unit, calculating connection information between acceleration arrays, calculating weight information in the acceleration arrays, and input and output data and chip configuration information of an artificial neural network;
the controller chip (2) is connected to the front-end equipment of the Internet of things as USB device, and provides equipment information of the movable artificial intelligent auxiliary computing equipment in a large-capacity storage equipment mode when the USB bus (4) enumerates, so that a virtual storage equipment (6) which can be read and written by the front-end equipment of the Internet of things is virtualized, and the controller chip (2) converts the read and write access of the front-end equipment of the Internet of things to the virtual storage equipment (6) into the read and write access of configuration resources of the intelligent chip (3).
2. A removable artificial intelligence auxiliary computing device according to claim 1, wherein said intelligent storage bus (5) is a bus with a generic non-volatile memory chip interface, the number of non-volatile memory chip interfaces on said intelligent storage bus (5) being at least 1, and the type of non-volatile memory chip interfaces being at least 1.
3. The removable artificial intelligence assistant computing device of claim 2, wherein the type of the non-volatile memory chip interface comprises one or more of ONFI, Toggle DDR, eMMC, SDIO.
4. A mobile artificial intelligence assisted computing device according to claim 1, wherein the configuration resources within the intelligent chip (3) are specifically:
the segmentation information of the calculation acceleration unit is the segmentation information of an integrated computing device inside an intelligent chip (3), the intelligent chip (3) comprises a plurality of integrated computing devices, each integrated computing device is a calculation acceleration unit, and the calculation acceleration unit is segmented into different calculation acceleration unit arrays according to the segmentation information of the calculation acceleration units;
the connection information among the calculation acceleration arrays is the connection information among arrays formed by dividing an internal storage and calculation integrated device of the intelligent chip (3), different calculation acceleration arrays are cascaded according to the connection information among the calculation acceleration arrays, and the output end of the front-stage array is connected with the input end of the rear-stage array;
the weight information in the calculation acceleration array is the weight information of each calculation acceleration unit, and the state of each storage and calculation integrated device in the calculation acceleration array is set according to the weight information in the calculation acceleration array;
the input and output data of the artificial neural network comprise input data input into the artificial neural network and output data output by the artificial neural network;
the chip configuration information includes: the chip model information, the capacity information of the resources configured in the chip and the address information of the resources configured in the chip.
5. A mobile artificial intelligence auxiliary computing device according to claim 4, wherein the configuration resources within the intelligent chip (3) are expressed in an array or table form, different types of configuration resources being mapped to different address segments.
6. A movable artificial intelligence auxiliary computing device according to claim 4, characterized in that the virtual storage device (6) comprises at least 1 partition (7) in the block space, the partitions (7) are in one-to-one correspondence with the intelligent chips (3), and the size of the address space of a partition (7) is calculated according to the amount of configuration resources of the intelligent chip (3) corresponding to the partition (7);
the partition (7) comprises an information section (7-1), a buffer section (7-2) and a resource section (7-3), wherein the information section (7-1) is used for storing the amount of configuration resources of the intelligent chip (3), the buffer section (7-2) is used for buffering input data and output data of the intelligent chip (3), the resource section (7-3) comprises a plurality of resource blocks (7-3-1), and each resource block (7-3-1) corresponds to the minimum unit of the intelligent chip (3) to which the calculation acceleration array can be allocated.
7. A mobile artificial intelligence auxiliary computing device according to claim 6, characterised in that said resource block (7-3-1) comprises an input connectivity table (T1), a weight value table (T2) and an output connectivity table (T3), said data in the input connectivity table (T1) describing the input interface of the resource block (7-3-1), said weight value table (T2) being the weight information of the compute acceleration array to which the resource block (7-3-1) corresponds, said data in the output connectivity table (T3) describing the output interface of the resource block (7-3-1).
8. A mobile artificial intelligence assisted computing device according to claim 6, wherein the controller chip (2) comprises: the system comprises a USB bus controller (2-1), an intelligent storage bus controller (2-2), a microcontroller (2-3), a code storage area (2-4) and a buffer storage area (2-5); the microcontroller (2-3) is connected with the USB bus controller (2-1) and the intelligent storage bus controller (2-2), the code storage area (2-4) is an embedded nonvolatile memory, and the buffer memory (2-5) is an embedded volatile memory.
9. A removable artificial intelligence auxiliary computing device according to claim 8, wherein said controller chip (2) further comprises an extended memory controller (2-6), said extended memory controller (2-6) being adapted to extend external memory into a buffer storage area.
10. A portable artificial intelligence auxiliary computing device according to claim 1, further comprising a housing (9), wherein a circuit board (10) is disposed on the housing (9), and the controller chip (2), the intelligent chip (3), the USB bus (4) and the intelligent storage bus (5) are integrated on the circuit board (10).
CN202110777201.2A 2021-07-09 2021-07-09 Portable artificial intelligence auxiliary computing equipment Active CN113487006B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110777201.2A CN113487006B (en) 2021-07-09 2021-07-09 Portable artificial intelligence auxiliary computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110777201.2A CN113487006B (en) 2021-07-09 2021-07-09 Portable artificial intelligence auxiliary computing equipment

Publications (2)

Publication Number Publication Date
CN113487006A true CN113487006A (en) 2021-10-08
CN113487006B CN113487006B (en) 2022-08-09

Family

ID=77937718

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110777201.2A Active CN113487006B (en) 2021-07-09 2021-07-09 Portable artificial intelligence auxiliary computing equipment

Country Status (1)

Country Link
CN (1) CN113487006B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114492709A (en) * 2021-12-30 2022-05-13 湖南中科存储科技有限公司 Dual-chip USB flash disk with backup function
CN116306855A (en) * 2023-05-17 2023-06-23 之江实验室 Data processing method and device based on memory and calculation integrated system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050120146A1 (en) * 2003-12-02 2005-06-02 Super Talent Electronics Inc. Single-Chip USB Controller Reading Power-On Boot Code from Integrated Flash Memory for User Storage
US20070130436A1 (en) * 1999-10-19 2007-06-07 Super Talent Electronics, Inc. Electronic Data Storage Medium With Fingerprint Verification Capability
CN101276322A (en) * 2008-05-23 2008-10-01 首都师范大学 Embedded type intelligent chip with formulation discover function
CN109947573A (en) * 2019-03-26 2019-06-28 北京智芯微电子科技有限公司 Intelligence suitable for electric system edge calculations accelerates chip
CN109981724A (en) * 2019-01-28 2019-07-05 上海左岸芯慧电子科技有限公司 A kind of internet-of-things terminal based on block chain, artificial intelligence system and processing method
CN111858023A (en) * 2019-04-26 2020-10-30 英特尔公司 Architectural enhancements for computing systems with artificial intelligence logic disposed locally to memory
CN112631968A (en) * 2020-12-22 2021-04-09 无锡江南计算技术研究所 Dynamic evolvable intelligent processing chip structure
CN112988082A (en) * 2021-05-18 2021-06-18 南京优存科技有限公司 Chip system for AI calculation based on NVM and operation method thereof
CN113065643A (en) * 2021-04-12 2021-07-02 天津中科虹星科技有限公司 Apparatus and method for performing multi-task convolutional neural network prediction

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070130436A1 (en) * 1999-10-19 2007-06-07 Super Talent Electronics, Inc. Electronic Data Storage Medium With Fingerprint Verification Capability
US20050120146A1 (en) * 2003-12-02 2005-06-02 Super Talent Electronics Inc. Single-Chip USB Controller Reading Power-On Boot Code from Integrated Flash Memory for User Storage
CN101276322A (en) * 2008-05-23 2008-10-01 首都师范大学 Embedded type intelligent chip with formulation discover function
CN109981724A (en) * 2019-01-28 2019-07-05 上海左岸芯慧电子科技有限公司 A kind of internet-of-things terminal based on block chain, artificial intelligence system and processing method
CN109947573A (en) * 2019-03-26 2019-06-28 北京智芯微电子科技有限公司 Intelligence suitable for electric system edge calculations accelerates chip
CN111858023A (en) * 2019-04-26 2020-10-30 英特尔公司 Architectural enhancements for computing systems with artificial intelligence logic disposed locally to memory
CN112631968A (en) * 2020-12-22 2021-04-09 无锡江南计算技术研究所 Dynamic evolvable intelligent processing chip structure
CN113065643A (en) * 2021-04-12 2021-07-02 天津中科虹星科技有限公司 Apparatus and method for performing multi-task convolutional neural network prediction
CN112988082A (en) * 2021-05-18 2021-06-18 南京优存科技有限公司 Chip system for AI calculation based on NVM and operation method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴艳霞,等.: "深度学习FPGA加速器的进展与趋势", 《计算机学报》 *
陈智.: "DSP芯片AI推理设备的设计与实现", 《福建电脑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114492709A (en) * 2021-12-30 2022-05-13 湖南中科存储科技有限公司 Dual-chip USB flash disk with backup function
CN116306855A (en) * 2023-05-17 2023-06-23 之江实验室 Data processing method and device based on memory and calculation integrated system
CN116306855B (en) * 2023-05-17 2023-09-01 之江实验室 Data processing method and device based on memory and calculation integrated system

Also Published As

Publication number Publication date
CN113487006B (en) 2022-08-09

Similar Documents

Publication Publication Date Title
CN102012791B (en) Flash based PCIE (peripheral component interface express) board for data storage
CN113487006B (en) Portable artificial intelligence auxiliary computing equipment
US9123409B2 (en) Memory device for a hierarchical memory architecture
US5875349A (en) Method and arrangement for allowing a computer to communicate with a data storage device
US11630578B2 (en) Electronic system with storage management mechanism and method of operation thereof
CN102612685B (en) Non-blocking data transfer via memory cache manipulation
US20160253093A1 (en) A new USB protocol based computer acceleration device using multi I/O channel SLC NAND and DRAM cache
CN108121672A (en) A kind of storage array control method and device based on Nand Flash memorizer multichannel
CN103034454B (en) Flexible flash command
CN105103144A (en) Apparatuses and methods for adaptive control of memory
CN110941395B (en) Dynamic random access memory, memory management method, system and storage medium
TWI764265B (en) Memory system for binding data to a memory namespace
US20220283949A1 (en) MEMORY TIERING USING PCIe CONNECTED FAR MEMORY
CN112035381A (en) Storage system and storage data processing method
CN115033188B (en) Storage hardware acceleration module system based on ZNS solid state disk
CN101169760A (en) Electronic hard disk storage room management method
CN111399750B (en) Flash memory data writing method and computer readable storage medium
CN112181293B (en) Solid state disk controller, solid state disk, storage system and data processing method
CN102592670B (en) Method for writing data, Memory Controller and memorizer memory devices
CN207008602U (en) A kind of storage array control device based on Nand Flash memorizer multichannel
CN103092781A (en) Effective utilization of flash interface
CN102193804B (en) Loading method of drivers in single board and communication equipment
CN114490023B (en) ARM and FPGA-based high-energy physical computable storage device
CN114296638A (en) Storage and calculation integrated solid state disk controller, solid state disk, data storage system and method
CN102880553A (en) Read-write method of off-chip flash file system based on micro control unit (MCU)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant