CN113168552A

CN113168552A - Artificial intelligence application development system, computer device and storage medium

Info

Publication number: CN113168552A
Application number: CN201980066985.9A
Authority: CN
Inventors: 朱焱; 汤鉴; 姜浩; 蔡权雄; 牛昕宇
Original assignee: Shenzhen Corerain Technologies Co Ltd
Current assignee: Shenzhen Corerain Technologies Co Ltd
Priority date: 2019-08-21
Filing date: 2019-08-21
Publication date: 2021-07-23
Also published as: WO2021031137A1

Abstract

An artificial intelligence application development system (100), a computer device and a storage medium, which belong to the field of artificial intelligence, the system (100) comprises: the neural network simulation system comprises a neural network generation subsystem (101) for constructing and training and verifying a neural network model, a neural network hardware execution subsystem (102) for receiving data input into the neural network model and outputting a result after calculation by the neural network model, and a deployment subsystem (103) for compiling the neural network model generated by the neural network generation subsystem (101) and deploying the neural network model to the neural network hardware execution subsystem (102). A training neural network model is constructed through a visual neural network generation subsystem (101), and the trained neural network model is automatically deployed to a neural network hardware execution subsystem (102) through a deployment subsystem (103) to be executed, so that the threshold of artificial intelligence application development can be reduced, and the development efficiency is improved.

Description

Artificial intelligence application development system, computer device and storage medium

Technical Field

The present application relates to the field of artificial intelligence technology, and in particular, to an artificial intelligence application development system, a computer device, and a storage medium.

Background

At present, with the coming of big data age, data is growing explosively. In the face of mass data, compared with the traditional mode of manually extracting data features, the method is more inclined to adopt an artificial intelligence deep learning (neural network) technology capable of improving feature completeness, and can effectively avoid complexity and low efficiency of manually extracting. With the deep learning technology playing more and more important roles in a plurality of fields, such as image recognition, voice recognition, intelligent management and other fields, application scenes in a plurality of fields have more and more strict requirements on data annotation, algorithm model building, model training, algorithm deployment, performance of hardware equipment, power consumption and the like, so that the development skill requirements on application developers are high, a plurality of application developers are forbidden, especially for novices just stepping into the field, the consumed cost is high, and the development efficiency is low.

Disclosure of Invention

An object of the embodiments of the present application is to provide an artificial intelligence application development system, a computer device, and a storage medium, so as to reduce a threshold of artificial intelligence application development and improve development efficiency.

In order to solve the above technical problem, an embodiment of the present application provides an artificial intelligence application development system, which adopts the following technical solutions:

the artificial intelligence application development system comprises:

the neural network generation subsystem is used for constructing, training and verifying a neural network model;

the neural network hardware execution subsystem is used for receiving data input into the neural network model, and outputting a result after calculation by the neural network model;

and the deployment subsystem is used for compiling the neural network model generated by the neural network generation subsystem and then deploying the neural network model to the neural network hardware execution subsystem.

Further, the neural network generation subsystem is further configured to provide training data for the neural network model and label the training data.

Further, the neural network hardware execution subsystem is realized based on an FPGA.

Further, the deployment subsystem includes:

the compiling module is used for analyzing the neural network model and generating a structure file and a data file of the model;

the operation module is used for distributing hardware computing resources according to the structure file and the data file of the model;

and the driving module is used for calling corresponding hardware computing resources according to the distribution result of the operation module, and the hardware computing resources comprise the neural network hardware execution subsystem realized based on the FPGA.

Further, the allocating, by the execution module, hardware computing resources according to the structure file and the data file of the model includes:

acquiring the information of each computing node according to the structure file and the data file of the model;

allocating hardware computing resources to each computing node based on the information for each computing node.

Furthermore, the neural network hardware execution subsystem realized based on the FPGA comprises an FPGA core module and an extension module.

Further, the FPGA core module includes a core chip, a memory chip, an SAMTEC interface, and a JTAG interface.

Further, the extension module comprises a network interface, a UART port, a GPIO port, and an SAMTEC interface, and the FPGA core module and the extension module are connected and communicate via the SAMTEC interface.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:

the computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the functions of the artificial intelligence application development system when executing the computer program.

In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:

the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the functionality of the artificial intelligence application development system of any one of the embodiments set forth herein.

Compared with the prior art, the embodiment of the application mainly has the following beneficial effects: the system comprises a neural network generation subsystem for constructing, training and verifying a neural network model, a neural network hardware execution subsystem for receiving data input into the neural network model and outputting a result after calculation by the neural network model, and a deployment subsystem for compiling the neural network model generated by the neural network generation subsystem and deploying the neural network model to the neural network hardware execution subsystem. A training neural network model is constructed through a visual neural network generation subsystem, and the trained neural network model is automatically deployed to a neural network hardware execution subsystem through a deployment subsystem for execution, so that the threshold of artificial intelligence application development can be reduced, and the development efficiency is improved.

Drawings

In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.

FIG. 1 illustrates a schematic structural diagram of one embodiment of an artificial intelligence application development system 100 in accordance with the present application;

FIG. 2 illustrates a schematic structural diagram of one embodiment of a deployment subsystem 103 of the artificial intelligence application development system in accordance with the present application;

FIG. 3 illustrates a schematic structural diagram of one embodiment of the neural network hardware execution subsystem 102 of the artificial intelligence application development system in accordance with the present application;

FIG. 4 is a schematic block diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in FIG. 1, FIG. 1 illustrates a schematic structural diagram of one embodiment of an artificial intelligence application development system according to the present application. The artificial intelligence application development system 100 includes:

and the neural network generation subsystem 101 is used for constructing, training and verifying a neural network model.

The neural network model is constructed by two methods, one is a deep learning neural network algorithm automatically generated based on labeled data, and the other is a neural network algorithm model which can be customized by a user according to requirements; the neural network model training is to use the marked data to carry out iterative training on the built neural network algorithm model so as to make the loss value of the model converge to the minimum; the neural network model verification means that the trained neural network model is subjected to effect verification through verification data, a user can upload image data, voice data and the like as the input of the model, the result is output after the model is detected and recognized, and the effect of the model, the recognition accuracy rate, the recognition speed and the like can be verified. In this embodiment, the neural network generation subsystem 101 may provide a visual interface through a WEB (WEB) technology to help a developer to quickly develop a neural network model, and train and verify the model, that is, the developer accesses the interface provided by the neural network generation subsystem 101 through a WEB to obtain corresponding services, such as building the neural network model. The construction, training and verification of the visualized neural network model are provided through the webpage, and the development efficiency of developers can be improved.

And the neural network hardware execution subsystem 102 is used for receiving the data input into the neural network model, and outputting a result after calculation by the neural network model.

The neural network hardware execution subsystem 102 may be a general-purpose processor (such as a CPU, etc.) that has stored and can execute the neural network model 101, or may be a special-purpose processor (such as an FPGA, etc.) that solidifies the neural network model 101; the neural network hardware execution subsystem 102 may provide hardware computing resources, and also may provide a network interface or other interfaces to receive and store externally input data, and then input the data into the neural network model to perform computation, that is, to extract features, classify or cluster, regress or predict, and the like, so as to obtain a prediction or recognition result.

And the deployment subsystem 103 is used for compiling the neural network model generated by the neural network generation subsystem 101 and then deploying the neural network model to the neural network hardware execution subsystem 102.

The neural network model includes a neural network diagram (neural network structure) and parameters corresponding to the structure, wherein the structure of the neural network is based on layers as computing units, including but not limited to convolutional layers, pooling layers, ReLU (activation function), full connection layers, and the like. Each layer in the neural network structure has a number of parameters in addition to receiving the data stream output by the previous layer, including and not limited to: weight, bias, etc. In this embodiment, the neural network model is compiled into a model file (including a structure file and a data file of the model) by a compiler (e.g., TVM), and hardware resources required by the corresponding model, such as a computing unit and a cache unit, and a pipeline unit capable of performing timing optimization, are automatically allocated according to the model file, that is, the hardware resources are called from the neural network hardware execution subsystem 102 and then executed.

In an embodiment of the present invention, an artificial intelligence application development system is provided, which includes a neural network generation subsystem for constructing, training, and verifying a neural network model, a neural network hardware execution subsystem for receiving data input to the neural network model and outputting a result after calculation by the neural network model, and a deployment subsystem for compiling the neural network model generated by the neural network generation subsystem and deploying the neural network model to the neural network hardware execution subsystem. A training neural network model is built through a visual neural network generation subsystem, and the trained neural network model is automatically deployed to a neural network hardware execution subsystem through a deployment subsystem to be executed, so that the threshold of artificial intelligence application development can be reduced, and the development efficiency is improved.

Further, the neural network generation subsystem 101 is further configured to provide training data for the neural network model and label the training data.

In this embodiment, the neural network generation subsystem 101 may also provide a developer with functional modules such as a newly-built database, uploaded data, and labeled data, so as to prepare data for subsequent neural network model training, and enable the model to be trained more quickly through the labeled data.

Further, the neural network hardware execution subsystem 102 is implemented based on an FPGA.

Different from the fixed hardware structures of a GPU and an ASIC, the FPGA has programmability, and developers can connect logic blocks inside the FPGA through programming according to own needs, so that corresponding functions are free and flexible. In addition, the GPU acceleration design is an algorithm model adaptive hardware structure, while the FPGA acceleration design is a hardware structure adaptive algorithm model, namely a corresponding hardware structure is designed (or called) according to the algorithm model, and the acceleration design mode can be used for accelerating the deep learning neural network algorithm model more quickly. In addition, compared with a GPU, the FPGA has a better energy efficiency ratio. Although ASIC is superior to FPGA in performance and power consumption, it needs to undergo many verification and physical design during design and manufacture, resulting in a long development period, and meanwhile, ASIC is dedicated hardware designed for a certain class of applications and the hardware structure cannot be changed after generation, however, at present, deep learning neural network algorithm is in a fast development stage, and for some application scenarios that are widely used but the algorithm is not mature, it is very difficult to design a high-performance general ASIC to adapt to all application scenarios. The FPGA is more suitable for accelerating the deep learning neural network algorithm model in the rapid development stage at present. Therefore, the neural network hardware execution subsystem 102 in the embodiment may accelerate the execution efficiency of the deep learning neural network by using the FPGA.

Further, as shown in FIG. 2, FIG. 2 illustrates a schematic structural diagram of an embodiment of a deployment subsystem 103 of the artificial intelligence application development system according to the present application. The deployment subsystem 103 includes:

a compiling module 1031, configured to analyze the neural network model and generate a structure file and a data file of the model;

an operation module 1032 for allocating hardware computing resources according to the structure file and the data file of the model;

a driving module 1033, configured to invoke corresponding hardware computing resources according to the allocation result of the running module, where the hardware computing resources include the neural network hardware execution subsystem implemented based on the FPGA.

In this embodiment, the compiling module 1031 may invoke a neural network compiler (e.g., TVM, etc.) to analyze the neural network model according to the structure of the neural network model generated by the neural network generating subsystem 101, extract the network structure and the weight data of the model, and store the extracted network structure and weight data in a file, so as to obtain a structure file and a data file of the model, where the format of the file may be json or xml, etc.; the operation module 1032 can automatically allocate hardware computing resources according to the structure file and the data file of the neural network model, wherein the hardware computing resources include a computing unit, a cache unit, a pipeline unit capable of performing time sequence optimization and the like; then, the corresponding hardware computing resources provided by the neural network hardware execution subsystem 102 implemented by using the FPGA are called through the driving module 1033 to perform computation and output a computation result; the result of the neural network output is a feature value, which can be understood as an abstract representation of an input picture or data, and then the abstract representation, i.e., the feature value, is converted into meaningful output through some calculation methods, such as picture categories and corresponding probabilities in a classification problem, target categories, probabilities, and coordinates included in a picture in a detection problem, and the like. By deploying three subsystems of the subsystem 103, automatic compilation of the neural network model, flexible scheduling of hardware computing resources, and performance optimization can be achieved.

In this embodiment, the structure of the neural network model is a layer as a computing unit, and includes, but is not limited to, an input layer, a convolutional layer, a pooling layer, a ReLU (activation function), a full connection layer, etc., and different neural networks are combined by different types and different numbers of layers to form a neural network structure with different functions; each layer in the neural network structure has a number of parameters in addition to receiving the data stream output by the previous layer, including and not limited to: weight, bias, etc. The network structure and parameter data of the model can be stored through a file, read out as node information when each node of each layer is calculated, and according to the node information, hardware resources required by the corresponding node can be dynamically distributed, for example, according to the calculation function and the data type of the node, corresponding calculation units and storage units are distributed to perform calculation operation, and the calculation result is stored through a register cache unit, so that the next layer can be conveniently and quickly read, the data copying time is saved, the calculation speed of the neural network is accelerated, time sequence optimization can be performed on the calculation of the neural network through a pipeline unit, and the like, and therefore, the calculation efficiency of the neural network can be improved.

Further, as shown in fig. 3, fig. 3 is a schematic structural diagram of an embodiment of the neural network hardware execution subsystem 102 of the artificial intelligence application development system according to the present application. The FPGA-based neural network hardware execution subsystem 102 includes an FPGA core module 1021 and an expansion module 1022. The FPGA core module 1021 comprises a core chip 10211, a memory chip 10212, a SAMTEC interface 10214 and a 6-pin JTAG interface 10213; the expansion module 1022 includes a network interface 10222, a 3-pin UART port 10223, a 40-pin GPIO port 10224, and a SAMTEC interface 10221, and the FPGA core module 1021 and the expansion module 1022 are connected and communicate through the SAMTEC interface 10214 of the core module 1021 and the SAMTEC interface 10221 of the expansion module 1022.

In this embodiment, the core chip is used to provide computing resources and implement the computation of the neural network, and the intel aria 10Soc FPGA may be used as the core chip; the memory chip is used for storing parameter data such as weight of the neural network, intermediate calculation data and the like; the JTAG interface may be used for data transfer between the core module 1021 and other devices, such as may be used to download an initial program for the FPGA. The network interface of the expansion module 1022 is used for communicating with an upper computer, downloading programs, transmitting data, and the like, and for example, may be used for acquiring data input into the neural network model through a network, and the network interface may be an RJ45 ethernet interface (an RJ45 may be replaced by a USB-C, USB port, and the universality of the expansion interface is high); the UART port is used for debugging the extension module 1022 and printing related debugging information; the GPIO port may provide an additional I/O interface for remote serial communication or control, for example, a camera or a microphone may be controlled through the GPIO port; the core module 1021 and the expansion module 1022 are connected and communicate through the SAMTEC interface, so that the core module 1021 can call resources of the expansion module 1022 to implement a corresponding function.

It will be understood by those skilled in the art that all or part of the subsystems in the system implementing the embodiments described above may be implemented by using a computer program to instruct associated hardware, where the computer program may be stored in a computer-readable storage medium, and when the computer program is executed, the functions of the embodiments including the subsystems described above may be implemented. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

It should be understood that, although the respective subsystems in the structural diagram of the drawings are sequentially shown as indicated by arrows, the subsystems are not necessarily sequentially executed in the order indicated by the arrows. The execution of these subsystems is not strictly sequential, and may be performed in other sequences unless explicitly stated otherwise herein. Moreover, at least a portion of the subsystems in the schematic block diagrams of the figures may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of execution is not necessarily sequential, but may be alternated or performed with other steps or at least a portion of the sub-steps or stages of other steps.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 2 comprises a memory 21, a processor 22, a network interface 23, communicatively connected to each other by a system bus. It is noted that only a computer device 2 having components 21-23 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.

The memory 21 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device 2. Of course, the memory 21 may also comprise both an internal storage unit of the computer device 2 and an external storage device thereof. In this embodiment, the memory 21 is generally used for storing an operating system installed in the computer device 2 and various types of application software, such as program codes of an artificial intelligence application development system. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to run the program code stored in the memory 21 or process data, for example, run the program code of the artificial intelligence application development system.

The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is generally used for establishing communication connection between the computer device 2 and other electronic devices.

The present application further provides another embodiment, which is to provide a computer readable storage medium, wherein the computer readable storage medium stores a program of an artificial intelligence application development system, and the program of the artificial intelligence application development system can be executed by at least one processor, so that the at least one processor executes the steps of the program of the artificial intelligence application development system, and realizes the corresponding functions.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims

An artificial intelligence application development system, comprising:

the neural network generation subsystem is used for constructing, training and verifying a neural network model;

the neural network hardware execution subsystem is used for receiving data input into the neural network model, and outputting a result after calculation by the neural network model;

and the deployment subsystem is used for compiling the neural network model generated by the neural network generation subsystem and then deploying the neural network model to the neural network hardware execution subsystem.
The system of claim 1, wherein the neural network generation subsystem is further configured to provide training data for the neural network model and label the training data.
The system of claim 1, wherein the neural network hardware execution subsystem is implemented based on an FPGA.
The system of claim 3, wherein the deployment subsystem comprises:

the compiling module is used for analyzing the neural network model and generating a structure file and a data file of the model;

the operation module is used for distributing hardware computing resources according to the structure file and the data file of the model;

and the driving module is used for calling corresponding hardware computing resources according to the distribution result of the operation module, and the hardware computing resources comprise the neural network hardware execution subsystem realized based on the FPGA.
The system of claim 4, wherein the configuration files and data files of the runtime module according to the model allocate hardware computing resources, comprising:

acquiring the information of each computing node according to the structure file and the data file of the model;

allocating hardware computing resources to each computing node based on the information for each computing node.
The system of claim 5, wherein the FPGA-based neural network hardware execution subsystem includes an FPGA core module and an extension module.
The system of claim 6, wherein the FPGA core module includes a core chip, a memory chip, a SAMTEC interface, and a JTAG interface.
The system of claim 7, wherein the expansion module includes a network interface, a UART port, a GPIO port, and a SAMTEC interface through which the FPGA core module and expansion module connect and communicate.
A computer device comprising a memory having stored therein a computer program and a processor that, when executed, implements the functionality of an artificial intelligence application development system as claimed in any one of claims 1 to 8.
A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, realizes the functions of the artificial intelligence application development system according to any one of claims 1 to 8.