WO2019136755A1 - Method and system for optimizing design model of artificial intelligence processing device, storage medium, and terminal - Google Patents

Method and system for optimizing design model of artificial intelligence processing device, storage medium, and terminal Download PDF

Info

Publication number
WO2019136755A1
WO2019136755A1 PCT/CN2018/072668 CN2018072668W WO2019136755A1 WO 2019136755 A1 WO2019136755 A1 WO 2019136755A1 CN 2018072668 W CN2018072668 W CN 2018072668W WO 2019136755 A1 WO2019136755 A1 WO 2019136755A1
Authority
WO
WIPO (PCT)
Prior art keywords
artificial intelligence
processing device
network model
intelligence processing
data
Prior art date
Application number
PCT/CN2018/072668
Other languages
French (fr)
Chinese (zh)
Inventor
肖梦秋
Original Assignee
深圳鲲云信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳鲲云信息科技有限公司 filed Critical 深圳鲲云信息科技有限公司
Priority to PCT/CN2018/072668 priority Critical patent/WO2019136755A1/en
Publication of WO2019136755A1 publication Critical patent/WO2019136755A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention relates to the technical field of software processing, and in particular, to an artificial intelligence processing device design model optimization method, system, storage medium, and terminal.
  • Deep learning stems from the study of artificial neural networks.
  • a multilayer perceptron with multiple hidden layers is a deep learning structure. Deep learning combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data.
  • Deep learning is a method based on the representation of data in machine learning. Observations (e.g., an image) can be represented in a variety of ways, such as a vector of each pixel intensity value, or more abstractly represented as a series of edges, regions of a particular shape, and the like. It is easier to learn tasks from instances (eg, face recognition or facial expression recognition) using some specific representation methods.
  • the advantage of deep learning is the use of unsupervised or semi-supervised feature learning and hierarchical feature extraction efficient algorithms instead of manual acquisition features.
  • CNN Convolutional Neural Networks
  • DBN Deep Belief Nets
  • CNN has become one of the research hotspots in many scientific fields, especially in the field of pattern classification. Since the network avoids the complicated pre-processing of images, it can directly input the original image, and thus has been widely used.
  • the basic structure of the CNN includes two layers, one of which is a feature extraction layer, and the input of each neuron is connected to the local acceptance domain of the previous layer, and the local features are extracted. Once the local feature is extracted, its positional relationship with other features is also determined; the second is the feature mapping layer, each computing layer of the network is composed of multiple feature maps, and each feature map is a plane. The weights of all neurons on the plane are equal.
  • the feature mapping structure uses a small sigmoid function that affects the function kernel as the activation function of the convolutional network, so that the feature map has displacement invariance. In addition, since the neurons on one mapping surface share weights, the number of network free parameters is reduced.
  • Each convolutional layer in the convolutional neural network is followed by a computational layer for local averaging and quadratic extraction. This unique two-feature extraction structure reduces feature resolution.
  • CNN is mainly used to identify two-dimensional graphics of displacement, scaling and other forms of distortion invariance. Since the feature detection layer of the CNN learns through the training data, when the CNN is used, the feature extraction of the display is avoided, and the learning data is implicitly learned; and the weights of the neurons on the same feature mapping surface are the same. So the network can learn in parallel, which is also a big advantage of the convolutional network relative to the neural network connected to each other.
  • the convolutional neural network has unique advantages in speech recognition and image processing with its special structure of local weight sharing. Its layout is closer to the actual biological neural network, and weight sharing reduces the complexity of the network, especially multidimensional.
  • the feature that the input vector image can be directly input into the network avoids the complexity of data reconstruction during feature extraction and classification.
  • an object of the present invention is to provide an artificial intelligence processing device design model optimization method, system, storage medium, and terminal, which can be processed in artificial intelligence by optimizing the depth learning algorithm. Run on the device.
  • an artificial intelligence processing device design model optimization method which includes the following steps: solidifying the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device; based on the artificial intelligence processing device The recognition accuracy rate quantifies the deep learning network model data after the solidification; and generates a deep learning data map according to the solidified network model data after the solidification and the quantized depth learning network model data.
  • the method further includes performing the evaluation of the deep learning network model according to the solidified network model data after the solidification and the quantized depth learning network model data.
  • the deep learning network model data is 32 bit fixed point data or 32 bit floating point data.
  • the deep learning network model adopts a Tensorflow training model.
  • the present invention provides an artificial intelligence processing device design model optimization system, including a curing module, a quantization module, and a generation module;
  • the curing module is configured to solidify the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device
  • the quantization module is configured to quantize the solidified network model data after curing based on a recognition accuracy of the artificial intelligence processing device;
  • the generating module is configured to generate a depth learning data map according to the solidified network model data after the solidification and the quantized depth learning network model data.
  • the method further includes an evaluation module, configured to perform the evaluation of the deep learning network model according to the solidified network model data after the solidification and the quantized depth learning network model data.
  • the deep learning network model data is 32 bit fixed point data or 32 bit floating point data.
  • the deep learning network model adopts a Tensorflow training model.
  • the present invention provides a storage medium having stored thereon a computer program that, when executed by a processor, implements the above-described artificial intelligence processing device design model optimization method.
  • the present invention provides a terminal, including: a processor and a memory;
  • the memory is for storing a computer program
  • the processor is configured to execute the computer program stored in the memory to cause the terminal to execute the artificial intelligence processing device design model optimization method.
  • the artificial intelligence processing device design model optimization method, system, storage medium, and terminal of the present invention have the following beneficial effects:
  • FIG. 1 is a flow chart showing an optimization method of a design method of an artificial intelligence processing device according to an embodiment of the present invention
  • FIG. 2 is a schematic structural view of an artificial intelligence processing device design model optimization system according to an embodiment of the present invention
  • FIG. 3 is a schematic structural view of a terminal according to an embodiment of the present invention.
  • the artificial intelligence processing device design model optimization method, system, storage medium and terminal of the invention optimize the deep learning algorithm to enable it to run on the artificial intelligence processing device, and the optimization efficiency is high and the utility is strong.
  • the artificial intelligence processing device design model optimization method of the present invention comprises the following steps:
  • Step S1 solidifying the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device.
  • the deep learning network model data is 32 bit fixed point data or 32 bit floating point data. Of course, you can also extend the data width to suit different needs.
  • the artificial intelligence processing device comprises a programmable logic device such as an FPGA.
  • the deep learning network data is first subjected to a hardening operation.
  • curing ie, freezing
  • the graph structure of the deep learning network model and the weight of the model are solidified together.
  • Step S2 Quantify the solidified network model data after curing based on the recognition accuracy of the artificial intelligence processing device.
  • quantization refers to the process of approximating a continuous value of a signal (or a large number of possible discrete values) to a finite number (or fewer) of discrete values. Quantization is mainly used in the conversion from continuous signals to digital signals. The continuous signal is sampled into a discrete signal, and the discrete signal is quantized to become a digital signal. Note that discrete signals do not usually require a quantized process, but may not be discrete in the range or require a quantized process.
  • the present invention quantizes the solidified learning network model data after curing using a certain quantization algorithm.
  • the quantification belongs to the mature prior art, and therefore will not be described herein.
  • Step S3 Generate a depth learning data map according to the solidified network model data after the solidification and the quantized deep learning network model data.
  • the deep learning data map is further generated according to the solidified network model data after the solidification and the quantized depth learning network model data.
  • a Tensorflow map is generated.
  • Tensorflow is Google's second-generation artificial intelligence learning system based on DistBelief. Its name is derived from its operating principle.
  • Tensor means an N-dimensional array.
  • Flow means that based on the calculation of the data flow graph, Tensorflow flows from one end of the flow graph to the other.
  • Tensorflow is a system that transmits complex data structures to an artificial intelligence neural network for analysis and processing.
  • the artificial intelligence processing device design model optimization method of the present invention further includes performing the deep learning network model according to the solidified network model data after the solidification and the quantized depth learning network model data. evaluation of.
  • the deep learning network is evaluated to determine whether it is adapted to the artificial intelligence processing device, and when the two are not adapted, it can be corrected by adjusting the curing and/or quantization algorithm.
  • the artificial intelligence processing device design model optimization system of the present invention includes a curing module 21, a quantization module 22, and a generation module 23.
  • the curing module 21 is configured to solidify the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device.
  • the deep learning network model data is 32 bit fixed point data or 32 bit floating point data. Of course, you can also extend the data width to suit different needs.
  • the artificial intelligence processing device comprises a programmable logic device such as an FPGA.
  • the deep learning network data is first subjected to a hardening operation.
  • curing ie, freezing
  • the graph structure of the deep learning network model and the weight of the model are solidified together.
  • the quantization module 22 is coupled to the curing module 21 for quantifying the solidified network model data after curing based on the recognition accuracy of the artificial intelligence processing device.
  • quantization refers to the process of approximating a continuous value of a signal (or a large number of possible discrete values) to a finite number (or fewer) of discrete values. Quantization is mainly used in the conversion from continuous signals to digital signals. The continuous signal is sampled into a discrete signal, and the discrete signal is quantized to become a digital signal. Note that discrete signals do not usually require a quantized process, but may not be discrete in the range or require a quantized process.
  • the present invention quantizes the solidified learning network model data after curing using a certain quantization algorithm.
  • the quantification belongs to the mature prior art, and therefore will not be described herein.
  • the generating module 23 is connected to the curing module 21 and the quantization module 22, and configured to generate a depth learning data map according to the solidified network model data after the solidification and the quantized depth learning network model data.
  • the deep learning data map is further generated according to the solidified network model data after the solidification and the quantized depth learning network model data.
  • a Tensorflow map is generated.
  • Tensorflow is Google's second-generation artificial intelligence learning system based on DistBelief. Its name is derived from its operating principle.
  • Tensor means an N-dimensional array.
  • Flow means that based on the calculation of the data flow graph, Tensorflow flows from one end of the flow graph to the other.
  • Tensorflow is a system that transmits complex data structures to an artificial intelligence neural network for analysis and processing.
  • the artificial intelligence processing device design model optimization system of the present invention further includes an evaluation module, configured to perform, according to the solidified network model data after the solidification and the quantized depth learning network model data.
  • An assessment of the deep learning network model is evaluated to determine whether it is adapted to the artificial intelligence processing device, and when the two are not adapted, it can be corrected by adjusting the curing and/or quantization algorithm.
  • each module of the above system is only a division of logical functions, and the actual implementation may be integrated into one physical entity in whole or in part, or may be physically separated.
  • these modules can all be implemented by software in the form of processing component calls; or all of them can be implemented in hardware form; some modules can be realized by processing component calling software, and some modules are realized by hardware.
  • the x module may be a separately set processing element, or may be integrated in one of the above-mentioned devices, or may be stored in the memory of the above device in the form of program code, by a processing element of the above device. Call and execute the functions of the above x modules.
  • the implementation of other modules is similar.
  • all or part of these modules can be integrated or implemented independently.
  • the processing elements described herein can be an integrated circuit with signal processing capabilities. In the implementation process, each step of the above method or each of the above modules may be completed by an integrated logic circuit of hardware in the processor element or an instruction in a form of software.
  • the above modules may be one or more integrated circuits configured to implement the above method, for example, one or more specific integrated circuits (ASICs), or one or more microprocessors (digitalsingnal processors, referred to as DSP), or one or more Field Programmable Gate Arrays (FPGAs).
  • ASICs application specific integrated circuits
  • DSP digital signal processors
  • FPGAs Field Programmable Gate Arrays
  • the processing component may be a general-purpose processor, such as a central processing unit (CPU) or other processor that can call the program code.
  • these modules can be integrated and implemented in the form of a system-on-a-chip (SOC).
  • SOC system-on-a-chip
  • the storage medium of the present invention stores a computer program, and when the program is executed by the processor, the artificial intelligence processing device design model optimization method is implemented.
  • the storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • the terminal of the present invention includes a processor 31 and a memory 32.
  • the memory 32 is used to store a computer program.
  • the memory 32 includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • the processor 31 is coupled to the memory 32 for executing a computer program stored by the memory 32 to cause the terminal to execute the artificial intelligence processing device design model optimization method.
  • the processor 31 may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP for short), and the like; or a digital signal processor (DSP), dedicated integration.
  • CPU central processing unit
  • NP Network Processor
  • DSP digital signal processor
  • Circuit ApplicationSpecific Integrated Circuit, ASIC for short
  • FPGA Field-Programmable Gate Array
  • FPGA field-Programmable Gate Array
  • the artificial intelligence processing device design model optimization method, system, storage medium, and terminal of the present invention optimize the deep learning algorithm to enable it to run on the artificial intelligence processing device; the optimization efficiency is high and the utility is strong. . Therefore, the present invention effectively overcomes various shortcomings in the prior art and has high industrial utilization value.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A method and system for optimizing a design model of an artificial intelligence processing device, a storage medium, and a terminal. The method comprises the following steps: solidifying deep learning network model data on the basis of a recognition accuracy of an artificial intelligence processing device (S1); quantifying the solidified deep learning network model data on the basis of the recognition accuracy of the artificial intelligence processing device (S2); and generating a deep learning data diagram according to the solidified deep learning network model data and the quantified deep learning network model data (S3). According to the method and system for optimizing the design model of the artificial intelligence processing device, the storage medium, and the terminal, a deep learning algorithm is optimized so as to be run on the artificial intelligence processing device.

Description

人工智能处理装置设计模型优化方法、系统、存储介质、终端Artificial intelligence processing device design model optimization method, system, storage medium, terminal 技术领域Technical field
本发明涉及软件处理的技术领域,特别是涉及一种人工智能处理装置设计模型优化方法、系统、存储介质、终端。The present invention relates to the technical field of software processing, and in particular, to an artificial intelligence processing device design model optimization method, system, storage medium, and terminal.
背景技术Background technique
深度学习的概念源于人工神经网络的研究。含多隐层的多层感知器就是一种深度学习结构。深度学习通过组合低层特征形成更加抽象的高层表示属性类别或特征,以发现数据的分布式特征表示。The concept of deep learning stems from the study of artificial neural networks. A multilayer perceptron with multiple hidden layers is a deep learning structure. Deep learning combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data.
深度学习是机器学习中一种基于对数据进行表征学习的方法。观测值(例如一幅图像)可以使用多种方式来表示,如每个像素强度值的向量,或者更抽象地表示成一系列边、特定形状的区域等。而使用某些特定的表示方法更容易从实例中学习任务(例如,人脸识别或面部表情识别)。深度学习的好处是用非监督式或半监督式的特征学习和分层特征提取高效算法来替代手工获取特征。Deep learning is a method based on the representation of data in machine learning. Observations (e.g., an image) can be represented in a variety of ways, such as a vector of each pixel intensity value, or more abstractly represented as a series of edges, regions of a particular shape, and the like. It is easier to learn tasks from instances (eg, face recognition or facial expression recognition) using some specific representation methods. The advantage of deep learning is the use of unsupervised or semi-supervised feature learning and hierarchical feature extraction efficient algorithms instead of manual acquisition features.
同机器学习方法一样,深度机器学习方法也有监督学习与无监督学习之分.不同的学习框架下建立的学习模型很是不同.例如,卷积神经网络(Convolutional neural networks,CNN)就是一种深度的监督学习下的机器学习模型,而深度置信网(Deep Belief Nets,DBN)就是一种无监督学习下的机器学习模型。Like machine learning methods, deep machine learning methods also have supervised learning and unsupervised learning. The learning models established under different learning frameworks are very different. For example, Convolutional Neural Networks (CNN) is a kind of depth. The machine learning model under the supervision of learning, and Deep Belief Nets (DBN) is a machine learning model under unsupervised learning.
目前,CNN已经成为众多科学领域的研究热点之一,特别是在模式分类领域,由于该网络避免了对图像的复杂前期预处理,可以直接输入原始图像,因而得到了更为广泛的应用。一般地,CNN的基本结构包括两层,其一为特征提取层,每个神经元的输入与前一层的局部接受域相连,并提取该局部的特征。一旦该局部特征被提取后,它与其它特征间的位置关系也随之确定下来;其二是特征映射层,网络的每个计算层由多个特征映射组成,每个特征映射是一个平面,平面上所有神经元的权值相等。特征映射结构采用影响函数核小的sigmoid函数作为卷积网络的激活函数,使得特征映射具有位移不变性。此外,由于一个映射面上的神经元共享权值,因而减少了网络自由参数的个数。卷积神经网络中的每一个卷积层都紧跟着一个用来求局部平均与二次提取的计算层,这种特有的两次特征提取结构减小了特征分辨率。At present, CNN has become one of the research hotspots in many scientific fields, especially in the field of pattern classification. Since the network avoids the complicated pre-processing of images, it can directly input the original image, and thus has been widely used. Generally, the basic structure of the CNN includes two layers, one of which is a feature extraction layer, and the input of each neuron is connected to the local acceptance domain of the previous layer, and the local features are extracted. Once the local feature is extracted, its positional relationship with other features is also determined; the second is the feature mapping layer, each computing layer of the network is composed of multiple feature maps, and each feature map is a plane. The weights of all neurons on the plane are equal. The feature mapping structure uses a small sigmoid function that affects the function kernel as the activation function of the convolutional network, so that the feature map has displacement invariance. In addition, since the neurons on one mapping surface share weights, the number of network free parameters is reduced. Each convolutional layer in the convolutional neural network is followed by a computational layer for local averaging and quadratic extraction. This unique two-feature extraction structure reduces feature resolution.
CNN主要用来识别位移、缩放及其他形式扭曲不变性的二维图形。由于CNN的特征检测层通过训练数据进行学习,所以在使用CNN时,避免了显示的特征抽取,而隐式地从训 练数据中进行学习;再者由于同一特征映射面上的神经元权值相同,所以网络可以并行学习,这也是卷积网络相对于神经元彼此相连网络的一大优势。卷积神经网络以其局部权值共享的特殊结构在语音识别和图像处理方面有着独特的优越性,其布局更接近于实际的生物神经网络,权值共享降低了网络的复杂性,特别是多维输入向量的图像可以直接输入网络这一特点避免了特征提取和分类过程中数据重建的复杂度。CNN is mainly used to identify two-dimensional graphics of displacement, scaling and other forms of distortion invariance. Since the feature detection layer of the CNN learns through the training data, when the CNN is used, the feature extraction of the display is avoided, and the learning data is implicitly learned; and the weights of the neurons on the same feature mapping surface are the same. So the network can learn in parallel, which is also a big advantage of the convolutional network relative to the neural network connected to each other. The convolutional neural network has unique advantages in speech recognition and image processing with its special structure of local weight sharing. Its layout is closer to the actual biological neural network, and weight sharing reduces the complexity of the network, especially multidimensional. The feature that the input vector image can be directly input into the network avoids the complexity of data reconstruction during feature extraction and classification.
因此,如何将深度学习算法进行优化使其能够在硬件上实现成为当前的热点研究课题之一。Therefore, how to optimize the deep learning algorithm to enable it to become one of the current hot research topics in hardware.
发明内容Summary of the invention
鉴于以上所述现有技术的缺点,本发明的目的在于提供一种人工智能处理装置设计模型优化方法、系统、存储介质、终端,通过对深度学习算法进行优化处理,使其能够在人工智能处理装置上运行。In view of the above disadvantages of the prior art, an object of the present invention is to provide an artificial intelligence processing device design model optimization method, system, storage medium, and terminal, which can be processed in artificial intelligence by optimizing the depth learning algorithm. Run on the device.
为实现上述目的及其他相关目的,本发明提供一种人工智能处理装置设计模型优化方法,包括以下步骤:基于人工智能处理装置的识别准确率对深度学习网络模型数据进行固化;基于人工智能处理装置的识别准确率对固化后的所述深度学习网络模型数据进行量化;根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据生成深度学习数据图。To achieve the above and other related objects, the present invention provides an artificial intelligence processing device design model optimization method, which includes the following steps: solidifying the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device; based on the artificial intelligence processing device The recognition accuracy rate quantifies the deep learning network model data after the solidification; and generates a deep learning data map according to the solidified network model data after the solidification and the quantized depth learning network model data.
于本发明一实施例中,还包括根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据进行所述深度学习网络模型的评估。In an embodiment of the invention, the method further includes performing the evaluation of the deep learning network model according to the solidified network model data after the solidification and the quantized depth learning network model data.
于本发明一实施例中,所述深度学习网络模型数据为32bit定点数据或32bit浮点数据。In an embodiment of the invention, the deep learning network model data is 32 bit fixed point data or 32 bit floating point data.
于本发明一实施例中,所述深度学习网络模型采用Tensorflow训练模型。In an embodiment of the invention, the deep learning network model adopts a Tensorflow training model.
对应地,本发明提供一种人工智能处理装置设计模型优化系统,包括固化模块、量化模块和生成模块;Correspondingly, the present invention provides an artificial intelligence processing device design model optimization system, including a curing module, a quantization module, and a generation module;
所述固化模块用于基于人工智能处理装置的识别准确率对深度学习网络模型数据进行固化;The curing module is configured to solidify the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device;
所述量化模块用于基于人工智能处理装置的识别准确率对固化后的所述深度学习网络模型数据进行量化;The quantization module is configured to quantize the solidified network model data after curing based on a recognition accuracy of the artificial intelligence processing device;
所述生成模块用于根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据生成深度学习数据图。The generating module is configured to generate a depth learning data map according to the solidified network model data after the solidification and the quantized depth learning network model data.
于本发明一实施例中,还包括评估模块,用于根据固化后的所述深度学习网络模型数据 和量化后的所述深度学习网络模型数据进行所述深度学习网络模型的评估。In an embodiment of the invention, the method further includes an evaluation module, configured to perform the evaluation of the deep learning network model according to the solidified network model data after the solidification and the quantized depth learning network model data.
于本发明一实施例中,所述深度学习网络模型数据为32bit定点数据或32bit浮点数据。In an embodiment of the invention, the deep learning network model data is 32 bit fixed point data or 32 bit floating point data.
于本发明一实施例中,所述深度学习网络模型采用Tensorflow训练模型。In an embodiment of the invention, the deep learning network model adopts a Tensorflow training model.
本发明提供一种存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述人工智能处理装置设计模型优化方法。The present invention provides a storage medium having stored thereon a computer program that, when executed by a processor, implements the above-described artificial intelligence processing device design model optimization method.
最后,本发明提供一种终端,包括:处理器及存储器;Finally, the present invention provides a terminal, including: a processor and a memory;
所述存储器用于存储计算机程序;The memory is for storing a computer program;
所述处理器用于执行所述存储器存储的计算机程序,以使所述终端执行上述人工智能处理装置设计模型优化方法。The processor is configured to execute the computer program stored in the memory to cause the terminal to execute the artificial intelligence processing device design model optimization method.
如上所述,本发明的人工智能处理装置设计模型优化方法、系统、存储介质、终端,具有以下有益效果:As described above, the artificial intelligence processing device design model optimization method, system, storage medium, and terminal of the present invention have the following beneficial effects:
(1)通过对深度学习算法进行优化处理,使其能够在人工智能处理装置上运行;(1) By optimizing the depth learning algorithm, it can be run on the artificial intelligence processing device;
(2)优化效率高,实用性强。(2) The optimization efficiency is high and the practicability is strong.
附图说明DRAWINGS
图1显示为本发明的人工智能处理装置设计模型优化方法于一实施例中的流程图;1 is a flow chart showing an optimization method of a design method of an artificial intelligence processing device according to an embodiment of the present invention;
图2显示为本发明的人工智能处理装置设计模型优化系统于一实施例中的结构示意图;2 is a schematic structural view of an artificial intelligence processing device design model optimization system according to an embodiment of the present invention;
图3显示为本发明的终端于一实施例中的结构示意图。FIG. 3 is a schematic structural view of a terminal according to an embodiment of the present invention.
元件标号说明Component label description
21                     固化模块21 curing module
22                     量化模块22 Quantization module
23                     生成模块23 generation module
31                     处理器31 processor
32                     存储器32 memory
具体实施方式Detailed ways
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。The embodiments of the present invention are described below by way of specific examples, and those skilled in the art can readily understand other advantages and effects of the present invention from the disclosure of the present disclosure. The present invention may be embodied or applied in various other specific embodiments, and various modifications and changes can be made without departing from the spirit and scope of the invention.
需要说明的是,本实施例中所提供的图示仅以示意方式说明本发明的基本构想,遂图式中仅显示与本发明中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制,其实际实施时各组件的型态、数量及比例可为一种随意的改变,且其组件布局型态也可能更为复杂。It should be noted that the illustrations provided in the present embodiment merely illustrate the basic concept of the present invention in a schematic manner, and only the components related to the present invention are shown in the drawings, instead of the number and shape of components in actual implementation. Dimensional drawing, the actual type of implementation of each component's type, number and proportion can be a random change, and its component layout can be more complicated.
本发明的人工智能处理装置设计模型优化方法、系统、存储介质、终端通过对深度学习算法进行优化处理,使其能够在人工智能处理装置上运行,优化效率高,实用性强。The artificial intelligence processing device design model optimization method, system, storage medium and terminal of the invention optimize the deep learning algorithm to enable it to run on the artificial intelligence processing device, and the optimization efficiency is high and the utility is strong.
如图1所示,于一实施例中,本发明的人工智能处理装置设计模型优化方法包括以下步骤:As shown in FIG. 1 , in an embodiment, the artificial intelligence processing device design model optimization method of the present invention comprises the following steps:
步骤S1、基于人工智能处理装置的识别准确率对深度学习网络模型数据进行固化。Step S1: solidifying the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device.
于本发明一实施例中,所述深度学习网络模型数据为32bit定点数据或32bit浮点数据。当然,也可以扩展数据宽度以适应不同的需求。In an embodiment of the invention, the deep learning network model data is 32 bit fixed point data or 32 bit floating point data. Of course, you can also extend the data width to suit different needs.
于本发明一实施例中,所述人工智能处理装置包括可编程逻辑器件,如FPGA。In an embodiment of the invention, the artificial intelligence processing device comprises a programmable logic device such as an FPGA.
具体地,为了使得深度学习网络模型数据能够与人工智能处理装置的识别准确率相适配,需要对所述深度学习网络模型数据进行精度压缩。因此,首先对所述深度学习网络数据进行固化操作。Specifically, in order to enable the deep learning network model data to be adapted to the recognition accuracy of the artificial intelligence processing device, it is necessary to perform precision compression on the deep learning network model data. Therefore, the deep learning network data is first subjected to a hardening operation.
具体地,固化,即freeze,表示将深度学习网络模型的图结构和该模型的权重固化到一起。Specifically, curing, ie, freezing, means that the graph structure of the deep learning network model and the weight of the model are solidified together.
步骤S2、基于人工智能处理装置的识别准确率对固化后的所述深度学习网络模型数据进行量化。Step S2: Quantify the solidified network model data after curing based on the recognition accuracy of the artificial intelligence processing device.
具体地,对于固化后的所述深度学习网络模型数据,还需进行进一步的量化处理。Specifically, for the deep learning network model data after curing, further quantization processing is required.
在数字信号处理领域,量化指将信号的连续取值(或者大量可能的离散取值)近似为有限多个(或较少的)离散值的过程。量化主要应用于从连续信号到数字信号的转换中。连续信号经过采样成为离散信号,离散信号经过量化即成为数字信号。注意离散信号通常情况下并不需要经过量化的过程,但可能在值域上并不离散,还是需要经过量化的过程。In the field of digital signal processing, quantization refers to the process of approximating a continuous value of a signal (or a large number of possible discrete values) to a finite number (or fewer) of discrete values. Quantization is mainly used in the conversion from continuous signals to digital signals. The continuous signal is sampled into a discrete signal, and the discrete signal is quantized to become a digital signal. Note that discrete signals do not usually require a quantized process, but may not be discrete in the range or require a quantized process.
具体地,本发明采用一定的量化算法对固化后的所述深度学习网络模型数据进行量化。对于本领域技术人员而言,量化属于成熟的现有技术,故在此不再赘述。Specifically, the present invention quantizes the solidified learning network model data after curing using a certain quantization algorithm. For those skilled in the art, the quantification belongs to the mature prior art, and therefore will not be described herein.
步骤S3、根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据生成深度学习数据图。Step S3: Generate a depth learning data map according to the solidified network model data after the solidification and the quantized deep learning network model data.
具体地,为了生成与人工智能处理装置相适配的数据,还需根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据生成深度学习数据图。例如,当所述深 度学习网络模型采用Tensorflow训练模型时,生成Tensorflow图。Tensorflow是谷歌基于DistBelief进行研发的第二代人工智能学习系统,其命名来源于本身的运行原理。Tensor(张量)意味着N维数组,Flow(流)意味着基于数据流图的计算,Tensorflow为张量从流图的一端流动到另一端计算过程。Tensorflow是将复杂的数据结构传输至人工智能神经网中进行分析和处理过程的系统。Specifically, in order to generate data adapted to the artificial intelligence processing device, the deep learning data map is further generated according to the solidified network model data after the solidification and the quantized depth learning network model data. For example, when the deep learning network model adopts the Tensorflow training model, a Tensorflow map is generated. Tensorflow is Google's second-generation artificial intelligence learning system based on DistBelief. Its name is derived from its operating principle. Tensor means an N-dimensional array. Flow means that based on the calculation of the data flow graph, Tensorflow flows from one end of the flow graph to the other. Tensorflow is a system that transmits complex data structures to an artificial intelligence neural network for analysis and processing.
于本发明一实施例中,本发明的人工智能处理装置设计模型优化方法还包括根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据进行所述深度学习网络模型的评估。通过对所述深度学习网络进行评估,以判断其是否与所述人工智能处理装置相适配,当二者不适配时,可通过调整固化和/或量化算法来修正。In an embodiment of the present invention, the artificial intelligence processing device design model optimization method of the present invention further includes performing the deep learning network model according to the solidified network model data after the solidification and the quantized depth learning network model data. evaluation of. The deep learning network is evaluated to determine whether it is adapted to the artificial intelligence processing device, and when the two are not adapted, it can be corrected by adjusting the curing and/or quantization algorithm.
如图2所示,于一实施例中,本发明的人工智能处理装置设计模型优化系统包括固化模块21、量化模块22和生成模块23。As shown in FIG. 2, in an embodiment, the artificial intelligence processing device design model optimization system of the present invention includes a curing module 21, a quantization module 22, and a generation module 23.
固化模块21用于基于人工智能处理装置的识别准确率对深度学习网络模型数据进行固化。The curing module 21 is configured to solidify the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device.
于本发明一实施例中,所述深度学习网络模型数据为32bit定点数据或32bit浮点数据。当然,也可以扩展数据宽度以适应不同的需求。In an embodiment of the invention, the deep learning network model data is 32 bit fixed point data or 32 bit floating point data. Of course, you can also extend the data width to suit different needs.
于本发明一实施例中,所述人工智能处理装置包括可编程逻辑器件,如FPGA。In an embodiment of the invention, the artificial intelligence processing device comprises a programmable logic device such as an FPGA.
具体地,为了使得深度学习网络模型数据能够与人工智能处理装置的识别准确率相适配,需要对所述深度学习网络模型数据进行精度压缩。因此,首先对所述深度学习网络数据进行固化操作。Specifically, in order to enable the deep learning network model data to be adapted to the recognition accuracy of the artificial intelligence processing device, it is necessary to perform precision compression on the deep learning network model data. Therefore, the deep learning network data is first subjected to a hardening operation.
具体地,固化,即freeze,表示将深度学习网络模型的图结构和该模型的权重固化到一起。Specifically, curing, ie, freezing, means that the graph structure of the deep learning network model and the weight of the model are solidified together.
量化模块22与固化模块21相连,用于基于人工智能处理装置的识别准确率对固化后的所述深度学习网络模型数据进行量化。The quantization module 22 is coupled to the curing module 21 for quantifying the solidified network model data after curing based on the recognition accuracy of the artificial intelligence processing device.
具体地,对于固化后的所述深度学习网络模型数据,还需进行进一步的量化处理。Specifically, for the deep learning network model data after curing, further quantization processing is required.
在数字信号处理领域,量化指将信号的连续取值(或者大量可能的离散取值)近似为有限多个(或较少的)离散值的过程。量化主要应用于从连续信号到数字信号的转换中。连续信号经过采样成为离散信号,离散信号经过量化即成为数字信号。注意离散信号通常情况下并不需要经过量化的过程,但可能在值域上并不离散,还是需要经过量化的过程。In the field of digital signal processing, quantization refers to the process of approximating a continuous value of a signal (or a large number of possible discrete values) to a finite number (or fewer) of discrete values. Quantization is mainly used in the conversion from continuous signals to digital signals. The continuous signal is sampled into a discrete signal, and the discrete signal is quantized to become a digital signal. Note that discrete signals do not usually require a quantized process, but may not be discrete in the range or require a quantized process.
具体地,本发明采用一定的量化算法对固化后的所述深度学习网络模型数据进行量化。对于本领域技术人员而言,量化属于成熟的现有技术,故在此不再赘述。Specifically, the present invention quantizes the solidified learning network model data after curing using a certain quantization algorithm. For those skilled in the art, the quantification belongs to the mature prior art, and therefore will not be described herein.
生成模块23与固化模块21和量化模块22相连,用于根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据生成深度学习数据图。The generating module 23 is connected to the curing module 21 and the quantization module 22, and configured to generate a depth learning data map according to the solidified network model data after the solidification and the quantized depth learning network model data.
具体地,为了生成与人工智能处理装置相适配的数据,还需根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据生成深度学习数据图。例如,当所述深度学习网络模型采用Tensorflow训练模型时,生成Tensorflow图。Tensorflow是谷歌基于DistBelief进行研发的第二代人工智能学习系统,其命名来源于本身的运行原理。Tensor(张量)意味着N维数组,Flow(流)意味着基于数据流图的计算,Tensorflow为张量从流图的一端流动到另一端计算过程。Tensorflow是将复杂的数据结构传输至人工智能神经网中进行分析和处理过程的系统。Specifically, in order to generate data adapted to the artificial intelligence processing device, the deep learning data map is further generated according to the solidified network model data after the solidification and the quantized depth learning network model data. For example, when the deep learning network model adopts the Tensorflow training model, a Tensorflow map is generated. Tensorflow is Google's second-generation artificial intelligence learning system based on DistBelief. Its name is derived from its operating principle. Tensor means an N-dimensional array. Flow means that based on the calculation of the data flow graph, Tensorflow flows from one end of the flow graph to the other. Tensorflow is a system that transmits complex data structures to an artificial intelligence neural network for analysis and processing.
于本发明一实施例中,本发明的人工智能处理装置设计模型优化系统还包括评估模块,用于根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据进行所述深度学习网络模型的评估。通过对所述深度学习网络进行评估,以判断其是否与所述人工智能处理装置相适配,当二者不适配时,可通过调整固化和/或量化算法来修正。In an embodiment of the present invention, the artificial intelligence processing device design model optimization system of the present invention further includes an evaluation module, configured to perform, according to the solidified network model data after the solidification and the quantized depth learning network model data. An assessment of the deep learning network model. The deep learning network is evaluated to determine whether it is adapted to the artificial intelligence processing device, and when the two are not adapted, it can be corrected by adjusting the curing and/or quantization algorithm.
需要说明的是,应理解以上系统的各个模块的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。且这些模块可以全部以软件通过处理元件调用的形式实现;也可以全部以硬件的形式实现;还可以部分模块通过处理元件调用软件的形式实现,部分模块通过硬件的形式实现。例如,x模块可以为单独设立的处理元件,也可以集成在上述装置的某一个芯片中实现,此外,也可以以程序代码的形式存储于上述装置的存储器中,由上述装置的某一个处理元件调用并执行以上x模块的功能。其它模块的实现与之类似。此外这些模块全部或部分可以集成在一起,也可以独立实现。这里所述的处理元件可以是一种集成电路,具有信号的处理能力。在实现过程中,上述方法的各步骤或以上各个模块可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。It should be noted that the division of each module of the above system is only a division of logical functions, and the actual implementation may be integrated into one physical entity in whole or in part, or may be physically separated. And these modules can all be implemented by software in the form of processing component calls; or all of them can be implemented in hardware form; some modules can be realized by processing component calling software, and some modules are realized by hardware. For example, the x module may be a separately set processing element, or may be integrated in one of the above-mentioned devices, or may be stored in the memory of the above device in the form of program code, by a processing element of the above device. Call and execute the functions of the above x modules. The implementation of other modules is similar. In addition, all or part of these modules can be integrated or implemented independently. The processing elements described herein can be an integrated circuit with signal processing capabilities. In the implementation process, each step of the above method or each of the above modules may be completed by an integrated logic circuit of hardware in the processor element or an instruction in a form of software.
例如,以上这些模块可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路(ApplicationSpecificIntegratedCircuit,简称ASIC),或,一个或多个微处理器(digitalsingnalprocessor,简称DSP),或,一个或者多个现场可编程门阵列(FieldProgrammableGateArray,简称FPGA)等。再如,当以上某个模块通过处理元件调度程序代码的形式实现时,该处理元件可以是通用处理器,例如中央处理器(CentralProcessingUnit,简称CPU)或其它可以调用程序代码的处理器。再如,这些模块可以集成在一起,以片上系统(system-on-a-chip,简称SOC)的形式实现。For example, the above modules may be one or more integrated circuits configured to implement the above method, for example, one or more specific integrated circuits (ASICs), or one or more microprocessors (digitalsingnal processors, referred to as DSP), or one or more Field Programmable Gate Arrays (FPGAs). For another example, when one of the above modules is implemented by the processing component dispatcher code, the processing component may be a general-purpose processor, such as a central processing unit (CPU) or other processor that can call the program code. As another example, these modules can be integrated and implemented in the form of a system-on-a-chip (SOC).
本发明的存储介质上存储有计算机程序,该程序被处理器执行时实现上述人工智能处理装置设计模型优化方法。优选地,所述存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。The storage medium of the present invention stores a computer program, and when the program is executed by the processor, the artificial intelligence processing device design model optimization method is implemented. Preferably, the storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
如图3所示,于一实施例中,本发明的终端包括处理器31及存储器32。As shown in FIG. 3, in an embodiment, the terminal of the present invention includes a processor 31 and a memory 32.
所述存储器32用于存储计算机程序。The memory 32 is used to store a computer program.
优选地,所述存储器32包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Preferably, the memory 32 includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
所述处理器31与所述存储器32相连,用于执行所述存储器32存储的计算机程序,以使所述终端执行上述人工智能处理装置设计模型优化方法。The processor 31 is coupled to the memory 32 for executing a computer program stored by the memory 32 to cause the terminal to execute the artificial intelligence processing device design model optimization method.
优选地,处理器31可以是通用处理器,包括中央处理器(CentralProcessingUnit,简称CPU)、网络处理器(NetworkProcessor,简称NP)等;还可以是数字信号处理器(DigitalSignalProcessing,简称DSP)、专用集成电路(ApplicationSpecificIntegratedCircuit,简称ASIC)、现场可编程门阵列(Field-ProgrammableGateArray,简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。Preferably, the processor 31 may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP for short), and the like; or a digital signal processor (DSP), dedicated integration. Circuit (ApplicationSpecific Integrated Circuit, ASIC for short), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
综上所述,本发明的人工智能处理装置设计模型优化方法、系统、存储介质、终端通过对深度学习算法进行优化处理,使其能够在人工智能处理装置上运行;优化效率高,实用性强。所以,本发明有效克服了现有技术中的种种缺点而具高度产业利用价值。In summary, the artificial intelligence processing device design model optimization method, system, storage medium, and terminal of the present invention optimize the deep learning algorithm to enable it to run on the artificial intelligence processing device; the optimization efficiency is high and the utility is strong. . Therefore, the present invention effectively overcomes various shortcomings in the prior art and has high industrial utilization value.
上述实施例仅例示性说明本发明的原理及其功效,而非用于限制本发明。任何熟悉此技术的人士皆可在不违背本发明的精神及范畴下,对上述实施例进行修饰或改变。因此,举凡所属技术领域中具有通常知识者在未脱离本发明所揭示的精神与技术思想下所完成的一切等效修饰或改变,仍应由本发明的权利要求所涵盖。The above-described embodiments are merely illustrative of the principles of the invention and its effects, and are not intended to limit the invention. Modifications or variations of the above-described embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention. Therefore, all equivalent modifications or changes made by those skilled in the art without departing from the spirit and scope of the invention will be covered by the appended claims.

Claims (10)

  1. 一种人工智能处理装置设计模型优化方法,其特征在于:包括以下步骤:An artificial intelligence processing device design model optimization method, comprising: the following steps:
    基于人工智能处理装置的识别准确率对深度学习网络模型数据进行固化;The deep learning network model data is solidified based on the recognition accuracy of the artificial intelligence processing device;
    基于人工智能处理装置的识别准确率对固化后的所述深度学习网络模型数据进行量化;The depth learning network model data after the curing is quantized based on the recognition accuracy of the artificial intelligence processing device;
    根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据生成深度学习数据图。Generating a deep learning data map according to the solidified network model data after the solidification and the quantized depth learning network model data.
  2. 根据权利要求1所述的人工智能处理装置设计模型优化方法,其特征在于:还包括根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据进行所述深度学习网络模型的评估。The artificial intelligence processing device design model optimization method according to claim 1, further comprising: performing the deep learning network according to the solidified network model data after the solidification and the quantized depth learning network model data. Evaluation of the model.
  3. 根据权利要求1所述的人工智能处理装置设计模型优化方法,其特征在于:所述深度学习网络模型数据为32bit定点数据或32bit浮点数据。The artificial intelligence processing device design model optimization method according to claim 1, wherein the deep learning network model data is 32-bit fixed point data or 32-bit floating point data.
  4. 根据权利要求1所述的人工智能处理装置设计模型优化方法,其特征在于:所述深度学习网络模型采用Tensorflow训练模型。The artificial intelligence processing device design model optimization method according to claim 1, wherein the deep learning network model adopts a Tensorflow training model.
  5. 一种人工智能处理装置设计模型优化系统,其特征在于:包括固化模块、量化模块和生成模块;An artificial intelligence processing device design model optimization system, comprising: a curing module, a quantifying module and a generating module;
    所述固化模块用于基于人工智能处理装置的识别准确率对深度学习网络模型数据进行固化;The curing module is configured to solidify the deep learning network model data based on the recognition accuracy of the artificial intelligence processing device;
    所述量化模块用于基于人工智能处理装置的识别准确率对固化后的所述深度学习网络模型数据进行量化;The quantization module is configured to quantize the solidified network model data after curing based on a recognition accuracy of the artificial intelligence processing device;
    所述生成模块用于根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据生成深度学习数据图。The generating module is configured to generate a depth learning data map according to the solidified network model data after the solidification and the quantized depth learning network model data.
  6. 根据权利要求5所述的人工智能处理装置设计模型优化系统,其特征在于:还包括评估模块,用于根据固化后的所述深度学习网络模型数据和量化后的所述深度学习网络模型数据进行所述深度学习网络模型的评估。The artificial intelligence processing device design model optimization system according to claim 5, further comprising: an evaluation module, configured to perform, according to the solidified network model data after the solidification and the quantized depth learning network model data. Evaluation of the deep learning network model.
  7. 根据权利要求5所述的人工智能处理装置设计模型优化系统,其特征在于:所述深度学习网络模型数据为32bit定点数据或32bit浮点数据。The artificial intelligence processing device design model optimization system according to claim 5, wherein the deep learning network model data is 32-bit fixed point data or 32-bit floating point data.
  8. 根据权利要求5所述的人工智能处理装置设计模型优化系统,其特征在于:所述深度学习网络模型采用Tensorflow训练模型。The artificial intelligence processing device design model optimization system according to claim 5, wherein the deep learning network model adopts a Tensorflow training model.
  9. 一种存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求1至4中任一项所述人工智能处理装置设计模型优化方法。A storage medium having stored thereon a computer program, wherein the program is executed by a processor to implement the artificial intelligence processing device design model optimization method according to any one of claims 1 to 4.
  10. 一种终端,其特征在于,包括:处理器及存储器;A terminal, comprising: a processor and a memory;
    所述存储器用于存储计算机程序;The memory is for storing a computer program;
    所述处理器用于执行所述存储器存储的计算机程序,以使所述终端执行权利要求1至4中任一项所述人工智能处理装置设计模型优化方法。The processor is configured to execute the computer program stored in the memory to cause the terminal to perform the artificial intelligence processing device design model optimization method according to any one of claims 1 to 4.
PCT/CN2018/072668 2018-01-15 2018-01-15 Method and system for optimizing design model of artificial intelligence processing device, storage medium, and terminal WO2019136755A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/072668 WO2019136755A1 (en) 2018-01-15 2018-01-15 Method and system for optimizing design model of artificial intelligence processing device, storage medium, and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/072668 WO2019136755A1 (en) 2018-01-15 2018-01-15 Method and system for optimizing design model of artificial intelligence processing device, storage medium, and terminal

Publications (1)

Publication Number Publication Date
WO2019136755A1 true WO2019136755A1 (en) 2019-07-18

Family

ID=67218210

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/072668 WO2019136755A1 (en) 2018-01-15 2018-01-15 Method and system for optimizing design model of artificial intelligence processing device, storage medium, and terminal

Country Status (1)

Country Link
WO (1) WO2019136755A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228238A (en) * 2016-07-27 2016-12-14 中国科学技术大学苏州研究院 The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN106485316A (en) * 2016-10-31 2017-03-08 北京百度网讯科技有限公司 Neural network model compression method and device
CN107239829A (en) * 2016-08-12 2017-10-10 北京深鉴科技有限公司 A kind of method of optimized artificial neural network
CN107480770A (en) * 2017-07-27 2017-12-15 中国科学院自动化研究所 The adjustable neutral net for quantifying bit wide quantifies the method and device with compression

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228238A (en) * 2016-07-27 2016-12-14 中国科学技术大学苏州研究院 The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN107239829A (en) * 2016-08-12 2017-10-10 北京深鉴科技有限公司 A kind of method of optimized artificial neural network
CN106485316A (en) * 2016-10-31 2017-03-08 北京百度网讯科技有限公司 Neural network model compression method and device
CN107480770A (en) * 2017-07-27 2017-12-15 中国科学院自动化研究所 The adjustable neutral net for quantifying bit wide quantifies the method and device with compression

Similar Documents

Publication Publication Date Title
WO2019136754A1 (en) Compiling method and system of artificial intelligence processing apparatus, storage medium and terminal
JP7290256B2 (en) Methods for Neural Networks
CN109949255B (en) Image reconstruction method and device
Liu et al. Real-time marine animal images classification by embedded system based on mobilenet and transfer learning
Ding et al. Extreme learning machine with kernel model based on deep learning
WO2019136758A1 (en) Hardware optimization method and system of artificial intelligence processing apparatus, storage medium and terminal
KR102399548B1 (en) Method for neural network and apparatus perform same method
CN110059733A (en) The optimization and fast target detection method, device of convolutional neural networks
US20190354865A1 (en) Variance propagation for quantization
CN105488563A (en) Deep learning oriented sparse self-adaptive neural network, algorithm and implementation device
WO2019136756A1 (en) Design model establishing method and system for artificial intelligent processing device, storage medium, and terminal
CN116188941A (en) Manifold regularized width learning method and system based on relaxation annotation
Qi et al. Research on deep learning expression recognition algorithm based on multi-model fusion
Duggal et al. Shallow SqueezeNext: An Efficient & Shallow DNN
US11429771B2 (en) Hardware-implemented argmax layer
CN113297964A (en) Video target recognition model and method based on deep migration learning
Chan et al. Sinp [n]: A fast convergence activation function for convolutional neural networks
WO2019136755A1 (en) Method and system for optimizing design model of artificial intelligence processing device, storage medium, and terminal
WO2023059723A1 (en) Model compression via quantized sparse principal component analysis
Yuan et al. Research on Image Classification of Lightweight Convolutional Neural Network
Zhao et al. Design and Development of Image Recognition Toolkit Based on Deep Learning
US12039740B2 (en) Vectorized bilinear shift for replacing grid sampling in optical flow estimation
Fang et al. Embedded image recognition system for lightweight convolutional Neural Networks
Wang et al. Image inpainting algorithm based on convolutional neural network structure and improved Deep Image Prior
Wang et al. T–S fuzzy model based multi-branch deep network architecture

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18900448

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18900448

Country of ref document: EP

Kind code of ref document: A1