CN109685201A - Operation method, device and Related product - Google Patents

Operation method, device and Related product Download PDF

Info

Publication number
CN109685201A
CN109685201A CN201811534068.2A CN201811534068A CN109685201A CN 109685201 A CN109685201 A CN 109685201A CN 201811534068 A CN201811534068 A CN 201811534068A CN 109685201 A CN109685201 A CN 109685201A
Authority
CN
China
Prior art keywords
operator
artificial intelligence
input data
splicing
duplication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811534068.2A
Other languages
Chinese (zh)
Other versions
CN109685201B (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Cambricon Information Technology Co Ltd
Original Assignee
Beijing Zhongke Cambrian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Cambrian Technology Co Ltd filed Critical Beijing Zhongke Cambrian Technology Co Ltd
Priority to CN201811534068.2A priority Critical patent/CN109685201B/en
Publication of CN109685201A publication Critical patent/CN109685201A/en
Application granted granted Critical
Publication of CN109685201B publication Critical patent/CN109685201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)
  • Advance Control (AREA)

Abstract

This disclosure relates to a kind of operation method, device and Related product, the product includes control module, and the control module includes: instruction buffer submodule, instruction buffer submodule and storage queue submodule;Described instruction cache sub-module, for storing the associated computations of artificial neural network operation;Described instruction handles submodule, obtains multiple operational orders for parsing to the computations;The storage queue submodule, for storing instruction queue, the instruction queue include: by the pending multiple operational orders of the tandem of the queue or computations.By above method, operation efficiency of the Related product when carrying out the operation of neural network model is can be improved in the disclosure.

Description

Operation method, device and Related product
Technical field
This disclosure relates to machine learning techniques field more particularly to a kind of operation method, device and Related product.
Background technique
Neural network algorithm is a kind of nearest popular machine learning algorithm, is all achieved very in various fields Good effect, such as image recognition, speech recognition, natural language processing etc..With the development of neural network algorithm, algorithm is answered Miscellaneous degree is also higher and higher, and in order to improve resolution, the scale of model is also being gradually increased.These big rule have been handled with GPU and CPU The model of mould will spend a large amount of calculating time, and power consumption is very big.In this case, new artificial intelligence process device It is suggested to improve the arithmetic speed of neural network model, saves operation time, reduce power consumption.However, currently to new artificial The algorithm of intelligent processor is supported far from enough.
Summary of the invention
In view of this, the present disclosure proposes a kind of operation methods, which comprises
Obtain artificial intelligence Operator Library in duplication operator and multiplication operator, the duplication operator for by input data into To obtain duplication input data, the multiplication operator is used to carry out multiplying to input data for row duplication;
The duplication operator and the multiplication operator are spliced to form splicing operator,
Wherein, the splicing operator is used in artificial intelligence process device execute input data corresponding splicing operation behaviour Make, to execute artificial intelligence operation.
It is described to splice the duplication operator and the multiplication operator to be formed in a kind of possible embodiment Splice operator, comprising:
Using the duplication operator as the prime operator of the multiplication operator.
In a kind of possible embodiment, the splicing arithmetic operation includes:
When obtaining input data, input data is replicated to obtain and replicate input number using the duplication operator According to;
Multiplying is carried out to the input data and duplication input data using the multiplication operator, to obtain multiplication fortune Calculate result.
In a kind of possible embodiment, the splicing operator is applied to the application layer in software transfer level, The artificial intelligence Operator Library is located at the Operator Library layer in software transfer level, and the artificial intelligence process device is located at software transfer Chip layer in level.
According to another aspect of the present disclosure, a kind of arithmetic unit is proposed, described device includes:
Module is obtained, for obtaining duplication operator and multiplication operator in artificial intelligence Operator Library, the duplication operator is used In input data is carried out duplication to obtain duplication input data, the multiplication operator is used to carry out multiplication fortune to input data It calculates;
Computing module is connected to the acquisition module, for splicing the duplication operator with the multiplication operator Splice operator to be formed,
Wherein, the splicing operator is used in artificial intelligence process device execute input data corresponding splicing operation behaviour Make, to support artificial intelligence operation.
In a kind of possible embodiment, the computing module includes:
First operation submodule, using the duplication operator as the prime operator of the multiplication operator.
In a kind of possible embodiment, the splicing arithmetic operation includes:
When obtaining input data, input data is replicated to obtain and replicate input number using the duplication operator According to;
Multiplying is carried out to the input data and duplication input data using the multiplication operator, to obtain multiplication fortune Calculate result.
According to another aspect of the present disclosure, a kind of artificial intelligence process device is proposed, described device includes:
Primary processor, for executing the method, to obtain splicing operator, the splicing operator is used for the input Data execute corresponding arithmetic operation;
Artificial intelligence process device is electrically connected to the primary processor;
The primary processor is also used to send input data and the splicing operator to artificial intelligence process device, described artificial Intelligent processor is configured as:
Receive the input data and splicing operator that primary processor is sent;
Artificial intelligence operation is carried out to obtain operation result to the input data using the splicing operator;
The operation result is sent to the primary processor.
In a kind of possible embodiment, the primary processor further includes primary processor memory space, for storing State splicing operator, wherein
The primary processor also provides for input data and the splicing being stored in the primary processor memory space is calculated Son.
In a kind of possible embodiment, the artificial intelligence process device passes to operation result by I/O interface The primary processor;
It, can between the multiple artificial intelligence process device when described device includes multiple artificial intelligence process devices To be attached by specific structure and transmit data;
Wherein, multiple artificial intelligence process devices are interconnected simultaneously by quick external equipment interconnection Bus PC IE bus Data are transmitted, to support the operation of more massive artificial intelligence;Multiple artificial intelligence process devices share same control system It unites or possesses respective control system;Multiple artificial intelligence process device shared drives possess respective memory;It is multiple The mutual contact mode of the artificial intelligence process device is any interconnection topology.
In a kind of possible embodiment, the device, further includes: storage device, the storage device respectively with institute It states artificial intelligence process device to connect with the primary processor, for saving the artificial intelligence process device device and the main process task The data of device.
According to another aspect of the present disclosure, a kind of artificial intelligence chip is proposed, the artificial intelligence chip includes described Artificial intelligence process device.
According to another aspect of the present disclosure, a kind of electronic equipment is proposed, the electronic equipment includes the artificial intelligence It can chip.
According to another aspect of the present disclosure, propose a kind of board, the board include: memory device, interface arrangement and Control device and the artificial intelligence chip;
Wherein, the artificial intelligence chip and the memory device, the control device and the interface arrangement are distinguished Connection;
The memory device, for storing data;
The interface arrangement, for realizing the data transmission between the chip and external equipment;
The control device is monitored for the state to the chip.
In a kind of possible embodiment, the memory device includes: multiple groups storage unit, is stored described in each group single It is first to be connect with the chip by bus, the storage unit are as follows: DDR SDRAM;
The chip includes: DDR controller, the control for data transmission and data storage to each storage unit System;
The interface arrangement are as follows: standard PCIE interface.
According to another aspect of the present disclosure, a kind of non-volatile computer readable storage medium storing program for executing is provided, is stored thereon with Computer program instructions, wherein the computer program instructions realize the above method when being executed by processor.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become It is clear.
Detailed description of the invention
Comprising in the description and constituting the attached drawing of part of specification and specification together illustrates the disclosure Exemplary embodiment, feature and aspect, and for explaining the principles of this disclosure.
Fig. 1 shows the flow chart of the operation method according to one embodiment of the disclosure.
Fig. 2 shows the software transfer hierarchical relationship schematic diagrames according to one embodiment of the disclosure.
Fig. 3 shows the schematic diagram of the splicing operator according to one embodiment of the disclosure.
Fig. 4 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure.
Fig. 5 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure.
Fig. 6 shows the block diagram of the artificial intelligence process device according to one embodiment of the disclosure.
Fig. 7 shows the block diagram of the artificial intelligence process device according to one embodiment of the disclosure.
Fig. 8 shows the block diagram of the artificial intelligence process device according to one embodiment of the disclosure.
Fig. 9 shows the block diagram of the main process task circuit 331 according to one embodiment of the disclosure.
Figure 10 shows the schematic diagram of the artificial intelligence process device according to one embodiment of the disclosure.
Figure 11 shows the schematic diagram of the artificial intelligence process device according to one embodiment of the disclosure.
Figure 12 shows a kind of board according to one embodiment of the disclosure.
Specific embodiment
Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove It non-specifically points out, it is not necessary to attached drawing drawn to scale.
Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary " Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.
In addition, giving numerous details in specific embodiment below to better illustrate the disclosure. It will be appreciated by those skilled in the art that without certain details, the disclosure equally be can be implemented.In some instances, for Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.
Referring to Fig. 1, Fig. 1 shows the flow chart of the operation method according to one embodiment of the disclosure.
As shown in Figure 1, which comprises
Step S110 obtains duplication operator and multiplication operator in artificial intelligence Operator Library, and the duplication operator is used for will Input data carries out duplication to obtain duplication input data, and the multiplication operator is used to carry out multiplying to input data;
Step S120 splices the duplication operator and the multiplication operator to form splicing operator,
Wherein, the splicing operator is used in artificial intelligence process device execute input data corresponding splicing operation behaviour Make, to execute artificial intelligence operation.
By above method, duplication operator and multiplication operator in the available artificial intelligence Operator Library of the disclosure, by institute It states duplication operator and multiplication operator is spliced to form splicing operator, the splicing operator of formation can be used for supporting new artificial Intelligent processor, to improve operation efficiency of the new artificial intelligence process device when carrying out the operation of neural network model.
The splicing operator formed by above method, can be used as a part of artificial intelligence operation, when splicing operator fortune When for carrying out artificial intelligence operation in artificial intelligence process device, including but not limited to speech recognition, image recognition may be implemented Deng application, splicing operator is formed by combining deformation operator and basic operator, artificial intelligence process device can be allowed more preferable Realize artificial intelligence operation in ground.
In a kind of possible embodiment, operator (operator) can be common algorithm in artificial intelligence, and quilt Referred to as layer, operation, each neural network of node correspond to a network structure, and the node in network structure is Operator.Artificial intelligent operator library can be preset, may include multiple basic operators (such as convolution in artificial intelligence Operator Library Operator, full connection operator, pond operator, activation operator etc.), each basis operator can be by including but not limited to central processing unit The processors such as CPU, image processor GPU are called to realize corresponding basic function.
In a kind of possible embodiment, the dimension of input data can be 4, when the first input data is image data When, each dimension of the first input data can indicate picture number, picture channel (Channel) quantity, picture height, picture Width.In other embodiments, (the example when the first input data is image data, but the dimension of the first input data is less than 4 For example 3), each dimension of the first input data can indicate picture number, picture number of channels, picture height, picture width In any 3 kinds.
In a kind of possible embodiment, duplication operator when executed, the content of an input data can be answered It makes in another memory headroom, although input data is different with memory headroom existing for duplication input data, the two Content is identical.
In a kind of possible embodiment, multiplication operator when executed, can to two input datas of input into Row multiplying, to obtain the result of multiplying.
In a kind of possible embodiment, step S120 by the deformation operator and the basic operator splice with Splicing operator is formed, may include:
Using the duplication operator as the prime operator of the multiplication operator.
In this way, the duplication of input data may be implemented in the disclosure, to meet the operation item of multiplication operator Part, multiplication operator can carry out multiplication fortune to the two when obtaining two input data (input datas+duplication input data) It calculates, to realize the square operation of input data.
In a kind of possible embodiment, the splicing arithmetic operation includes:
When obtaining input data, input data is replicated to obtain and replicate input number using the duplication operator According to;
Multiplying is carried out to the input data and duplication input data using the multiplication operator, to obtain multiplication fortune Calculate result.
In a kind of possible embodiment, the splicing operator is applied to the application layer in software transfer level, The artificial intelligence Operator Library is located at the Operator Library layer in software transfer level, and the artificial intelligence process device is located at software transfer Chip layer in level.
Referring to Fig. 2, Fig. 2 shows the software transfer hierarchical relationship schematic diagrames according to one embodiment of the disclosure.
As shown in Fig. 2, software transfer hierarchical relationship from top to bottom successively include application layer, ccf layer, Operator Library layer, Drive layer, chip layer, wherein the splicing operator obtained by foregoing operation method can be applied to application layer, artificial intelligence Energy Operator Library can be in Operator Library layer, and artificial intelligence process device can be located in chip layer, and driving layer may include for driving The driver of dynamic chip layer work.
It, can by described above it is found that using the deformation operator in Operator Library layer and after basic operator forms splicing operator Directly to be called by application layer to be applied in application layer, to realize corresponding function in artificial intelligence operation Can, it avoids and requires to transfer deformation operator from Operator Library layer each time when application layer will carry out artificial intelligence operation And the case where basis operator, so as to improve the implementation procedure of artificial intelligence operation.
It, can be with when needing using artificial intelligence operation to carry out speech recognition, image procossing in an application example Using square splicing operator (duplication operator+multiplication operator) Lai Jinhang square operation in one embodiment of the disclosure, thus When needing to carry out square operation to input data, input data is replicated using a square splicing operator, it is defeated to obtain duplication Enter data, and multiplying is carried out to realize input data to input data and duplication input data using square splicing operator Square operation.Using described in the disclosure squares of splicing operator, artificial intelligence operation can be executed more advantageously to realize Including but not limited to image procossing, speech recognition etc. are applied, to improve the efficiency of artificial intelligence operation.
By above method, the disclosure can obtain splicing operator, the splicing operator according to duplication operator and multiplication operator When can need to carry out square operation to input data, input data is replicated using a square splicing operator, to be answered Input data processed, and replicate input data to input data using square splicing operator and carry out multiplying to realize input The square operation of data.
Referring to Fig. 3, Fig. 3 shows the schematic diagram of the splicing operator according to one embodiment of the disclosure.
As shown in figure 3, the splicing operator includes:
Operator 10 is replicated, the duplication operator 10 is used to carry out input data duplication to obtain duplication input data;
Multiplication operator 20 is connected to duplication operator 10, and the multiplication operator 20 is for receiving input data and duplication Multiplying is carried out to input data and duplication input data after input data, and exports operation result.
By splicing operator above, the disclosure can use duplication operator and input data carried out duplication to be replicated Input data, using multiplication operator to input data and duplication input data after receiving input data and duplication input data Multiplying is carried out, and exports operation result, to realize the square operation of input data.
Referring to Fig. 4, Fig. 4 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure.
As shown in figure 4, described device includes:
Module 80 is obtained, for obtaining duplication operator and multiplication operator in artificial intelligence Operator Library, the duplication operator For input data to be carried out duplication to obtain duplication input data, the multiplication operator is used to carry out multiplication to input data Operation;
Computing module 90 is connected to the acquisition module 80, for carrying out the duplication operator and the multiplication operator Splice to form splicing operator,
Wherein, the splicing operator is used in artificial intelligence process device execute input data corresponding splicing operation behaviour Make, to support artificial intelligence operation.
Referring to Fig. 5, Fig. 5 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure.
As shown in figure 5, computing module 90, comprising:
First operation submodule 910, using the duplication operator as the prime operator of the multiplication operator.
In a kind of possible embodiment, the splicing arithmetic operation includes:
When obtaining input data, input data is replicated to obtain and replicate input number using the duplication operator According to;
Multiplying is carried out to the input data and duplication input data using the multiplication operator, to obtain multiplication fortune Calculate result.
Referring to Fig. 6, Fig. 6 shows the block diagram of the artificial intelligence process device according to one embodiment of the disclosure.
In a kind of possible embodiment, as shown in fig. 6,
Primary processor 50, for executing the method, to obtain splicing operator, the splicing operator is used for described defeated Enter data and executes corresponding arithmetic operation;
Artificial intelligence process device 60 is electrically connected to the primary processor 50;
The primary processor 50 is also used to send input data and the splicing operator to artificial intelligence process device 60, described Artificial intelligence process device 60 is configured as:
Receive the input data and splicing operator that primary processor 50 is sent;
Artificial intelligence operation is carried out to obtain operation result to the input data using the splicing operator;
The operation result is sent to the primary processor 50.
In a kind of possible embodiment, primary processor 50 may include primary processor memory space, for storing master Processor 50 executes the splicing operator that the operation method obtains, wherein
The splicing that the primary processor 50 also provides for input data and is stored in the primary processor memory space Operator.
It is to be understood that primary processor 50 can execute the operation method after obtaining data, obtains splicing and calculate Son, and the splicing operator of acquisition is sent to artificial intelligence process device 60 simultaneously and is handled.Primary processor 50 can also will be deposited The splicing operator of storage is sent to artificial intelligence process device 60, and pre-stored splicing operator is sent to artificial intelligence to realize Processor 60, artificial intelligence process device 60 carry out artificial intelligence operation according to the splicing operator and input data received.More than Two ways, former can be considered that the mode handled immediately on line, latter can be considered processing mode under line.
In a kind of possible embodiment, device as shown in Figure 4, Figure 5 can be realized in primary processor 50.
In a kind of possible embodiment, primary processor 50 can be central processor CPU, be also possible to other types Processor, such as image processor GPU.It is to be understood that the splicing operator is obtained by foregoing operation method Splice operator, the specific description introduced before please referring to splicing operator, details are not described herein.
In a kind of possible embodiment, artificial intelligence process device can be to be formed by multiple identical processors , such as multiple processors (XPU) formation is similar to the framework of primary processor 50+ artificial intelligence process device 60.Can also be by One processor forms, and in this case, processor can both execute operation method above-mentioned, calculates to obtain splicing Son can also carry out artificial intelligence operation to input data by splicing operator, to obtain output result.In present embodiment In, the type of processor can be existing, be also possible to the new types of processors newly proposed, the disclosure is without limitation.
In a kind of possible embodiment, primary processor 50 can be used as artificial intelligence process device and external data and The interface of control, including data are carried, and the basic control such as unlatching, stopping to this artificial intelligent treatment device is completed;Its elsewhere Managing device can also be with the common completion processor active task of artificial intelligence process device cooperation.
In a kind of possible embodiment, artificial intelligence process device may include more than one artificial intelligence process Device can be linked between artificial intelligence process device by specific structure and transmit data, for example, be carried out by PCIE bus Data are interconnected and transmit, to support the operation of more massive machine learning.At this point it is possible to same control system is shared, it can also To there is control system independent;Can with shared drive, can also each accelerator have respective memory.In addition, it is interconnected Mode can be any interconnection topology.
The artificial intelligent treatment device compatibility with higher, can pass through PCIE interface and various types of server phases Connection.
Referring to Fig. 7, Fig. 7 shows the block diagram of the artificial intelligence process device according to one embodiment of the disclosure.
In a kind of possible embodiment, as shown in fig. 7, primary processor 50 and artificial intelligence process device 60 can pass through General interconnecting interface (such as I/O interface) connection, for transmitting data and control between primary processor 50 and artificial intelligence process device 60 System instruction.The artificial intelligent processor 60 obtains required input data (including splicing operator), write-in from primary processor 50 The storage device of artificial intelligence process device on piece;Control instruction can be obtained from primary processor 50, be written at artificial intelligence Manage the control caching of device on piece;The data in the memory module of artificial intelligence process device 60 can also be read and be transferred to other Processing unit.
In a kind of possible embodiment, artificial intelligence process device can also include storage device, storage device point It is not connect with the artificial intelligence process device and other described processing units.Storage device is for being stored in the artificial intelligence The data of the data of processing unit and other processing units, operation required for being particularly suitable for are filled in this artificial intelligence process Set or the storage inside of other processing units in the data that can not all save.
The combined treatment device can be used as the SOC on piece of the equipment such as mobile phone, robot, unmanned plane, video monitoring equipment The die area of control section is effectively reduced in system, improves processing speed, reduces overall power.When this situation, the combined treatment The general interconnecting interface of device is connected with certain components of equipment.Certain components for example camera, display, mouse, keyboard, Network interface card, wifi interface.By the above artificial intelligence process device, the disclosure can be by primary processor by input data and splicing Operator is transferred to artificial intelligence process device, and artificial intelligence process executes artificial intelligence operation using splicing operator to input data Operation, to obtain operation result, and is sent to primary processor for operation result.
It is to be understood that artificial intelligence process device 60 can be the single processor that can be used for artificial intelligence operation, It is also possible to the combination of a variety of different processors.Artificial intelligence process device is applied to artificial intelligence operation, artificial intelligence operation packet Include machine learning operation, class brain operation, etc..Wherein, machine learning operation includes neural network computing, k-means operation, branch Hold vector machine operation etc..The artificial intelligent processor 60 can specifically include GPU (Graphics Processing Unit, figure Shape processor unit), NPU (Neural-Network Processing Unit, neural network processor unit), DSP (Digital Signal Process, Digital Signal Processing), field programmable gate array (Field-Programmable Gate Array, FPGA) chip one kind or combination.
In a kind of possible embodiment, artificial intelligence process device 60 is as shown in Figure 8.Referring to Fig. 8, Fig. 8 is shown According to the block diagram of the artificial intelligence process device of one embodiment of the disclosure.
As shown in figure 8, the artificial intelligence process device 30 includes control module 32, computing module 33 and memory module 31, The computing module 33 include main process task circuit 331 and it is multiple from processing circuit 332 (from the number of processing circuit be example in figure Property).
The control module 32, for obtaining input data and computations;
The control module 32 is also used to parse the computations and obtains multiple operational orders, by multiple operational order And the input data is sent to the main process task circuit 331;
The main process task circuit 331, for executing preamble processing and with the multiple from processing to the input data Data and operational order are transmitted between circuit;
It is the multiple from processing circuit 332, for referring to according to the data and operation transmitted from the main process task circuit 331 It enables the parallel intermediate operations that execute obtain multiple intermediate results, and multiple intermediate results is transferred to the main process task circuit 331;
The main process task circuit 331 obtains the computations for executing subsequent processing to the multiple intermediate result Calculated result.
Artificial intelligence process device 30 described in the disclosure holds input data after receiving input data and computations The corresponding arithmetic operation of row, to obtain the calculated result.
Artificial intelligence process device described in the disclosure can support the artificial intelligence of machine learning and some non-machine learning It can algorithm.
Above-mentioned computations include but is not limited to: forward operation instruction or reverse train instruction, the application specific embodiment party Formula is not intended to limit the specific manifestation form of above-mentioned computations.
It, can be by the meter after artificial intelligence process 30 obtains the calculated result in a kind of possible embodiment It calculates result and is sent to other processors such as central processor CPU or image processor GPU.
The operational order is run code of the artificial intelligent processor 30 according to splicing operator acquisition, above-mentioned to run Code includes but is not limited to: forward operation instruction or reverse train instruction or the instruction of other neural network computings etc., the application Specific embodiment is not intended to limit the specific manifestation form of above-mentioned computations.
In a kind of possible embodiment, the artificial intelligence process device 30 can be obtained by data transmission module 360 It arrives, which is specifically as follows one or more data I/O interfaces or I/O pin.
The main process task circuit 331, for operational data executing preamble processing with the operation that obtains that treated to described Data, and with it is the multiple from transmitted between processing circuit in the operational data, intermediate result and operational order at least one Kind.
The block diagram of the main process task circuit 331 according to one embodiment of the disclosure is shown also referring to Fig. 9, Fig. 9.
As shown in figure 9, main process task circuit 331 may include: conversion processing circuit 113, activation processing circuit 111, addition One of processing circuit 112 or any combination.
The conversion processing circuit 113 is handled for executing the preamble to the data, and the preamble processing can are as follows: The received data of main process task circuit 331 or intermediate result are executed to the exchange between the first data structure and the second data structure (such as conversion of continuous data and discrete data);Or the received data of main process task circuit 331 or intermediate result are executed first Exchange (such as conversion of fixed point type and floating point type) between data type and the second data type.
The activation processing circuit 111 specially counts in execution main process task circuit 331 for executing the subsequent processing According to activation operation;
The addition process circuit 112, for executing the subsequent processing, specially execution add operation or cumulative fortune It calculates.
Each from processing circuit 332, operational data and operational order for being transmitted according to the main process task circuit 331 are held Row intermediate operations obtain intermediate result, and the intermediate result is transferred to the main process task circuit 331;
The main process task circuit 331 obtains the operational order most for executing subsequent processing to multiple intermediate results Whole calculated result.
The control module 32 is also used to generate debugging result according to the state information, and to the state information acquisition Device 40 exports debugging result.
Memory module 31 is used to store the status information in the calculating process, wherein the state according to operational order Information includes status information in the preamble treatment process of the main process task circuit 331, the multiple from processing circuit 332 Between the status information in calculating process, at least one in the status information in the subsequent processes of the main process task circuit 331 Kind.The memory module may include on piece sub-module stored 310, and the on piece sub-module stored 310 may include that high speed is temporary Deposit memory.
Memory module 31 can also include register, one or any combination in caching, specifically, the caching, For storing the computations;The register, for storing the neural network model, the data and scalar;It is described Caching is that scratchpad caches.
In a kind of possible embodiment, control module 32 may include: instruction buffer submodule 320, instruction processing Submodule 321 and storage queue submodule 323;
Instruction buffer submodule 320, for storing the associated computations of the neural network model;
Described instruction handles submodule 321, obtains multiple operational orders for parsing to the computations;
Storage queue submodule 323, for storing instruction queue, the instruction queue include: the tandem by the queue Pending multiple operational orders or computations.
For example, main process task circuit 331 also may include a control module in a kind of possible embodiment 32, which may include master instruction processing submodule, be specifically used for Instruction decoding into microcommand.Certainly in one kind It also may include another control module 32 from processing circuit 332 in possible embodiment, another control module 32 packet It includes from instruction and handles submodule, specifically for receiving and processing microcommand.Above-mentioned microcommand can be the next stage instruction of instruction, The microcommand can further can be decoded as each component, each module or everywhere by obtaining after the fractionation or decoding to instruction Manage the control signal of circuit.
In a kind of optinal plan, the structure of the computations can be as shown in table 1.
Table 1
Operation code Register or immediate Register/immediate ...
Ellipsis expression in upper table may include multiple registers or immediate.
In alternative dispensing means, which may include: one or more operation domains and an operation code. The computations may include neural network computing instruction.By taking neural network computing instructs as an example, as shown in table 1, wherein deposit Device number 0, register number 1, register number 2, register number 3, register number 4 can be operation domain.Wherein, each register number 0, Register number 1, register number 2, register number 3, register number 4 can be the number of one or more register.For example, such as Shown in table 2.
Table 2
Above-mentioned register can be chip external memory, certainly in practical applications, or on-chip memory, for depositing Store up data, which is specifically as follows t dimension data, and t is the integer more than or equal to 1, for example, be 1 dimension data when t=1, i.e., to Amount is 2 dimension datas, i.e. matrix when such as t=2, is multidimensional tensor when such as t=3 or 3 or more.
Optionally, which can also include:
Dependence handles submodule 322, for when with multiple operational orders, determine the first operational order with it is described The 0th operational order before first operational order whether there is incidence relation, such as first operational order and the 0th fortune Calculating instruction, there are incidence relations, then first operational order are buffered in described instruction cache sub-module, the described 0th After operational order is finished, first operational order is extracted from described instruction cache sub-module and is transmitted to the operation mould Block;
The determination first operational order whether there is with the 0th operational order before the first operational order to be associated with System includes:
Extract required data (such as matrix) in first operational order according to first operational order first is deposited Address section is stored up, the 0th stored address area of required matrix in the 0th operational order is extracted according to the 0th operational order Between, such as first storage address section has Chong Die region with the 0th storage address section, it is determined that described first Operational order and the 0th operational order have incidence relation, such as first storage address section and the 0th storage Location section does not have the region of overlapping, it is determined that first operational order does not have with the 0th operational order to be associated with System.
Referring to Fig. 10, Figure 10 shows the schematic diagram of the artificial intelligence process device according to one embodiment of the disclosure.
In a kind of possible embodiment, computing module 33 may include branch process circuit 333 as shown in Figure 10; Its specific connection structure is as shown in Figure 10, wherein
Main process task circuit 331 is connect with branch process circuit 333, branch process circuit 333 and multiple from processing circuit 332 Connection;
Branch process circuit 333, for execute forwarding main process task circuit 331 and between processing circuit 332 data or Instruction.
In a kind of possible embodiment, by taking the full connection operation in neural network computing as an example, process can be with are as follows: y =f (wx+b), wherein x is to input neural variable matrix, and w is weight matrix, and b is biasing scalar, and f is activation primitive, specifically can be with Are as follows: sigmoid function, any one in tanh, relu, softmax function.It is assumed that being binary tree structure, have 8 A method from processing circuit, realized can be with are as follows:
Control module obtains input nerve variable matrix x, weight matrix w out of memory module 31 and full connection operation refers to It enables, input nerve variable matrix x, weight matrix w and full connection operational order is transferred to main process task circuit;
Main process task circuit splits into 8 submatrixs for nerve variable matrix x is inputted, and 8 submatrixs are then passed through tree-shaped mould Block is distributed to 8 from processing circuit, and weight matrix w is broadcast to 8 from processing circuit,
The multiplying and accumulating operation for executing 8 submatrixs and weight matrix w parallel from processing circuit obtain 8 centres As a result, 8 intermediate results are sent to main process task circuit;
The operation result is executed biasing for sorting to obtain the operation result of wx by 8 intermediate results by main process task circuit Activation operation is executed after the operation of b and obtains final result y, final result y is sent to control module, control module is final by this As a result y is exported or is stored to memory module 31.
The method that neural network computing device as shown in Figure 10 executes the instruction of neural network forward operation is specifically as follows:
Control module 32 extracted out of memory module 31 operational data (such as neural network forward operation instruction, nerve net Network operational order) operation domain is transmitted to data access by corresponding operation domain and at least one operation code, control module 32 At least one operation code is sent to computing module by module.
Control module 32 extracts the corresponding weight w of the operation domain out of memory module 31 and biasing b (when b is 0, is not required to It extracts biasing b), weight w and biasing b is transmitted to the main process task circuit of computing module, control module is out of memory module 31 Input data Xi is extracted, input data Xi is sent to main process task circuit.
Input data Xi is split into n data block by main process task circuit;
The instruction processing submodule 321 of control module 32 determines that multiplying order, biasing refer to according at least one operation code It enables and accumulated instruction, multiplying order, offset instructions and accumulated instruction is sent to main process task circuit, main process task circuit is by the multiplication Instruction, weight w are sent to multiple from processing circuit in a broadcast manner, which are distributed to multiple electric from processing Road (such as with n from processing circuit, then each sending a data block from processing circuit);It is multiple from processing circuit, use Intermediate result is obtained in the weight w is executed multiplying with the data block received according to the multiplying order, which is tied Fruit is sent to main process task circuit, which holds multiple intermediate results sent from processing circuit according to the accumulated instruction Row accumulating operation obtains accumulation result, and accumulation result execution biasing is held b according to the bigoted instruction and obtains final result, by this Final result is sent to the control module.
In addition, the sequence of add operation and multiplying can exchange.
Technical solution provided by the present application is that neural network computing instruction realizes neural network by an instruction Multiplying and biasing operation are not necessarily to store or extract, reduce intermediate data in the intermediate result of neural computing Storage and extraction operation, so it, which has, reduces corresponding operating procedure, the advantages of improving the calculating effect of neural network.
Figure 11 is please referred to, Figure 11 shows the schematic diagram of the artificial intelligence process device according to one embodiment of the disclosure.
In a kind of possible embodiment, computing module 33 may include a main process task circuit 331 as shown in figure 11 With multiple from processing circuit 332.
In a kind of possible embodiment, as shown in figure 11, it is multiple from processing circuit be in array distribution;Each from processing Circuit is connect with other adjacent from processing circuit, and main process task circuit connection is the multiple a from processing from the k in processing circuit Circuit, the k is from processing circuit are as follows: n of the 1st row arrange from processing circuit, n of m row from processing circuit and the 1st M is from processing circuit, it should be noted that as shown in figure 11 K only include n of the 1st row from processing circuit from processing electricity Road, the n m arranged from processing circuit and the 1st of m row are a from processing circuit, i.e. the k are multiple from processing from processing circuit In circuit directly with the slave processing circuit of main process task circuit connection.
K is from processing circuit, in the main process task circuit and multiple data between processing circuit and referring to The forwarding of order.
In some embodiments, a kind of chip has also been applied for comprising above-mentioned artificial intelligence process device.
In some embodiments, a kind of chip-packaging structure has been applied for comprising said chip.
In some embodiments, a kind of board has been applied for comprising said chip encapsulating structure.
Figure 12 is please referred to, Figure 12 shows a kind of board according to one embodiment of the disclosure, and above-mentioned board is in addition to including Can also include other matching components, which includes but is not limited to other than said chip 389: memory device 390, Interface arrangement 391 and control device 392;
The memory device 390 is connect with the chip in the chip-packaging structure by bus, for storing data.Institute Stating memory device may include multiple groups storage unit 393.Storage unit described in each group is connect with the chip by bus.It can To understand, storage unit described in each group can be DDR SDRAM (English: Double Data Rate SDRAM, Double Data Rate Synchronous DRAM).
DDR, which does not need raising clock frequency, can double to improve the speed of SDRAM.DDR allows the rising in clock pulses Edge and failing edge read data.The speed of DDR is twice of standard SDRAM.In one embodiment, the storage device can be with Including storage unit described in 4 groups.Storage unit described in each group may include multiple DDR4 particles (chip).Implement at one In example, the chip interior may include 4 72 DDR4 controllers, and 64bit is used for transmission in above-mentioned 72 DDR4 controllers Data, 8bit are used for ECC check.It is appreciated that when using DDR4-3200 particle in the storage unit described in each group, data The theoretical bandwidth of transmission can reach 25600MB/s.
In one embodiment, storage unit described in each group include multiple Double Data Rate synchronous dynamics being arranged in parallel with Machine memory.DDR can transmit data twice within a clock cycle.The control of setting control DDR in the chips Device, the control for data transmission and data storage to each storage unit.
The interface arrangement is electrically connected with the chip in the chip-packaging structure.The interface arrangement is for realizing described Data transmission between chip and external equipment (such as server or computer).Such as in one embodiment, the interface Device can be standard PCIE interface.For example, data to be processed are transferred to the core by standard PCIE interface by server Piece realizes data transfer.Preferably, when using the transmission of 16 interface of PCIE 3.0X, theoretical bandwidth can reach 16000MB/s. In another embodiment, the interface arrangement can also be other interfaces, and the application is not intended to limit above-mentioned other interfaces Specific manifestation form, the interface unit can be realized signaling transfer point.In addition, the calculated result of the chip is still by institute It states interface arrangement and sends back external equipment (such as server).
The control device is electrically connected with the chip.The control device is for supervising the state of the chip Control.Specifically, the chip can be electrically connected with the control device by SPI interface.The control device may include list Piece machine (Micro Controller Unit, MCU).If the chip may include multiple processing chips, multiple processing cores or more A processing circuit can drive multiple loads.Therefore, the chip may be at the different work shape such as multi-load and light load State.It may be implemented by the control device to processing chips multiple in the chip, multiple processing and/or multiple processing circuits Working condition regulation.
In some embodiments, a kind of electronic equipment has been applied for comprising above-mentioned board.
Electronic equipment include data processing equipment, robot, computer, printer, scanner, tablet computer, intelligent terminal, Mobile phone, automobile data recorder, navigator, sensor, camera, server, cloud server, camera, video camera, projector, hand Table, earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or Medical Devices.
The vehicles include aircraft, steamer and/or vehicle;The household electrical appliance include TV, air-conditioning, micro-wave oven, Refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator;The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument And/or electrocardiograph.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know It knows, embodiment described in this description belongs to alternative embodiment, related actions and modules not necessarily the application It is necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, reference can be made to the related descriptions of other embodiments.
In several embodiments provided herein, it should be understood that disclosed device, it can be by another way It realizes.For example, the apparatus embodiments described above are merely exemplary, such as the division of the module, it is only a kind of Logical function partition, there may be another division manner in actual implementation, such as multiple module or components can combine or can To be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Coupling, direct-coupling or communication connection can be through some interfaces, the indirect coupling or communication connection of device or module, It can be electrical or other forms.
The module as illustrated by the separation member may or may not be physically separated, aobvious as module The component shown may or may not be physical module, it can and it is in one place, or may be distributed over multiple On network module.Some or all of the modules therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
It, can also be in addition, can integrate in a processing module in each functional module in each embodiment of the application It is that modules physically exist alone, can also be integrated in two or more modules in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software program module.
If the integrated module is realized in the form of software program module and sells or use as independent product When, it can store in a computer-readable access to memory.Based on this understanding, the technical solution of the application substantially or Person says that all or part of the part that contributes to existing technology or the technical solution can body in the form of software products Reveal and, which is stored in a memory, including some instructions are used so that a computer equipment (can be personal computer, server or network equipment etc.) executes all or part of each embodiment the method for the application Step.And memory above-mentioned includes: USB flash disk, read-only memory (ROM, Read-Only Memory), random access memory The various media that can store program code such as (RAM, Random Access Memory), mobile hard disk, magnetic or disk.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can store in a computer-readable memory, memory May include: flash disk, read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English: Random Access Memory, referred to as: RAM), disk or CD etc..
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can store in a computer-readable memory, memory May include: flash disk, read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English: Random Access Memory, referred to as: RAM), disk or CD etc..
The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or lead this technology Other those of ordinary skill in domain can understand each embodiment disclosed herein.

Claims (15)

1. a kind of operation method, which is characterized in that the described method includes:
The duplication operator and multiplication operator in artificial intelligence Operator Library are obtained, the duplication operator is for being answered input data To obtain duplication input data, the multiplication operator is used to carry out multiplying to input data system;
The duplication operator and the multiplication operator are spliced to form splicing operator,
Wherein, the splicing operator is used in artificial intelligence process device execute input data corresponding splicing arithmetic operation, To execute artificial intelligence operation.
2. the method according to claim 1, wherein described carry out the duplication operator and the multiplication operator Splicing is to form splicing operator, comprising:
Using the duplication operator as the prime operator of the multiplication operator.
3. according to the method described in claim 2, it is characterized in that, the splicing arithmetic operation includes:
When obtaining input data, input data is replicated to obtain duplication input data using the duplication operator;
Multiplying is carried out to the input data and duplication input data using the multiplication operator, to obtain multiplying knot Fruit.
4. the method according to claim 1, wherein the splicing operator is applied to answering in software transfer level With program layer, the artificial intelligence Operator Library is located at the Operator Library layer in software transfer level, artificial intelligence process device position Chip layer in software transfer level.
5. a kind of arithmetic unit, which is characterized in that described device includes:
Module is obtained, for obtaining duplication operator and multiplication operator in artificial intelligence Operator Library, the duplication operator is used for will Input data carries out duplication to obtain duplication input data, and the multiplication operator is used to carry out multiplying to input data;
Computing module is connected to the acquisition module, for splicing the duplication operator and the multiplication operator with shape At splicing operator,
Wherein, the splicing operator is used in artificial intelligence process device execute input data corresponding splicing arithmetic operation, To support artificial intelligence operation.
6. device according to claim 5, which is characterized in that the computing module includes:
First operation submodule, using the duplication operator as the prime operator of the multiplication operator.
7. device according to claim 6, which is characterized in that the splicing arithmetic operation includes:
When obtaining input data, input data is replicated to obtain duplication input data using the duplication operator;
Multiplying is carried out to the input data and duplication input data using the multiplication operator, to obtain multiplying knot Fruit.
8. a kind of artificial intelligence process device, which is characterized in that described device includes:
Primary processor, for executing method according to any of claims 1-4, to obtain splicing operator, the splicing is calculated Son is for executing corresponding arithmetic operation to the input data;
Artificial intelligence process device is electrically connected to the primary processor;
The primary processor is also used to send input data and the splicing operator, the artificial intelligence to artificial intelligence process device Processor is configured as:
Receive the input data and splicing operator that primary processor is sent;
Artificial intelligence operation is carried out to obtain operation result to the input data using the splicing operator;
The operation result is sent to the primary processor.
9. device according to claim 8, which is characterized in that the primary processor further includes primary processor memory space, For storing the splicing operator, wherein
The splicing operator that the primary processor also provides for input data and is stored in the primary processor memory space.
10. device according to claim 8, which is characterized in that operation result is passed through I/O by the artificial intelligence process device Interface passes to the primary processor;
When described device includes multiple artificial intelligence process devices, can lead between the multiple artificial intelligence process device Specific structure is crossed to be attached and transmit data;
Wherein, multiple artificial intelligence process devices are interconnected and are transmitted by quick external equipment interconnection Bus PC IE bus Data, to support the operation of more massive artificial intelligence;Multiple artificial intelligence process devices share same control system or Possess respective control system;Multiple artificial intelligence process device shared drives possess respective memory;It is multiple described The mutual contact mode of artificial intelligence process device is any interconnection topology.
11. device according to claim 8, which is characterized in that further include: storage device, the storage device respectively with institute It states artificial intelligence process device to connect with the primary processor, for saving the artificial intelligence process device device and the main process task The data of device.
12. a kind of artificial intelligence chip, which is characterized in that the artificial intelligence chip includes such as any one of claim 8-11 institute The artificial intelligence process device stated.
13. a kind of electronic equipment, which is characterized in that the electronic equipment includes the chip as described in the claim 12.
14. a kind of board, which is characterized in that the board includes: memory device, interface arrangement and control device and such as right It is required that artificial intelligence chip described in 12;
Wherein, the artificial intelligence chip is separately connected with the memory device, the control device and the interface arrangement;
The memory device, for storing data;
The interface arrangement, for realizing the data transmission between the chip and external equipment;
The control device is monitored for the state to the chip.
15. board according to claim 14, which is characterized in that
The memory device includes: multiple groups storage unit, and storage unit described in each group is connect with the chip by bus, institute State storage unit are as follows: DDR SDRAM;
The chip includes: DDR controller, the control for data transmission and data storage to each storage unit;
The interface arrangement are as follows: standard PCIE interface.
CN201811534068.2A 2018-12-14 2018-12-14 Operation method, device and related product Active CN109685201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811534068.2A CN109685201B (en) 2018-12-14 2018-12-14 Operation method, device and related product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811534068.2A CN109685201B (en) 2018-12-14 2018-12-14 Operation method, device and related product

Publications (2)

Publication Number Publication Date
CN109685201A true CN109685201A (en) 2019-04-26
CN109685201B CN109685201B (en) 2020-10-30

Family

ID=66187738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811534068.2A Active CN109685201B (en) 2018-12-14 2018-12-14 Operation method, device and related product

Country Status (1)

Country Link
CN (1) CN109685201B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125617A (en) * 2019-12-23 2020-05-08 中科寒武纪科技股份有限公司 Data processing method, data processing device, computer equipment and storage medium
CN111949317A (en) * 2019-05-17 2020-11-17 上海寒武纪信息科技有限公司 Instruction processing method and device and related product
CN111966400A (en) * 2019-05-20 2020-11-20 上海寒武纪信息科技有限公司 Instruction processing method and device and related product
CN111966399A (en) * 2019-05-20 2020-11-20 上海寒武纪信息科技有限公司 Instruction processing method and device and related product
CN112306945A (en) * 2019-07-30 2021-02-02 安徽寒武纪信息科技有限公司 Data synchronization method and device and related product
CN112347026A (en) * 2019-08-09 2021-02-09 安徽寒武纪信息科技有限公司 Data synchronization method and device and related product
CN112347186A (en) * 2019-08-09 2021-02-09 安徽寒武纪信息科技有限公司 Data synchronization method and device and related product
CN112445525A (en) * 2019-09-02 2021-03-05 中科寒武纪科技股份有限公司 Data processing method, related device and computer readable medium
CN112784207A (en) * 2019-11-01 2021-05-11 中科寒武纪科技股份有限公司 Operation method and related product

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090125814A1 (en) * 2006-03-31 2009-05-14 Alex Willcock Method and system for computerized searching and matching using emotional preference
CN103235974A (en) * 2013-04-25 2013-08-07 中国科学院地理科学与资源研究所 Method for improving processing efficiency of massive spatial data
US20140324747A1 (en) * 2013-04-30 2014-10-30 Raytheon Company Artificial continuously recombinant neural fiber network
US20160350656A1 (en) * 2015-05-25 2016-12-01 Omprakash VISVANATHAN Cognitive agent for executing cognitive rules
CN107357206A (en) * 2017-07-20 2017-11-17 郑州云海信息技术有限公司 A kind of method, apparatus and system of the computing optimization based on FPGA boards
CN107621932A (en) * 2017-09-25 2018-01-23 威创集团股份有限公司 The local amplification method and device of display image
CN107832804A (en) * 2017-10-30 2018-03-23 上海寒武纪信息科技有限公司 A kind of information processing method and Related product
CN107967135A (en) * 2017-10-31 2018-04-27 平安科技(深圳)有限公司 Computing engines implementation method, electronic device and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090125814A1 (en) * 2006-03-31 2009-05-14 Alex Willcock Method and system for computerized searching and matching using emotional preference
CN103235974A (en) * 2013-04-25 2013-08-07 中国科学院地理科学与资源研究所 Method for improving processing efficiency of massive spatial data
US20140324747A1 (en) * 2013-04-30 2014-10-30 Raytheon Company Artificial continuously recombinant neural fiber network
US20160350656A1 (en) * 2015-05-25 2016-12-01 Omprakash VISVANATHAN Cognitive agent for executing cognitive rules
CN107357206A (en) * 2017-07-20 2017-11-17 郑州云海信息技术有限公司 A kind of method, apparatus and system of the computing optimization based on FPGA boards
CN107621932A (en) * 2017-09-25 2018-01-23 威创集团股份有限公司 The local amplification method and device of display image
CN107832804A (en) * 2017-10-30 2018-03-23 上海寒武纪信息科技有限公司 A kind of information processing method and Related product
CN107967135A (en) * 2017-10-31 2018-04-27 平安科技(深圳)有限公司 Computing engines implementation method, electronic device and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PETER BODIK 等: "Combining Visualization and Statistical Analysis to Improve Operator Confidence and Efficiency for Failure Detection and Localization", 《PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING (ICAC"05)》 *
侯志华: "模型驱动效能评估软件构建平台", 《计算机与现代化》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949317A (en) * 2019-05-17 2020-11-17 上海寒武纪信息科技有限公司 Instruction processing method and device and related product
CN111966400A (en) * 2019-05-20 2020-11-20 上海寒武纪信息科技有限公司 Instruction processing method and device and related product
CN111966399A (en) * 2019-05-20 2020-11-20 上海寒武纪信息科技有限公司 Instruction processing method and device and related product
CN111966399B (en) * 2019-05-20 2024-06-07 上海寒武纪信息科技有限公司 Instruction processing method and device and related products
CN112306945A (en) * 2019-07-30 2021-02-02 安徽寒武纪信息科技有限公司 Data synchronization method and device and related product
CN112306945B (en) * 2019-07-30 2023-05-12 安徽寒武纪信息科技有限公司 Data synchronization method and device and related products
CN112347026A (en) * 2019-08-09 2021-02-09 安徽寒武纪信息科技有限公司 Data synchronization method and device and related product
CN112347186A (en) * 2019-08-09 2021-02-09 安徽寒武纪信息科技有限公司 Data synchronization method and device and related product
CN112445525A (en) * 2019-09-02 2021-03-05 中科寒武纪科技股份有限公司 Data processing method, related device and computer readable medium
CN112784207A (en) * 2019-11-01 2021-05-11 中科寒武纪科技股份有限公司 Operation method and related product
CN112784207B (en) * 2019-11-01 2024-02-02 中科寒武纪科技股份有限公司 Operation method and related product
CN111125617A (en) * 2019-12-23 2020-05-08 中科寒武纪科技股份有限公司 Data processing method, data processing device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN109685201B (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN109657782A (en) Operation method, device and Related product
CN109685201A (en) Operation method, device and Related product
CN109522052A (en) A kind of computing device and board
CN109543832A (en) A kind of computing device and board
TWI803663B (en) A computing device and computing method
CN109726822A (en) Operation method, device and Related product
CN109284815A (en) Neural network model algorithm Compilation Method, device and Related product
CN110059797A (en) A kind of computing device and Related product
CN109543825A (en) Neural network model algorithm Compilation Method, device and Related product
CN109753319A (en) A kind of device and Related product of release dynamics chained library
CN109670581A (en) A kind of computing device and board
CN109993301A (en) Neural metwork training device and Related product
CN109739703A (en) Adjust wrong method and Related product
CN110147249A (en) A kind of calculation method and device of network model
CN109740729A (en) Operation method, device and Related product
CN110059809A (en) A kind of computing device and Related product
CN110163349A (en) A kind of calculation method and device of network model
CN109726800A (en) Operation method, device and Related product
CN109711538A (en) Operation method, device and Related product
CN109711540A (en) A kind of computing device and board
CN109740730A (en) Operation method, device and Related product
CN109583579A (en) Computing device and Related product
CN110472734A (en) A kind of computing device and Related product
CN112084023A (en) Data parallel processing method, electronic equipment and computer readable storage medium
CN111382852B (en) Data processing device, method, chip and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100190 room 644, comprehensive research building, No. 6 South Road, Haidian District Academy of Sciences, Beijing

Applicant after: Zhongke Cambrian Technology Co., Ltd

Address before: 100190 room 644, comprehensive research building, No. 6 South Road, Haidian District Academy of Sciences, Beijing

Applicant before: Beijing Zhongke Cambrian Technology Co., Ltd.

CB02 Change of applicant information
TA01 Transfer of patent application right

Effective date of registration: 20200918

Address after: Room 611-194, R & D center building, China (Hefei) international intelligent voice Industrial Park, 3333 Xiyou Road, hi tech Zone, Hefei City, Anhui Province

Applicant after: Anhui Cambrian Information Technology Co., Ltd

Address before: 100190 room 644, comprehensive research building, No. 6 South Road, Haidian District Academy of Sciences, Beijing

Applicant before: Zhongke Cambrian Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant