CN111857723B - Parameter compiling method and device and computer readable storage medium - Google Patents
Parameter compiling method and device and computer readable storage medium Download PDFInfo
- Publication number
- CN111857723B CN111857723B CN202010604992.4A CN202010604992A CN111857723B CN 111857723 B CN111857723 B CN 111857723B CN 202010604992 A CN202010604992 A CN 202010604992A CN 111857723 B CN111857723 B CN 111857723B
- Authority
- CN
- China
- Prior art keywords
- parameter
- parameters
- subunit
- intermediate parameter
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000003860 storage Methods 0.000 title claims abstract description 28
- 238000004590 computer program Methods 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 4
- 238000012163 sequencing technique Methods 0.000 claims description 4
- 238000009826 distribution Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 abstract description 8
- 238000007781 pre-processing Methods 0.000 abstract description 4
- 238000013135 deep learning Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
Abstract
The embodiment of the invention discloses a parameter compiling method, a device and a medium, which are used for extracting network parameters contained in each model file; converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters; and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode. By converting the network parameters of each model file, the model files of different frames can be converted into uniform and hardware-friendly intermediate parameters, the correlation between various operation operations of the network parameters and the hardware is eliminated, and the problems of software code redundancy, dependency library conflict and the like caused by supporting various frames are well solved. According to the method and the device, data are written into hardware in the FPGA preprocessing stage, communication between the host and the FPGA is not needed, and communication pressure between the host and the FPGA does not exist.
Description
Technical Field
The present invention relates to the field of artificial intelligence technology, and in particular, to a parameter compiling method, apparatus, and computer-readable storage medium.
Background
With the development of artificial intelligence in various fields, such as agriculture, finance, security, health care, manufacturing and the like, people hope that the algorithm can be calculated more quickly and accurately, and meanwhile, the power consumption is lower.
For a wide variety of deep learning frameworks, an algorithm developer may use multiple frameworks as development, with the workload in each framework being represented and executed in a unique manner, and thus, even a simple Convolution (constraint) operation may need to be defined in a different manner. Supporting multiple frames requires hardware-based software to be adapted for each different operation, which may lead to a bloated software design and affect the efficiency of algorithm execution.
It can be seen that how to reduce software code redundancy brought by various frameworks is a problem to be solved by those skilled in the art.
Disclosure of Invention
Embodiments of the present invention provide a parameter compiling method, apparatus, and computer-readable storage medium, which can reduce software code redundancy caused by various frameworks.
To solve the above technical problem, an embodiment of the present invention provides a parameter compiling method, including:
extracting network parameters contained in each model file;
converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters;
and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode.
Optionally, the extracting network parameters included in each model file includes:
identifying the frame type corresponding to each model file according to the loaded model parameters;
and analyzing corresponding network parameters from each model file according to the network structure corresponding to each frame type.
Optionally, the allocating a corresponding memory address to each intermediate parameter according to the size information, the weight information, and the context operation serial number corresponding to each intermediate parameter includes:
sequencing each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
calculating the input/output size and the weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and setting corresponding memory addresses for each sorted intermediate parameter according to the input and output sizes and the weight of each intermediate parameter.
Optionally, the storing each intermediate parameter and the memory address corresponding to the intermediate parameter in a preset storage space according to a set manner includes:
querying a pre-established binary instruction set to obtain binary instruction streams corresponding to the intermediate parameters and the memory addresses; wherein each intermediate parameter has a hierarchy to which it belongs;
and sequentially writing the binary instruction streams corresponding to each layer into the bin file.
Optionally, after sequentially writing the binary instruction streams corresponding to the respective levels into the bin file, the method further includes:
and when a parameter calling instruction is acquired, calling a binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction from the bin file.
The embodiment of the invention also provides a parameter compiling device which comprises an extracting unit, a converting unit, a distributing unit and a storing unit;
the extraction unit is used for extracting the network parameters contained in each model file;
the conversion unit is used for converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
the distribution unit is used for distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters;
and the storage unit is used for storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode.
Optionally, the extraction unit comprises an identification subunit and an analysis subunit;
the identification subunit is used for identifying the frame type corresponding to each model file according to the loaded model parameters;
and the analysis subunit is used for analyzing the corresponding network parameters from each model file according to the network structure corresponding to each frame type.
Optionally, the allocation unit includes a sorting subunit, a calculating subunit, and a setting subunit;
the sorting subunit is configured to sort each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
the calculation subunit is configured to calculate an input/output size and a weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and the setting subunit is used for setting corresponding memory addresses for each sorted intermediate parameter according to the input and output size and the weight size of each intermediate parameter.
Optionally, the storage unit includes an inquiry subunit and a write subunit;
the query subunit is configured to query a pre-established binary instruction set to obtain binary instruction streams corresponding to the intermediate parameters and the memory addresses; wherein each intermediate parameter has a hierarchy to which it belongs;
and the writing subunit is used for sequentially writing the binary instruction streams corresponding to each level into the bin file.
Optionally, the system further comprises a calling unit;
and the calling unit is used for calling a binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction from the bin file when the parameter calling instruction is acquired.
An embodiment of the present invention further provides a parameter compiling apparatus, including:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the parameter compilation method as described in any one of the above.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the parameter compiling method according to any one of the above items.
According to the technical scheme, the network parameters contained in each model file are extracted; converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number. By converting the network parameters of each model file, the model files of different frames can be converted into uniform and hardware-friendly intermediate parameters, the correlation between various operation operations of the network parameters and the hardware is eliminated, and the converted intermediate parameters can support various deep learning frames to operate various operations on a self-designed FPGA. In order to facilitate subsequent calling, corresponding memory addresses are allocated to the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters; and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode. The technical scheme can well solve the problems of software code redundancy, dependency library conflict and the like caused by supporting various frameworks. Compared with the pipelined operation of the traditional deep learning compiler, the method writes data into hardware in the FPGA preprocessing stage, so that the communication between the host and the FPGA is not needed, and the communication pressure between the host and the FPGA does not exist.
Drawings
In order to more clearly illustrate the embodiments of the present invention, the drawings required for the embodiments will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart of a parameter compiling method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a parameter compiling apparatus according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a hardware structure of a parameter compiling apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without any creative work belong to the protection scope of the present invention.
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Next, a parameter compiling method provided by the embodiment of the invention is described in detail. Fig. 1 is a flowchart of a parameter compiling method according to an embodiment of the present invention, where the method includes:
s101: and extracting the network parameters contained in each model file.
In the embodiment of the invention, the frame type corresponding to each model file can be identified according to the loaded model parameters.
The model parameters may include a name of the model, address information of the model, and the like.
Common frame types are tensorflow, caffe, mxnet, etc. The model file under each frame type has a unique name, so that in the embodiment of the invention, the model file can be acquired according to the address information of the loaded model, and the frame type of the model file can be determined according to the name of the loaded model.
In the concrete implementation, in order to extract the network parameters more accurately and effectively, the corresponding network parameters can be analyzed from each model file according to the network structure corresponding to each frame type.
S102: and converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification.
The parameter specification may include size information, weight information, and a context operation number.
Network parameters refer to parameters required to implement various operations under the network framework. In the embodiment of the present invention, the network parameters under different frames can be converted into intermediate parameters in a unified form according to preset parameter specifications, so that the network parameters under different frame types are converted into parameters which cannot be identified by the frame type and are more suitable for a hardware end, i.e., a Field Programmable Gate Array (FPGA).
S103: and distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters.
The sequence number of the context operation reflects the sequence of each network parameter. The size information may include an input featuremap size, an output featuremap size. The weight information may include the size of the weight data and the number of input and output channels.
In the embodiment of the invention, the intermediate parameters can be sequenced according to the context operation serial numbers corresponding to the intermediate parameters; calculating the input/output size and the weight of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter; and setting corresponding memory addresses for the sorted intermediate parameters according to the input and output sizes and the weight sizes of the intermediate parameters.
Each intermediate parameter has its own hierarchy. Wherein, the hierarchy can include a convolutional layer, an active layer, a pooling layer, a fully connected layer, and the like.
The hierarchy to which each intermediate parameter belongs can be distinguished according to the context operation sequence number. When setting the memory address for each intermediate parameter, the memory address may be set for each hierarchy with the hierarchy to which the intermediate parameter belongs as the processing unit. According to the input and output size and the weight size corresponding to the intermediate parameter in the hierarchy, the memory space size occupied by the intermediate parameter in the hierarchy can be determined. According to the arrangement sequence of each hierarchy and the size of the memory space required to be occupied, memory addresses can be set for each hierarchy in sequence.
S104: and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode.
In the embodiment of the invention, in order to facilitate the hardware end to rapidly identify the intermediate parameters, each intermediate parameter and the corresponding memory address thereof can be converted into a binary code form for storage.
In particular implementations, a binary instruction set may be pre-established. The binary instruction set contains binary codes corresponding to different parameters.
After the intermediate parameters and the memory addresses corresponding to the intermediate parameters are obtained, a pre-established binary instruction set can be queried to obtain a binary instruction stream corresponding to each intermediate parameter and memory address.
Considering that each intermediate parameter has its corresponding hierarchy, in practical application, the binary instruction streams corresponding to the respective hierarchies may be written into the bin file in sequence. In practical application, when a parameter calling instruction is acquired, a binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction can be called from the bin file.
The binary code designed according to the binary instruction set for each operation parameter is converted into the binary instruction stream, so that the parameters can be conveniently and quickly identified by the hardware end, the data conversion of the hardware end in software development is reduced, and the algorithm execution efficiency is effectively improved.
According to the technical scheme, the network parameters contained in each model file are extracted; converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number. By converting the network parameters of each model file, the model files of different frames can be converted into uniform and hardware-friendly intermediate parameters, the correlation between various operation operations of the network parameters and the hardware is eliminated, and the converted intermediate parameters can support various deep learning frames to operate various operations on a self-designed FPGA. In order to facilitate subsequent calling, corresponding memory addresses are allocated to the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters; and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode. The technical scheme can well solve the problems of software code redundancy, dependency library conflict and the like caused by supporting various frameworks. Compared with the pipelined operation of the traditional deep learning compiler, the method writes data into hardware in the FPGA preprocessing stage, so that the communication between the host and the FPGA is not needed, and the communication pressure between the host and the FPGA does not exist.
Fig. 2 is a schematic structural diagram of a parameter compiling apparatus according to an embodiment of the present invention, including an extracting unit 21, a converting unit 22, an allocating unit 23, and a storing unit 24;
an extracting unit 21, configured to extract network parameters included in each model file;
a conversion unit 22, configured to convert each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
the allocating unit 23 is configured to allocate a corresponding memory address to each intermediate parameter according to the size information, the weight information, and the context operation serial number corresponding to each intermediate parameter;
the storage unit 24 is configured to store each intermediate parameter and the corresponding memory address thereof in a preset storage space according to a set manner.
Optionally, the extraction unit comprises an identification subunit and an analysis subunit;
the identification subunit is used for identifying the frame type corresponding to each model file according to the loaded model parameters;
and the analysis subunit is used for analyzing the corresponding network parameters from each model file according to the network structure corresponding to each frame type.
Optionally, the allocation unit includes a sorting subunit, a calculating subunit, and a setting subunit;
the sequencing subunit is used for sequencing each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
the calculation subunit is used for calculating the input/output size and the weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and the setting subunit is used for setting corresponding memory addresses for the sorted intermediate parameters according to the input/output sizes and the weight sizes of the intermediate parameters.
Optionally, the storage unit comprises an inquiry subunit and a write subunit;
the query subunit is used for querying a pre-established binary instruction set to acquire the binary instruction streams corresponding to the intermediate parameters and the memory addresses; wherein each intermediate parameter has the hierarchy to which it belongs;
and the writing subunit is used for sequentially writing the binary instruction streams corresponding to each level into the bin file.
Optionally, the system further comprises a calling unit;
and the calling unit is used for calling the binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction from the bin file when the parameter calling instruction is acquired.
The description of the features in the embodiment corresponding to fig. 2 may refer to the related description of the embodiment corresponding to fig. 1, and is not repeated here.
According to the technical scheme, the network parameters contained in each model file are extracted; converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number. By converting the network parameters of each model file, the model files of different frames can be converted into uniform and hardware-friendly intermediate parameters, the correlation between various operation operations of the network parameters and hardware is eliminated, and the converted intermediate parameters can support various operations of various deep learning frames on a self-designed FPGA. In order to facilitate subsequent calling, corresponding memory addresses are allocated to the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters; and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode. The technical scheme can well solve the problems of software code redundancy, dependency library conflict and the like caused by supporting various frameworks. Compared with the pipelined operation of the traditional deep learning compiler, the method writes data into hardware in the FPGA preprocessing stage, so that the communication between the host and the FPGA is not needed, and the communication pressure between the host and the FPGA does not exist.
Fig. 3 is a schematic diagram of a hardware structure of a parameter compiling apparatus 30 according to an embodiment of the present invention, including:
a memory 31 for storing a computer program;
a processor 32 for executing a computer program to implement the steps of any of the parameter compilation methods described above.
The embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program implements the steps of any one of the parameter compiling methods.
The present invention provides a method, an apparatus and a computer-readable storage medium for parameter compilation. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Claims (8)
1. A method of parameter compilation, comprising:
extracting network parameters contained in each model file;
converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters;
storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode;
the allocating a corresponding memory address to each intermediate parameter according to the size information, the weight information, and the context operation serial number corresponding to each intermediate parameter includes:
sequencing each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
calculating the input/output size and the weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and setting corresponding memory addresses for each sorted intermediate parameter according to the input and output sizes and the weight of each intermediate parameter.
2. The parameter compiling method according to claim 1, wherein the extracting the network parameters included in each model file comprises:
identifying the frame type corresponding to each model file according to the loaded model parameters;
and analyzing corresponding network parameters from each model file according to the network structure corresponding to each frame type.
3. The parameter compiling method according to claim 1, wherein the storing each intermediate parameter and the memory address corresponding thereto in a preset storage space according to a set manner comprises:
querying a pre-established binary instruction set to obtain binary instruction streams corresponding to the intermediate parameters and the memory addresses; wherein each intermediate parameter has a hierarchy to which it belongs;
and sequentially writing the binary instruction streams corresponding to the levels into the bin file.
4. The parameter compiling method according to claim 3, further comprising, after the sequentially writing the binary instruction streams corresponding to the respective levels into the bin file:
and when a parameter calling instruction is acquired, calling a binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction from the bin file.
5. A parameter compiling device is characterized by comprising an extracting unit, a converting unit, a distributing unit and a storing unit;
the extraction unit is used for extracting the network parameters contained in each model file;
the conversion unit is used for converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
the distribution unit is used for distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters;
the storage unit is used for storing each intermediate parameter and the corresponding memory address thereof to a preset storage space in a set mode;
the distribution unit comprises a sorting subunit, a calculation subunit and a setting subunit;
the sorting subunit is configured to sort each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
the calculation subunit is configured to calculate an input/output size and a weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and the setting subunit is used for setting corresponding memory addresses for each sorted intermediate parameter according to the input and output size and the weight size of each intermediate parameter.
6. The apparatus according to claim 5, wherein the extracting unit comprises an identifying subunit and a parsing subunit;
the identification subunit is used for identifying the frame type corresponding to each model file according to the loaded model parameters;
and the analysis subunit is used for analyzing the corresponding network parameters from each model file according to the network structure corresponding to each frame type.
7. A parameter compiling apparatus, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the parameter compilation method as claimed in any one of claims 1 to 4.
8. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the parameter compilation method according to any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010604992.4A CN111857723B (en) | 2020-06-29 | 2020-06-29 | Parameter compiling method and device and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010604992.4A CN111857723B (en) | 2020-06-29 | 2020-06-29 | Parameter compiling method and device and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111857723A CN111857723A (en) | 2020-10-30 |
CN111857723B true CN111857723B (en) | 2022-06-17 |
Family
ID=72989194
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010604992.4A Active CN111857723B (en) | 2020-06-29 | 2020-06-29 | Parameter compiling method and device and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111857723B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114116051B (en) * | 2021-11-17 | 2024-03-22 | 招联消费金融股份有限公司 | Processing method, device, equipment and storage medium based on neural network model |
CN115981666B (en) * | 2023-03-21 | 2023-07-21 | 北京探境科技有限公司 | Neural network information integration method, device, system and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106650922A (en) * | 2016-09-29 | 2017-05-10 | 清华大学 | Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system |
CN110795165A (en) * | 2019-10-12 | 2020-02-14 | 苏州浪潮智能科技有限公司 | Neural network model data loading method and related device |
CN111240640A (en) * | 2020-01-21 | 2020-06-05 | 苏州浪潮智能科技有限公司 | Data quantization method and device based on hardware environment and readable storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10592213B2 (en) * | 2016-10-19 | 2020-03-17 | Intel Corporation | Preprocessing tensor operations for optimal compilation |
KR102564456B1 (en) * | 2017-10-19 | 2023-08-07 | 삼성전자주식회사 | Method and apparatus for quantizing parameter of neural network |
US11468338B2 (en) * | 2018-09-11 | 2022-10-11 | Apple Inc. | Compiling models for dedicated hardware |
-
2020
- 2020-06-29 CN CN202010604992.4A patent/CN111857723B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106650922A (en) * | 2016-09-29 | 2017-05-10 | 清华大学 | Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system |
CN110795165A (en) * | 2019-10-12 | 2020-02-14 | 苏州浪潮智能科技有限公司 | Neural network model data loading method and related device |
CN111240640A (en) * | 2020-01-21 | 2020-06-05 | 苏州浪潮智能科技有限公司 | Data quantization method and device based on hardware environment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111857723A (en) | 2020-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111813963B (en) | Knowledge graph construction method and device, electronic equipment and storage medium | |
US9619430B2 (en) | Active non-volatile memory post-processing | |
CN111857723B (en) | Parameter compiling method and device and computer readable storage medium | |
CN105677812A (en) | Method and device for querying data | |
JP2010524060A (en) | Data merging in distributed computing | |
Gómez et al. | Map-based transparent persistence for very large models | |
CN104067282A (en) | Counter operation in a state machine lattice | |
CN106557307B (en) | Service data processing method and system | |
CN110008192A (en) | A kind of data file compression method, apparatus, equipment and readable storage medium storing program for executing | |
CN112667860A (en) | Sub-graph matching method, device, equipment and storage medium | |
CN108153587A (en) | A kind of slow task reason detection method for big data platform | |
Shi et al. | A case study of tuning MapReduce for efficient Bioinformatics in the cloud | |
Risco-Martin et al. | A methodology to automatically optimize dynamic memory managers applying grammatical evolution | |
Bei et al. | MEST: A model-driven efficient searching approach for MapReduce self-tuning | |
Löhnertz et al. | Steinmetz: Toward Automatic Decomposition of Monolithic Software Into Microservices. | |
CN110908870A (en) | Resource monitoring method and device for mainframe, storage medium and equipment | |
CN112069052A (en) | Abnormal object detection method, device, equipment and storage medium | |
CN113344023A (en) | Code recommendation method, device and system | |
CN113778961A (en) | Production management method, device and system for CIM model data | |
CN105573763A (en) | Embedded system modeling method supporting RTOS | |
CN113268485A (en) | Data table association analysis method, device, equipment and storage medium | |
CN112347101A (en) | Tag data storage method, computer device, and storage medium | |
CN110287241B (en) | Method and device for generating alarm data report | |
CN108694041A (en) | Data transfer device, device and service terminal | |
Ediger et al. | Computational graph analytics for massive streaming data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |