CN111857723A - Parameter compiling method and device and computer readable storage medium - Google Patents

Parameter compiling method and device and computer readable storage medium Download PDF

Info

Publication number
CN111857723A
CN111857723A CN202010604992.4A CN202010604992A CN111857723A CN 111857723 A CN111857723 A CN 111857723A CN 202010604992 A CN202010604992 A CN 202010604992A CN 111857723 A CN111857723 A CN 111857723A
Authority
CN
China
Prior art keywords
parameter
parameters
subunit
network
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010604992.4A
Other languages
Chinese (zh)
Other versions
CN111857723B (en
Inventor
曹其春
董刚
尹文枫
梁玲燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN202010604992.4A priority Critical patent/CN111857723B/en
Publication of CN111857723A publication Critical patent/CN111857723A/en
Application granted granted Critical
Publication of CN111857723B publication Critical patent/CN111857723B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

The embodiment of the invention discloses a parameter compiling method, a device and a medium, which are used for extracting network parameters contained in each model file; converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters; and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode. By converting the network parameters of each model file, the model files of different frames can be converted into uniform and hardware-friendly intermediate parameters, the correlation between various operation operations of the network parameters and the hardware is eliminated, and the problems of software code redundancy, dependency library conflict and the like caused by supporting various frames are well solved. According to the method and the device, data are written into hardware in the FPGA preprocessing stage, communication between the host and the FPGA is not needed, and communication pressure between the host and the FPGA does not exist.

Description

Parameter compiling method and device and computer readable storage medium
Technical Field
The present invention relates to the field of artificial intelligence technology, and in particular, to a parameter compiling method, apparatus, and computer-readable storage medium.
Background
With the development of artificial intelligence in various fields, such as agriculture, finance, security, health care, manufacturing and the like, people hope that the algorithm can be calculated more quickly and accurately, and meanwhile, the power consumption is lower.
For a wide variety of deep learning frameworks, an algorithm developer may use multiple frameworks as development, with the workload in each framework being represented and executed in a unique manner, and thus, even a simple Convolution (constraint) operation may need to be defined in a different manner. Supporting multiple frames requires hardware-based software to be adapted for each different operation, which may lead to a bloated software design and affect the efficiency of algorithm execution.
It can be seen that how to reduce software code redundancy brought by various frameworks is a problem to be solved by those skilled in the art.
Disclosure of Invention
Embodiments of the present invention provide a parameter compiling method, apparatus, and computer-readable storage medium, which can reduce software code redundancy caused by various frameworks.
To solve the above technical problem, an embodiment of the present invention provides a parameter compiling method, including:
extracting network parameters contained in each model file;
converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters;
and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode.
Optionally, the extracting network parameters included in each model file includes:
identifying the frame type corresponding to each model file according to the loaded model parameters;
and analyzing corresponding network parameters from each model file according to the network structure corresponding to each frame type.
Optionally, the allocating a corresponding memory address to each intermediate parameter according to the size information, the weight information, and the context operation serial number corresponding to each intermediate parameter includes:
sequencing each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
Calculating the input/output size and the weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and setting corresponding memory addresses for each sorted intermediate parameter according to the input and output sizes and the weight of each intermediate parameter.
Optionally, the storing each intermediate parameter and the memory address corresponding to the intermediate parameter in a preset storage space according to a set manner includes:
querying a pre-established binary instruction set to obtain binary instruction streams corresponding to the intermediate parameters and the memory addresses; wherein each intermediate parameter has a hierarchy to which it belongs;
and sequentially writing the binary instruction streams corresponding to each layer into the bin file.
Optionally, after sequentially writing the binary instruction streams corresponding to the respective levels into the bin file, the method further includes:
and when a parameter calling instruction is acquired, calling a binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction from the bin file.
The embodiment of the invention also provides a parameter compiling device which comprises an extracting unit, a converting unit, a distributing unit and a storing unit;
the extraction unit is used for extracting the network parameters contained in each model file;
The conversion unit is used for converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
the distribution unit is used for distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters;
and the storage unit is used for storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode.
Optionally, the extraction unit comprises an identification subunit and an analysis subunit;
the identification subunit is used for identifying the frame type corresponding to each model file according to the loaded model parameters;
and the analysis subunit is used for analyzing the corresponding network parameters from each model file according to the network structure corresponding to each frame type.
Optionally, the allocation unit includes a sorting subunit, a calculating subunit, and a setting subunit;
the sorting subunit is configured to sort each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
The calculation subunit is configured to calculate an input/output size and a weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and the setting subunit is used for setting corresponding memory addresses for each sorted intermediate parameter according to the input and output size and the weight size of each intermediate parameter.
Optionally, the storage unit includes an inquiry subunit and a write subunit;
the query subunit is configured to query a pre-established binary instruction set to obtain binary instruction streams corresponding to the intermediate parameters and the memory addresses; wherein each intermediate parameter has a hierarchy to which it belongs;
and the writing subunit is used for sequentially writing the binary instruction streams corresponding to each level into the bin file.
Optionally, the system further comprises a calling unit;
and the calling unit is used for calling a binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction from the bin file when the parameter calling instruction is acquired.
An embodiment of the present invention further provides a parameter compiling apparatus, including:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the parameter compilation method as described in any one of the above.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the parameter compiling method according to any one of the above items.
According to the technical scheme, the network parameters contained in each model file are extracted; converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number. By converting the network parameters of each model file, the model files of different frames can be converted into uniform and hardware-friendly intermediate parameters, the correlation between various operation operations of the network parameters and the hardware is eliminated, and the converted intermediate parameters can support various deep learning frames to operate various operations on a self-designed FPGA. In order to facilitate subsequent calling, corresponding memory addresses are allocated to the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters; and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode. The technical scheme can well solve the problems of software code redundancy, dependency library conflict and the like caused by supporting various frameworks. Compared with the pipelined operation of the traditional deep learning compiler, the method writes data into hardware in the FPGA preprocessing stage, so that the communication between the host and the FPGA is not needed, and the communication pressure between the host and the FPGA does not exist.
Drawings
In order to illustrate the embodiments of the present invention more clearly, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart of a parameter compiling method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a parameter compiling apparatus according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a hardware structure of a parameter compiling apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without any creative work belong to the protection scope of the present invention.
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Next, a parameter compiling method provided by the embodiment of the invention is described in detail. Fig. 1 is a flowchart of a parameter compiling method according to an embodiment of the present invention, where the method includes:
s101: and extracting the network parameters contained in each model file.
In the embodiment of the invention, the frame type corresponding to each model file can be identified according to the loaded model parameters.
The model parameters may include a name of the model, address information of the model, and the like.
Common frame types are tensorflow, caffe, mxnet, etc. The model file under each frame type has a unique name, so that in the embodiment of the invention, the model file can be acquired according to the address information of the loaded model, and the frame type of the model file can be determined according to the name of the loaded model.
In the concrete implementation, in order to extract the network parameters more accurately and effectively, the corresponding network parameters can be analyzed from each model file according to the network structure corresponding to each frame type.
S102: and converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification.
The parameter specification may include size information, weight information, and a context operation number.
Network parameters refer to parameters required to implement various operations under the network framework. In the embodiment of the present invention, the network parameters under different frames can be converted into intermediate parameters in a unified form according to preset parameter specifications, so that the network parameters under different frame types are converted into parameters which cannot be identified by the frame type and are more suitable for a hardware end, i.e., a Field Programmable Gate Array (FPGA).
S103: and distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters.
The sequence number of the context operation reflects the sequence of each network parameter. The size information may include an input featuremap size, an output featuremap size. The weight information may include the size of the weight data and the number of input and output channels.
In the embodiment of the invention, the intermediate parameters can be sequenced according to the context operation serial numbers corresponding to the intermediate parameters; calculating the input/output size and the weight of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter; and setting corresponding memory addresses for the sorted intermediate parameters according to the input and output sizes and the weight sizes of the intermediate parameters.
Each intermediate parameter has its own hierarchy. Wherein, the hierarchy can include a convolutional layer, an active layer, a pooling layer, a fully connected layer, and the like.
The hierarchy to which each intermediate parameter belongs can be distinguished according to the context operation sequence number. When setting the memory address for each intermediate parameter, the memory address may be set for each hierarchy with the hierarchy to which the intermediate parameter belongs as the processing unit. According to the input and output size and the weight size corresponding to the intermediate parameter in the hierarchy, the memory space size occupied by the intermediate parameter in the hierarchy can be determined. According to the arrangement sequence of each hierarchy and the size of the memory space required to be occupied, memory addresses can be set for each hierarchy in sequence.
S104: and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode.
In the embodiment of the invention, in order to facilitate the hardware end to rapidly identify the intermediate parameters, each intermediate parameter and the corresponding memory address thereof can be converted into a binary code form for storage.
In particular implementations, a binary instruction set may be pre-established. The binary instruction set contains binary codes corresponding to different parameters.
After the intermediate parameters and the memory addresses corresponding to the intermediate parameters are obtained, a pre-established binary instruction set can be queried to obtain a binary instruction stream corresponding to each intermediate parameter and memory address.
Considering that each intermediate parameter has its corresponding hierarchy, in practical application, the binary instruction streams corresponding to the respective hierarchies may be written into the bin file in sequence. In practical application, when a parameter calling instruction is acquired, a binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction can be called from the bin file.
The binary code designed according to the binary instruction set for each operation parameter is converted into the binary instruction stream, so that the parameters can be conveniently and quickly identified by the hardware end, the data conversion of the hardware end in software development is reduced, and the algorithm execution efficiency is effectively improved.
According to the technical scheme, the network parameters contained in each model file are extracted; converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number. By converting the network parameters of each model file, the model files of different frames can be converted into uniform and hardware-friendly intermediate parameters, the correlation between various operation operations of the network parameters and the hardware is eliminated, and the converted intermediate parameters can support various deep learning frames to operate various operations on a self-designed FPGA. In order to facilitate subsequent calling, corresponding memory addresses are allocated to the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters; and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode. The technical scheme can well solve the problems of software code redundancy, dependency library conflict and the like caused by supporting various frameworks. Compared with the pipelined operation of the traditional deep learning compiler, the method writes data into hardware in the FPGA preprocessing stage, so that the communication between the host and the FPGA is not needed, and the communication pressure between the host and the FPGA does not exist.
Fig. 2 is a schematic structural diagram of a parameter compiling apparatus according to an embodiment of the present invention, including an extracting unit 21, a converting unit 22, an allocating unit 23, and a storing unit 24;
an extracting unit 21, configured to extract network parameters included in each model file;
a conversion unit 22, configured to convert each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
the allocating unit 23 is configured to allocate a corresponding memory address to each intermediate parameter according to the size information, the weight information, and the context operation serial number corresponding to each intermediate parameter;
the storage unit 24 is configured to store each intermediate parameter and the corresponding memory address thereof in a preset storage space according to a set manner.
Optionally, the extraction unit comprises an identification subunit and an analysis subunit;
the identification subunit is used for identifying the frame type corresponding to each model file according to the loaded model parameters;
and the analysis subunit is used for analyzing the corresponding network parameters from each model file according to the network structure corresponding to each frame type.
Optionally, the allocation unit includes a sorting subunit, a calculating subunit, and a setting subunit;
The sequencing subunit is used for sequencing each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
the calculating subunit is used for calculating the input/output size and the weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and the setting subunit is used for setting corresponding memory addresses for the sorted intermediate parameters according to the input and output sizes and the weight sizes of the intermediate parameters.
Optionally, the storage unit comprises an inquiry subunit and a write subunit;
the query subunit is used for querying a pre-established binary instruction set to acquire binary instruction streams corresponding to the intermediate parameters and the memory addresses; wherein each intermediate parameter has the hierarchy to which it belongs;
and the writing subunit is used for sequentially writing the binary instruction streams corresponding to each level into the bin file.
Optionally, the system further comprises a calling unit;
and the calling unit is used for calling the binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction from the bin file when the parameter calling instruction is acquired.
The description of the features in the embodiment corresponding to fig. 2 may refer to the related description of the embodiment corresponding to fig. 1, and is not repeated here.
According to the technical scheme, the network parameters contained in each model file are extracted; converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number. By converting the network parameters of each model file, the model files of different frames can be converted into uniform and hardware-friendly intermediate parameters, the correlation between various operation operations of the network parameters and the hardware is eliminated, and the converted intermediate parameters can support various deep learning frames to operate various operations on a self-designed FPGA. In order to facilitate subsequent calling, corresponding memory addresses are allocated to the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters; and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode. The technical scheme can well solve the problems of software code redundancy, dependency library conflict and the like caused by supporting various frameworks. Compared with the pipelined operation of the traditional deep learning compiler, the method writes data into hardware in the FPGA preprocessing stage, so that the communication between the host and the FPGA is not needed, and the communication pressure between the host and the FPGA does not exist.
Fig. 3 is a schematic diagram of a hardware structure of a parameter compiling apparatus 30 according to an embodiment of the present invention, including:
a memory 31 for storing a computer program;
a processor 32 for executing a computer program to implement the steps of any of the parameter compilation methods described above.
The embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program implements the steps of any one of the parameter compiling methods.
The present invention provides a method, an apparatus and a computer-readable storage medium for parameter compilation. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Claims (10)

1. A method of parameter compilation, comprising:
Extracting network parameters contained in each model file;
converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters;
and storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode.
2. The parameter compiling method according to claim 1, wherein the extracting the network parameters included in each model file comprises:
identifying the frame type corresponding to each model file according to the loaded model parameters;
and analyzing corresponding network parameters from each model file according to the network structure corresponding to each frame type.
3. The parameter compiling method according to claim 1, wherein the allocating a corresponding memory address to each of the intermediate parameters according to the size information, the weight information, and the context operation serial number corresponding to each of the intermediate parameters comprises:
sequencing each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
Calculating the input/output size and the weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and setting corresponding memory addresses for each sorted intermediate parameter according to the input and output sizes and the weight of each intermediate parameter.
4. The parameter compiling method according to claim 1, wherein the storing each intermediate parameter and the memory address corresponding thereto in a preset storage space according to a set manner comprises:
querying a pre-established binary instruction set to obtain binary instruction streams corresponding to the intermediate parameters and the memory addresses; wherein each intermediate parameter has a hierarchy to which it belongs;
and sequentially writing the binary instruction streams corresponding to each layer into the bin file.
5. The parameter compiling method according to claim 4, further comprising, after the sequentially writing the binary instruction streams corresponding to the respective levels into the bin file:
and when a parameter calling instruction is acquired, calling a binary instruction stream matched with the hierarchy identifier carried by the parameter calling instruction from the bin file.
6. A parameter compiling device is characterized by comprising an extracting unit, a converting unit, a distributing unit and a storing unit;
The extraction unit is used for extracting the network parameters contained in each model file;
the conversion unit is used for converting each network parameter into a corresponding intermediate parameter according to a preset parameter specification; the parameter specification comprises size information, weight information and a context operation serial number;
the distribution unit is used for distributing corresponding memory addresses for the intermediate parameters according to the size information, the weight information and the context operation serial number corresponding to the intermediate parameters;
and the storage unit is used for storing each intermediate parameter and the corresponding memory address thereof to a preset storage space according to a set mode.
7. The apparatus according to claim 6, wherein the extracting unit comprises an identifying subunit and a parsing subunit;
the identification subunit is used for identifying the frame type corresponding to each model file according to the loaded model parameters;
and the analysis subunit is used for analyzing the corresponding network parameters from each model file according to the network structure corresponding to each frame type.
8. The parameter compiling apparatus according to claim 6 wherein the allocation unit includes an ordering subunit, a calculating subunit, and a setting subunit;
The sorting subunit is configured to sort each intermediate parameter according to the context operation serial number corresponding to each intermediate parameter;
the calculation subunit is configured to calculate an input/output size and a weight size of each intermediate parameter according to the size information and the weight information corresponding to each intermediate parameter;
and the setting subunit is used for setting corresponding memory addresses for each sorted intermediate parameter according to the input and output size and the weight size of each intermediate parameter.
9. A parameter compiling apparatus, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the parameter compilation method as claimed in any one of claims 1 to 5.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the parameter compilation method according to any one of claims 1 to 5.
CN202010604992.4A 2020-06-29 2020-06-29 Parameter compiling method and device and computer readable storage medium Active CN111857723B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010604992.4A CN111857723B (en) 2020-06-29 2020-06-29 Parameter compiling method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010604992.4A CN111857723B (en) 2020-06-29 2020-06-29 Parameter compiling method and device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111857723A true CN111857723A (en) 2020-10-30
CN111857723B CN111857723B (en) 2022-06-17

Family

ID=72989194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010604992.4A Active CN111857723B (en) 2020-06-29 2020-06-29 Parameter compiling method and device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111857723B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114116051A (en) * 2021-11-17 2022-03-01 招联消费金融有限公司 Processing method, device, equipment and storage medium based on neural network model
CN115981666A (en) * 2023-03-21 2023-04-18 北京探境科技有限公司 Neural network information integration method, device, system and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650922A (en) * 2016-09-29 2017-05-10 清华大学 Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system
US20180107456A1 (en) * 2016-10-19 2018-04-19 1026 Labs, Inc. Preprocessing tensor operations for optimal compilation
US20190122100A1 (en) * 2017-10-19 2019-04-25 Samsung Electronics Co., Ltd. Method and apparatus with neural network parameter quantization
CN110795165A (en) * 2019-10-12 2020-02-14 苏州浪潮智能科技有限公司 Neural network model data loading method and related device
US20200082274A1 (en) * 2018-09-11 2020-03-12 Apple Inc. Compiling models for dedicated hardware
CN111240640A (en) * 2020-01-21 2020-06-05 苏州浪潮智能科技有限公司 Data quantization method and device based on hardware environment and readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650922A (en) * 2016-09-29 2017-05-10 清华大学 Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system
US20180107456A1 (en) * 2016-10-19 2018-04-19 1026 Labs, Inc. Preprocessing tensor operations for optimal compilation
US20190122100A1 (en) * 2017-10-19 2019-04-25 Samsung Electronics Co., Ltd. Method and apparatus with neural network parameter quantization
US20200082274A1 (en) * 2018-09-11 2020-03-12 Apple Inc. Compiling models for dedicated hardware
CN110795165A (en) * 2019-10-12 2020-02-14 苏州浪潮智能科技有限公司 Neural network model data loading method and related device
CN111240640A (en) * 2020-01-21 2020-06-05 苏州浪潮智能科技有限公司 Data quantization method and device based on hardware environment and readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114116051A (en) * 2021-11-17 2022-03-01 招联消费金融有限公司 Processing method, device, equipment and storage medium based on neural network model
CN114116051B (en) * 2021-11-17 2024-03-22 招联消费金融股份有限公司 Processing method, device, equipment and storage medium based on neural network model
CN115981666A (en) * 2023-03-21 2023-04-18 北京探境科技有限公司 Neural network information integration method, device, system and storage medium
CN115981666B (en) * 2023-03-21 2023-07-21 北京探境科技有限公司 Neural network information integration method, device, system and storage medium

Also Published As

Publication number Publication date
CN111857723B (en) 2022-06-17

Similar Documents

Publication Publication Date Title
CN110515739B (en) Deep learning neural network model load calculation method, device, equipment and medium
Wahib et al. Scalable kernel fusion for memory-bound GPU applications
Wang et al. Performance prediction for apache spark platform
US9619430B2 (en) Active non-volatile memory post-processing
JP5298117B2 (en) Data merging in distributed computing
CN111857723B (en) Parameter compiling method and device and computer readable storage medium
US7853930B2 (en) Annotating graphs to allow quick loading and analysis of very large graphs
Gómez et al. Map-based transparent persistence for very large models
CN104067282A (en) Counter operation in a state machine lattice
CN111813963A (en) Knowledge graph construction method and device, electronic equipment and storage medium
CN108153587A (en) A kind of slow task reason detection method for big data platform
CN110275889B (en) Feature processing method and device suitable for machine learning
CN112667860A (en) Sub-graph matching method, device, equipment and storage medium
CN106294128B (en) A kind of automated testing method and device exporting report data
CN105302915B (en) The high-performance data processing system calculated based on memory
CN113778961B (en) Production management method, device and system for CIM model data
CN113268485B (en) Data table association analysis method, device, equipment and storage medium
Bei et al. MEST: A model-driven efficient searching approach for MapReduce self-tuning
CN105573763A (en) Embedded system modeling method supporting RTOS
CN112347101A (en) Tag data storage method, computer device, and storage medium
CN110888909B (en) Data statistical processing method and device for evaluation content
CN108694041A (en) Data transfer device, device and service terminal
CN112069052A (en) Abnormal object detection method, device, equipment and storage medium
CN113344023A (en) Code recommendation method, device and system
CN104090895A (en) Method, device, server and system for obtaining cardinal number

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant