CN116991431A - GPU-based coding and decoding model static deployment method, electronic equipment and medium - Google Patents

GPU-based coding and decoding model static deployment method, electronic equipment and medium Download PDF

Info

Publication number
CN116991431A
CN116991431A CN202310983908.8A CN202310983908A CN116991431A CN 116991431 A CN116991431 A CN 116991431A CN 202310983908 A CN202310983908 A CN 202310983908A CN 116991431 A CN116991431 A CN 116991431A
Authority
CN
China
Prior art keywords
matrix
information
relative position
coding
decoding model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310983908.8A
Other languages
Chinese (zh)
Other versions
CN116991431B (en
Inventor
谢佳形
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Muxi Lingzhi Technology Hangzhou Co ltd
Original Assignee
Muxi Integrated Circuit Hangzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Muxi Integrated Circuit Hangzhou Co ltd filed Critical Muxi Integrated Circuit Hangzhou Co ltd
Priority to CN202310983908.8A priority Critical patent/CN116991431B/en
Publication of CN116991431A publication Critical patent/CN116991431A/en
Application granted granted Critical
Publication of CN116991431B publication Critical patent/CN116991431B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The application relates to the technical field of computers, in particular to a static deployment method of a coding and decoding model based on a Graphic Processing Unit (GPU), electronic equipment and a medium, wherein the method comprises the following steps of S1, acquiring an original feature matrix, and generating a complement feature matrix by complementing columns of the original feature matrix; s2, acquiring a relative position coding matrix corresponding to the filling feature matrix; s3, inputting the relative position coding matrix into an encoder of the coding and decoding model to generate relative position coding information; s4, the historical prediction information is complemented into R-bit input information, the complemented R-bit input information and the current effective information bit number X in the R-bit input information are input into a decoder of the coding and decoding model, relative position coding information is read, the (X+1) th prediction information is generated, the value range of X is 0 to R-1, and target information is generated based on the R-1 prediction information. The application realizes static deployment of the coding and decoding model on the GPU and meets the requirement of the coding and decoding model on the reasoning speed.

Description

GPU-based coding and decoding model static deployment method, electronic equipment and medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a static deployment method for a codec model based on a GPU, an electronic device, and a medium.
Background
The input length of some encoding-decoding (Encoder-Decoder) models (hereinafter codec models) is dynamically changing, resulting in output length and intermediate feature length also being dynamically changing, such as speech recognition (Automatic Speech Recognition, ASR) models. ASR is an important application in the AI field, a technology that converts human speech into text. The codec model has high requirements on the reasoning speed, and for the requirements on the reasoning speed, the trained codec model is generally required to be deployed on a graphics processor (Graphics Processing Unit, referred to as GPU for short), some GPUs support the dynamic deployment of the codec model, but some GPUs do not support the dynamic deployment of the codec model, so how to implement the static deployment of the codec model on the GPU, and meeting the requirements of the codec model on the reasoning speed becomes a technical problem to be solved urgently.
Disclosure of Invention
The application aims to provide a static deployment method, electronic equipment and medium for a coding and decoding model based on a GPU, which realize static deployment of the coding and decoding model on the GPU and meet the requirement of the coding and decoding model on the reasoning speed.
According to a first aspect of the present application, there is provided a GPU-based codec model static deployment method, including:
s1, acquiring an original feature matrix, and complementing columns of the original feature matrix to generate an complemented feature matrix, wherein the number of rows and the number of columns of the original feature matrix are M, the number of rows of the complemented feature matrix is M, and the number of columns of the complemented feature matrix is N, and N is more than M;
s2, acquiring a relative position coding matrix corresponding to the filling feature matrix, wherein the number of rows of the relative position coding matrix is N-1, and the number of columns of the relative position coding matrix is M;
s3, inputting the relative position coding matrix into an encoder of a coding and decoding model to generate relative position coding information, and storing the relative position coding information in a preset memory;
s4, the historical prediction information is complemented into R-bit input information, the complemented R-bit input information and the current effective information bit number X in the R-bit input information are input into a decoder of a coding and decoding model, relative position coding information is read from the preset memory, the (X+1) th prediction information is generated, the value range of X is 0 to R-1, and target information is generated based on the R-1 th prediction information.
According to a second aspect of the present application, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being arranged to perform the method according to the first aspect of the application.
According to a third aspect of the present application there is provided a computer readable storage medium storing computer executable instructions for performing the method of the first aspect of the present application.
Compared with the prior art, the application has obvious advantages and beneficial effects. By means of the technical scheme, the GPU-based coding and decoding model static deployment method, the electronic equipment and the media can achieve quite technical progress and practicality, have wide industrial utilization value, and have at least the following beneficial effects:
according to the embodiment of the application, the original feature matrix is supplemented to generate the supplemented feature matrix, the corresponding relative position coding matrix is obtained based on the supplemented feature matrix, the relative position coding information is generated based on the relative position coding matrix, the supplemented R-bit input information and the current effective information bit number X in the R-bit input information are input into a decoder of a coding and decoding model in the decoding process, and the X+1th prediction information is predicted, so that static deployment of the coding and decoding model on the GPU is realized, and the requirement of the coding and decoding model on the reasoning speed is met.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a static deployment method for a GPU-based codec model according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
The embodiment of the application provides a GPU-based coding and decoding model static deployment method, which is shown in fig. 1 and comprises the following steps:
s1, acquiring an original feature matrix, and complementing columns of the original feature matrix to generate an complemented feature matrix, wherein the number of rows and the number of columns of the original feature matrix are M, the number of rows of the complemented feature matrix is M, and the number of columns of the complemented feature matrix is N, and N > M.
The number of columns of the original feature matrix may be dynamically changed, and the fixed number of columns of the matrix input by the coding and decoding model is realized in a filling manner, so that the static deployment of the coding and decoding model encoder is realized. The codec model may specifically be an ASR model.
And S2, acquiring a relative position coding matrix corresponding to the filling feature matrix, wherein the number of rows of the relative position coding matrix is N-1, and the number of columns of the relative position coding matrix is M.
And S3, inputting the relative position coding matrix into an encoder of a coding and decoding model to generate relative position coding information, and storing the relative position coding information in a preset memory.
S4, the historical prediction information is complemented into R-bit input information, the complemented R-bit input information and the current effective information bit number X in the R-bit input information are input into a decoder of a coding and decoding model, relative position coding information is read from the preset memory, the (X+1) th prediction information is generated, the value range of X is 0 to R-1, and target information is generated based on the R-1 th prediction information.
In the prior art, when predicting information, the last bit of each prediction can be dynamically realized, but the application is statically deployed, so that the input information of R bits which are complemented each time is input, and the input information of R bits which are complemented each time is not the last bit of each time to be predicted.
It should be noted that, in the prior art, the conversion from absolute position coding to relative position coding may be directly implemented by rel_shift for the original feature matrix, but in the present application, since the original feature matrix may be dynamically changed and all the supplementary features are involved, the conversion from absolute position coding to relative position coding cannot be directly implemented by using the existing rel_shift. Based on the above, the embodiment of the application provides a set of modes suitable for the application, and can accurately generate and realize conversion from absolute position codes to relative position codes. As one embodiment, the element in the relative position matrix is a ij Wherein i is a ij Line number j is a ij The value range of i is 1 to M, the value range of j is 1 to N-1, and the relative position coding matrix is divided into a first area, a second area and a third area; element a in the first region ij The method meets the following conditions: i is less than or equal to M-1, j<M and a ij Located at a i(i+1) Left side; element a in the second region ij The method meets the following conditions: i.e<M-1,2<j≤M,a ij Located at a i(i+1) Right side; the elements in the third region satisfy j=i+1 or i>M-1, it is understood that the third region is all regions except the first region and the second region in the relative position coding matrix. The step S2 includes:
step S21, setting a first matrix and a second matrix, wherein the number of rows and the number of columns of the first matrix and the second matrix are M, each row of the first matrix is (1, 2,3, …, M, …, M), each column of the second matrix is (1, 2,3, …, M, …, M), and the value range of M is 1 to M.
Step S22, a first intermediate matrix and a second intermediate matrix are obtained based on the first matrix and the second matrix, and the number of rows and the number of columns of the first intermediate matrix and the second intermediate matrix are M.
Step S23, determining elements in a first area in a relative position coding matrix based on the first intermediate matrix, determining elements in a second area based on the second intermediate matrix, setting all elements in a third area to 0 based on a mask operation, and generating the relative position matrix.
Note that the portion corresponding to the third region is a portion not requiring attention, and therefore the elements of the third region are directly set to 0 entirely based on the masking operation.
As an embodiment, the first matrix comprises the element b xy The second matrix includes element c xy The first intermediate matrix includes an element d xy The second intermediate matrix includes element e xy X represents b xy 、c xy 、d xy 、e xy The corresponding number of rows, y represents b xy 、c xy 、d xy 、e xy The number of columns corresponds to the number of columns, the value range of x is 1 to M, the value range of y is 1 to M, and the step S22 includes:
step S221, based on b xy 、c xy And M determines d xy :d xy =b xy -c xy +M-1。
Wherein d is based on xy =b xy -c xy +M-1 enables determination of each d xy To obtain a first intermediate matrix for directing the generation of elements in a first region of the relative position-coding matrix.
Step S22, based on b xy 、c xy Determining e xy :e xy =b xy -c xy -1。
Wherein, based on e xy =b xy -c xy -1 being able to determine each e xy To obtain a second intermediate matrix for directing the generation of elements in a second region in the relative position-coding matrix.
As an embodiment, the step S23 includes:
step S231, locating the original feature matrix in the ith row, and the (d) ij The element of the column is determined to be element a in the first region ij Wherein d ij Is the ith row and jth column element of the first intermediate matrix.
Wherein the first region is a lower triangular region in the relative position encoding matrix, only the elements of the lower triangular region in the first intermediate matrix are required to be used in step S231, the row corresponding to each element in the lower triangular region is determined as the row in the original feature matrix, and d ij And (3) determining the values of the corresponding elements as columns in the original feature matrix, acquiring the corresponding elements from the original feature matrix, and filling the corresponding elements into corresponding positions in a first area of the relative position coding matrix.
Step S232, locating the original feature matrix in the ith row and the (e) ij The element of the column is determined to be element a in the second region ij Wherein e is ij Is the ith row and jth column element of the second intermediate matrix.
Wherein the second region is an upper triangle region in the relative position encoding matrix, in step S231, only the elements of the upper triangle region in the first intermediate matrix are needed to be used, the row corresponding to each element in the upper triangle region is determined as the row in the original feature matrix, and e ij And (3) determining the values of the corresponding elements as columns in the original feature matrix, acquiring the corresponding elements from the original feature matrix, and filling the corresponding elements into corresponding positions in a second area of the relative position coding matrix.
As an embodiment, the step S3 includes:
step S31, inputting the relative position coding matrix into an encoder of a coding and decoding model to execute a depth convolution operation, generating relative position coding information, and resetting elements of a third area of the relative position matrix to 0 based on a mask operation before each execution of the depth convolution operation.
It should be noted that, in the encoder, a plurality of depth convolution operations need to be performed, although in the initial state, the elements of the third area of the relative position matrix are already set to 0 based on the mask operation, in the process of performing the depth convolution, calculation of correlation and the like may be involved, so that part of the elements of the third area are no longer 0, and if the elements are not reset, the result of encoding will be affected, therefore, before each time the depth convolution operation is performed, the elements of the third area of the relative position matrix need to be reset to 0 based on the mask operation, so as to ensure the accuracy of encoding.
As an embodiment, the step S4 includes:
step S41, initially setting x=0, and initially setting the history preset information to U 0
Wherein U is 0 For predicting U 1 It will be appreciated that U 0 Not randomly set, but according to specific encoded content.
S42, generating R-bit input information from the history preset information, filling the input information of the part with 0, inputting the current filled R-bit input information and X into a decoder of a coding and decoding model, reading relative position coding information from the preset memory, and generating the X+1st prediction information U X+1
The method comprises the steps of inputting X into a decoder of a coding and decoding model, wherein the decoder is equivalent to designating bits needing to be predicted in advance, ensuring that each input is R bits, and ensuring accuracy on the basis of realizing static deployment.
Step S43, if X<R-1, the history preset information is set to (U) 0 ,U 1 ,…,U X+1 ) Setting x=x+1, returning to step S42, and if x=r-1, generating target information based on R-1 pieces of prediction information.
It should be noted that some exemplary embodiments are described as a process or a method depicted as a flowchart. Although a flowchart depicts steps as a sequential process, many of the steps may be implemented in parallel, concurrently, or with other steps. Furthermore, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figures. The processes may correspond to methods, functions, procedures, subroutines, and the like.
The embodiment of the application also provides electronic equipment, which comprises: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being configured to perform the methods of embodiments of the present application.
The embodiment of the application also provides a computer readable storage medium, which stores computer executable instructions for executing the method according to the embodiment of the application.
According to the embodiment of the application, the original feature matrix is supplemented to generate the supplemented feature matrix, the corresponding relative position coding matrix is obtained based on the supplemented feature matrix, the relative position coding information is generated based on the relative position coding matrix, the supplemented R-bit input information and the current effective information bit number X in the R-bit input information are input into a decoder of a coding and decoding model in the decoding process, and the X+1th prediction information is predicted, so that static deployment of the coding and decoding model on the GPU is realized, and the requirement of the coding and decoding model on the reasoning speed is met.
The present application is not limited to the above-mentioned embodiments, but is intended to be limited to the following embodiments, and any modifications, equivalents and modifications can be made to the above-mentioned embodiments without departing from the scope of the application.

Claims (9)

1. The method for statically deploying the encoding and decoding model based on the GPU is characterized by comprising the following steps of:
s1, acquiring an original feature matrix, and complementing columns of the original feature matrix to generate an complemented feature matrix, wherein the number of rows and the number of columns of the original feature matrix are M, the number of rows of the complemented feature matrix is M, and the number of columns of the complemented feature matrix is N, and N is more than M;
s2, acquiring a relative position coding matrix corresponding to the filling feature matrix, wherein the number of rows of the relative position coding matrix is N-1, and the number of columns of the relative position coding matrix is M;
s3, inputting the relative position coding matrix into an encoder of a coding and decoding model to generate relative position coding information, and storing the relative position coding information in a preset memory;
s4, the historical prediction information is complemented into R-bit input information, the complemented R-bit input information and the current effective information bit number X in the R-bit input information are input into a decoder of a coding and decoding model, relative position coding information is read from the preset memory, the (X+1) th prediction information is generated, the value range of X is 0 to R-1, and target information is generated based on the R-1 th prediction information.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the element in the relative position matrix is a ij Wherein i is a ij Line number j is a ij The value range of i is 1 to M, the value range of j is 1 to N-1, and the relative position coding matrix is divided into a first area, a second area and a third area; element a in the first region ij The method meets the following conditions: i is less than or equal to M-1, j<M and a ij Located at a i(i+1) Left side; element a in the second region ij The method meets the following conditions: i.e<M-1,2<j≤M,a ij Located at a i(i+1) Right side; the elements in the third region satisfy j=i+1 or i>M-1; the step S2 includes:
step S21, setting a first matrix and a second matrix, wherein the number of rows and the number of columns of the first matrix and the second matrix are M, each row of the first matrix is (1, 2,3, …, M, …, M), each column of the second matrix is (1, 2,3, …, M, …, M), and the value range of M is 1 to M;
step S22, a first intermediate matrix and a second intermediate matrix are obtained based on the first matrix and the second matrix, and the number of rows and the number of columns of the first intermediate matrix and the second intermediate matrix are M;
step S23, determining elements in a first area in a relative position coding matrix based on the first intermediate matrix, determining elements in a second area based on the second intermediate matrix, setting all elements in a third area to 0 based on a mask operation, and generating the relative position matrix.
3. The method of claim 2, wherein the step of determining the position of the substrate comprises,
the first matrix includes element b xy The second matrix includes element c xy The first intermediate matrix includes an element d xy The second intermediate matrix includes element e xy X represents b xy 、c xy 、d xy 、e xy The corresponding number of rows, y represents b xy 、c xy 、d xy 、e xy The number of columns corresponds to the number of columns, the value range of x is 1 to M, the value range of y is 1 to M, and the step S22 includes:
step S221, based on b xy 、c xy And M determines d xy :d xy =b xy -c xy +M-1;
Step S22, based on b xy 、c xy Determining e xy :e xy =b xy -c xy -1。
4. The method of claim 3, wherein the step of,
the step S23 includes:
step S231, locating the original feature matrix in the ith row, and the (d) ij The element of the column is determined to be element a in the first region ij Wherein d ij A j-th column element of an i-th row of the first intermediate matrix;
step S232, locating the original feature matrix in the ith row and the (e) ij The element of the column is determined to be element a in the second region ij Wherein e is ij Is the ith row and jth column element of the second intermediate matrix.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the step S3 includes:
step S31, inputting the relative position coding matrix into an encoder of a coding and decoding model to execute a depth convolution operation, generating relative position coding information, and resetting elements of a third area of the relative position matrix to 0 based on a mask operation before each execution of the depth convolution operation.
6. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the step S4 includes:
step S41, initially setting x=0, and initially setting the history preset information to U 0
S42, generating R-bit input information from the history preset information, filling the input information of the part with 0, inputting the current filled R-bit input information and X into a decoder of a coding and decoding model, reading relative position coding information from the preset memory, and generating the X+1st prediction information U X+1
Step S43, if X<R-1, the history preset information is set to (U) 0 ,U 1 ,…,U X+1 ) Setting x=x+1, returning to step S42, and if x=r-1, generating target information based on R-1 pieces of prediction information.
7. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the coding and decoding model is an ASR model.
8. An electronic device, comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor, the instructions being arranged to perform the method of any of the preceding claims 1-7.
9. A computer readable storage medium, characterized in that computer executable instructions are stored for performing the method of any of the preceding claims 1-7.
CN202310983908.8A 2023-08-04 2023-08-04 GPU-based coding and decoding model static deployment method, electronic equipment and medium Active CN116991431B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310983908.8A CN116991431B (en) 2023-08-04 2023-08-04 GPU-based coding and decoding model static deployment method, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310983908.8A CN116991431B (en) 2023-08-04 2023-08-04 GPU-based coding and decoding model static deployment method, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN116991431A true CN116991431A (en) 2023-11-03
CN116991431B CN116991431B (en) 2024-03-01

Family

ID=88523026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310983908.8A Active CN116991431B (en) 2023-08-04 2023-08-04 GPU-based coding and decoding model static deployment method, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN116991431B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040221223A1 (en) * 2003-04-29 2004-11-04 Nam-Yul Yu Apparatus and method for encoding a low density parity check code
CN114781744A (en) * 2022-05-07 2022-07-22 东南大学 Deep learning multi-step long radiance prediction method based on codec
CN116227629A (en) * 2023-05-10 2023-06-06 荣耀终端有限公司 Information analysis method, model training method, device and electronic equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040221223A1 (en) * 2003-04-29 2004-11-04 Nam-Yul Yu Apparatus and method for encoding a low density parity check code
CN114781744A (en) * 2022-05-07 2022-07-22 东南大学 Deep learning multi-step long radiance prediction method based on codec
CN116227629A (en) * 2023-05-10 2023-06-06 荣耀终端有限公司 Information analysis method, model training method, device and electronic equipment

Also Published As

Publication number Publication date
CN116991431B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
US11574031B2 (en) Method and electronic device for convolution calculation in neural network
CN110023963B (en) Processing text sequences using neural networks
US20240143691A1 (en) Universal transformers
US20110107021A1 (en) Column Oriented In-Memory Page Caching
CN106951962A (en) Compound operation unit, method and electronic equipment for neutral net
TWI428768B (en) Shared language model
US20210158152A1 (en) Simulation system for semiconductor process and simulation method thereof
JP2008204528A (en) Memory control method and memory system
US11836520B2 (en) Dynamic batching for inference system for transformer-based generation tasks
KR20220164559A (en) Attention Neural Networks with Sparse Attention Mechanisms
KR20210147862A (en) Method and apparatus for training retrosynthesis prediction model
CN113569196A (en) Data processing method, device, medium and equipment
EP3750113A1 (en) Contiguous sparsity pattern neural networks
US20230104159A1 (en) Generating output examples using bit blocks
CN112651485A (en) Method and apparatus for recognizing image and method and apparatus for training neural network
CN116991431B (en) GPU-based coding and decoding model static deployment method, electronic equipment and medium
CN114492759A (en) Sparse attention neural network
JP2010520532A (en) Input stroke count
CN113222159A (en) Quantum state determination method and device
US20060161607A1 (en) Method and structure for cache aware transposition via rectangular subsections
CN116663491B (en) Method, equipment and medium for covering group condition constraint statement based on BDD solving function
US11182128B2 (en) Multiply-accumulate operation device, multiply-accumulate operation methods, and systems
US20210271981A1 (en) Electronic apparatus and method for controlling thereof
CN115424038A (en) Multi-scale image processing method, system and device and computer equipment
KR20220008743A (en) In-memory computation circuit and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240415

Address after: 311100, Room 206-063, Building 8, Xixi Bafangcheng, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Patentee after: Muxi Lingzhi Technology (Hangzhou) Co.,Ltd.

Country or region after: China

Address before: Room 1113, 11th Floor, Building F, Information Port, No. 198 Qidi Road, Economic and Technological Development Zone, Xiaoshan District, Hangzhou City, Zhejiang Province, 311200

Patentee before: Muxi Integrated Circuit (Hangzhou) Co.,Ltd.

Country or region before: China