CN116186526B - Feature detection method, device and medium based on sparse matrix vector multiplication - Google Patents

Feature detection method, device and medium based on sparse matrix vector multiplication Download PDF

Info

Publication number
CN116186526B
CN116186526B CN202310484071.2A CN202310484071A CN116186526B CN 116186526 B CN116186526 B CN 116186526B CN 202310484071 A CN202310484071 A CN 202310484071A CN 116186526 B CN116186526 B CN 116186526B
Authority
CN
China
Prior art keywords
sparse matrix
matrix
channel
zero elements
vector multiplication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310484071.2A
Other languages
Chinese (zh)
Other versions
CN116186526A (en
Inventor
刘杰
郭际虎
王庆林
石永振
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202310484071.2A priority Critical patent/CN116186526B/en
Publication of CN116186526A publication Critical patent/CN116186526A/en
Application granted granted Critical
Publication of CN116186526B publication Critical patent/CN116186526B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/30Assessment of water resources
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Algebra (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a feature detection method, device and medium based on sparse matrix vector multiplication, and relates to the field of scientific computation. In the method, a matrix image and matrix characteristics are input into a neural network model; outputting a prediction storage format of the target sparse matrix through a neural network model; and carrying out sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determining the characteristic information of the preset scene according to the result of the sparse matrix vector multiplication. Matrix images and matrix features corresponding to the sample sparse matrix in the neural network model are input, and the format corresponding to the shortest time in the time-consuming results of carrying out the sparse matrix vector multiplication on all formats of the sample sparse matrix for the same time is output, so that the time-consuming of the prediction storage format of the target sparse matrix output through the neural network model for carrying out the sparse matrix vector multiplication is also shortest, the sparse matrix vector multiplication efficiency is improved, and the feature detection efficiency based on the sparse matrix vector multiplication is improved.

Description

Feature detection method, device and medium based on sparse matrix vector multiplication
Technical Field
The present application relates to the field of scientific computing, and in particular, to a method, an apparatus, and a medium for feature detection based on sparse matrix vector multiplication.
Background
In the field of scientific computation, the performance of Sparse Matrix vector multiplication (SpMV) affects the efficiency of many operations, such as page ordering, solving a linear equation set, graph analysis, etc. In the process of calculation, the matrix is converted into a sparse matrix format to participate in calculation, while the sparse matrix vector is multiplied by the memory to have dense access and limited performance and limited bandwidth, and for the characteristic, a plurality of formats for storing the sparse matrix designed for different hardware architectures and non-zero element distribution situations appear, and the efficiency is improved for different situations by designing different formats.
However, on the same computing platform, a certain sparse matrix storage format cannot enable all matrices to exert optimal performance, so that the efficiency of feature detection of a certain scene through sparse matrix vector multiplication is reduced. If the sparse matrix vector multiplication is performed on all the sparse matrices in the same storage format on the same operation platform in the marine acoustic field, the optimal performance of all the sparse matrices may not be exerted, and the efficiency of the sound intensity distribution obtained by the sparse matrix vector multiplication is reduced.
Therefore, how to determine an optimal format of a matrix for performing sparse matrix vector multiplication, and to perform matrix vector multiplication in the determined optimal format, to improve the efficiency of scientific operation is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a method, a device and a medium for feature detection based on sparse matrix vector multiplication, which are used for determining an optimal format of a certain matrix for sparse matrix vector multiplication, and carrying out matrix vector multiplication according to the determined optimal format, so that the efficiency of scientific operation is improved, and the efficiency of feature detection based on sparse matrix vector multiplication is improved.
In order to solve the above technical problems, the present application provides a feature detection method based on sparse matrix vector multiplication, including:
acquiring data which are acquired by data acquisition equipment and are used for representing the characteristics of a preset scene, and determining a target sparse matrix according to the data; the preset scene at least comprises one of an acoustic scene and a circuit scene;
converting the target sparse matrix into a matrix image and extracting matrix characteristics of the target sparse matrix;
inputting the matrix image and the matrix characteristics into a preset neural network model; the neural network model is obtained by taking a sample matrix image corresponding to a sample sparse matrix and sample matrix characteristics corresponding to the sample sparse matrix as inputs, and taking a format corresponding to the shortest time in time-consuming results of carrying out sparse matrix vector multiplication for the same times on all formats of the sample sparse matrix as output training;
Outputting a predicted storage format of the target sparse matrix through the neural network model;
and carrying out sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determining the characteristic information of the preset scene according to the result of the sparse matrix vector multiplication.
Preferably, establishing the neural network model includes:
acquiring a sample sparse matrix data set;
performing sparse matrix vector multiplication on all formats of each sample sparse matrix in the sample sparse matrix dataset for the same times;
selecting a format corresponding to the time-consuming result with the shortest time from all formats of the sample sparse matrix as a label value of the corresponding sample sparse matrix;
converting each sample sparse matrix into a corresponding sample matrix image and extracting the sample matrix characteristics of each sample sparse matrix;
training the neural network model according to the sample matrix image corresponding to each sample sparse matrix, the sample matrix characteristics and the label value corresponding to each sample sparse matrix.
Preferably, the structure of the neural network model includes: an image channel and a feature channel; wherein the image channel is for receiving an image representation of a matrix and the feature channel is for receiving a feature representation of the matrix; the matrix comprises the sample sparse matrix and the target sparse matrix;
The image channel comprises a convolution layer, a preset number of convolution layers with attention mechanisms, a full convolution layer, a pooling layer and a full connection layer;
the characteristic channel comprises two full-connection layers, a residual full-connection layer and a full-connection layer; wherein, the residual error full-connection layer comprises a residual error block formed by two full-connection layers;
and the output of the image channel and the output of the characteristic channel pass through a full connection layer to obtain the predictive storage format.
Preferably, extracting matrix features of the sparse matrix includes:
acquiring target data in the sparse matrix; the target data at least comprises values of non-zero elements in the sparse matrix, column values of the non-zero elements in the sparse matrix and row values of the non-zero elements in the sparse matrix;
determining matrix characteristics of the sparse matrix according to the values of the non-zero elements, the column values of the non-zero elements and the row values of the non-zero elements; the matrix characteristics of the sparse matrix at least comprise the density of the sparse matrix and the average value of the number of non-zero elements of each row of the sparse matrix.
Preferably, converting the sparse matrix into a matrix image comprises:
Partitioning the sparse matrix according to a preset size;
acquiring a first distance between a column coordinate of a current non-zero element in a current block and a diagonal line of the sparse matrix;
acquiring a first integer division result of the first distance and the width of the current block and a second integer division result of the row coordinates of the current non-zero element and the length of the current block;
determining coordinates of non-zero elements to be modified in a first channel of the RGB three channels according to the first integer division result and the second integer division result, and adding 1 to element values on the coordinates of the non-zero elements to be modified in the first channel; returning to the step of acquiring the distance between the column coordinates of the current non-zero element in the current block and the diagonal line of the sparse matrix until all the non-zero elements in the sparse matrix are traversed;
acquiring a second distance between the row coordinates of the current non-zero element in the current block and the diagonal of the sparse matrix;
obtaining a third integer division result of the second distance and the length of the current block and a fourth integer division result of the column coordinates of the current non-zero element and the width of the current block;
determining coordinates of non-zero elements to be modified in a second channel of the RGB three channels according to the third integer division result and the fourth integer division result, and adding 1 to element values on the coordinates of the non-zero elements to be modified in the second channel; returning to the step of acquiring the distance between the row coordinates of the current non-zero elements in the current partition and the diagonal line of the sparse matrix until all the non-zero elements in the sparse matrix are traversed;
Acquiring the number of non-zero elements in the current block;
acquiring the ratio of the number of the non-zero elements in the current block to the size of the current block;
and taking the ratio as a value corresponding to the current block of a third channel in the RGB three channels.
Preferably, converting the sparse matrix into a matrix image comprises:
acquiring the size of a matrix image to be generated and the number of non-zero elements of the sparse matrix;
filling row coordinates of the non-zero elements in corresponding positions in a first channel and column coordinates of the non-zero elements in corresponding positions in a second channel in three RGB channels, respectively, when the number of the non-zero elements is equal to the size of the matrix image to be generated;
filling row coordinates of the non-zero elements in corresponding positions in the first channel and filling column coordinates of the non-zero elements in corresponding positions in the second channel in the RGB three channels when the number of the non-zero elements is smaller than the size of the matrix image to be generated; filling 0 for the position of the spare part;
selecting the elements in the middle of all non-zero elements of the sparse matrix, which are equal to the size of the matrix image to be generated, under the condition that the number of the non-zero elements is larger than the size of the matrix image to be generated; and filling row coordinates of the element in the corresponding position in the first channel and filling column coordinates of the element in the corresponding position in the second channel in the RGB three channels.
Preferably, after determining the values of the elements of the RGB three-channel, the method further comprises:
and respectively carrying out normalization processing on the values of the elements of each channel so that the values of the elements of each channel are within a preset range.
In order to solve the above technical problem, the present application further provides a device for feature detection based on sparse matrix vector multiplication, including:
the acquisition module is used for acquiring data which are acquired by the data acquisition equipment and are used for representing the characteristics of a preset scene, and determining a target sparse matrix according to the data; the preset scene at least comprises one of an acoustic scene and a circuit scene;
the conversion and extraction module is used for converting the target sparse matrix into a matrix image and extracting matrix characteristics of the target sparse matrix;
the input module is used for inputting the matrix image and the matrix characteristics into a preset neural network model; the neural network model is obtained by taking a sample matrix image corresponding to a sample sparse matrix and sample matrix characteristics corresponding to the sample sparse matrix as inputs, and taking a format corresponding to the shortest time in time-consuming results of carrying out sparse matrix vector multiplication for the same times on all formats of the sample sparse matrix as output training;
The output module is used for outputting a prediction storage format of the target sparse matrix through the neural network model;
the determining module is used for carrying out sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determining the characteristic information of the preset scene according to the result of the sparse matrix vector multiplication.
In order to solve the above technical problem, the present application further provides a device for feature detection based on sparse matrix vector multiplication, including:
a memory for storing a computer program;
a processor, configured to implement the steps of the method for feature detection based on sparse matrix vector multiplication when executing the computer program.
In order to solve the above technical problem, the present application further provides a computer readable storage medium, where a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above method for feature detection based on sparse matrix vector multiplication.
The feature detection method based on sparse matrix vector multiplication comprises the following steps: acquiring data which are acquired by data acquisition equipment and are used for representing the characteristics of a preset scene, and determining a target sparse matrix according to the data; the preset scene at least comprises one of an acoustic scene and a circuit scene; converting the target sparse matrix into a matrix image and extracting matrix characteristics of the target sparse matrix; inputting the matrix image and the matrix characteristics into a preset neural network model; outputting a prediction storage format of the target sparse matrix through a neural network model; and carrying out sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determining the characteristic information of the preset scene according to the result of the sparse matrix vector multiplication. Compared with the previous method for carrying out sparse matrix vector multiplication on all sparse matrixes by adopting the same storage format, in the method provided by the application, because the sample matrix image corresponding to the sample sparse matrix and the sample matrix characteristic corresponding to the sample sparse matrix are used as inputs in the neural network model for predicting the storage format of the target sparse matrix for carrying out sparse matrix vector multiplication, and the format corresponding to the shortest time in the time-consuming results of carrying out sparse matrix vector multiplication on all the formats of the sample sparse matrix for the same times is used as output, the predicted storage format of the target sparse matrix output through the neural network model is shortest in time-consuming of carrying out sparse matrix vector multiplication, and the efficiency of sparse matrix vector multiplication is improved, so that the efficiency of carrying out feature detection based on sparse matrix vector multiplication is improved.
In addition, the application also provides a device for detecting the characteristics based on the sparse matrix vector multiplication and a computer readable storage medium, and the device and the method have the same or corresponding technical characteristics as the method for detecting the characteristics based on the sparse matrix vector multiplication, and have the same effects.
Drawings
For a clearer description of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for feature detection based on sparse matrix vector multiplication according to an embodiment of the present application;
FIG. 2 is a schematic diagram of converting a matrix into an image by a multi-channel histogram representation and a non-zero element position representation, respectively;
FIG. 3 is a block diagram of an apparatus for feature detection based on sparse matrix vector multiplication according to an embodiment of the present application;
fig. 4 is a block diagram of an apparatus for feature detection based on sparse matrix vector multiplication according to another embodiment of the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments herein without making any inventive effort are intended to fall within the scope of the present application.
The core of the application is to provide a method, a device and a medium for feature detection based on sparse matrix vector multiplication, which are used for determining an optimal format of a certain matrix for sparse matrix vector multiplication, and carrying out matrix vector multiplication according to the determined optimal format, so that the efficiency of scientific operation is improved, and the efficiency of feature detection based on sparse matrix vector multiplication is improved.
In the field of scientific computation, the performance of sparse matrix vector multiplication affects the efficiency of many operations, such as page ordering, solving linear equations, graph analysis, and the like. In computation, the matrix is transformed into a sparse matrix format to participate in the operation, while sparse matrix vector times memory access is dense and performance is limited by limited bandwidth. For this feature, many formats of the sparse matrix have been designed for different hardware architectures and non-zero element distribution situations, and efficiency is improved for different situations by designing different formats. However, on the same computing platform, a certain sparse matrix storage format cannot enable all matrices to perform optimally. Therefore, the method and the device use the established neural network model to predict the optimal format when carrying out sparse matrix vector multiplication on each sparse matrix, and carry out sparse matrix vector multiplication on the predicted format so as to improve the operation efficiency.
In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description. Fig. 1 is a flowchart of a feature detection method based on sparse matrix vector multiplication according to an embodiment of the present application, where, as shown in fig. 1, the method includes:
s10: and acquiring data which are acquired by the data acquisition equipment and are used for representing the preset scene characteristics, and determining a target sparse matrix according to the data.
The preset scene at least comprises one of an acoustic scene and a circuit scene.
S11: and converting the target sparse matrix into a matrix image and extracting matrix characteristics of the target sparse matrix.
S12: and inputting the matrix image and the matrix characteristics into a preset neural network model.
The neural network model is obtained by taking a sample matrix image corresponding to a sample sparse matrix and sample matrix features corresponding to the sample sparse matrix as inputs, and taking a format corresponding to the shortest time in time-consuming results of carrying out sparse matrix vector multiplication on all formats of the sample sparse matrix for the same times as output training.
S13: and outputting a predicted storage format of the target sparse matrix through the neural network model.
S14: and carrying out sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determining the characteristic information of the preset scene according to the result of the sparse matrix vector multiplication.
The target sparse matrix is a matrix to be subjected to sparse matrix vector multiplication. Sparse matrices can be derived from problems or areas such as acoustic problems, biochemical networks, graph theory problems, chemical process simulation problems, circuit simulation problems, combination problems, computational fluid dynamics problems, computer graphics/vision problems, directed graph problems, and the like. The corresponding preset scene is the corresponding scene when the problems are solved. The data acquisition equipment and the preset scene are not limited, and the data acquisition equipment and the preset scene are determined according to actual conditions. If the selected preset scene is a marine acoustic scene, the data acquisition equipment can be a sensor, a detector with the sensor is placed in the water field to detect underwater sound field data such as sound intensity, water temperature density, flow velocity and the like, and then a sparse matrix is restored according to the underwater sound field data.
And carrying out sparse matrix vector multiplication on the sparse matrix on the operation platform to obtain the characteristic information of the preset scene. The types of computing platforms are classified into a pure central processor (Central Processing Unit, CPU), a pure graphics processor (Graphics Processing Unit, GPU) and a cpu+gpu platform according to the computing means. Table 1 is a partial sparse matrix format and its corresponding supported computing platform. The current platform has only a CPU, and CSR, COO, ELL, DIA, CSR, CVR, spV8, CSR2 and the like can be selected as the sparse matrix formats, and ACSR, CSR, COO, ELL, DIA, CSR is available as the sparse matrix formats if the current platform has only a GPU. The time consumption of the sparse matrix with different formats is different when the sparse matrix vector multiplication is performed, so that the optimal format of the sparse matrix (the format of the sparse matrix corresponding to the time consumption of the sparse matrix with the shortest time consumption when the sparse matrix vector multiplication is performed) can be determined through the neural network model provided by the embodiment of the application, and the sparse matrix vector multiplication is performed according to the optimal format.
Table 1 partial sparse matrix format and corresponding supported computing platform thereof
In this embodiment, a sample matrix image corresponding to a sample sparse matrix and a sample matrix feature corresponding to the sample sparse matrix are used as inputs, a format corresponding to the time-consuming result of performing sparse matrix vector multiplication for the same times on all formats of the sample sparse matrix is used as output, and the neural network model is trained, so that the output of the neural network model meets a certain accuracy rate. The method for acquiring the matrix image corresponding to the sparse matrix (including the sample sparse matrix and the target sparse matrix) and the method for extracting the matrix features corresponding to the sparse matrix are not limited, and the method is determined according to practical situations.
After a neural network model used for predicting the optimal storage format of the sparse matrix is determined, converting the target sparse matrix into a matrix image, extracting matrix characteristics of the target sparse matrix, inputting the matrix image and the matrix characteristics into the neural network model, outputting the predicted storage format of the target sparse matrix through the neural network model, carrying out sparse matrix vector multiplication by using the predicted storage format, and determining the characteristic information of a predicted scene according to the result of the sparse matrix vector multiplication. Taking the selected preset scene as an example of a marine acoustic scene, after recovering the sparse matrix according to the underwater sound field data, obtaining a prediction storage format of the sparse matrix (because the prediction storage format is also an optimal storage format because the neural network model is trained by using the format corresponding to the shortest time in the time-consuming results of carrying out the sparse matrix vector multiplication of the same times by using all formats), and carrying out the sparse matrix vector multiplication by using the prediction storage format, so that underwater sound intensity distribution information can be obtained.
The feature detection method based on sparse matrix vector multiplication provided by the embodiment comprises the following steps: acquiring data which are acquired by data acquisition equipment and are used for representing the characteristics of a preset scene, and determining a target sparse matrix according to the data; the preset scene at least comprises one of an acoustic scene and a circuit scene; converting the target sparse matrix into a matrix image and extracting matrix characteristics of the target sparse matrix; inputting the matrix image and the matrix characteristics into a preset neural network model; outputting a prediction storage format of the target sparse matrix through a neural network model; and carrying out sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determining the characteristic information of the preset scene according to the result of the sparse matrix vector multiplication. Compared with the previous method of carrying out sparse matrix vector multiplication on all sparse matrixes by adopting the same storage format, in the method provided by the embodiment, because in the neural network model of the storage format used for predicting the target sparse matrix to carry out sparse matrix vector multiplication, the sample matrix image corresponding to the sample sparse matrix and the sample matrix characteristic corresponding to the sample sparse matrix are used as inputs, and the format corresponding to the shortest time in the time-consuming results of carrying out sparse matrix vector multiplication on all the formats of the sample sparse matrix for the same times is used as output, the predicted storage format of the target sparse matrix output through the neural network model is shortest in time-consuming of carrying out sparse matrix vector multiplication, the sparse matrix vector multiplication efficiency is improved, and the efficiency of carrying out feature detection based on sparse matrix vector multiplication is improved.
On the basis of the above embodiment, when building the neural network model, a preferred implementation manner is that building the neural network model includes:
acquiring a sample sparse matrix data set;
carrying out sparse matrix vector multiplication for the same times on all formats of each sample sparse matrix in the sample sparse matrix data set;
selecting a format corresponding to the shortest time consuming time from time consuming results of the sparse matrix vector multiplication from all formats of each sample sparse matrix as a label value of the corresponding sample sparse matrix;
converting each sample sparse matrix into a corresponding sample matrix image and extracting sample matrix characteristics of each sample sparse matrix;
training the neural network model according to the sample matrix image corresponding to each sample sparse matrix, the sample matrix characteristics and the label value corresponding to each sample sparse matrix.
The training process for the neural network model is as follows:
step 1: first, a sparse matrix dataset is acquired. All data meeting the specifications of the mtx file (mtx file is matrix data stored in sparse matrix format) can be directly added to the dataset.
Step 2: the same number of sparse matrix vector multiplications are performed on the computing platform for all formats. In order to reduce the error, the sparse matrix vector multiplication is performed "a plurality of times" herein, for example, 1000 or 10000 times, and the sparse matrix vector multiplication is performed at least 1000 times in the method of the present embodiment by taking an average value according to the result of the plurality of times. According to the partial sparse matrix formats and the corresponding operation platforms shown in table 1, after the corresponding formats are selected, the matrix performs sparse matrix vector multiplication operation in the corresponding formats, and time consumption when each matrix performs sparse matrix vector multiplication by using different sparse matrix formats is counted. Table 2 is the time consuming statistics on a pure CPU platform for matrix apache2 (file name of matrix file, suffix. Mtx). The information of the CPU used here is as follows: CPU name is Intel Xeon Gold 6258R; the core number is 28; the thread number is 56; the maximum frequency is 4.00GHz; the reference frequency is 2.7GHz; the Cache size is 38.5MB; the memory type is DDR4-2933.
TABLE 2 time consuming statistics of matrix apache2 on a pure CPU platform
Step 3: and marking the matrix data. According to the time-consuming result of the sparse matrix vector multiplication, the shortest time-consuming format is used as the label value of the matrix, as shown in table 2, the apoche 2 performs the SpMV for 3000 times on a pure CPU platform in 8 formats respectively, and the time consumed by the final MKL-CSR format is the least, so the label value of the apoche 2 is MKL-CSR.
Step 4: and extracting matrix characteristics of the matrix.
Step 5: the matrix is converted into a matrix image.
Step 6: and (3) taking the extracted matrix characteristics and the matrix image as input, and training the neural network by combining the matrix label value obtained in the step (2).
For the neural network model used in step 6, a new neural network architecture is provided in this embodiment (since the input is dual input of the matrix feature and the matrix image, the new neural network is denoted as MixedNet in this application), and the optimal sparse matrix vector multiplication storage format of the matrix can be predicted according to the dual input of the matrix feature and the matrix image. In a preferred embodiment, the neural network model includes: an image channel and a feature channel; the image channel is used for receiving the image representation of the matrix, and the characteristic channel is used for receiving the characteristic representation of the matrix; the matrix comprises a sample sparse matrix and a target sparse matrix;
The image channel comprises a convolution layer, a preset number of convolution layers with attention mechanisms, a full convolution layer, a pooling layer and a full connection layer;
the characteristic channel comprises two full-connection layers, a residual full-connection layer and a full-connection layer; the residual error full-connection layer comprises a residual error block formed by two full-connection layers;
the output of the image channel and the output of the characteristic channel pass through a full connection layer to obtain a predictive storage format. As shown in table 3, table 3 is the structural parameters of MixedNet.
TABLE 3 Structure parameters of MixedNet
As can be seen from table 3, mixedNet consists of two channels, an image channel and a feature channel, respectively. The image channels are used to receive image representations of the matrix, such as multi-channel histograms and non-zero element position maps, assuming a size of H W, the input image will first undergo a layer of convolution (Conv), a convolution kernel size of 3X 3, a number of channels of 32, and then 7 layers of convolution with attention mechanisms (MBConv), the convolution kernel size and number of channels being shown in Table 4; finally, the image channel output is obtained through one full convolution, one pooling and one full connection layer. The feature channel is used for receiving the feature representation of the matrix, and the features of the matrix are selected to form a vector, and the common selected features are shown in table 3, and the vector is the feature representation of the matrix. The feature channel contains two full-link layers (FCs) and one Residual full-link layer (Residual FC). The residual full connection layer contains two residual blocks (fcblocs) composed of full connection layers. And after the output result passes through a full connection layer again, the output result combination (Concat) of the image channel passes through a full connection layer again, and a final prediction result is obtained.
In extracting the matrix features of the sparse matrix, the preferred embodiment is that extracting the matrix features of the sparse matrix includes:
acquiring target data in a sparse matrix; the target data at least comprises values of non-zero elements in the sparse matrix, column values of the non-zero elements in the sparse matrix and row values of the non-zero elements in the sparse matrix;
determining matrix characteristics of the sparse matrix according to the values of the non-zero elements, the column values of the non-zero elements and the row values of the non-zero elements; the matrix characteristics of the sparse matrix at least comprise density of the sparse matrix and average value of non-zero element numbers of each row of the sparse matrix.
In practice, only the mtx file (sparse matrix storage file) needs to be provided and then the sparse matrix data is read, which mainly includes three data val arrays (values of non-zero elements in the record matrix), colIdx arrays (column values of non-zero elements in the record matrix), rowIdx arrays (row values of non-zero elements in the record matrix). After the reading is finished, traversing the three arrays, and thus calculating the characteristics of the matrix. Table 4 is a partial characterization of the matrix.
TABLE 4 partial characterization of matrix
Therefore, in the method provided by the embodiment, extraction of matrix features is realized.
In converting a sparse matrix into an image representation, embodiments of the present application provide two new matrix image generation methods. In this embodiment, the first method is referred to as a multi-channel histogram representation, and the second method is referred to as a non-zero element position representation.
The process of converting a matrix into an image using a multi-channel histogram representation is as follows:
partitioning the sparse matrix according to a preset size;
acquiring a first distance between a column coordinate of a current non-zero element in a current block and a diagonal line of a sparse matrix;
acquiring a first integer division result of a first distance and the width of the current block and a second integer division result of the row coordinates of the current non-zero element and the length of the current block;
determining coordinates of non-zero elements to be modified in a first channel of the RGB three channels according to the first integer division result and the second integer division result, and adding 1 to element values on the coordinates of the non-zero elements to be modified in the first channel; returning to the step of acquiring the distance between the column coordinates of the current non-zero elements in the current block and the diagonal line of the sparse matrix until all the non-zero elements in the sparse matrix are traversed;
acquiring a second distance between the row coordinates of the current non-zero element in the current block and a diagonal line of the sparse matrix;
Obtaining a third integer division result of the second distance and the length of the current block and a fourth integer division result of the column coordinates of the current non-zero element and the width of the current block;
determining coordinates of non-zero elements to be modified in a second channel of the RGB three channels according to the third integer division result and the fourth integer division result, and adding 1 to element values on the coordinates of the non-zero elements to be modified in the second channel; returning to the step of acquiring the distance between the row coordinates of the current non-zero elements in the current block and the diagonal line of the sparse matrix until all the non-zero elements in the sparse matrix are traversed;
acquiring the number of non-zero elements in the current block;
acquiring the ratio of the number of non-zero elements in the current block to the size of the current block;
and taking the ratio as a value corresponding to the current block of the third channel in the RGB three channels.
Specifically, firstly, the matrix is segmented according to a fixed size (for example, the size of the matrix is w×h, w represents the width of the matrix block, and h represents the length of the matrix block); subsequently, the distance of the column coordinates (or row coordinates) of the non-zero element values with respect to the diagonal of the matrix is calculated, assuming that the non-zero element coordinates are (r, c) and the row diagonal element coordinates are (r, r), then the distance is abs (c-r) (abs represents the absolute value of c-r). Then, dividing the distance by w to obtain col (col=abs (c-r)// w), dividing r by h to obtain row (row=r// h); finally, adding 1 to the element value of the (row, col) position in the multi-channel histogram; this step is repeated until all non-zero elements have been traversed. The multi-channel histogram representation may be divided into a row histogram representation and a column histogram representation. The method used in the above-described process of determining the coordinates of the non-zero element to be modified of the first channel is a row histogram representation, the method used in the process of determining the coordinates of the non-zero element to be modified of the first channel is a column histogram representation, and in addition, the manner in which the values of the elements of the third channel are determined is referred to as a density method. After determining the values of the elements of the RGB three channels, further comprising: and respectively carrying out normalization processing on the values of the elements of each channel so that the values of the elements of each channel are in a preset range.
Fig. 2 is a schematic diagram of converting a matrix into an image by using a multi-channel histogram representation and a non-zero element position representation according to an embodiment of the present application. As shown in fig. 2, the area surrounded by the dashed box containing the AB element is Block1, the Block size is set to w= 2,h =2, the row of Block1 is marked with 0, and the column is marked with 0. When the row histogram method is used, two non-zero elements in Block1 are a and B, their coordinates are (0, 0) and (0, 1), the distances from the diagonal are abs (0-0) and abs (1-0), respectively, the column coordinates in the rowmajordhannel (row histogram channel, actually expressed in a two-dimensional matrix) corresponding to the row histogram representation are abs (0-0)// 2=0 and abs (1-0)// 2=0, respectively, and the row coordinates are 0// 2=0 and 0// 2=0, respectively, so that the value of rowmajordhannel [0] [0] is finally added by 2; when the column histogram method is used, for example, the coordinates of the non-zero elements E and F are (2, 4), the coordinates of F are (2, 5), and the distances from the diagonal elements (4, 4) and (5, 5) are 2 and 3, respectively, so that the row coordinates of the non-zero elements which are required to be modified in the channel colmajordhannel (column histogram channel, actually expressed in a two-dimensional matrix) corresponding to the column histogram representation are 2// 2=1 and 3// 2=1, and the column coordinates are 4// 2=2 and 5// 2=2, so that the value of colmajordhannel [1] [2] is added by 2. The elements are assigned in the R (red) G (green) channel of the three-channel image by a row histogram method and a column histogram method, the B (blue) channel is a density map, the block size divided as shown in fig. 2 is 2×2, the first block coordinates are (0, 0), and if there are 2 non-zero elements, the value of the density map Densitychannel [0] [0] is 2/4=0.5. After the row, column, and density histograms are obtained, each channel is normalized to the bin with a value range of [0,255 ].
The process of converting a matrix into an image using non-zero element position representation is as follows:
acquiring the size of a matrix image to be generated and the number of non-zero elements of a sparse matrix;
filling row coordinates of non-zero elements in corresponding positions in a first channel and column coordinates of non-zero elements in corresponding positions in a second channel in the RGB three channels under the condition that the number of the non-zero elements is equal to the size of a matrix image to be generated;
filling row coordinates of non-zero elements in corresponding positions in a first channel and column coordinates of non-zero elements in corresponding positions in a second channel in the RGB three channels under the condition that the number of the non-zero elements is smaller than the size of a matrix image to be generated; filling 0 for the position of the spare part;
selecting the elements in the middle of all non-zero elements of the sparse matrix, which are equal to the size of the matrix image to be generated, under the condition that the number of the non-zero elements is larger than the size of the matrix image to be generated; the row coordinates of the elements are filled in the corresponding positions in the first channel and the column coordinates of the elements are filled in the corresponding positions in the second channel in the three RGB channels.
Specifically, the non-zero element position indication rule is to fill a row label and a column label of non-zero elements in a corresponding position in an R (red) G (green) channel, wherein if the size of a designated generated image is 256×256, when the total number of non-zero elements of the matrix is less than or equal to 256×256, all elements are filled in the image, and the rest part is set to 0; when the total number of non-zero elements of the matrix is greater than 256×256, 256×256 elements in the middle of all non-zero elements of the matrix are selected, for example, the total number of non-zero elements is 256×256×3, then according to the read matrix data, the 256×256 non-zero elements are started until the 256×256×2 non-zero elements, and row labels and columns are sequentially filled into R (red) G (green) channels of the generated image. As shown in fig. 2, the generated non-zero element position represents that the size of the diagram is 4×4, 16 non-zero elements can be accommodated, 12 elements are total in the original matrix, according to the order of ABCDEFGHIJKL, the order of a is 0, the non-zero element position represents that the size of the diagram is 4×4, and then the position where the non-zero element position represents that the diagram needs to be modified is (0// 4, 0%4) = (0, 0); a has a coordinate of (0, 0), the order of filling 0,F in the (0, 0) positions of the R and G channels respectively is 5, the position to be modified of the non-zero element position representation is (5// 4, 5%4) = (1, 1), and the coordinate of F is (2, 5), and the (1, 1) positions of the R and G channels respectively are filled with 2 and 5 in the non-zero element position representation. Finally, all non-zero elements are fully filled in the non-zero element position representation, 4 empty positions are left, and the values of the left positions of the R and G channels are set to 0. After determining the elements of the R channel and the G channel, respectively carrying out normalization processing on the values of the elements of each channel so that the values of the elements of each channel are in a preset range.
The generation of the matrix image is realized through the multi-channel histogram representation method or the non-zero element position representation method, so that the characteristic information of the matrix can be better saved, and meanwhile, the size of a generated picture can be designated and the matrix with any size can be processed.
In order to better understand the present application, a further detailed description of the present application will be provided below in connection with specific embodiments. In the feature detection process, the process of carrying out sparse matrix vector multiplication on the new matrix is as follows:
step 1: a new matrix is obtained and stored in mtx file form.
Step 2: the mtx file of the matrix is read.
Step 3: and extracting matrix characteristics.
Step 4: the matrix is transformed into a specified image form (optionally a multi-channel histogram form or a non-zero element position form).
Step 5: and inputting the matrix characteristics and the matrix images into a trained network MixedNet to obtain a prediction result (a predicted optimal sparse matrix storage format).
Step 6: the matrix storage format is converted to a predicted format.
Step 7: sparse matrix vector multiplication is performed.
In the above embodiments, the method for feature detection based on sparse matrix vector multiplication is described in detail, and the present application further provides an embodiment corresponding to the device for feature detection based on sparse matrix vector multiplication. It should be noted that the present application describes an embodiment of the device portion from two angles, one based on the angle of the functional module and the other based on the angle of the hardware.
Fig. 3 is a block diagram of an apparatus for feature detection based on sparse matrix vector multiplication according to an embodiment of the present application. The embodiment is based on the angle of the functional module, and comprises:
the acquisition module 10 is used for acquiring data which is acquired by the data acquisition equipment and is used for representing the characteristics of a preset scene, and determining a target sparse matrix according to the data; the preset scene at least comprises one of an acoustic scene and a circuit scene;
the conversion and extraction module 11 is used for converting the target sparse matrix into a matrix image and extracting matrix characteristics of the target sparse matrix;
an input module 12, configured to input the matrix image and the matrix features into a preset neural network model; the neural network model is obtained by taking a sample matrix image corresponding to a sample sparse matrix and sample matrix characteristics corresponding to the sample sparse matrix as inputs, and taking a format corresponding to the shortest time in time-consuming results of carrying out sparse matrix vector multiplication on all formats of the sample sparse matrix for the same times as output training;
an output module 13, configured to output a predicted storage format of the target sparse matrix through a neural network model;
the determining module 14 is configured to perform sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determine feature information of the preset scene according to a result of the sparse matrix vector multiplication.
Since the embodiments of the apparatus portion and the embodiments of the method portion correspond to each other, the embodiments of the apparatus portion are referred to the description of the embodiments of the method portion, and are not repeated herein. And has the same advantageous effects as the above-mentioned method of feature detection based on sparse matrix vector multiplication.
Fig. 4 is a block diagram of an apparatus for feature detection based on sparse matrix vector multiplication according to another embodiment of the present application. The device for feature detection based on sparse matrix vector multiplication in this embodiment includes, based on hardware angle, as shown in fig. 4:
a memory 20 for storing a computer program;
a processor 21 for implementing the steps of the method of sparse matrix vector multiplication based feature detection as mentioned in the above embodiments when executing a computer program.
The device for feature detection based on sparse matrix vector multiplication provided in this embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like.
Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 21 may be implemented in hardware in at least one of a digital signal processor (Digital Signal Processor, DSP), a Field programmable gate array (Field-Programmable Gate Array, FPGA), a programmable logic array (Programmable Logic Array, PLA). The processor 21 may also include a main processor, which is a processor for processing data in an awake state, also called CPU, and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 21 may be integrated with a GPU for taking care of rendering and drawing of the content that the display screen is required to display. In some embodiments, the processor 21 may also include an artificial intelligence (Artificial Intelligence, AI) processor for processing computing operations related to machine learning.
Memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing a computer program 201, where the computer program, when loaded and executed by the processor 21, is capable of implementing the relevant steps of the sparse matrix vector multiplication based feature detection method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may further include an operating system 202, data 203, and the like, where the storage manner may be transient storage or permanent storage. The operating system 202 may include Windows, unix, linux, among others. The data 203 may include, but is not limited to, the data referred to above for the sparse matrix vector multiplication based feature detection method, and the like.
In some embodiments, the device based on the feature detection of sparse matrix vector multiplication may further comprise a display screen 22, an input/output interface 23, a communication interface 24, a power supply 25 and a communication bus 26.
Those skilled in the art will appreciate that the structure shown in fig. 4 does not constitute a limitation of the apparatus for sparse matrix vector multiplication based feature detection, and may include more or fewer components than shown.
The device for detecting the characteristics based on sparse matrix vector multiplication, provided by the embodiment of the application, comprises a memory and a processor, wherein the processor can realize the following method when executing a program stored in the memory: the feature detection method based on sparse matrix vector multiplication has the same effect.
Finally, the present application also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps as described in the method embodiments above.
It will be appreciated that the methods of the above embodiments, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored on a computer readable storage medium. With such understanding, the technical solution of the present application, or a part contributing to the prior art or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, performing all or part of the steps of the method described in the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The computer readable storage medium provided by the application comprises the above-mentioned feature detection method based on sparse matrix vector multiplication, and the effects are the same as the above.
The method, the device and the medium for feature detection based on sparse matrix vector multiplication are described in detail. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present application, and such improvements and modifications fall within the scope of the claims of the present application.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims (9)

1. A method of feature detection based on sparse matrix vector multiplication, comprising:
acquiring data which are acquired by data acquisition equipment and are used for representing the characteristics of a preset scene, and determining a target sparse matrix according to the data; the method for determining the target sparse matrix comprises the following steps of: acquiring underwater sound field data acquired by the sensor, and recovering the target sparse matrix according to the underwater sound field data, wherein the underwater sound field data comprises sound intensity, water temperature density and flow velocity;
converting the target sparse matrix into a matrix image and extracting matrix characteristics of the target sparse matrix;
inputting the matrix image and the matrix characteristics into a preset neural network model; the neural network model is obtained by taking a sample matrix image corresponding to a sample sparse matrix and sample matrix characteristics corresponding to the sample sparse matrix as inputs, and taking a format corresponding to the shortest time in time-consuming results of carrying out sparse matrix vector multiplication for the same times on all formats of the sample sparse matrix as output training;
Outputting a predicted storage format of the target sparse matrix through the neural network model;
performing sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determining characteristic information of the preset scene according to a sparse matrix vector multiplication result;
wherein converting the sparse matrix into a matrix image comprises:
partitioning the sparse matrix according to a preset size;
acquiring a first distance between a column coordinate of a current non-zero element in a current block and a diagonal line of the sparse matrix;
acquiring a first integer division result of the first distance and the width of the current block and a second integer division result of the row coordinates of the current non-zero element and the length of the current block;
determining coordinates of non-zero elements to be modified in a first channel of the RGB three channels according to the first integer division result and the second integer division result, and adding 1 to element values on the coordinates of the non-zero elements to be modified in the first channel; returning to the step of acquiring the distance between the column coordinates of the current non-zero element in the current block and the diagonal line of the sparse matrix until all the non-zero elements in the sparse matrix are traversed;
Acquiring a second distance between the row coordinates of the current non-zero element in the current block and the diagonal of the sparse matrix;
obtaining a third integer division result of the second distance and the length of the current block and a fourth integer division result of the column coordinates of the current non-zero element and the width of the current block;
determining coordinates of non-zero elements to be modified in a second channel of the RGB three channels according to the third integer division result and the fourth integer division result, and adding 1 to element values on the coordinates of the non-zero elements to be modified in the second channel; returning to the step of acquiring the distance between the row coordinates of the current non-zero elements in the current partition and the diagonal line of the sparse matrix until all the non-zero elements in the sparse matrix are traversed;
acquiring the number of non-zero elements in the current block;
acquiring the ratio of the number of the non-zero elements in the current block to the size of the current block;
and taking the ratio as a value corresponding to the current block of a third channel in the RGB three channels.
2. The method of sparse matrix vector multiplication based feature detection of claim 1, wherein building the neural network model comprises:
Acquiring a sample sparse matrix data set;
performing sparse matrix vector multiplication on all formats of each sample sparse matrix in the sample sparse matrix dataset for the same times;
selecting a format corresponding to the time-consuming result with the shortest time from all formats of the sample sparse matrix as a label value of the corresponding sample sparse matrix;
converting each sample sparse matrix into a corresponding sample matrix image and extracting the sample matrix characteristics of each sample sparse matrix;
training the neural network model according to the sample matrix image corresponding to each sample sparse matrix, the sample matrix characteristics and the label value corresponding to each sample sparse matrix.
3. The method of feature detection based on sparse matrix vector multiplication of claim 2, wherein the structure of the neural network model comprises: an image channel and a feature channel; wherein the image channel is for receiving an image representation of a matrix and the feature channel is for receiving a feature representation of the matrix; the matrix comprises the sample sparse matrix and the target sparse matrix;
The image channel comprises a convolution layer, a preset number of convolution layers with attention mechanisms, a full convolution layer, a pooling layer and a full connection layer;
the characteristic channel comprises two full-connection layers, a residual full-connection layer and a full-connection layer; wherein, the residual error full-connection layer comprises a residual error block formed by two full-connection layers;
and the output of the image channel and the output of the characteristic channel pass through a full connection layer to obtain the predictive storage format.
4. A method of feature detection based on sparse matrix vector multiplication according to any one of claims 1 to 3, wherein extracting matrix features of the sparse matrix comprises:
acquiring target data in the sparse matrix; the target data at least comprises values of non-zero elements in the sparse matrix, column values of the non-zero elements in the sparse matrix and row values of the non-zero elements in the sparse matrix;
determining matrix characteristics of the sparse matrix according to the values of the non-zero elements, the column values of the non-zero elements and the row values of the non-zero elements; the matrix characteristics of the sparse matrix at least comprise the density of the sparse matrix and the average value of the number of non-zero elements of each row of the sparse matrix.
5. The method of feature detection based on sparse matrix vector multiplication of claim 1, wherein converting the sparse matrix into a matrix image comprises:
acquiring the size of a matrix image to be generated and the number of non-zero elements of the sparse matrix;
filling row coordinates of the non-zero elements in corresponding positions in a first channel and column coordinates of the non-zero elements in corresponding positions in a second channel in three RGB channels, respectively, when the number of the non-zero elements is equal to the size of the matrix image to be generated;
filling row coordinates of the non-zero elements in corresponding positions in the first channel and filling column coordinates of the non-zero elements in corresponding positions in the second channel in the RGB three channels when the number of the non-zero elements is smaller than the size of the matrix image to be generated; filling 0 for the position of the spare part;
selecting the elements in the middle of all non-zero elements of the sparse matrix, which are equal to the size of the matrix image to be generated, under the condition that the number of the non-zero elements is larger than the size of the matrix image to be generated; and filling row coordinates of the element in the corresponding position in the first channel and filling column coordinates of the element in the corresponding position in the second channel in the RGB three channels.
6. The method of sparse matrix vector multiplication based feature detection of claim 1 or 5, wherein after determining the values of the RGB three-channel elements, the method further comprises:
and respectively carrying out normalization processing on the values of the elements of each channel so that the values of the elements of each channel are within a preset range.
7. An apparatus for feature detection based on sparse matrix vector multiplication, comprising:
the acquisition module is used for acquiring data which are acquired by the data acquisition equipment and are used for representing the characteristics of a preset scene, and determining a target sparse matrix according to the data; the method for determining the target sparse matrix comprises the following steps of: acquiring underwater sound field data acquired by the sensor, and recovering the target sparse matrix according to the underwater sound field data, wherein the underwater sound field data comprises sound intensity, water temperature density and flow velocity;
the conversion and extraction module is used for converting the target sparse matrix into a matrix image and extracting matrix characteristics of the target sparse matrix;
The input module is used for inputting the matrix image and the matrix characteristics into a preset neural network model; the neural network model is obtained by taking a sample matrix image corresponding to a sample sparse matrix and sample matrix characteristics corresponding to the sample sparse matrix as inputs, and taking a format corresponding to the shortest time in time-consuming results of carrying out sparse matrix vector multiplication for the same times on all formats of the sample sparse matrix as output training;
the output module is used for outputting a prediction storage format of the target sparse matrix through the neural network model;
the determining module is used for carrying out sparse matrix vector multiplication on the target sparse matrix according to the prediction storage format, and determining the characteristic information of the preset scene according to the result of the sparse matrix vector multiplication;
wherein converting the sparse matrix into a matrix image comprises:
partitioning the sparse matrix according to a preset size;
acquiring a first distance between a column coordinate of a current non-zero element in a current block and a diagonal line of the sparse matrix;
acquiring a first integer division result of the first distance and the width of the current block and a second integer division result of the row coordinates of the current non-zero element and the length of the current block;
Determining coordinates of non-zero elements to be modified in a first channel of the RGB three channels according to the first integer division result and the second integer division result, and adding 1 to element values on the coordinates of the non-zero elements to be modified in the first channel; returning to the step of acquiring the distance between the column coordinates of the current non-zero element in the current block and the diagonal line of the sparse matrix until all the non-zero elements in the sparse matrix are traversed;
acquiring a second distance between the row coordinates of the current non-zero element in the current block and the diagonal of the sparse matrix;
obtaining a third integer division result of the second distance and the length of the current block and a fourth integer division result of the column coordinates of the current non-zero element and the width of the current block;
determining coordinates of non-zero elements to be modified in a second channel of the RGB three channels according to the third integer division result and the fourth integer division result, and adding 1 to element values on the coordinates of the non-zero elements to be modified in the second channel; returning to the step of acquiring the distance between the row coordinates of the current non-zero elements in the current partition and the diagonal line of the sparse matrix until all the non-zero elements in the sparse matrix are traversed;
Acquiring the number of non-zero elements in the current block;
acquiring the ratio of the number of the non-zero elements in the current block to the size of the current block;
and taking the ratio as a value corresponding to the current block of a third channel in the RGB three channels.
8. An apparatus for feature detection based on sparse matrix vector multiplication, comprising:
a memory for storing a computer program;
a processor for implementing the steps of a method of sparse matrix vector multiplication based feature detection as claimed in any one of claims 1 to 6 when executing said computer program.
9. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a method of feature detection based on sparse matrix vector multiplication according to any one of claims 1 to 6.
CN202310484071.2A 2023-05-04 2023-05-04 Feature detection method, device and medium based on sparse matrix vector multiplication Active CN116186526B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310484071.2A CN116186526B (en) 2023-05-04 2023-05-04 Feature detection method, device and medium based on sparse matrix vector multiplication

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310484071.2A CN116186526B (en) 2023-05-04 2023-05-04 Feature detection method, device and medium based on sparse matrix vector multiplication

Publications (2)

Publication Number Publication Date
CN116186526A CN116186526A (en) 2023-05-30
CN116186526B true CN116186526B (en) 2023-07-18

Family

ID=86442643

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310484071.2A Active CN116186526B (en) 2023-05-04 2023-05-04 Feature detection method, device and medium based on sparse matrix vector multiplication

Country Status (1)

Country Link
CN (1) CN116186526B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113721982A (en) * 2021-08-03 2021-11-30 清华大学 Sparse matrix storage method, vector calculation method and electronic equipment
CN115048215A (en) * 2022-05-24 2022-09-13 哈尔滨工程大学 Method for realizing diagonal matrix SPMV (sparse matrix) on GPU (graphics processing Unit) based on mixed compression format
CN115390788A (en) * 2022-09-19 2022-11-25 复旦大学 Sparse matrix multiplication distribution system of graph convolution neural network based on FPGA

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140181171A1 (en) * 2012-12-24 2014-06-26 Pavel Dourbal Method and system for fast tensor-vector multiplication
US10346507B2 (en) * 2016-11-01 2019-07-09 Nvidia Corporation Symmetric block sparse matrix-vector multiplication
US11127167B2 (en) * 2019-04-29 2021-09-21 Nvidia Corporation Efficient matrix format suitable for neural networks

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113721982A (en) * 2021-08-03 2021-11-30 清华大学 Sparse matrix storage method, vector calculation method and electronic equipment
CN115048215A (en) * 2022-05-24 2022-09-13 哈尔滨工程大学 Method for realizing diagonal matrix SPMV (sparse matrix) on GPU (graphics processing Unit) based on mixed compression format
CN115390788A (en) * 2022-09-19 2022-11-25 复旦大学 Sparse matrix multiplication distribution system of graph convolution neural network based on FPGA

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Efficiently Executing Sparse Matrix-Matrix Multiplication on General Purpose Digital Single Processor;Haibo Xu et al.;《2022 IEEE 24th Int Conf on High Performance Computing & Communications》;第1-8页 *
Sparse Matrix Classification on Imbalanced Datasets Using Convolutional Neural Networks;JUAN C. PICHEL et al.;《IEEE Access》;第82377-82389页 *
基于 GPU 的稀疏矩阵存储格式优化研究;杨世伟;《计算机工程》;第45卷(第9期);第23-39页 *

Also Published As

Publication number Publication date
CN116186526A (en) 2023-05-30

Similar Documents

Publication Publication Date Title
US10096134B2 (en) Data compaction and memory bandwidth reduction for sparse neural networks
EP3627397B1 (en) Processing method and apparatus
KR102499396B1 (en) Neural network device and operating method of neural network device
CN108364061B (en) Arithmetic device, arithmetic execution apparatus, and arithmetic execution method
CN106855952B (en) Neural network-based computing method and device
EP3674986A1 (en) Neural network apparatus and method with bitwise operation
US20200380360A1 (en) Method and apparatus with neural network parameter quantization
CN111444807B (en) Target detection method, device, electronic equipment and computer readable medium
CN107832794A (en) A kind of convolutional neural networks generation method, the recognition methods of car system and computing device
CN111639230B (en) Similar video screening method, device, equipment and storage medium
CN113126953A (en) Method and apparatus for floating point processing
CN112529862A (en) Significance image detection method for interactive cycle characteristic remodeling
CN116186526B (en) Feature detection method, device and medium based on sparse matrix vector multiplication
KR20210124888A (en) Neural network device for neural network operation, operating method of neural network device and application processor comprising neural network device
CN112561050B (en) Neural network model training method and device
CN111898544A (en) Character and image matching method, device and equipment and computer storage medium
CN116342628A (en) Pathological image segmentation method, pathological image segmentation device and computer equipment
CN114820755B (en) Depth map estimation method and system
CN111291240A (en) Method for processing data and data processing device
CN111899161A (en) Super-resolution reconstruction method
CN111767204A (en) Overflow risk detection method, device and equipment
US20230325665A1 (en) Sparsity-based reduction of gate switching in deep neural network accelerators
US20240028895A1 (en) Switchable one-sided sparsity acceleration
US20240119269A1 (en) Dynamic sparsity-based acceleration of neural networks
US20240020517A1 (en) Real-time inference of temporal down-sampling convolutional networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant