WO2020119188A1

WO2020119188A1 - Program detection method, apparatus and device, and readable storage medium

Info

Publication number: WO2020119188A1
Application number: PCT/CN2019/103639
Authority: WO
Inventors: 曹芳; 赵雅倩; 郭振华
Original assignee: 广东浪潮大数据研究有限公司
Priority date: 2018-12-10
Filing date: 2019-08-30
Publication date: 2020-06-18
Also published as: CN109558329A

Abstract

A program detection method, apparatus and device, and a readable storage medium. The method of the present application comprises: after a Winograd program detection instruction is received, obtaining test data; performing convolution calculation on the test data by using a target algorithm program of a convolutional neural network to obtain a convolution result; sending the test data to an FPGA, so that the FPGA performs fast convolution calculation on the test data by using a Winograd program; receiving a fast convolution result sent by the FPGA, and calculating a similarity between the fast convolution result and the convolution result; and if the similarity is greater than a threshold, determining that the Winograd program is correct. By means of the method, the Winograd program in the FPGA can be detected.

Description

Program detection method, device, equipment and readable storage medium

This application requires the priority of the Chinese patent application filed on December 10, 2018 in the Chinese Patent Office with the application number 201811514703.0 and the invention titled "a program detection method, device, equipment and readable storage medium", all of which are The content is incorporated into this application by reference.

Technical field

The present invention relates to the field of computer application technology, and in particular, to a program detection method, device, device, and readable storage medium.

Background technique

In recent years, Convolutional Neural Networks (CNN) are more and more widely used in computer vision tasks. CNN usually contains multiple layers, and the output feature map of each layer is the input feature map of the next layer. The calculation of the current optimal CNN is mainly dominated by the convolutional layer.

FPGA (Field-Programmable Gate Array), because of its advantages of high performance, low energy consumption and reconfigurability, has become the effective hardware accelerator of CNN and has attracted much attention. If the target's convolution algorithm is used, each element in the output feature map needs to be calculated separately through multi-step multiply-accumulate and add operations, which requires a large amount of DSP (multiplier) resources in the FPGA to perform multiplication operations. However, in the FPGA board DSP resources are limited and very precious, and cannot meet the number of multiplications required by the target convolution algorithm.

The Winograd algorithm is a fast algorithm for convolutional neural networks. It uses the structural similarity between elements to generate a list of elements in the output feature map. Can reduce the number of multiplication operations, thereby greatly reducing the complexity of the algorithm, can improve the CNN performance on the FPGA.

However, the code implementation of the Winograd algorithm on the FPGA increases the complexity of the program, and it is extremely error-prone during the code development process. Once the Winograd algorithm is partially wrong, it will affect the accuracy of the entire CNN algorithm. In order to make the program verification results more accurate, it is often necessary to enter different test data to verify the Winograd program. However, because of the complexity of the Winograd algorithm in the development stage, it is difficult to find the convolution results corresponding to different test data. Even if a correspondence table of different test data and test results is stored in advance for program verification, there are problems that the amount of test data is small, there is contingency and the test process is complicated, and it is difficult to realize.

In summary, how to effectively check whether the Winograd program is correct is a technical problem urgently needed by those skilled in the art.

Summary of the invention

The purpose of the present invention is to provide a program detection method, device, equipment and readable storage medium to detect the Winograd program on the FPGA to ensure the accuracy of the entire CNN algorithm.

To solve the above technical problems, the present invention provides the following technical solutions:

A program detection method, including:

Obtain test data when receiving the Winograd program detection instruction;

Use the target algorithm program of the convolutional neural network to perform convolution calculation on the test data to obtain a convolution result; the target algorithm program is an algorithm program that implements the convolutional neural network in a sliding window manner;

Sending the test data to the FPGA, so that the FPGA uses the Winograd program to perform fast convolution calculation on the test data;

Receiving the fast convolution result sent by the FPGA, and calculating the similarity between the fast convolution result and the convolution result;

When the similarity is greater than the threshold, it is determined that the Winograd program is correct.

Preferably, the target algorithm program of the convolutional neural network is used to perform convolution calculation on the test data to obtain a convolution result, including:

Use the target algorithm program to perform convolution calculation on the test data, and use the first layer result of the convolutional neural network as the convolution result;

Correspondingly, the FPGA uses the Winograd program to perform fast convolution calculation on the test data, including:

The FPGA uses the Winograd program to perform fast convolution calculation on the test data, and uses the fast calculation to obtain the first layer result of the convolutional neural network as the fast convolution result.

Preferably, it also includes:

Obtaining filter parameters of the convolutional neural network;

The filter parameters are set in the target convolution algorithm program and the Winograd program, respectively.

Preferably, sending the test data to the FPGA includes:

Create the PFGA board operating environment and initialize board parameters;

Sending the test data to the FPGA.

Preferably, the FPGA uses the Winograd program to perform fast convolution calculation on the test data, including:

The FPGA starts the kernel and uses the Winograd program to perform fast convolution calculation on the test data.

Preferably, calculating the similarity between the fast convolution result and the convolution result includes:

Calculating the ratio of the fast convolution result to the convolution result, and using the ratio to determine the similarity;

Or, calculate the difference between the fast convolution result and the convolution result, and use the difference to determine the similarity.

Preferably, when the similarity is less than or equal to the threshold, it further includes:

It is determined that the Winograd program is wrong.

A program detection device, including:

Test data acquisition module, used to acquire test data when receiving the Winograd program detection instruction;

The convolution calculation module is used to use the target algorithm program of the convolutional neural network to perform convolution calculation on the test data to obtain a convolution result; the target algorithm program is to implement the convolutional nerve in a sliding window manner Algorithm program of the network;

A test data sending module, configured to send the test data to the FPGA, so that the FPGA uses the Winograd program to perform fast convolution calculation on the test data;

A similarity calculation module, configured to receive the fast convolution result sent by the FPGA, and calculate the similarity between the fast convolution result and the convolution result;

The detection result determination module is used to determine that the Winograd program is correct when the similarity is greater than a threshold.

A program detection device, including:

Memory, used to store computer programs;

The processor is configured to implement the steps of the above program detection method when executing the computer program.

A readable storage medium, a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned program detection method are realized.

Using the method provided by the embodiment of the present invention, when the Winograd program detection instruction is received, the test data is obtained; the target algorithm program of the convolutional neural network is used to perform convolution calculation on the test data to obtain the convolution result; the target algorithm program is Implement the algorithm program of convolutional neural network by sliding window method; send test data to FPGA, so that FPGA can use Winograd program to perform fast convolution calculation on test data; receive the fast convolution result sent by FPGA, and calculate the fast convolution result and The similarity of the convolution results; when the similarity is greater than the threshold, it is determined that the Winograd program is correct.

Because the target algorithm program of the deep neural convolutional network, that is, the implementation process of the sliding window algorithm, only the use of loop nesting can accurately express the convolution algorithm, and it has the advantages of simple code and low probability of error. Because the Winograd program is a fast algorithm program for realizing the convolutional neural network, that is to say, when the correctly expressed Winograd program and the target algorithm program calculate the convolution calculation result of the same input data, the two convolution results obtained should be Consistency or keeping within a certain range of differences means similarity. Based on this, after writing the Winograd program to the FPGA, when receiving the Winograd program detection instruction, first obtain the test data for verification. Then use the target algorithm program of the convolutional neural network in the CPU to perform convolution calculation on the test data to obtain the convolution result. At the same time, the test data can be sent to the FPGA. After the FPGA obtains the test data, it uses the Winograd program to perform fast convolution calculation on the test data, and then sends the fast convolution calculation result to the CPU. After the CPU receives the fast convolution result sent by the FPGA, it calculates the similarity between the fast convolution result and the convolution result; when the similarity is greater than the threshold, it is determined that the Winograd program is correct. In this way, the target algorithm program running in the CPU can be used to detect the Winograd program in the FPGA, which ensures the accuracy of the Winograd algorithm part, can improve the accuracy of the CNN algorithm in the FPGA, and further improve the implementation on the FPGA The accuracy of computer vision tasks.

Correspondingly, the embodiments of the present invention also provide a program detection device, device and readable storage medium corresponding to the above-mentioned program detection method, which have the above-mentioned technical effects and will not be repeated here.

BRIEF DESCRIPTION

In order to more clearly explain the embodiments of the present invention or the technical solutions in the prior art, the following will briefly introduce the drawings required in the embodiments or the description of the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, without paying any creative work, other drawings can be obtained based on these drawings.

FIG. 1 is an implementation flowchart of a program detection method in an embodiment of the present invention;

Figure 2 is a schematic diagram of the function of creating a board running environment;

Figure 3 is a schematic diagram of the initialization board parameter function;

4 is a specific flowchart of a program detection method in an embodiment of the present invention;

5 is a schematic structural diagram of a program detection device according to an embodiment of the present invention;

6 is a schematic structural diagram of a program detection device according to an embodiment of the present invention;

7 is a schematic diagram of a specific structure of a program detection device in an embodiment of the present invention.

detailed description

The core of the present invention is to provide a program detection method that combines the advantages of the Winograd algorithm and the target algorithm, and proposes a way to check the running result of the Winograd algorithm program based on the running result of the target algorithm program to further determine whether the Winograd algorithm program is correct The method of Winograd algorithm is expressed.

Among them, the introduction of the fast Winograd algorithm: suppose F(m, r) represents the one-dimensional convolution of the input data size m and the filter size r, and F(mxm, rxr) represents the input data size m*m, the filter A two-dimensional convolution of size r*r. The one-dimensional convolution F (m, r) Winograd fast filtering algorithm can be written in the form of a matrix: Y = A ^T [(Gg) · (B ^T d)]; F (m, r) nested with itself to obtain the least square Dimensional convolution F(m*m, r*r) Winograd fast filtering algorithm can be expressed as: Y = A ^T [(GgG ^T )·(B ^T dB)] A, where g is the filtered data and d is the input Data, G, A and B are three transformation matrices. Taking F(4, 3) as an example, the values of G, B, and A are as follows:

When implementing the Winograd algorithm on the FPGA code, you need to calculate according to the above formula, taking one-dimensional F(4, 3) as an example, d=[d1, d2, d3, d4, d5], g=[g0, g1, g2 ], the calculation of B ^T d needs to express 6 mathematical expressions in code:

float, trans_input0=4.0f*d1-5.0f*d3+d5;

float, trans_input1 = -4.0f*d2-4.0f*d3+d4+d5;

float trans_input2=4.0f*d2-4.0f*d3-d4+d5;

float, trans_input3 = -2.0f*d2-d3+2.0f*d4+d5;

float, trans_input4=2.0f*d2-d3-2.0f*d4+d5;

float trans_input5=4.0f*d2-5.0f*d4+d5;

To calculate Gg, you need to express 6 mathematical expressions in code:

float trans_filter0=one_over_4*g0;

float trans_filter1＝minus_one_over_6*g0

+minus_one_over_6*g1+minus_one_over_6*g2;

float trans_filter2＝minus_one_over_6*g0+one_over_6*g1

+minus_one_over_6*g2;

float trans_filter3=one_over_24*g0+one_over_12*g1+one_over_6*g2;

float trans_filter4＝one_over_24*g0-one_over_12*g1+one_over_6*g2;

float trans_filter5=g2;

Assuming that the result of the dot product of B ^T d and Gg is [mul0, mul1, mul2, mul3, mul4, mul5], and finally calculate the final result Y, you need to express 4 mathematical expressions in code:

floatresult0=mul0+mul1+mul2+mul3+mul4;

float results1=mul1-mul2+2.0f*mul3-2.0f*mul4;

floatresult2=mul1+mul2+4.0f*mul3+4.0f*mul4;

floatresult3=mul1-mul2+8.0f*mul3-8.0f*mul4+mul5;

As shown above, in the one-dimensional F(4,3) Winograd code, a total of 6+6+4=16 four arithmetic expressions need to be written manually. Similarly, two-dimensional F(4x4, 3x3) contains at least (6x6+6x6)+(6x3+6x6)+(4x6+4x4)=166 mathematical expressions. The expression data that needs to be manually written in the code increases sharply, and each expression contains the four operations of addition, subtraction, multiplication and division, different constants and different variables. As a result, the complexity of expressions greatly increases the chance of code errors.

Brief introduction of traditional convolution algorithm: If F (4x4, 3x3) is calculated according to the traditional convolution algorithm, if sliding window method is used, only four for loop nesting calculations are needed in the program, the code is simple, and the probability of error is small. The traditional algorithm program is as follows:

It can be seen from the above analysis that the Winograd (matrix multiplication) algorithm code implemented on the FPGA-kernel side is complex and has a high error probability, while the traditional convolution algorithm is simple to write on the host side and has a very low error rate. This method will combine the advantages of these two algorithms to improve the Winograd algorithm. The specific practices are as follows:

The expression of the Winograd algorithm on the kernel side of the FPGA remains unchanged, but the first layer CNN convolution result of its calculation is returned to the host side, and at the same time, the traditional convolution calculation is implemented on the host side, and another thread is used to calculate the convolution of the first layer CNN result. After the calculation is completed, compare the calculation result of the traditional convolution algorithm on the host side with the calculation result of the Winograd algorithm returned by the kernel side. If the difference between the calculation results is small and within the expected permission range, it means that the Winograd calculation result is correct, and the kernel side cnn program continues Run; if the difference in the calculation results exceeds the expected result, it means that the Winograd algorithm expression is wrong, and the program needs to be interrupted to check and modify.

In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making creative efforts fall within the protection scope of the present invention.

Example one:

Please refer to FIG. 1. FIG. 1 is a flowchart of a program detection method according to an embodiment of the present invention. The method can be applied to a CPU. The method includes the following steps:

S101. Obtain test data when receiving the Winograd program detection instruction.

Among them, the Winograd program is a fast algorithm program for realizing a convolutional neural network.

After the developer completes the code development of the Winograd program and writes the Winograd program to the FPGA, the Winograd program detection instruction can be sent to the CPU through the visual interface or through the command line. When the CPU receives the Winograd program detection instruction, the CPU can obtain data for testing the Winograd program. Specifically, the test data may specifically be image data or a matrix. Obtaining test data can receive external incoming test data through the interface, and can also read parameter data directly from the storage device.

S102. Use the target algorithm program of the convolutional neural network to perform convolution calculation on the test data to obtain a convolution result.

Among them, the target algorithm program is an algorithm program that implements a convolutional neural network in a sliding window manner.

In the embodiment of the present invention, the target algorithm program of the convolutional neural network may be written in advance. After the test data is obtained, the target algorithm program can be used to perform convolution calculation on the test data to obtain the convolution result. Among them, the target algorithm program can also use Fourier or im2col to realize the algorithm program of the convolutional neural network.

Among them, the sliding window algorithm, this method is the most intuitive and simple method. im2col algorithm: At present, almost all mainstream computing frameworks including Caffe, MXNet, etc. have implemented this method. This method converts the entire convolution process into a GEMM process, and GEMM is extremely optimized in various BLAS libraries. FFT algorithm: Fourier transform and fast Fourier transform are commonly used calculation methods in classic image processing. Since the sliding window algorithm, Fourier algorithm or im2col algorithm are common algorithms, the specific processing logic will not be repeated here.

S103. Send the test data to the FPGA so that the FPGA can use the Winograd program to perform fast convolution calculation on the test data.

After the test data is obtained, the test data needs to be sent to the FPGA. Specifically, the FPGA in the embodiment of the present invention may be a chip or device with editable logic gates. After the FPGA receives the test data, it can use the Winograd program to perform a fast convolution calculation on the test data to obtain a fast convolution result. After the FPGA calculates the fast convolution result, the fast convolution result can be returned to the CPU.

Among them, the test data is sent to the FPGA, including:

Step 1: Create PFGA board operating environment and initialize board parameters;

Step 2: Send test data to FPGA.

For ease of description, the following steps 1 and 2 are combined for description.

First create the FPGA board operating environment, and then initialize the board parameters, you can send the test data to the FPGA. Among them, the creation of the board operating environment and the initialization of the board parameters can be operated on the board by calling the functions packaged by intel. For example, the function of creating the board running environment as shown in FIG. 2 can be called, as shown in FIG. 3. The function of initializing board parameters.

After the FPGA receives the test data, it can start the kernel and use the Winograd program to perform fast convolution calculation on the test data. Among them, the kernel is a real-time operating system with event scheduling and synchronization in the FPGA, communication between processes (messaging), memory management, and process management. In this way, after the fast convolution result is obtained, the result can be returned to the CPU.

S104. Receive the fast convolution result sent by the FPGA, and calculate the similarity between the fast convolution result and the convolution result.

After the CPU receives the FPGA legal fast convolution result, it can calculate the similarity between the fast convolution result and the convolution result.

Specifically, since the programs corresponding to the target algorithm of the convolutional neural network and the Winograd algorithm respectively perform convolution calculation on the same input data, the calculation results should be consistent or have a high degree of similarity. Because the code of the target algorithm program is relatively simple and is not easy to make mistakes, the convolution result obtained by the convolution calculation of the test data by the target algorithm program can be used as a reference value. When the fast convolution calculation result of the Winograd algorithm program is obtained, the judgment The similarity between the convolution result and the fast convolution result can determine whether the Winograd program is correct.

Specifically, the calculation method of the similarity includes but is not limited to the following two methods. In practical applications, a calculation method of relative degree may be selected:

Method 1: Calculate the ratio between the fast convolution result and the convolution result, and use the ratio to determine the similarity. By judging the relationship between the ratio of two values and 1, the similarity of these two numbers can be determined. Specifically, the ratio is close to 1, indicating that the similarity between the two values is higher. Based on this, after the fast convolution result and the convolution result are obtained, the ratio of the fast convolution result to the convolution result can be calculated, and then the similarity can be determined using the ratio. Specifically, when calculating the ratio between the fast convolution result and the convolution result, the guarantee ratio is between (0, 1) (that is, the calculated fast convolution result is better than the convolution result, or the convolution result is better than the calculated fast convolution Results The result of these two results is less than or equal to 0 as the ratio of the fast convolution result to the convolution result), the specified ratio is 1, the similarity is 100%, if the ratio is (0-1), the ratio is directly After differentiation, the percentage is determined as the similarity.

Method 2: Calculate the difference between the fast convolution result and the convolution result, and use the difference to determine the similarity. Specifically, when the difference is 0, the similarity is 100%, and different differences are specified as different similarities. For example, when the difference is 1, the similarity is 99%, and the difference is 2, yes. The degree is 98%. According to a certain ratio, the larger the difference, the smaller the similarity.

S105. When the similarity is greater than the threshold, determine that the Winograd program is correct.

In the embodiment of the present invention, a threshold may be set, and the threshold is used to compare with the similarity to determine whether the Winograd program is correct. Specifically, when the similarity is greater than the threshold, the Winograd program can be determined to be correct. When the similarity is less than or equal to the threshold, it is determined that the Winograd program is wrong. The value of the threshold can be determined as 99%, or 99.9%, or 99.999%.

Because the target algorithm program of the deep neural convolution network, that is, the implementation process of the sliding window algorithm, only the use of loop nesting can accurately express the convolution algorithm, and it has the advantages of simple code and low error probability. Because the Winograd program is a fast algorithm program for implementing convolutional neural networks, that is, when the correctly expressed Winograd program and the target algorithm program calculate the convolution calculation result of the same input data, the two convolution results obtained should be Consistency or keeping within a certain range of differences means similarity. Based on this, after the Winograd program is written into the FPGA, when the Winograd program detection instruction is received, the test data for verification is first obtained. Then use the target algorithm program of the convolutional neural network in the CPU to perform convolution calculation on the test data to obtain the convolution result. At the same time, the test data can be sent to the FPGA. After the FPGA obtains the test data, it uses the Winograd program to perform fast convolution calculation on the test data, and then sends the fast convolution calculation result to the CPU. After the CPU receives the fast convolution result sent by the FPGA, it calculates the similarity between the fast convolution result and the convolution result; when the similarity is greater than the threshold, it is determined that the Winograd program is correct. In this way, the target algorithm program running in the CPU can be used to detect the Winograd program in the FPGA, which ensures the accuracy of the Winograd algorithm part, can improve the accuracy of the CNN algorithm in the FPGA, and further improve the implementation on the FPGA The accuracy of computer vision tasks.

It should be noted that, based on the foregoing embodiments, the embodiments of the present invention also provide corresponding improvements. In the preferred/improved embodiments, the same steps as in the above-mentioned embodiments or the corresponding steps can be referred to each other, and the corresponding beneficial effects can also be cross-referenced, which will not be repeated in the preferred/improved embodiments herein.

Preferably, when determining whether the Winograd program is correct, according to the similarity calculation principle, the difference and the ratio can be directly compared with the preset judgment threshold to determine whether the Winograd program is correct. Specifically, after calculating the difference between the convolution result and the fast convolution result, if the difference is less than 10 ^-3 , it is determined that the Winograd program is correct, or the ratio between the convolution result and the fast convolution result is greater than 0.999, Then make sure the Winograd program is correct. Of course, the judgment thresholds of 10 ^-3 and 0.999 can be adjusted according to the actual accuracy requirements.

Preferably, because the convolutional neural network includes several convolutional layers, and the convolution algorithm can reduce the number of codes by means of cyclic calls, therefore, when testing the Winograd program, only the first layer of convolution calculation results can be compared That's it. Specifically, step S102 may specifically use a target algorithm program to perform convolution calculation on the test data, and use the first layer result of the convolutional neural network as the convolution result; accordingly, the FPGA in step S104 uses the Winograd program to test the data Carry out fast convolution calculation, specifically, FPGA uses the Winograd program to perform fast convolution calculation on the test data, and the first layer result of the convolutional neural network obtained by the fast calculation is used as the fast convolution result. In this way, the verification time of the Winograd program can be shortened.

Preferably, before performing the Winograd program test, the filter parameters in the convolutional neural network can also be set. Specifically, the filter parameters of the convolutional neural network are obtained, and the filter parameters are respectively set in the target convolution algorithm program and the Winograd program. In this way, the filter parameters in the target convolution algorithm program and the Winograd program can be guaranteed to be consistent, and the accuracy of the Winograd program under different filter parameters can also be tested separately.

Example 2:

In order to facilitate those skilled in the art to better understand the program detection method provided by the embodiment of the present invention, the following uses a specific application scenario as an example to describe in detail the program detection method provided by the embodiment of the present invention.

Please refer to FIG. 4, which is a specific flowchart of a program detection method according to an embodiment of the present invention.

The expression of the Winograd algorithm expressed by the Winograd program on the FPGA-kernel side remains unchanged, but the first layer of CNN convolution results of its calculation are returned to the host (host side, same as the CPU or processor above), while achieving the target on the host side Convolution calculation, starting a new thread to calculate the convolution result of the first layer CNN. After the calculation is completed, compare the calculation result of the target convolution algorithm on the host side with the calculation result of the Winograd algorithm returned by the kernel side. If the calculation result is very small, it is within the expected permission range, indicating that the Winograd calculation result is correct, and the kernel side cnn program continues Run; if the difference in the calculation results exceeds the expected result, it means that the Winograd algorithm expression is wrong, and the program needs to be interrupted to check and modify.

Specifically, first input the test data and filter data (same as the filter parameters above) into the CPU cache. Then, two threads are started on the host side, where thread 1 is used to calculate the convolution according to the target convolution algorithm; thread 2 is used to start the kernel to use the FPGA board to accelerate the calculation of CNN.

After thread 2 starts, it first creates the FPGA board operating environment, initializes the board parameters, and then writes the test data and filter data into the FPGA board card cache, and then starts the FPGA kernel program to perform calculations. That is, the kernel program calculates the convolution according to the Winograd algorithm, obtains the first layer CNN convolution, and returns the convolution result to the host side.

After thread 1 is started, the input and filter data are first obtained, and then the first layer CNN convolution result is calculated according to the target convolution algorithm. Received Winograd convolution result data returned from the kernel. Then compare the difference of the convolution results obtained by the two methods. If the difference is less than 10 ^-3 , it means that the Winograd algorithm program is correct. If the difference is beyond the expected range, it means that there is a problem with the expression of the kernel Winograd algorithm program, and the program needs to be interrupted to check and modify. By adding the host-side verification program to ensure that the Winograd calculation results are accurate, you can determine whether the Winograd algorithm program running on the FPGA is correct, and improve it if there is an error. In order to further ensure that the calculation results of the entire CNN network are correct.

Example three:

Corresponding to the above method embodiments, an embodiment of the present invention also provides a program detection device. The program detection device described below and the program detection method described above can be referred to each other.

Referring to FIG. 5, the device includes the following modules:

The test data obtaining module 101 is used to obtain test data when receiving the Winograd program detection instruction; wherein, the Winograd program is a fast algorithm program for realizing a convolutional neural network;

The convolution calculation module 102 is used to use the target algorithm program of the convolutional neural network to perform convolution calculation on the test data to obtain the convolution result; the target algorithm program is an algorithm program that implements the convolutional neural network in a sliding window manner

The test data sending module 103 is used to send the test data to the FPGA so that the FPGA can use the Winograd program to perform fast convolution calculation on the test data;

The similarity calculation module 104 is used to receive the fast convolution result sent by the FPGA and calculate the similarity between the fast convolution result and the convolution result;

The detection result determination module 105 is used to determine that the Winograd program is correct when the similarity is greater than the threshold.

Using the device provided by the embodiment of the present invention, when receiving the Winograd program detection instruction, the test data is obtained; the target algorithm program of the convolutional neural network is used to perform convolution calculation on the test data to obtain the convolution result; the target algorithm program is Implement the algorithm program of convolutional neural network by sliding window method; send test data to FPGA, so that FPGA can use Winograd program to perform fast convolution calculation on test data; receive the fast convolution result sent by FPGA, and calculate the fast convolution result and The similarity of the convolution results; when the similarity is greater than the threshold, it is determined that the Winograd program is correct.

In a specific embodiment of the present invention, the convolution calculation module 102 is specifically used to perform fast convolution calculation on the test data in the FPGA using the Winograd program, and use the fast calculation to obtain the first layer result of the convolutional neural network as a fast When the convolution result is used, the target algorithm program is used to perform convolution calculation on the test data, and the first layer result of the convolutional neural network is used as the convolution result.

In a specific embodiment of the present invention, it further includes:

The filter setting module is used to obtain the filter parameters of the convolutional neural network; the filter parameters are set in the target convolution algorithm program and the Winograd program, respectively.

In a specific embodiment of the present invention, the test data sending module 103 is specifically used to create a PFGA board operating environment and initialize board parameters; send the test data to the FPGA so that the FPGA starts the kernel and uses the Winograd program to Test data for fast convolution calculation.

In a specific embodiment of the present invention, the similarity calculation module 104 is specifically used to calculate the ratio between the fast convolution result and the convolution result, and use the ratio to determine the similarity; or, calculate the fast convolution result and the convolution result. Difference, use the difference to determine the similarity.

In a specific embodiment of the present invention, the detection result determination module 105 is specifically configured to determine that the Winograd program is wrong when the similarity is less than or equal to the threshold.

Example 4:

Corresponding to the above method embodiment, an embodiment of the present invention further provides a program detection device. A program detection device described below and a program detection method described above can be referred to each other.

As shown in Figure 6, the program detection equipment includes:

Memory D1, used to store computer programs;

The processor D2 is configured to implement the steps of the program detection method in the foregoing method embodiments when the computer program is executed.

Specifically, please refer to FIG. 7, which is a schematic diagram of a specific structure of a program detection device provided in this embodiment. The program detection device may have a relatively large difference due to different configurations or performances, and may include one or more processings. A central processing unit (CPU) 322 (for example, one or more processors) and a memory 332, one or more storage media 330 (for example, one or more mass storage devices) that store application programs 342 or data 344. The memory 332 and the storage medium 330 may be short-term storage or persistent storage. The program stored in the storage medium 330 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the data processing device. Furthermore, the central processor 322 may be configured to communicate with the storage medium 330 and execute a series of instruction operations in the storage medium 330 on the program detection device 301.

The program detection device 301 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input output interfaces 358, and/or one or more operating systems 341. For example, Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

The steps in the program detection method described above can be implemented by the structure of the program detection device.

Example 5:

Corresponding to the above method embodiment, an embodiment of the present invention further provides a readable storage medium. A readable storage medium described below and a program detection method described above can be referred to each other.

A readable storage medium stores a computer program on the readable storage medium, and when the computer program is executed by a processor, the steps of the program detection method of the foregoing method embodiments are implemented.

The readable storage medium may specifically be a U-disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, which can store program codes Readable storage media.

Professionals can further realize that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the hardware and software Interchangeability, in the above description, the composition and steps of each example have been generally described according to function. Whether these functions are executed in hardware or software depends on the specific application of the technical solution and design constraints. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the present invention.

Claims

A program detection method, which includes:

Obtain test data when receiving the Winograd program detection instruction;

Use the target algorithm program of the convolutional neural network to perform convolution calculation on the test data to obtain a convolution result; the target algorithm program is an algorithm program that implements the convolutional neural network in a sliding window manner;

Sending the test data to the FPGA, so that the FPGA uses the Winograd program to perform fast convolution calculation on the test data;

Receiving the fast convolution result sent by the FPGA, and calculating the similarity between the fast convolution result and the convolution result;

When the similarity is greater than the threshold, it is determined that the Winograd program is correct.
The program detection method according to claim 1, wherein using the target algorithm program of the convolutional neural network to perform convolution calculation on the test data to obtain a convolution result includes:

Use the target algorithm program to perform convolution calculation on the test data, and use the first layer result of the convolutional neural network as the convolution result;

Correspondingly, the FPGA uses the Winograd program to perform fast convolution calculation on the test data, including:

The FPGA uses the Winograd program to perform fast convolution calculation on the test data, and uses the fast calculation to obtain the first layer result of the convolutional neural network as the fast convolution result.
The program detection method according to claim 1, further comprising:

Obtaining filter parameters of the convolutional neural network;

The filter parameters are set in the target convolution algorithm program and the Winograd program, respectively.
The program detection method according to claim 1, wherein sending the test data to the FPGA includes:

Create the PFGA board operating environment and initialize board parameters;

Sending the test data to the FPGA.
The program detection method according to claim 1, wherein the FPGA uses the Winograd program to perform fast convolution calculation on the test data, including:

The FPGA starts the kernel and uses the Winograd program to perform fast convolution calculation on the test data.
The program detection method according to any one of claims 1 to 5, wherein calculating the similarity between the fast convolution result and the convolution result includes:

Calculating the ratio of the fast convolution result to the convolution result, and using the ratio to determine the similarity;

Or, calculate the difference between the fast convolution result and the convolution result, and use the difference to determine the similarity.
The program detection method according to claim 6, wherein when the similarity is less than or equal to the threshold, the method further comprises:

It is determined that the Winograd program is wrong.
A program detection device, including:

Test data acquisition module, used to acquire test data when receiving the Winograd program detection instruction;

The convolution calculation module is used to use the target algorithm program of the convolutional neural network to perform convolution calculation on the test data to obtain a convolution result; the target algorithm program is to implement the convolutional nerve in a sliding window manner Algorithm program of the network;

A test data sending module, configured to send the test data to the FPGA, so that the FPGA uses the Winograd program to perform fast convolution calculation on the test data;

A similarity calculation module, configured to receive the fast convolution result sent by the FPGA, and calculate the similarity between the fast convolution result and the convolution result;

The detection result determination module is used to determine that the Winograd program is correct when the similarity is greater than a threshold.
A program detection device, characterized in that it includes:

Memory, used to store computer programs;

The processor is configured to implement the steps of the program detection method according to any one of claims 1 to 7 when executing the computer program.
A readable storage medium, characterized in that a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, the steps of the program detection method according to any one of claims 1 to 7 are implemented.