CN111881028A

CN111881028A - Neural network automatic generation method based on model code syntactic analysis

Info

Publication number: CN111881028A
Application number: CN202010712775.7A
Authority: CN
Inventors: 陈振宇; 刘佳玮; 曹可凡
Original assignee: Shenzhen Muzhi Technology Co ltd
Current assignee: Shenzhen Muzhi Technology Co ltd
Priority date: 2020-07-23
Filing date: 2020-07-23
Publication date: 2020-11-03

Abstract

A neural network automatic generation method based on model code syntax analysis is characterized in that a syntax library is abstracted according to syntax structures of a neural network built under different frameworks, and a safety check mechanism is built to ensure the correctness of a code generation program; on the basis, corresponding code characteristics are fused, the use scene of the neural network is restored, and the usability of the code generation program is improved. In order to realize automatic generation under different technical frameworks and network structures, the automatic generation tool of the neural network is realized from coarse granularity, a single structure and individual frameworks to finer granularity, supports various models and does not depend on specific technical frameworks through continuous iteration.

Description

Neural network automatic generation method based on model code syntactic analysis

Technical Field

The invention belongs to the field of software testing, and particularly relates to deep learning compiler testing. According to the construction rule of the neural network model code, the automatic generation of the deep neural network is realized by analyzing the rule of the existing network code in the open source community.

Background

The development of hardware computing power and the acceptance in the industry enable artificial intelligence and deep learning to gradually deepen into more fields, and the artificial intelligence and deep learning have achieved technical achievements particularly in the fields of medical diagnosis, automatic driving, face recognition and the like, and have started the popularization of products. In contrast, the underlying framework as a support for deep learning functions becomes a bottleneck limiting its further development. On the one hand, the reliability of these underlying frameworks is questionable, and tests for these frameworks are still being actively carried out; on the other hand, because the framework provides rich and easy-to-use API, the process of manufacturing wheels is reduced for developers, and the threshold of the developers is lowered. Meanwhile, the more detailed the division of the deep learning application, the fewer codes which can be shared under different scenes, and the process of repeatedly constructing the model code also becomes a limit for limiting the development of the deep learning application.

From the viewpoint of program development, the importance of compilers, one of the most commonly used system software, is self-evident, and research on testing techniques of compilers has been an important branch of the software testing field. Current testing for compilers has enabled research in some of the more common programming languages. Existing compiler test validation tools (such as SuperTest, CVSAC, etc.) mainly focus on the implementation of compiler on the consistency of language standards. Csmith is a random difference testing tool of a compiler, and mainly focuses on searching errors in the compiling intermediate process of the compiler. Csmith randomly generates test cases using a custom grammar subset. It takes into account a most basic syntax subset and is implemented in a hard-coded manner. To date, it has found hundreds of errors in GCC and LLVM and helps improve the quality of the most widely used C compilers.

The traditional approach to neural network architecture is: parameters of the network architecture and corresponding training are traversed until the task performance reaches a point of reduced revenue. This approach faces a number of problems. One is the problem of architecture fixation, most of which are based on the back propagation method to train the network weight, not the architecture. They only use gradient information in the weight space of the neural network, and the neural network architecture in the whole training process is fixed. Therefore, such an approach does not lead to a better network architecture. Secondly, the methods all need a long lifting process, and searching for a proper neural network architecture by a trial and error method is very inefficient. This problem becomes more severe as the network deepens, containing millions of parameters. Even the fastest GPUs take tens of hours each time a deep neural network is tried. It is to be appreciated that GPUs are currently the mainstay of neural network training. Even with sufficient computing power and researchers, finding an excellent architecture suitable for certain applications can take years, such as the image domain, a revolution from AlexNet to VGG, google net, ResNet. Thirdly, a great deal of redundancy is caused, which results in that most of the parameters of the neural network are excessive. Even the most well-known networks in image classification tasks (e.g., LeNets, AlexNet, VGG) face a large amount of storage and computational redundancy.

Learning and performing of language or grammar is a typical mental activity of human beings, and various models of these processes using neural networks have been proposed.

Currently, the academic community has performed a series of work on testing of the deep learning framework. csmith is a work of automatically generating a program by a grammar library based on C language grammar, and the work ensures the usability of the program by constructing grammar subsets and assisting safety check, and is currently focused on the test of a C compiler. CRADLE is aimed at capturing bugs under a potential deep learning framework through differential testing and locating layers where bugs occur through change rates, however, the models used in this approach are limited. The AUDEE is a new work, random testing is carried out by summarizing parameter value ranges and metamorphic relations of common APIs (application programming interfaces), but the automatically generated model only realizes the random parameter level. Therefore, the invention realizes automatic generation of the neural network by analyzing the neural network and constructing the code grammar based on the existing research results, and can be further used for testing the deep learning compiler.

Disclosure of Invention

The invention aims to solve the problems that: the C programming program is constructed by using the grammar structure of the programming language and is obtained on the test of the C compiler

With great success, but there is no new progress in the deep learning field. The invention can realize the random generation of the model code from the code level, is used for the random test of a deep learning framework and a bottom compiler, and solves the problem that the random test in the deep learning field lacks model data.

The technical scheme of the invention is as follows: neural network automatic generation method based on model code syntactic analysis

The method is characterized in that an operable network model can be generated according to the grammatical rules of the neural network codes. The generation method comprises the following 6 steps:

1. the present invention randomly selects an allowed generation for the current program point from its syntax. For selection it consults the probabilities

Table and filter functions specific to the current point: there is one table, filter pair for statements, another for expressions, and so on. The table assigns a probability to each alternative, where the sum of the probabilities is 1. After a code block is selected from the table and generated, the invention begins performing a filter operation that determines whether the selection is acceptable in the current situation.

2. If an object (e.g., variable or function) is required for the selected code block, the generator will randomly select an appropriate defined object or define a new object. Essentially, the present invention dynamically builds a probability table for potential targets and includes a create option. Thus, function and variable definitions are dynamically created when the present invention decides to refer to them.

3. If the selected code block allows the generator to select one type, the invention will select one at random. Depending on the current context, the selection may be limited (e.g., when generating operands for integer-type expressions) or unlimited (e.g., when generating random variables and generating parameter types for new functions.) random selection is guided by syntax, probability tables, and filters that have been described.

4. If the selected code block is a non-terminating code block, the generator recurses. The generator calls a function to generate a NAND termination code

The block corresponds to a program fragment. More generally, the present invention recursively compounds each non-terminating element (e.g., for each subcomponent) in the current block of code, or each parameter in a function call.

5. The present invention performs a set of dataflow transfer functions. It passes point-to-fact from the local environment to the transfer function, resulting in a new set of point-to-fact. The present invention updates the local environment with these facts.

6. The present invention performs a set of security checks. If the check is successful, the new code fragment is submitted to the generated program. Otherwise, the fragment will be deleted and any changes to the local environment rolled back.

When the present invention creates a call to a new function (the subject of which does not yet exist), the generation of the current function will be suspended until

Until the new function is completed. Thus, the present invention completes when the top-level function has been fully generated. At that time, it gracefully prints all randomly generated definitions in the proper order: type, global variable, prototype, and function. Finally, the present invention outputs one primary function. And calling a top-level randomly generated function by the main function, calculating the check sum of the non-pointer global variable, printing the check sum, and exiting.

The invention is characterized in that:

1. the method for generating the model code by using the automatic program is firstly proposed in the field of deep learning test.

2. The syntax analysis technique is applied to the testing of the deep learning framework for the first time.

3. The testing of the deep learning framework is extended to the testing for all API interfaces for the first time.

Drawings

Fig. 1 is a general flow chart of the implementation of the present invention.

Fig. 2 is a flowchart of API call statement generation.

Fig. 3 is a flow chart of the flow control statement generation.

Fig. 4 is a security check flow chart.

Detailed Description

The embodiments of the present invention are described below with reference to specific examples, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure of the present specification.

The method implements automatic generation of the neural network model codes through grammar, mainly adopts grammar rule check, and relates to specific key technologies such as a Deep Neural Network (DNN), a deep Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), an abstract syntax tree technology and the like.

1. Generating environmental controls

The invention maintains a global context, i.e. type, global variable and function, with the highest level of definition. The global environment will be extended when new entity types are defined during program generation. The present invention also uses three main pieces of information to maintain the local environment in order to save information about current program spawn points. First, the local environment describes the current call chain, supporting context-dependent pointer analysis. Second, it contains (1) the start of the current function, (2) the start of the current statement and (3) effect information for objects that may have been read or written since the previous sequence point. In addition, the local environment contains the pointing fact of all in-range pointers.

2. Parameter generation mechanism

The invention first creates a set of structure type declarations at random. For each member, it randomly decides the number of members and the type of each member. The type of member may be an (possibly qualified) integer type, a bit field, or a previously generated structure type.

After the preliminary step of generating the type definition, the present invention begins generating python program code. The invention starts from the program entry and generates the program from top to bottom.

3. Filter mechanism

The filter enforces basic semantic restrictions (e.g., continue can only appear in a loop), user-controllable restrictions (e.g., maximum statement depth and number of functions), and other user-controllable options. If the filter rejects the selected code block, the invention simply loops back to select from the table until the filter is successful.

4. Program syntax specification

The specification of the program generated by the present invention is governed by the syntax of the python3 language. A program is a collection of types, variables, and function definitions. The method body is modeled as a block; a block contains a statement list and a statement list; statements are expressions, control flow constructs, assignments, or blocks. The internal implementation of different blocks introduces code properties that ensure that the generator meets the properties of the programming language and the programming framework. The present invention also utilizes grammars to generate other programs representing programming characteristics. The syntax is implemented by a set of manually coded python classes. .

5. Program characteristic control

The dynamically realized constructed code should contain more language characteristics so as to improve the quality of the constructed code to the maximum extent. Any "language" is a combination of various "linguistic characteristics". In addition to the generic language properties that programming languages have themselves, there is also a need to consider the "specific" properties that accompany a framework or model structure. The expression capability of the program can be improved by utilizing rich language characteristics, and the practical value of the constructed code is further enhanced. The research abstracts the code characteristics based on the programming language and the 'special' characteristics of the accompanying framework or model structure by analyzing the construction process of the neural network, and realizes the usability of automatically generating programs.

In the embodiment, syntactic structure analysis is carried out according to the construction code of the neural network, the implementation modes under different frames are investigated, the automatic selection and safety check process of the network construction code segment is realized, and in the process, the language characteristics are restored as much as possible, so that the generated model is close to the existing programming habit, and finally, the automatic construction of the neural network is completed.

Claims

1. A neural network automatic generation method based on model code grammar analysis is characterized in that a global environment with highest level definition is maintained, a probability table and a filter function specific to a current point are consulted, the expression capability of a program can be improved by utilizing abundant language characteristics, the practical value of a constructed code is further enhanced, and a necessary grammar library in an automatic generation process is realized by a group of manually coded python classes.

2. Maintaining a global environment with a highest level of definition as described in claim 1, wherein: the global environment will be extended when new entity types are defined during program generation. The present invention also uses three main pieces of information to maintain the local environment in order to save information about current program spawn points. First, the local environment describes the current call chain, supporting context-dependent pointer analysis. Second, it contains (1) the start of the current function, (2) the start of the current statement and (3) effect information for objects that may have been read or written since the previous sequence point. In addition, the local environment contains the pointing fact of all in-range pointers.

3. The lookup probability table and current point specific filter function of claim 1 wherein: the filter enforces basic semantic restrictions (e.g., continue can only appear in a loop), user-controllable restrictions (e.g., maximum statement depth and number of functions), and other user-controllable options. If the filter rejects the selected code block, the invention simply loops back to select from the table until the filter is successful.

4. The use of rich language features to increase the expressive power of a program as claimed in claim 1, wherein: by analyzing the building process of the neural network, code characteristics based on a programming language and 'special' characteristics accompanying a framework or a model structure are abstracted, and the usability of automatically generating the program is realized.

5. The grammar library necessary for implementing the automatic generation process by a set of manually coded python classes as described in claim 1, wherein: the method body is modeled as a block; a block contains a statement list and a statement list; statements are expressions, control flow constructs, assignments, or blocks. The internal implementation of different blocks introduces code properties that ensure that the generator meets the properties of the programming language and the programming framework. The present invention also utilizes grammars to generate other programs representing programming characteristics.