CN116933067A

CN116933067A - Pattern recognition method and device, electronic equipment and computer readable medium

Info

Publication number: CN116933067A
Application number: CN202210343505.2A
Authority: CN
Inventors: 徐茂轩; 王凯
Original assignee: Beijing Lynxi Technology Co Ltd
Current assignee: Beijing Lynxi Technology Co Ltd
Priority date: 2022-04-02
Filing date: 2022-04-02
Publication date: 2023-10-24

Abstract

The present disclosure provides a pattern recognition method and apparatus, an electronic device, and a computer readable medium, the method comprising: generating a first sequence according to the execution sequence of a plurality of operators in the designated calculation graph and the characteristic information of the plurality of operators; labeling the fusion attribute of the operator according to the first sequence to obtain a second sequence; determining a target subgraph to be subjected to operator fusion from a plurality of subgraphs of the designated calculation graph according to the second sequence; and updating the fusion mode set based on the target fusion mode under the condition that the fusion mode of the target subgraph comprises the target fusion mode which does not belong to the fusion mode set. According to the embodiment of the disclosure, the fusible sub-graph can be efficiently identified.

Description

Pattern recognition method and device, electronic equipment and computer readable medium

Technical Field

The disclosure relates to the field of computer technology, and in particular, to a pattern recognition method and device, an electronic device, and a computer readable medium.

Background

The computational graph (Computational Graph) is a directed graph for describing functions and has been widely used in various types of deep learning frameworks (e.g., tensorflow and onnx, etc.). Typically, the computational graph needs to be compiled to generate a stream of instructions that can be run on hardware. After the computational graph enters the compiler, it is divided into several sub-graphs, each sub-graph including one or more operators, during the compilation phase.

In the related art, after dividing the subgraph, the compiler can also fuse operators in part of the subgraphs in a manual mode, and the subgraphs after operator fusion are used as an execution unit to execute corresponding instructions, so that the purposes of reducing on-chip storage and accelerating execution speed are achieved.

However, as operators are more in variety, the combination modes of the operators are complex and various, various choices exist for operator parameters, subgraphs which can be fused with operators cannot be comprehensively determined in a manual mode, and the efficiency of manual determination is low.

Disclosure of Invention

The disclosure provides a pattern recognition method and device, electronic equipment and a computer readable medium.

In a first aspect, the present disclosure provides a pattern recognition method, the pattern recognition method comprising: generating a first sequence according to the execution sequence of a plurality of operators in a designated calculation graph and the characteristic information of the operators, wherein each item of the first sequence corresponds to the characteristic vector of each operator; labeling the fusion attribute of the operator according to the first sequence to obtain a second sequence, wherein the fusion attribute is used for representing whether the operator has the characteristic of fusion with an adjacent operator; determining a target subgraph to be subjected to operator fusion from a plurality of subgraphs of the designated calculation graph according to the second sequence; and updating the fusion mode set based on the target fusion mode under the condition that the fusion mode of the target subgraph comprises a target fusion mode which does not belong to the fusion mode set, wherein the fusion mode set comprises at least one fusion mode.

In a second aspect, the present disclosure provides a pattern recognition apparatus comprising: the generation module is configured to generate a first sequence according to the execution sequence of a plurality of operators in a designated calculation graph and the characteristic information of the operators, wherein each item of the first sequence corresponds to the characteristic vector of each operator; the labeling module is configured to label fusion attributes of the operators according to the first sequence to obtain a second sequence, wherein the fusion attributes are used for representing whether the operators have the characteristic of fusion with adjacent operators; the determining module is configured to determine a target subgraph to be subjected to operator fusion from a plurality of subgraphs of the designated calculation graph according to the second sequence; the updating module is configured to update the fusion mode set based on the target fusion mode when the fusion mode of the target subgraph comprises a target fusion mode which does not belong to the fusion mode set, wherein the fusion mode set comprises at least one fusion mode.

In a third aspect, the present disclosure provides a compiler comprising: at least one pattern recognition device for implementing the pattern recognition method described above.

In a fourth aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, one or more of the computer programs being executable by the at least one processor to enable the at least one processor to perform the pattern recognition method described above.

In a fifth aspect, the present disclosure provides a computer readable medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the pattern recognition method described above.

According to the embodiment provided by the disclosure, a first sequence formed by operator feature vectors can be obtained according to the execution sequence and feature information of a plurality of operators in a calculation graph, and the fusion attribute of the operators is determined according to the first sequence, so that a target sub-graph to be fused is efficiently and conveniently determined according to the fusion attribute, and when a target fusion mode which does not belong to a fusion mode set exists in the fusion mode of the target sub-graph, the fusion mode set is updated by using the target fusion mode, so that a more comprehensive and complete fusion mode set is obtained, and therefore sub-graphs which can be fused in the calculation graph are comprehensively identified, occupation of storage resources on the graph is reduced when calculation tasks are executed based on the fused sub-graph, and the execution speed is increased.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:

FIG. 1 is a flow chart of a pattern recognition method provided in an embodiment of the present disclosure;

FIG. 2 is a flow chart of a pattern recognition method provided in an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a training process of a vector acquisition model according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a training process of a labeling model according to an embodiment of the disclosure;

FIG. 5 is a schematic diagram of a processing procedure of a pattern recognition method according to an embodiment of the disclosure;

FIG. 6 is a schematic diagram of a pattern recognition provided by an embodiment of the present disclosure;

FIG. 7 is a block diagram of a pattern recognition device provided by an embodiment of the present disclosure;

FIG. 8 is a block diagram of a compiler provided by an embodiment of the present disclosure;

fig. 9 is a block diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistical approximation theory, convex analysis, algorithm complexity theory, etc., and is a special study of how a computer simulates or implements Learning behavior of a human being to obtain new knowledge or skill, and reorganizes the existing knowledge structure to continuously improve its own performance. Deep learning, which is a branch of neural network algorithms developed from machine learning, has the motivation to build, simulate a neural network for analysis learning of the human brain.

In practical applications, a computational graph may be used to represent a neural network to which the deep learning model corresponds. In the calculation graph, the calculation is represented as a directed graph in which a flow direction manner of data, a calculation manner of data, a relationship between various calculations, and the like are defined. Typically, the computational graph includes two types of basic elements, nodes (Node) and directed edges (Edge), respectively. Wherein the nodes include variable nodes, operator nodes, and the like, each variable node corresponding to a tensor, which may be a scalar, vector, or matrix, the operator nodes referring to mathematical operations (e.g., convolution/deconvolution operators, pooling operators, activation operators, classifier operators, fully-connected operators, and the like); directed edges are used to represent the dependency between two nodes (e.g., two operators connected in series, the output of the former operator being the input of the next operator). By the co-action of internal multiple operators, the computational graph as a whole can be used to implement relatively complete, function-specific algorithms, including but not limited to various types of deep learning algorithms.

In the related art, a computation graph may be divided into a plurality of sub-graphs, and each sub-graph is allocated to a computation unit to perform computation, so as to obtain an operation result of the computation graph in a distributed computation manner. Considering that the many-Core system comprises a plurality of processing cores (cores) connected in a preset manner, each processing Core is a minimum unit capable of being independently scheduled and having independent computing capability and has the capability of independently executing computing tasks, the many-Core system can be used for carrying out distributed processing on each sub-graph of the computing graph, so that the processing efficiency is improved.

It should be appreciated that when a many-core system processes a computational graph, different sub-graphs need to be allocated different resources, or each sub-graph "mapped" onto different resources of the many-core network. That is, the many-core system should not repeatedly use (occupy) a portion of the determined resources in the many-core system for each sub-graph of the computational graph at runtime.

Where the resources of the many-core system may include operating resources (i.e., hardware resources). For example, after the resources of the many-core system include processing cores, threads, on-chip memory spaces and the like and target operation resources are allocated to each sub-graph of the computation graph, when the sub-graph operates in the many-core system, the processing cores, threads and on-chip memory spaces in the target operation resources can only be occupied by the sub-graph.

The resources of the many-core system may also include time resources, with the target time resources allocated for the subgraph corresponding to specifying when the subgraph should be processed by the many-core system. That is, multiple subgraphs may be processed in a "time division multiplexed" manner in a many-core system. In other words, the same resource cannot be allocated to a plurality of different subgraphs at the same time, i.e., any running resource, and at most can only be a target resource of one subgraph under the same time resource (same time); the same running resources may be allocated to different subgraphs at different times.

It should be noted that, before a many-core system processes a computation graph, a compiler is typically required to compile the computation graph to generate an instruction stream that can run on hardware. In the related art, after dividing the subgraph, the compiler can also fuse operators in part of the subgraphs in a manual mode, and the subgraphs after operator fusion are used as an execution unit to execute corresponding instructions, so that the purposes of reducing on-chip storage and accelerating execution speed are achieved. However, as operators are more in variety, the combination modes of the operators are complex and various, various choices exist for operator parameters, subgraphs which can be fused with the operators cannot be comprehensively determined by a manual mode, and the determination efficiency is low.

In view of this, the embodiments of the present disclosure provide a pattern recognition method and apparatus, an electronic device, and a medium, according to an execution sequence and feature information of a plurality of operators in a computation graph, a first sequence composed of operator feature vectors is obtained, and a fusion attribute of the operators is determined according to the first sequence, so that a target sub-graph to be fused is efficiently and conveniently determined according to the fusion attribute, and when a target fusion pattern not belonging to a fusion pattern set exists in a fusion pattern of the target sub-graph, the fusion pattern set is updated by using the target fusion pattern, so as to obtain a more comprehensive and complete target fusion pattern set, thereby more comprehensively recognizing sub-graphs capable of performing operator fusion in the computation graph, reducing occupation of storage resources on the chip and accelerating execution speed when executing computation tasks based on the fused sub-graphs.

Fig. 1 is a flowchart of a pattern recognition method according to an embodiment of the present disclosure. Referring to fig. 1, the method includes:

step S101, generating a first sequence according to the execution sequence of a plurality of operators and the characteristic information of the operators in the designated calculation graph.

Wherein, the designated computation graph refers to the computation graph in the current compiling process, wherein the connection relation of the nodes possibly comprises one or more subgraphs. The execution order is used to characterize the order of operations of the individual operators when executing a given computational graph.

The characteristic information of the operator includes information that may reflect certain characteristics of the operator. The terms of the first sequence respectively correspond to the feature vectors of the operators, and the feature vectors are vectors obtained according to the feature information of the operators.

And step S102, labeling the fusion attribute of the operators according to the first sequence to obtain a second sequence.

The fusion attribute is used for representing whether the operator has the characteristic of fusion with the adjacent operator, and labeling the fusion attribute aims at determining a labeling value of the fusion attribute so as to determine the fusion attribute of the operator according to the labeling value. The second sequence is a sequence consisting of fused property annotation values of operators.

In some embodiments, fusion attributes of each operator in the first sequence are marked by using a preset marking value, so as to obtain a second sequence.

And step S103, determining a target subgraph to be subjected to operator fusion from a plurality of subgraphs of the designated calculation graph according to the second sequence.

The target subgraph is a subgraph which can be subjected to operator fusion in the appointed calculation graph.

Step S104, updating the fusion mode set based on the target fusion mode when the fusion mode of the target subgraph comprises the target fusion mode which is not classified into the fusion mode set.

The fusion mode set comprises at least one fusion mode, and the fusion mode is used for indicating the mode of operators capable of fusion, and comprises, but not limited to, operator types and operator connection relations. The target fusion pattern is a pattern that does not belong to the fusion pattern set. In other words, the target fusion pattern is different from all existing fusion patterns in the fusion pattern set.

In addition, when the fusion patterns of the target subgraphs are all the target fusion patterns of the fusion pattern set, it is described that no new fusion pattern is found, and therefore, the fusion pattern set does not need to be updated.

It should also be noted that the update to the fusion pattern set belongs to an iterative update process. In general, multiple pattern recognition processes are required to recognize that a more comprehensive fusion pattern has been recognized, and a fusion pattern set constructed based on the recognized fusion pattern belongs to a more complete set.

In the embodiment of the disclosure, a first sequence formed by operator feature vectors can be obtained according to the execution sequence and feature information of a plurality of operators in a calculation graph, and the fusion attribute of the operators is determined according to the first sequence, so that a target sub-graph to be subjected to operator fusion is efficiently and conveniently determined according to the fusion attribute, and when a target fusion mode which does not belong to a fusion mode set exists in the fusion mode of the target sub-graph, the fusion mode set is updated by using the target fusion mode, so that a more comprehensive and complete target fusion mode set is obtained, and sub-graphs capable of being subjected to operator fusion in the calculation graph are more comprehensively identified, so that occupation of on-chip storage resources is reduced and the execution speed is accelerated when the calculation task is executed based on the sub-graphs subjected to operator fusion.

It should be noted that, in some embodiments, before step S101, the pattern recognition method according to the embodiments of the present disclosure may further include: an order of execution of the plurality of operators in the specified computational graph is determined.

In some embodiments, the order of execution of the operators may be determined from the join relationships of the individual operators in the specified computational graph.

It should be noted that the above method for determining the execution sequence is merely illustrative, and the embodiments of the present disclosure do not limit the specific determination manner of the execution sequence.

Fig. 2 is a flowchart of a pattern recognition method according to an embodiment of the present disclosure. Referring to fig. 2, the method includes:

step S201, a first sequence is generated according to the execution sequence of a plurality of operators and the characteristic information of the operators in the designated calculation graph.

In some embodiments, the feature information of the operator includes at least one of an operator structure, an operator execution mechanism, an operator access feature, and an operator calculation feature, and the corresponding feature vector may be obtained by performing feature extraction on one or more feature information of the operator. The operator structure is used for representing information such as association relations among components in the operator (for example, dimension information of a two-dimensional convolution operator (Conv 2 d); the operator execution mechanism comprises a principle followed by an operator in the execution process, and can comprise contents such as the quantity relation between input data and output data of the operator (for example, an activating operator corresponds to one input data and one output data, and a convolution operator corresponds to a plurality of input data and a plurality of output data); the operator memory access feature is used for characterizing the feature of the operator about memory access behavior, and relates to data reuse (locality) and decoupling of data memory access (for example, decoupling of input data and weight in a convolution operator can be performed separately for memory access); the operator computation characteristics are used to characterize the operational characteristics of the operator (e.g., the operator may perform matrix operations, vector operations, scalar operations, etc.).

In some embodiments, in compiling a specified computation graph, first arranging a plurality of operators in the specified computation graph according to the execution sequence thereof to obtain a third sequence (i.e. an execution sequence); secondly, generating a characteristic vector of the operator according to the characteristic information of the operator; finally, the feature vectors of the operators are arranged according to the arrangement sequence of the operators in the third sequence, so as to obtain a first sequence (namely a feature vector sequence).

In some embodiments, the feature vector of the operator may be generated by a preset vector acquisition model, that is, the feature vector of the operator is generated by the vector acquisition model according to the feature information of the operator. The vector acquisition model is a model obtained through training, and the training method can be seen in the training process of the vector acquisition model shown in fig. 3.

Step S202, marking the fusion attribute of the operator by using a preset marking value according to the first sequence, and obtaining the marking value of the operator.

The preset labeling value is used for labeling the fusion attribute of the operator.

In some embodiments, the preset labeling value of the fusion attribute includes a first value and a second value, where the first value is used to characterize that the corresponding operator has a characteristic of fusing with an adjacent operator, and the second value is used to characterize that the corresponding operator does not have a characteristic of fusing with an adjacent operator. For example, if the labeling value of the fusion attribute of a certain operator is a first value, it is indicated that the operator can be fused with an adjacent operator, and the fusion attribute is fusible; otherwise, if the labeling value of the fusion attribute of a certain operator is the second value, the operator cannot be fused with the adjacent operator, and the fusion attribute is unfused. It should be noted that the above labeling values are merely examples, and in practical applications, the types and the number of the labeling values may be set according to the requirements, which is not limited by the embodiments of the present disclosure.

Step S203, obtaining a second sequence according to the labeling values of the operators.

In some embodiments, if the specified computation graph includes three operators, the execution sequence is the third operator-the first operator-the second operator, and the preset labeling value includes "0" and "1". When the labeling value of the third operator is 1 and the labeling values of the first operator and the second operator are 0, the corresponding second sequence is determined to be 100.

It should be noted that, in some embodiments, steps S202-S203 may be implemented through a preset labeling model. The first sequence is input into the annotation model, the annotation value of each operator is determined by the annotation model, the second sequence is obtained according to the annotation value of the operator, and the second sequence is output as output data, so that the fusion attribute of the operator can be determined according to the second sequence. Wherein the labeling model is a model which is built based on a conditional random field (Conditional Random Field, CRF) and obtained through training.

In some possible implementations, the training process of the annotation model includes: constructing an initial labeling model based on a conditional random field; inputting the first training sequence into an initial labeling model to obtain a predicted labeling sequence, wherein the predicted labeling sequence is a sequence which is obtained by the initial labeling model according to the first training sequence and is used for fusion attributes of a predictor; and carrying out iterative updating on the annotation model according to the predicted annotation sequence and the second training sequence to obtain a trained annotation model, wherein the trained annotation model has the function of determining the second sequence corresponding to the first sequence according to the first sequence. The first training sequence is a sequence obtained by arranging operator features according to an operator execution sequence, and the second training sequence is a sequence formed by operator fusion attribute labeling values (operators are arranged according to the execution sequence).

After obtaining the trained annotation model, the annotation model can be used to obtain the second sequence in step S102 or steps S202-S203.

In some possible implementations, after obtaining the predicted annotation sequence, before iteratively updating the annotation model according to the predicted annotation sequence and the second training sequence, the method further includes:

auditing the prediction labeling sequence to obtain an auditing result; and updating the predicted labeling sequence according to the auditing result to obtain an updated predicted labeling sequence, and carrying out iterative updating on the labeling model by using the updated predicted labeling sequence and the second training sequence to obtain a trained labeling model.

The auditing of the predicted labeling sequence refers to auditing the accuracy or rationality of the predicted labeling sequence according to any one or more of experience, statistical data or other preset conditions, and obtaining a corresponding auditing result (namely determining the predicted labeling sequence which is fused in error or cannot be fused). And rejecting or modifying the prediction labeling sequence with low accuracy or unreasonable accuracy according to the auditing result, so as to update the prediction labeling sequence. When the updated prediction annotation sequence and the second training sequence are used for carrying out iterative updating on the annotation model, the annotation model with better effect can be obtained, and therefore the model training efficiency and training effect are improved.

It should be noted that, the auditing of the prediction labeling sequence may be performed by a corresponding intelligent hardware device, or may be implemented manually, which is not limited by the embodiments of the present disclosure.

It should be further noted that, in practical applications, there are various modifications and improvements to the conditional random field, and the labeling model in the embodiments of the present disclosure may be established based on any one of the conditional random fields and used to output the second sequence. The specific type of conditional random field is not limited by the disclosed embodiments.

Step S204, determining a target subgraph to be subjected to operator fusion from a plurality of subgraphs of the designated calculation graph according to the second sequence.

In some embodiments, an operator in the second sequence corresponding to the first value is first determined to be a target operator; and secondly, determining a sub-graph with a target operator in the plurality of sub-graphs as a target sub-graph.

It should be noted that when the sub-graph in which the target operator exists is determined as the target sub-graph, two types of target sub-graphs are generated. In the first target subgraph, all operators are target operators to be fused, and in the second target subgraph, at least one operator which cannot be fused or does not need to be fused is included in the target subgraph besides the target operators to be fused. In practical application, the type of the target subgraph to be identified can be determined according to specific identification requirements. For example, when both target subgraphs need to be identified, it is sufficient to directly determine the subgraph in which the target operator exists as the target subgraph. For another example, when only the first target sub-graph needs to be identified, the sub-graph with the target operator in the sub-graph is firstly screened out to serve as an alternative target sub-graph, and then the first target sub-graph is further screened out of the alternative target sub-graph.

Step S205, in the case that the fusion mode of the target subgraph comprises a target fusion mode which does not belong to the fusion mode set, updating the fusion mode set based on the target fusion mode.

In some embodiments, in the case that the fusion mode of the target subgraph includes a target fusion mode that does not belong to the fusion mode set, it may be determined that the target fusion mode is different from all existing fusion modes in the fusion mode set, that is, the target fusion mode belongs to a new fusion mode. In order to make the fusion mode set more complete, therefore, the target fusion mode is incorporated into the fusion mode set, thereby realizing the updating of the fusion mode set.

In the embodiment of the disclosure, the fusion attribute of the operator is accurately marked through the preset marking value, so that the target subgraph can be identified more conveniently and accurately, and the existing fusion mode set can be updated according to the fusion mode of the target subgraph, thereby obtaining a more comprehensive and complete fusion mode set. In practical application, the labeling process can be realized through a labeling model established based on a conditional random field, so that the labeling operation of fusion attributes is more convenient.

It should be noted that, in some embodiments, after step S205, the pattern recognition method according to the embodiments of the present disclosure may further include: determining a target subgraph in a calculation graph to be identified according to a target fusion mode set, wherein the target fusion mode set is obtained by updating the fusion mode set for a plurality of times; fusing at least two operators in the target subgraph to obtain a fused operator; acquiring an execution strategy of the target sub-graph, wherein the execution strategy at least comprises an execution method of a fusion operator (when an operator which cannot be fused or is not fused exists in the target sub-graph, the execution strategy also comprises the execution method of the operator which cannot be fused or is not fused); generating an instruction corresponding to the target subgraph according to the execution strategy of the target subgraph; and sending the instruction corresponding to the target subgraph to the designated processing core so that the designated processing core can execute the calculation task corresponding to the target subgraph according to the instruction.

When an execution strategy is prepared, the target subgraph subjected to operator fusion can be used as an execution unit, and based on the target subgraph, part of intermediate data does not need to be stored, so that the number of times of 'storing-fetching' operation can be effectively reduced, the execution speed of the target subgraph is increased, and the processing capacity of a many-core system or a many-core chip is improved.

Fig. 3 is a schematic diagram of a training process of a vector acquisition model according to an embodiment of the disclosure. Referring to fig. 3, the training process of the vector acquisition model includes the steps of:

in step S301, an initial vector acquisition model is constructed.

The vector acquisition model is used for acquiring corresponding feature vectors according to the feature information of the operators.

In some embodiments, an initial vector acquisition model is built based on neural network techniques, including, but not limited to, one or more of a recurrent neural network (Recursive Neural Network, RNN) model, a convolutional neural network model (Convolutional Neural Networks, CNN), and a deep belief network (Deep Belief Neural Networks, DBNN) model (e.g., the RNN model may be superimposed on the CNN model to build a vector acquisition model).

It should be noted that the above types of vector acquisition models are merely examples, and the embodiments of the present disclosure do not limit the construction manner of the vector acquisition models.

Because the initial vector acquisition model is an untrained original model, the accuracy of the feature vector obtained by using the initial vector acquisition model is not high, so that good model parameters are required to be obtained through training, and the vector acquisition model with good effect can be obtained only after the model is iteratively updated according to the model parameters.

Step S302, first training set data is acquired.

In some embodiments, the first training set data includes, but is not limited to, feature information of an operator and feature vectors corresponding thereto.

In step S303, the first training set data is input into a vector acquisition model to obtain a predicted feature vector.

The prediction feature vector is a prediction result obtained by processing input data by a vector acquisition model.

In some embodiments, the predicted feature vector includes a first vector for characterizing independent features of the operator and a second vector for characterizing associated features of the operator with neighboring operators.

Step S304, updating the vector acquisition model according to the predicted feature vector and the actual feature vector.

Step S305, stopping training to obtain a final vector acquisition model when a preset first iteration stop condition is satisfied.

Wherein the first iteration stop condition is used to instruct to stop the training process of the vector acquisition model.

In some embodiments, the first iteration stop condition includes that the number of iterations reaches a preset iteration number threshold, and/or that the feature vector accuracy reaches a preset accuracy requirement, and/or that the feature vector accuracy is no longer improved, and/or the like.

It should be noted that the above first iteration stop condition is merely an example, and the embodiment of the present disclosure is not limited thereto.

Fig. 4 is a schematic diagram of a training process of a labeling model according to an embodiment of the disclosure. Referring to fig. 4, the training process of the annotation model comprises the following steps:

step S401, an initial labeling model is built based on the conditional random field.

Wherein the conditional random field is a conditional probability distribution model of another set of output random variables given a set of input random variables, characterized by assuming that the output random variables constitute a markov random field. Conditional random fields are often applied to sequence labeling problems, such as part-of-speech labeling, named entity recognition, etc.

In the embodiment of the disclosure, a labeling model constructed based on CRF is used for labeling or identifying fusion attributes of operators. For example, the input random variable of the labeling model is a feature vector sequence (i.e., a first sequence) formed by arranging feature vectors of respective operators of a designated computational graph according to an execution order thereof, and the output random variable is a sequence (i.e., a second sequence) formed by fused attribute labeling values of the respective operators.

It should be noted that, because the initial labeling model uses the model parameters that are configured initially, and the model is not trained, the labeling result obtained by using the initial labeling model is not high in accuracy, and the labeling model with high labeling accuracy can be obtained only after the model parameters are iteratively updated and the good model parameters are obtained through training. The model parameters include, but are not limited to, weight coefficients of the feature vectors, wherein the weight coefficients are used for representing the influence degree of the feature vectors on the labeling result.

Step S402, second training set data is acquired.

In some embodiments, the second training set data includes, but is not limited to, a feature vector sequence (i.e., a first training sequence) and a corresponding actual annotation sequence (i.e., a second training sequence) of the training computational graph. The actual labeling sequence is a sequence obtained by labeling according to the actual fusion attribute of the operator.

Wherein the training computational graph can be obtained based on different neural networks. In general, the more types and the larger number of the neural networks, the more training set data are obtained, and the more diversity is achieved, so that a better training effect is obtained.

Step S403, inputting the second training set data into the labeling model to obtain a prediction labeling sequence.

The prediction annotation sequence is an annotation sequence obtained by predicting an annotation model according to input data.

And step S404, updating the labeling model according to the actual labeling sequence and the predicted labeling sequence.

When the predicted labeling sequence is different from the actual labeling sequence, a lifting space is reserved for explaining the accuracy of the labeling model, so that model parameters are adjusted according to the difference between the actual labeling sequence and the preset labeling sequence, and the labeling model is updated by using the model parameters, so that a new training process is started based on the updated labeling model.

And step S405, stopping training to obtain a final labeling model under the condition that a preset second iteration stop condition is met.

The second iteration stop condition is used for indicating to stop the training process of the labeling model.

In some embodiments, the second iteration stop condition includes that the iteration number reaches a preset iteration number threshold, and/or that the labeling accuracy of the model reaches a preset accuracy requirement, and/or that the standard accuracy of the model is no longer improved, and the like.

It should be noted that the second iteration stop condition is merely exemplary, and the embodiment of the present disclosure is not limited thereto.

It should also be noted that, the training of the labeling model may be implemented through electronic devices such as a server, a virtual machine, and the embodiment of the disclosure is not limited thereto. Moreover, in some embodiments, the data processing carrying capacity of the electronic device is limited, and the computation graph of the neural network is usually larger in scale (in terms of a large number of operators and the like), so when the second training set data is acquired, the computation graph acquired based on the neural network may be divided in advance to obtain a plurality of computation graphs with smaller scales, and then the labeling model is trained according to the method using the computation graphs with smaller scales.

It should also be noted that in some embodiments, the labeling model and the vector acquisition model may be trained in a training process. For example, first, the second training set data is input into an initial vector acquisition model to obtain a predicted feature vector, the predicted feature vector is input into an initial labeling model, and the initial labeling model labels the fusion attribute of each operator according to the predicted feature vector to obtain a predicted labeling sequence. And secondly, the vector acquisition model updates the model according to the predicted feature vector and the actual feature vector, and similarly, the labeling model updates the model according to the predicted labeling sequence and the actual labeling sequence. And finally, training for a new round based on the updated vector acquisition model and the labeling model until the preset first iteration stop condition and/or second iteration stop condition are met, stopping training, and obtaining the trained vector acquisition model and the trained labeling model.

It should be understood that implementing training of the labeling model and the vector acquisition model in one training process is equivalent to training the two models as different levels of a neural network, respectively. Based on this, the actual feature vector used when updating the vector acquisition model in step S304 is not the actual feature vector of a single hierarchical network, but is based on the overall actual feature vector of the entire neural network in order to obtain a good training result.

Fig. 5 is a schematic diagram of a processing procedure of a pattern recognition method according to an embodiment of the disclosure. Referring to fig. 5, the processing procedure of the pattern recognition method includes the steps of:

step S501, a specified computation graph corresponding to the neural network model is generated from the specified neural network model.

Step S502, according to the execution sequence of each operator in the designated computation graph, generating the execution sequence of the designated computation graph.

Step S503, inputting the characteristic information of each operator in the execution sequence into a vector acquisition model to obtain the characteristic vector of each operator.

Step S504, the feature vectors of the operators are arranged according to the execution sequence, and a first sequence is obtained.

Step S505, inputting the first sequence into the labeling model to obtain a second sequence.

The second sequence is the fusion attribute labeling sequence.

In step S506, the operator corresponding to the first value in the second sequence is determined as the target operator.

The first value is a fusion attribute labeling value, and the corresponding operator is characterized by fusion with an adjacent operator. Correspondingly, the fusion attribute labeling value further comprises a second value, and the second value is used for representing that the corresponding operator does not have the characteristic of fusion with the adjacent operator.

Step S507, determining the subgraph with the target operator as the target subgraph.

Step S508, determining whether the fusion mode of the target subgraph comprises a target fusion mode which is not attributed to the fusion mode set.

Step S509, in the case that the fusion pattern of the target subgraph includes a target fusion pattern that does not belong to the fusion pattern set, updates the fusion pattern set based on the target fusion pattern.

It should be noted that, under the condition that the fusion mode of the target subgraph is attributed to the fusion mode set, the fusion mode set does not need to be updated.

Fig. 6 is a schematic diagram of pattern recognition provided in an embodiment of the present disclosure. The processing procedure of the pattern recognition method is described below with reference to fig. 6.

Referring to fig. 6, the specified computation graph includes 9 operators, where operator 1-operator 3 are operators in sub-graph 1, operator 4 and operator 5 are operators in sub-graph 2, operator 6-operator 8 are operators in sub-graph 3, and operator 9 constitutes sub-graph 4. Assuming that the execution sequence of the 9 operators is from the operator 1 to the operator 9 in sequence, the execution sequence of the specified computation graph is operator 1-operator 2- … -operator 8-operator 9. For each operator, determining the feature vector of the operator according to the feature information of the operator, and arranging the feature vectors of the operators according to the execution sequence to obtain a first sequence which has the same length as the execution sequence and has a corresponding relation (namely, the execution sequence corresponds to the same operator at the same position in the first sequence). After the first sequence is input to the annotation model, the annotation model outputs a second sequence. The labeling value of the fusion attribute of the operator comprises two cases of 0 and 1, wherein 0 indicates that the operator does not have the characteristic of fusion with the adjacent operator, and 1 indicates that the operator has the characteristic of fusion with the adjacent operator.

As shown in fig. 6, the second sequence is 000110000, and thus, the labeling value corresponding to the operator 4 and the operator 5 is "1", and therefore, it is determined that the operator 4 and the operator 5 belong to the target operator. Further, the operators 4 and 5 are operators in the sub-graph 2, and thus, the sub-graph 2 is determined as the target sub-graph. And obtaining fusion characteristics corresponding to the sub-graph 2 according to the operator characteristics and the connection characteristics of the operators 4 and 5. Comparing whether the fusion characteristics corresponding to the sub-graph 2 are the same as the fusion characteristics of each fusion mode in the fusion mode set, and if not, adding the fusion mode corresponding to the sub-graph 2 into the fusion mode set. Otherwise, if the fusion characteristic corresponding to the sub-graph 2 is the same as the fusion characteristic of a fusion mode in the fusion mode set, the fusion mode set is not updated.

Fig. 7 is a block diagram of a pattern recognition apparatus according to an embodiment of the present disclosure.

Referring to fig. 7, an embodiment of the present disclosure provides a pattern recognition apparatus including:

the generating module 701 is configured to generate a first sequence according to an execution order of the plurality of operators and feature information of the plurality of operators in the specified computation graph.

Wherein each item of the first sequence corresponds to a feature vector of each operator, respectively.

The labeling module 702 is configured to label the fusion attribute of the operator according to the first sequence, so as to obtain a second sequence.

The fusion attribute is used for representing whether the operator has the characteristic of fusion with the adjacent operator.

A determining module 703 configured to determine, from the plurality of subgraphs of the specified computation graph, a target subgraph to be operator fused according to the second sequence.

An updating module 704 configured to update the fusion pattern set based on the target fusion pattern in case the fusion pattern of the target sub-graph includes a target fusion pattern not belonging to the fusion pattern set.

The fusion mode set comprises at least one fusion mode.

In some embodiments, the generating module 701 includes: the device comprises a first ordering unit, a vector generating unit and a second ordering unit. The first sequencing unit is used for arranging the operators according to the execution sequence of the operators to obtain a third sequence; the vector generation unit is used for generating the characteristic vector of the operator according to the characteristic information of the operator; and the second sequencing unit is used for arranging the feature vectors of the operators according to the arrangement sequence of the operators in the third sequence to obtain the first sequence.

In some embodiments, the feature information includes at least one of an operator structure, an operator execution mechanism, an operator access feature, and an operator computation feature.

In some embodiments, the feature vector of the operator includes a first vector for characterizing the independent features of the operator and a second vector for characterizing the associated features of the operator with neighboring operators.

It should be noted that, in some embodiments, the feature vector of the operator may be generated by a preset vector acquisition model according to the feature information of the operator. The vector acquisition model is a model which is obtained through training and has the capability of generating characteristic vectors. The training process of the vector acquisition model can be seen in the embodiment of fig. 4 of the present disclosure, and the description thereof will not be repeated here.

In some embodiments, the labeling module 702 includes: the device comprises a labeling unit and a labeling sequence acquisition unit. The labeling unit is used for labeling the fusion attribute of the operator by using a preset labeling value according to the first sequence to obtain the labeling value of the operator; the labeling sequence obtaining unit is used for obtaining a second sequence according to labeling values of a plurality of operators.

It should be noted that, in some embodiments, the labeling module obtains the second sequence, and may be obtained by labeling the fusion attribute of the operator according to the first sequence by using a preset labeling model. The labeling model is a model which is built based on a conditional random field and obtained through training. The training process of the annotation model can be seen in the embodiment of fig. 3 of the present disclosure, and will not be repeated here.

In some embodiments, the determining module 703 includes: a target operator determining unit and a target subgraph determining unit. The target operator determining unit is used for determining an operator corresponding to the first value in the second sequence as a target operator; and the target sub-graph determining unit is used for determining a sub-graph with a target operator in the sub-graphs as a target sub-graph. The first value is a preset labeling value of the fusion attribute, the first value is used for representing that the corresponding operator has the characteristic of fusion with the adjacent operator, and correspondingly, the preset labeling value of the fusion attribute also comprises a second value, and the second value is used for representing that the corresponding operator does not have the characteristic of fusion with the adjacent operator.

In some embodiments, a pattern recognition apparatus according to an embodiment of the present disclosure may further include: the system comprises a target sub-graph identification unit, an operator fusion unit, an execution strategy acquisition unit, an instruction generation unit and an instruction sending unit. Specifically, the target sub-graph recognition unit is used for determining a target sub-graph in a calculation graph to be recognized according to a target fusion mode set, wherein the target fusion mode set is obtained by updating the fusion mode set for a plurality of times; the operator fusion unit is used for fusing at least two operators in the target subgraph to obtain a fusion operator; an execution policy obtaining unit, configured to obtain an execution policy of the target sub-graph, where the execution policy at least includes an execution method of a fusion operator (when an unfused or unfused operator exists in the target sub-graph, the execution policy further includes an execution method of the unfused or unfused operator); the instruction generation unit is used for generating an instruction corresponding to the target subgraph according to the execution strategy of the target subgraph; and the instruction sending unit is used for sending the instruction corresponding to the target subgraph to the designated processing core so that the designated processing core can execute the calculation task corresponding to the target subgraph according to the instruction.

Fig. 8 is a block diagram of a compiler provided in an embodiment of the present disclosure.

Referring to fig. 8, an embodiment of the present disclosure provides a compiler 800 including: at least one pattern recognition means 801.

The pattern recognition device 801 adopts any one operator fusion device in the embodiments of the present disclosure, and is used to implement any one pattern recognition method in the embodiments of the present disclosure.

In some embodiments, compiler 800 includes a pattern recognition device 801. The pattern recognition device 801 specifically includes: the device comprises a generating module, a labeling module, a determining module and an updating module. Specifically, the generating module is used for generating a first sequence according to the execution sequence of a plurality of operators in the designated calculation graph and the characteristic information of the plurality of operators; the labeling module is used for labeling the fusion attribute of the operator according to the first sequence to obtain a second sequence; the determining module is used for determining a target subgraph to be fused from a plurality of subgraphs of the designated calculation graph according to the second sequence; and the updating module is used for updating the fusion mode set based on the target fusion mode under the condition that the fusion mode of the target subgraph comprises the target fusion mode which does not belong to the fusion mode set.

It should be understood that the present disclosure is not limited to the particular arrangements and processes described in the foregoing embodiments and illustrated in the drawings. For convenience and brevity of description, detailed descriptions of known methods are omitted herein, and specific working processes of the systems, modules and units described above may refer to corresponding processes in the foregoing method embodiments, which are not repeated herein.

Referring to fig. 9, an embodiment of the present disclosure provides an electronic device including: at least one processor 901; and a memory 902 communicatively coupled to the at least one processor 901; wherein the memory 902 stores one or more computer programs executable by the at least one processor 901, the one or more computer programs being executable by the at least one processor 901 to enable the at least one processor 901 to perform the pattern recognition method described above.

In some embodiments, the electronic device may be a brain-like chip, and since the brain-like chip may employ a vectorization computing manner, parameters such as weight information of a neural network model need to be called into through an external memory, for example, double Data Rate (DDR) synchronous dynamic random access memory. Therefore, the operation efficiency of batch processing is high in the embodiment of the disclosure.

Further, embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the computer program, when executed by a processor/processing core, implements the pattern recognition method described above.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims

1. A pattern recognition method, comprising:

generating a first sequence according to the execution sequence of a plurality of operators in a designated calculation graph and the characteristic information of the operators, wherein each item of the first sequence corresponds to the characteristic vector of each operator;

labeling the fusion attribute of the operator according to the first sequence to obtain a second sequence, wherein the fusion attribute is used for representing whether the operator has the characteristic of fusion with an adjacent operator;

Determining a target subgraph to be subjected to operator fusion from a plurality of subgraphs of the designated calculation graph according to the second sequence;

and updating the fusion mode set based on the target fusion mode under the condition that the fusion mode of the target subgraph comprises a target fusion mode which does not belong to the fusion mode set, wherein the fusion mode set comprises at least one fusion mode.

2. The pattern recognition method according to claim 1, wherein the labeling the fusion attribute of the operator according to the first sequence to obtain a second sequence includes:

marking the fusion attribute of the operator by using a preset marking value according to the first sequence to obtain the marking value of the operator;

and obtaining the second sequence according to the labeling values of the operators.

3. The pattern recognition method according to claim 1 or 2, wherein the second sequence is obtained by labeling the fusion attribute of the operator according to the first sequence by a preset labeling model;

the labeling model is a model which is built based on a conditional random field and obtained through training.

4. A pattern recognition method according to claim 3, wherein the preset training set data comprises a first training sequence for characterizing operator features and a second training sequence for characterizing operator fusion properties;

The training process of the annotation model comprises the following steps:

constructing an initial labeling model based on a conditional random field;

inputting the first training sequence into the initial annotation model to obtain a prediction annotation sequence, wherein the prediction annotation sequence is a sequence which is obtained by the initial annotation model according to the first training sequence and is used for fusion attribute of a predictor;

and carrying out iterative updating on the annotation model according to the predicted annotation sequence and the second training sequence to obtain a trained annotation model, wherein the trained annotation model has the function of determining a second sequence corresponding to the first sequence according to the first sequence.

5. The pattern recognition method of claim 4, wherein after the obtaining the predicted labeling sequence, before the iteratively updating the labeling model according to the predicted labeling sequence and the second training sequence, further comprises:

auditing the prediction labeling sequence to obtain an auditing result;

and updating the predicted annotation sequence according to the auditing result to obtain an updated predicted annotation sequence, and carrying out iterative updating on the annotation model by using the updated predicted annotation sequence and the second training sequence to obtain a trained annotation model.

6. The pattern recognition method according to claim 1, wherein the generating a first sequence according to an execution order of a plurality of operators in a specified computation graph and feature information of the plurality of operators comprises:

arranging a plurality of operators according to the execution sequence of the operators to obtain a third sequence;

generating a feature vector of the operator according to the feature information of the operator;

and arranging the feature vectors of the operators according to the arrangement sequence of the operators in the third sequence to obtain the first sequence.

7. The pattern recognition method according to claim 6, wherein the feature vector of the operator is generated by a preset vector acquisition model according to the feature information of the operator, the feature vector of the operator includes a first vector for characterizing the independent feature of the operator, and a second vector for characterizing the associated feature of the operator and an adjacent operator;

wherein the vector acquisition model is obtained through training.

8. The pattern recognition method of claim 6 or 7, wherein the feature information includes at least one of an operator structure, an operator execution mechanism, an operator access feature, and an operator computation feature.

9. The pattern recognition method according to claim 1, wherein the preset labeling value of the fusion attribute includes a first value and a second value, and the first value is used for representing that the corresponding operator has the characteristic of fusion with the adjacent operator, and the second value is used for representing that the corresponding operator does not have the characteristic of fusion with the adjacent operator;

and determining a target subgraph to be subjected to operator fusion from a plurality of subgraphs of the designated calculation graph according to the second sequence, wherein the target subgraph comprises:

determining an operator corresponding to the first value in the second sequence as a target operator;

and determining a sub-graph with the target operator in the plurality of sub-graphs as the target sub-graph.

10. The pattern recognition method according to claim 1, wherein after updating the fusion pattern set based on the target fusion pattern, further comprising:

determining a target subgraph in a calculation graph to be identified according to a target fusion mode set, wherein the target fusion mode set is obtained by updating the fusion mode set for a plurality of times;

fusing at least two operators in the target subgraph to obtain a fused operator;

Acquiring an execution strategy of the target subgraph, wherein the execution strategy at least comprises an execution method of the fusion operator;

generating an instruction corresponding to the target subgraph according to the execution strategy of the target subgraph;

and sending the instruction corresponding to the target subgraph to a designated processing core so that the designated processing core can execute the calculation task corresponding to the target subgraph according to the instruction.

11. A pattern recognition apparatus, comprising:

the generation module is configured to generate a first sequence according to the execution sequence of a plurality of operators in a designated calculation graph and the characteristic information of the operators, wherein each item of the first sequence corresponds to the characteristic vector of each operator;

the labeling module is configured to label fusion attributes of the operators according to the first sequence to obtain a second sequence, wherein the fusion attributes are used for representing whether the operators have the characteristic of fusion with adjacent operators;

the determining module is configured to determine a target subgraph to be subjected to operator fusion from a plurality of subgraphs of the designated calculation graph according to the second sequence;

the updating module is configured to update the fusion mode set based on the target fusion mode when the fusion mode of the target subgraph comprises a target fusion mode which does not belong to the fusion mode set, wherein the fusion mode set comprises at least one fusion mode.

12. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the pattern recognition method of any one of claims 1-10.

13. A computer readable medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the pattern recognition method according to any of claims 1-10.