CN108171134A

CN108171134A - A kind of operational motion discrimination method and device

Info

Publication number: CN108171134A
Application number: CN201711387866.2A
Authority: CN
Inventors: 唐海川; 李欣旭; 龚明; 孙帮成; 田寅
Original assignee: CRRC Industry Institute Co Ltd
Current assignee: CRRC Industry Institute Co Ltd
Priority date: 2017-12-20
Filing date: 2017-12-20
Publication date: 2018-06-15

Abstract

The present invention provides a kind of operational motion discrimination method and device.The method includes：Video clip to be identified is obtained, wherein, a kind of type of action is included in the video clip to be identified；According to the video clip to be identified and the action recognition identification model pre-established, the type of action of the video to be identified is identified.Operational motion discrimination method and device provided by the invention can successively extract information from Pixel-level initial data to abstract semantic concept, its feature of aspect ratio engineer extracted has more efficient ability to express, and rapidly and accurately operational motion can be recognized.

Description

A kind of operational motion discrimination method and device

Technical field

The present invention relates to machine vision pattern technology fields, and in particular to a kind of operational motion discrimination method and device.

Background technology

Urban track traffic assumes responsibility for large-scale transport task between urban inner and outskirts of a town, is the public visitor of modern city The important component of railway and highway system is transported, ensures that its operational safety is particularly important.According to China's Rail Transit System accident statistics Data, in the reason of leading to great driving accident, the human factors such as operation error of train operator occupy major portion.Therefore, Monitoring train operator in real time has found its operation error and gives warning with correcting, early to reducing safety accident and casualties There is highly important realistic meaning.

However existing driver monitors system, is mostly used for the physical condition of monitoring driver.Such as bullet train is anti-sudden Dead system, the system can only simply recognize the survival condition of driver；Also some Wearables, by the heart for measuring driver Electricity and pulse signal, so as to judge the current working status of driver, but the equipment seriously affects operation of the driver to train.By In the complexity and uncertainty of human motion, action recognition, which is then one, has more highly difficult subject, at this stage without one The ripe equipment of set can directly recognize the operational motion of train operator.

In terms of general action recognition, most methods are devoted to design effective motion feature, then pass through This feature carries out the classification of motion.Such as intensive track (DT) algorithm, exercise data is subjected to dynamic time warping (DTW), then Its image grey level histogram (HOG), light stream pros figure (HOF) and optical flow gradient histogram (MBH) are extracted, is finally compiled Code, so as to obtain sports immunology feature and classify.The accuracy of identification of these methods depends on the quality of motion feature, for Different scenes need to carry out Different Optimization, therefore wide usage is poor.In addition, the accuracy of action recognition is also relied on and is just acquired The dimension of data, the exercise data comprising depth information three-dimensional data or based on binocular vision is just than the fortune of common monocular vision Dynamic data are able to record more relative position informations, therefore are easier to be identified, however its required sensor is also more Complexity is not easy to be installed in subway driver.

Therefore, how to propose a kind of method, can quickly identify the type of operational motion, become urgent problem to be solved.

Invention content

For the defects in the prior art, the present invention provides a kind of operational motion discrimination method and devices.

In a first aspect, the present invention provides a kind of operational motion discrimination method, including：

Video clip to be identified is obtained, wherein, a kind of type of action is included in the video clip to be identified；

According to the video clip to be identified and the action identifying model pre-established, the video to be identified is identified Type of action.

Second aspect, the present invention provide a kind of operational motion device for identifying, including：

Acquisition module, for obtaining video clip to be identified, wherein, one kind is included in the video clip to be identified Type of action；

Identification module, for according to the video clip to be identified and the action identifying model pre-established, identifying The type of action of the video to be identified.

Operational motion discrimination method and device provided by the invention based on deep learning network, merge 3D convolutional Neural nets Network and long memory network in short-term.Compared to traditional action identifying algorithm, deep learning can be from Pixel-level initial data to pumping The semantic concept of elephant successively extracts information, and the feature of aspect ratio engineer extracted has more efficient ability to express, Therefore there is prominent advantage in terms of image identification.In addition, 3D convolutional neural networks can obtain continuous picture frame, than read-only The information in sequential is obtained the convolutional neural networks for taking single image more.Then, long memory network in short-term can cope with difference The forms of motion of rate, therefore network provided by the invention, on the basis of motion detection is realized, clear in structure, complexity is low, End-to-end operation dramatically simplifies identification algorithm flow.

Description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is this hair Some bright embodiments, for those of ordinary skill in the art, without creative efforts, can be with root Other attached drawings are obtained according to these attached drawings.

Fig. 1 is the flow diagram of operational motion discrimination method provided in the embodiment of the present invention；

Figure picture answers position view when Fig. 2 is progress action identifying provided in an embodiment of the present invention；

Fig. 3 is the flow diagram for the operational motion discrimination method that further embodiment of this invention provides；

Fig. 4 is the structure diagram of deep learning network provided in an embodiment of the present invention；

Fig. 5 is the 3D convolution process schematic diagrames for the deep learning network that further embodiment of this invention provides；

Fig. 6 is the structure diagram of operational motion device for identifying provided in the embodiment of the present invention.

Specific embodiment

Purpose, technical scheme and advantage to make the embodiment of the present invention are clearer, below in conjunction with the embodiment of the present invention In attached drawing, the technical solution in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is Part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art All other embodiments obtained without creative efforts shall fall within the protection scope of the present invention.

Fig. 1 is the flow diagram of operational motion discrimination method provided in the embodiment of the present invention, as shown in Figure 1, described Method includes：

S101, video clip to be identified is obtained, wherein, a kind of action class is included in the video clip to be identified Type；

S102, according to the video clip to be identified and the action identifying model pre-established, identify and described wait to know Type of action in other video clip.

Specifically, figure picture answers position view, such as Fig. 2 when Fig. 2 is progress action identifying provided in an embodiment of the present invention It is shown.Colour TV camera can be used in the embodiment of the present invention or infrared visual sensor obtains subway driver work video, due to ground Dark in iron is preferably to use infrared visual sensor in the present invention.

In gatherer process, personage 2 is 0.8-1.2 meters apart from camera 1.Become for the indoor illumination of reply subway driver To change, camera 2 uses single infrared camera, lens focus 55mm, and shooting angle is 60 ° -90 °, in shooting process, camera shooting The resolution requirement of first 2 shooting video is in more than 640*480.

Single thermal camera in subway train is installed, shoots the work video of subway driver, and to the work of shooting Video is handled, and obtains video clip to be identified, wherein, a kind of action class is included in the video clip to be identified Type.

Practical when being recognized to the type of action in video, video clip to be identified be input to and is pre-established Action identification identification model in, server through calculating and identification, provide the type of action in video clip to be identified.

The embodiment of the present invention builds a model for having merged 3D convolutional neural networks and long memory network in short-term, the model Receive the input of video using 3D convolutional neural networks, while different rates are acted using long memory network extended model in short-term Compatibility, finally according to the video clip to be identified and the action recognition model of structure, identify described to be identified regard The type of action of frequency.

Operational motion discrimination method provided by the invention, can be from Pixel-level initial data to abstract semantic concept successively Information is extracted, the feature of aspect ratio engineer extracted has more efficient ability to express, can be rapidly and accurately to behaviour Work action is recognized.

Optionally, the action identifying model is established using following steps：

The video of different types of operational motion is selected according to the video of acquisition, establishes operational motion database；

According to the operational motion database, the deep learning network model pre-established is trained, is determined described Action identifying model.

On the basis of above-described embodiment, Fig. 3 is the stream for the operational motion discrimination method that further embodiment of this invention provides Journey schematic diagram；When the type of action in video clip is identified, need to establish action identifying model in advance, specifically To establish process as follows：

Using single thermal camera shooting subway driver work video free of discontinuities, the video of at least one week is acquired, then According to train operation rules, screen and intercept out correct relevant operation action, and be classified as N classes, structure driver operation moves Make database.Then when screening video structure database, the single sample in each action classification should be only dynamic comprising one The video file of work or video frame intersection.

In the training network model, due to model structure requirement, it is desirable that by input data, that is, Sample video into row format Change.

For example, for a sample i of sample database, belong to type j, sample i should be the video for including an action, false If shared a frame images.It is divided into first(For downward rounding) part segment, each segment is interior comprising 16 frames, if most The latter segment then gives up the segment, and the resolution ratio of each frame is adjusted to 128*128 using linear interpolation method less than 16 frames, Build the frame stream of a 128*128*16.Meanwhile the label j of the sample is subjected to one-hot coding (One-Hot Encoding) and is compiled Code generates the vector of a N*1 dimension, and j-th of element is 1, remaining is all zero.Then each frame stream of sample i is tied up with label j It is fixed, therefore a sample can change intoA input.During training, using 80% sample as training set, 10% sample Collect as verification, 10% sample is trained as test set.

It is trained using driver operation action database, so as to obtain a model that can be used for the classification of motion i.e. institute State action identifying model.

Operational motion discrimination method provided by the invention based on deep learning network, merges 3D convolutional neural networks and length Short-term memory network.Compared to traditional action identifying algorithm, deep learning can be from Pixel-level initial data to abstract language Adopted concept successively extracts information, the feature of aspect ratio engineer extracted have more efficient ability to express, therefore There is prominent advantage in terms of image identification.In addition, 3D convolutional neural networks can obtain continuous picture frame, it is more single than only reading The convolutional neural networks of image obtain the information in sequential more.Then, long memory network in short-term can cope with different rates Forms of motion, therefore network provided by the invention, on the basis of motion detection is realized, clear in structure, complexity is low, end-to-end Operation, dramatically simplifies identification algorithm flow.

Optionally, the deep learning network model includes 3D convolutional neural networks and long memory network in short-term.Optionally, The deep learning network model concrete structure includes：Multiple convolutional layers, multiple pond layers, full articulamentum, a length When a memory layer and Softmax output layer.

On the basis of above-described embodiment, Fig. 4 is the structure diagram of deep learning network provided in an embodiment of the present invention； Fig. 5 is the 3D convolution process schematic diagrames of deep learning network that further embodiment of this invention provides；

With reference to the content of Fig. 4 and Fig. 5, specific example is named to illustrate the training process of deep learning network model. 8 convolutional layers (1-8), 5 pond layers (9-13), 1 full articulamentum (14), 1 long short-term memory are included in the network model Layer (15) and 1 Softmax output layer (16).

It is every layer of specific configuration below：

Conv1→Pool1→Conv2→Pool2→Conv3a→Conv3b→Pool3→Conv4a→Conv4b→ Pool4→Conv5a→Conv5b→Pool5→fc6→lstm7→Softmax

Convolutional layer 1 receives the input of 128*128*16*1.Wherein 128*128 refer to input picture width and height, 16 Refer to continuous 16 frame figure, 1 refers to that picture is single channel.Convolution kernel size is 3*3*3, and weights use mean value as 0, variance 1 Just too distribution initialization, moving step length 1, input Boundary filling is 0, and activation primitive is Relu functions, and formula is as follows：

F (x)=max (0, x)

For common convolutional layer, input is two-dimensional array, therefore the output undergone after single convolution nuclear convolution should be single Open characteristic pattern, it is impossible to the feature in fine extraction time dimension.Different from common convolutional neural networks, the convolution kernel of the network is Three-dimensional structure, convolution process as shown in Figure 5, convolution kernel can once receive input and the processing of continuous multiple frames picture, simultaneously The time and space information of sample is obtained, the set of multiple characteristic patterns that output result is then is referred to as character.Finally, Convolutional layer 1 will export the character of 64 128*128*16*1.

Pond layer 9 receives the input of 64 128*128*16*1 character.Similar with convolution process, pond core is three-dimensional Structure, size 2*2*1, weights use the just too distribution initialization that mean value is 1 for 0, variance, moving step length 1, primary energy Enough receive the input of a character and carry out maximum value pond.Therefore, pond layer 9 will export the spy of 64 64*64*16*1 Levy body.

Convolutional layer 2 receives the input of the character of 64 64*64*16*1.Convolution kernel size is 3*3*3, and weights use The just too distribution initialization that mean value is 0, variance is 1, moving step length 1, input Boundary filling is 0, and activation primitive is Relu letters Number, the character of 128 64*64*16*1 of final output.

Pond layer 10 receives the input of 128 64*64*16*1 character.Pond core size is 2*2*2, and weights use Just too distribution initialization, the moving step length 1 that mean value is 0, variance is 1 carry out maximum value pond.Therefore, pond layer 10 will Export the character of 128 32*32*8*1.

Convolutional layer 3 receives the input of the character of 128 32*32*8*1.Convolution kernel size is 3*3*3, and weights use The just too distribution initialization that mean value is 0, variance is 1, moving step length 1, input Boundary filling is 0, and activation primitive is Relu letters Number, the character of 256 32*32*8*1 of final output.

Convolutional layer 4 receives the input of the character of 256 32*32*8*1.Convolution kernel size is 3*3*3, and weights use The just too distribution initialization that mean value is 0, variance is 1, moving step length 1, input Boundary filling is 0, and activation primitive is Relu letters Number, the character of 256 32*32*8*1 of final output.

Pond layer 11 receives the input of 256 32*32*8*1 character.Pond core size is 2*2*2, and weights use Just too distribution initialization, the moving step length 1 that mean value is 0, variance is 1 carry out maximum value pond.Therefore, pond layer 11 will Export the character of 256 16*16*4*1.

Convolutional layer 5 receives the input of the character of 256 16*16*4*1.Convolution kernel size is 3*3*3, and weights use The just too distribution initialization that mean value is 0, variance is 1, moving step length 1, input Boundary filling is 0, and activation primitive is Relu letters Number, the character of 512 16*16*4*1 of final output.

Convolutional layer 6 receives the input of the character of 512 16*16*4*1.Convolution kernel size is 3*3*3, and weights use The just too distribution initialization that mean value is 0, variance is 1, moving step length 1, input Boundary filling is 0, and activation primitive is Relu letters Number, the character of 512 16*16*4*1 of final output.

Pond layer 12 receives the input of 512 16*16*4*1 character.Pond core size is 2*2*2, and weights use Just too distribution initialization, the moving step length 1 that mean value is 0, variance is 1 carry out maximum value pond.Therefore, pond layer 12 will Export the character of 512 8*8*2*1.

Convolutional layer 7 receives the input of the character of 512 8*8*2*1.Convolution kernel size is 3*3*3, and weights are using equal It is worth the just too distribution initialization for being 1 for 0, variance, moving step length 1, input Boundary filling is 0, and activation primitive is Relu functions, The character of 512 8*8*2*1 of final output.

Convolutional layer 8 receives the input of the character of 512 8*8*2*1.Convolution kernel size is 3*3*3, and weights are using equal It is worth the just too distribution initialization for being 1 for 0, variance, moving step length 1, input Boundary filling is 0, and activation primitive is Relu functions, The character of 512 8*8*2*1 of final output.

Pond layer 13 receives the input of 512 8*8*2*1 character.Pond core size is 2*2*2, and weights are using equal It is worth just too distribution initialization, the moving step length 1 that are 1 for 0, variance and carries out maximum value pond.Therefore, pond layer 13 will be defeated Go out the character of 512 4*4*1*1.

Full articulamentum 14 receives the character input of 512 4*4*1*1.Share 4096 nodes, weights use mean value for 0th, the just too distribution initialization, and use Relu activation primitives that variance is 1.Full articulamentum 14 will export 4096 characteristic value.

Long short-term memory layer 15 receives 4096 characteristic value inputs.It includes 4096 units, each unit has middle placement Input gate forgets door and out gate.The characteristic value of generation 1000 is exported to Softmax layers 16.Weights use mean value as 0, side Difference is initialized for 1 be just distributed very much.For 3D convolutional layers, although can receive temporal input, it can be enterprising in sequential Capable judgement is relatively fixed, therefore limited for the unstable action effect of rate.And long short-term memory is a kind of time recurrence Neural network can be used for interval and the relatively large event of delay variation in processing and predicted time sequence.Therefore using length Short-term memory layer exports 1000 characteristic values and carries out the classification of motion to Softmax.

Softmax layers 16 have N number of node, and each node corresponds to a kind of type action, and export target as the general of the category Rate, for node n, the formula of Softmax is as follows：

y_n=f (W_n,x_n)

As Softmax exports the probability that the sample is the n-th class.y_nThe value obtained for the node from previous layer network.

In training process, cross entropy loss function, after considering numerical computations steadiness, softmax loss letters are used Several formula is as follows：

For sample i, its correct class categories are j, if model exportsValue be 1, illustrate that classification is correct, it is this Situation does not contribute loss function.But if classification error,Value be less than 1, at this time loss function value increase, because This, training process will optimize weight and tend to causeValue level off to 1, so as to reduce loss function.During indiscipline, due to Weight generates at random, therefore the probability each classified is exactly 1/N, therefore loses and approach in the case of no increase regularization

After all samples are introduced with L1 regularizations punishment, the formula of loss function is：

Training process uses stochastic gradient descent (SGD), and B is lot number, and taking 30 samples, learning rate is opened for a batch Beginning is set as 0.003, then often halves after 10w iterative calculation, each iteration all can reversely update the weight of every layer of network. It is according to the final gradient direction that loss function obtains：

P_i,NIt is the only heat vector of label of sample i, dimension is N*1, and j-th of element value is 1, and other element values are 0.P_N Be network model output sample i in N number of classificatory probability.After loss variation tends towards stability with training process, then stop Only train.

After the completion of training, then the model can be used to carry out action identifying.In action identifying, first using infrared photography Machine acquires one section of action, is then inputted action identifying model, the action identifying model will provide judging result, and result is Certain one kind action in maneuver library is either not belonging to other actions in maneuver library, so as to fulfill action identifying.

Optionally, the obtaining step of the video of different types of operational motion is as follows：Original video is divided into The set of segments of specially multiple continuous 16 frame pictures, then sequentially input the deep learning network model video.The video And the spatial positional information of executive agent is acted in temporal information comprising the video and the picture.

On the basis of above-described embodiment, specifically, such as in picture driver spatial positional information, for remembering difference The action of movement rate, so as to fulfill an accurately action identifying result.

Optionally, the type of action includes at least：Refer to poor operation, push operation, pulling process, safety check operation and gesture behaviour Make.

On the basis of above-described embodiment, the type of action includes at least：The finger difference operation of driver, draws behaviour at push operation Work, safety check operation and gesture operation, and these action types are stored in operational motion database.

Operational motion discrimination method provided in an embodiment of the present invention based on deep learning network, merges 3D convolutional Neural nets Network and long memory network in short-term.Compared to traditional action identifying algorithm, deep learning can be from Pixel-level initial data to pumping The semantic concept of elephant successively extracts information, and the feature of aspect ratio engineer extracted has more efficient ability to express, Therefore there is prominent advantage in terms of image identification.In addition, 3D convolutional neural networks can obtain continuous picture frame, than read-only The information in sequential is obtained the convolutional neural networks for taking single image more.Then, long memory network in short-term can cope with difference The forms of motion of rate, therefore network provided by the invention, on the basis of motion detection is realized, clear in structure, complexity is low, End-to-end operation dramatically simplifies identification algorithm flow.

Fig. 6 is the structure diagram of operational motion device for identifying provided in an embodiment of the present invention, as shown in fig. 6, the dress Put including：Acquisition module 10 and identification module 20, wherein：

Acquisition module 10 is used to obtain video clip to be identified, wherein, one is included in the video clip to be identified Kind type of action；

Identification module 20 is used to, according to the video clip to be identified and the action identifying model pre-established, identify The type of action of the video to be identified.

Operational motion device for identifying provided in an embodiment of the present invention includes acquisition module 10 and identification module 20, acquisition module 10 obtain video clip to be identified, wherein, a kind of type of action, identification module 20 are included in the video clip to be identified According to the video clip to be identified and the action identifying model pre-established, the action class of the video to be identified is identified Type.

Operational motion device for identifying provided by the invention, can be from Pixel-level initial data to abstract semantic concept successively Information is extracted, the feature of aspect ratio engineer extracted has more efficient ability to express, can be rapidly and accurately to behaviour Work action is recognized.

On the basis of above-described embodiment, the flow diagram of operational motion discrimination method shown in Figure 3, to regarding When type of action in frequency segment is identified, needs to establish action identifying model in advance, it is as follows specifically to establish process：

Operational motion device for identifying provided by the invention based on deep learning network, merges 3D convolutional neural networks and length Short-term memory network.Compared to traditional action identifying algorithm, deep learning can be from Pixel-level initial data to abstract language Adopted concept successively extracts information, the feature of aspect ratio engineer extracted have more efficient ability to express, therefore There is prominent advantage in terms of image identification.In addition, 3D convolutional neural networks can obtain continuous picture frame, it is more single than only reading The convolutional neural networks of image obtain the information in sequential more.Then, long memory network in short-term can cope with different rates Forms of motion, therefore network provided by the invention, on the basis of motion detection is realized, clear in structure, complexity is low, end-to-end Operation, dramatically simplifies identification algorithm flow.

Optionally, the deep learning network model includes 3D convolutional neural networks and long memory network in short-term.

Optionally, the deep learning network model concrete structure includes：Multiple convolutional layers, multiple pond layers, one it is complete Articulamentum, one long short-term memory layer and a Softmax output layer.Optionally, the deep learning network model is specific Structure includes：Multiple convolutional layers, multiple pond layers, a full articulamentum, one long short-term memory layer and a Softmax are defeated Go out layer.

It is introduced in specific training process such as embodiment of the method, does not do specific introduction herein.

It is preferable to do action effect for the personage immediately ahead of identification picture pick-up device for method and device provided by the invention, The indoor monitoring situation of subway driver is suitable for the scene that this algorithm is good at.In addition, method and device provided by the invention, base In the infrared vision of monocular, equipment framework is simple, convenient for reequiping into subway drivers' cab.Therefore method provided by the invention and dress It puts, it is still immature at this stage in subway driver operation identification system, it can be provided for subway driver's violation operation identification system Solution.

Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It is realized by the mode of software plus required general hardware platform, naturally it is also possible to pass through hardware.Based on such understanding, on Technical solution is stated substantially in other words to embody the part that the prior art contributes in the form of software product, it should Computer software product can store in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including several fingers It enables and (can be personal computer, server or the network equipment etc.) so that computer equipment is used to perform each implementation Method described in certain parts of example or embodiment.

Device and system embodiment described above is only schematical, wherein described be used as separating component explanation Unit may or may not be physically separate, the component shown as unit may or may not be Physical unit, you can be located at a place or can also be distributed in multiple network element.It can be according to the actual needs Some or all of module therein is selected to realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not paying In the case of performing creative labour, you can to understand and implement.

Claims

1. a kind of operational motion discrimination method, which is characterized in that including：

According to the video clip to be identified and the action identifying model pre-established, the video clip to be identified is identified In type of action.

2. according to the method described in claim 1, it is characterized in that, the action identifying model is established using following steps：

According to the operational motion database, the deep learning network model pre-established is trained, determines the action Identification model.

3. according to the method described in claim 2, it is characterized in that, the deep learning network model includes 3D convolutional Neural nets Network and long memory network in short-term.

4. according to the method described in claim 3, it is characterized in that, the deep learning network model concrete structure includes：It is more A convolutional layer, multiple pond layers, a full articulamentum, one long short-term memory layer and a Softmax output layer.

5. the according to the method described in claim 2, it is characterized in that, acquisition step of the video of different types of operational motion It is rapid as follows：

Original video is divided into the set of segments of multiple continuous 16 frame pictures, then sequentially inputs the deep learning network mould Type.The video includes the spatial positional information that executive agent is acted in temporal information and the picture.

6. according to the method described in claim 2, it is characterized in that, the type of action includes at least：Refer to difference operation, push away behaviour Work, pulling process, safety check operation and gesture operation.

7. a kind of operational motion device for identifying, which is characterized in that including：

Acquisition module, for obtaining video clip to be identified, wherein, a kind of action is included in the video clip to be identified Type；

Identification module, for according to the video clip to be identified and the action identifying model pre-established, identifying described The type of action of video to be identified.

8. device according to claim 7, which is characterized in that the action identifying model is established using following steps：

9. device according to claim 8, which is characterized in that the deep learning network model includes 3D convolutional Neural nets Network and long memory network in short-term.

10. device according to claim 9, which is characterized in that the deep learning network model concrete structure includes：It is more A convolutional layer, multiple pond layers, a full articulamentum, one long short-term memory layer and a Softmax output layer.