CN110378880A

CN110378880A - The Cremation Machine burning time calculation method of view-based access control model

Info

Publication number: CN110378880A
Application number: CN201910585276.3A
Authority: CN
Inventors: 王璞; 金卫平; 谢世朋; 宋兴瑞; 高舜龙
Original assignee: Nanjing Guoke Software Co Ltd
Current assignee: Nanjing Guoke Software Co Ltd
Priority date: 2019-07-01
Filing date: 2019-07-01
Publication date: 2019-10-25

Abstract

The invention discloses the Cremation Machine burning time calculation methods of view-based access control model, the step of including are as follows: the position of corpse is fixed first, the photo of corpse is obtained by camera, it is then based on enhancing Mask-RCNN deep learning network and carries out corpse segmentation, human body pixel number and funeral cremator burning time relationship are judged again by four layers of full link neural network, obtain optimal funeral cremator burning time.The Mask-RCNN method includes three phases: feature extraction, region suggest and prediction.The invention belongs to contactless burning time Predicting Technique, Combustion System program differentiates that result is adjusted Cremation Machine burning time by four layers of full link neural network, so that Cremation Machine burning time parameter setting is more reasonable.

Description

The Cremation Machine burning time calculation method of view-based access control model

Technical field

The invention belongs to incinerator technical fields, are related to a kind of method for being specially adapted for incinerating human or animal's corpse, tool Body is related to a kind of Cremation Machine burning time calculation method of view-based access control model.

Background technique

Funeral cremator burning time has particularity, currently, time for being burnt of the corpse of different weight substantially according to Manually micro-judgment, therefore occur often because judgement inaccuracy, so that burning time is too long or too short, causes bad shadow It rings.

The key factor for influencing Cremation Machine burning time is the weight of corpse.Because of the particularity of Cremation Machine burning time, no Contact weighing-appliance convenient to use carries out the weight weighing of corpse, this allows for judging based on contactless carcass weight non- It is often necessary.

Summary of the invention

To solve the problems, such as that above-mentioned existing funeral cremator burning time judgement inaccuracy, the present invention propose that view-based access control model passes The estimation of contactless corpse burning time of sense belongs to the estimation of contactless funeral cremator burning time based on deep learning Method.The photo for obtaining corpse by camera first is then based on enhancing Mask-RCNN deep learning network and carries out corpse point It cuts, judges human body pixel number and funeral cremator burning time relationship again by four layers of full link neural network, obtain best Funeral cremator burning time.

To achieve the above object, the technical solution adopted by the present invention is the Cremation Machine burning time calculating side of view-based access control model Method, comprising the following steps:

Step 1: the position of corpse is fixed；

Step 2: photo is extracted；

Step 3: the region where corpse is determined by Mask-RCNN method；

Step 4: burning time is determined according to the time relationship of corpse pixel and corpse combustion state.

Above-mentioned Mask-RCNN method includes three phases: feature extraction, region suggest and prediction.

In feature extraction phases, using the FPN based on ResNet as the core network of Mask-RCNN, according to Lung neoplasm figure As designing fine essential characteristic extract layer and simplified subsequent characteristics extract layer.

The FPN includes three parts: path from bottom to top, from top to bottom path and lateral connection.

In the region proposal stage, region suggests network RPN by carrying out sliding window on more size characteristic figures that FPN is provided Mouth operation is to extract rectangle candidate region.

The Feature Conversion in the frame of candidate region is tieed up with fixed space using the method for RoIAlign in forecast period The small characteristic pattern of H*W is spent, to solve region mismatch problem caused by quantifying twice in RoIPool operation.

Preferably, tubercle is identified as 7*7, nodule segmentation 14*14 in above-mentioned small characteristic pattern.

It is above-mentioned to be quantified as the quantization of the boundary RoI and elementary boundary quantization twice.

Compared with prior art, the present invention has following advantageous effects:

1, the present invention is based on enhancing Mask-RCNN deep learning networks to carry out human body segmentation, then passes through four layers of full link Neural network judges human body pixel number and funeral cremator burning time relationship, when available optimal funeral cremator burns Between.

2, the invention belongs to contactless burning time Predicting Technique, Combustion System program is by above-mentioned differentiation result to fire Change machine burning time is adjusted, so that Cremation Machine burning time parameter setting is more reasonable.

Detailed description of the invention

Fig. 1 is that the present invention is based on the flow charts of the Cremation Machine burning time calculation method of vision；

Fig. 2 is the Lung neoplasm automatic identification flow chart based on Mask-RCNN；

Fig. 3 is the segmentation schematic diagram based on Mask-RCNN frame；

Fig. 4 is that Mask-RCNN is that three phases schematic diagram is established in segmentation；

Fig. 5 is the FPN flow chart based on ResNet；

Fig. 6 is the system assumption diagram of RPN；

Fig. 7 is the system assumption diagram of ResNet-50；

Fig. 8 is three kinds of ResNet structure optimization schematic diagrames；

Fig. 9 is four layers of fully-connected network schematic diagram.

Specific embodiment

Now in conjunction with attached drawing, the present invention will be further described in detail.

The process of the Cremation Machine burning time calculation method of view-based access control model as shown in Figure 1, the present invention in order to make full use of sky Between information and priori knowledge, using improved Mask-RCNN algorithm, the spy based on ResNet that is modified according to pulmonary nodule image It levies pyramid network (FPN), the core network as Mask-RCNN.Then first by the Pixel-level of tubercle using Mask-RCNN It tests knowledge and extracts the pyramid characteristic pattern for being suitable for image, the tubercle side provided followed by extension by Mask-RCNN identification Boundary's frame obtains the identification and segmentation that area-of-interest (ROI) carries out tubercle, and the flow chart of this method is as shown in Figure 2.For letter For the sake of list, the only characteristic pattern after display core network in RPN framework, but after all characteristic patterns all have each layer.

The segmentation schematic diagram of Mask-RCNN frame is as shown in figure 3, the architectural framework of Mask-RCNN is based on convolutional Neural net The architectural framework of network (CNN), CNN are a kind of deep neural networks being mainly made of various types of layers.The input of CNN is usual It is no complicated pretreated original image, and the output of each layer of CNN has the characteristic pattern of sizes and a variety of ratios.One As for, extended with region is experienced due to the intensification of layer, extracted feature from low-level become high level.CNN is used Back-propagation algorithm optimizes the weight of all weighting layers by minimizing loss, and the design of loss function can be defined as not Same image task.In general, there are three types of major type of layers: convolutional layer (Convolutional Layer), pond layer for CNN tool (Pooling Layer), full articulamentum (Fully Connected Layer).

Traditional neural network by establishing connection, weight multiplied by weight matrix for input matrix between input and output Matrix is the operation being fully connected, therefore very time-consuming, so only allow to train some features, in order to more effectively extract feature, For convolutional layer using part connection, it both horizontally and vertically moves the convolutional filtering of a small fixed size along input matrix Device generates linear activation response in the form of characteristic pattern.

Each neuron in characteristic pattern is connected to the local receptor field of input matrix, and is obtained by two steps.? In the first step, receptive field is multiplied by it with the value of corresponding position in filter, is then added the value of multiplication to obtain end value, institute There is neuron to share the weight of filter.In order to carry out feature extraction in a manner of more, multiple filters in convolutional layer can With parallel work-flow to generate multiple characteristic patterns, local feature is then extracted.Gradient disappears or explodes in order to prevent and to improve CNN anti- To the efficiency of propagation, amendment linear unit (ReLU) is fed to by the characteristic pattern that filter generates.Then after ReLU being handled Characteristic pattern be input to the layer (pond layer) of next type, and by select the whole statistical nature of small neighbourhood replace field come into Row double sampling, whole statistical nature in this case use maximum value (referred to as maximum pond).Pond layer has CNN There are translation, rotation and scaling invariance, meanwhile, the size reduction of feature vector can prevent network over-fitting and further drop Low computation complexity, also results in loss in detail.Another important layer is full articulamentum, and filter size is that it is to tool There are the advanced features of the strong semantic information suitable for different images task to be encoded, learns and adjust convolutional layer and full articulamentum Weight can preferably indicate input data in the training process.

As shown in figure 4, Mask-RCNN is that three phases are established in segmentation: feature extraction, region suggest and prediction.Wherein, 4 (a) learn the flow chart of block, the convolution block that 4 (b) present invention use, 4 (c) identical pieces used for the present invention for residual error.

In the first stage, it using the FPN based on ResNet as the core network of Mask-RCNN, is set according to Lung neoplasm image The subsequent characteristics extract layer counting out fine essential characteristic extract layer and simplifying.Recent studies indicate that network depth is to net Network study is of great significance, but has certain side effect to e-learning, such as gradient disappearance/explosion, training accuracy. Previous problem is solved via specification initialization and intermediate cannonical layer, and ResNet can be by being applied to CNN for residual block Solve Second Problem, as shown in Figure 4.

Residual block is defined as

Y=F (x)+x (1.1)

Wherein input feature vector mapping is expressed as x, and the output for merging layer is expressed as y, and output y can pass through x and F (x) corresponding positions The weight set is added to obtain.Therefore, the output F (x) of three convolutional layers with ReLU becomes the mapping of the residual error from x to y.It is identical Mapping is that input feature vector maps and merges the quick connection between layer, need to only execute the identical mapping from x to y.So due to defeated The weight ratio of the reference entered, convolutional layer is easier to learn without fast connection.In ResNet, other than first layer convolutional layer, Nearly all weighting layer is all made of residual block.Due to the requirement of area size, the necessary extended area range of block, therefore there are two types of The block of type, i.e. convolution block (convolutional block) and identical piece (identity blocks).Convolution block can lead to Crossing increases to 2 from 1 by the step-length of first layer convolutional layer to increase regional scope, and for matching dimensionality, quick connection should also be incited somebody to action Step-length increases to 2..Identical piece is calibrated bolck used in ResNet, and it is identical to output and input dimension.Scheme in (b) and (c) the One, two, the size of three-layer coil product core is 1 × 1,3 × 3,1 × 1, and the effect of first layer and third layer convolutional layer is to reduce feature dimensions Degree, dimension here refers to port number, so that second layer convolutional layer has less weight to go to learn.This design is reliable , and be proved that precision can be effectively improved.The following table 1 provides the details for the ResNet structure that the present invention uses.

Table 1

Although ResNet has powerful expressive force, verified to solve multiple dimensioned figure by using pyramid representation As task can obtain better performance.Therefore, using the multiple dimensioned spy extracted based on the FPN of ResNet in method Sign.

FPN includes three parts: path from bottom to top, from top to bottom path and lateral connection, as shown in Figure 5.FPN includes Three parts: path from bottom to top, top-down path and lateral connection.Path from bottom to top is the feedforward of ResNet It calculates, there is identical scale in each stage, and layer is deeper, and character representation is better, so utilizing the last one residual error of each stage Block output activates to form the individual features mapping set of pyramid level.

Top-down path first up-samples characteristic pattern with rougher spatial resolution, but from higher gold The semantic information that word tower rank is extracted wants being higher by twice for relatively low pyramid rank extraction.Then by the feature of up-sampling with Feature from bottom to top merges accordingly, these features from bottom to top are by one 1 × 1 convolutional layer (to reduce pyramid The port number of all ranks, and make its port number 256) be added by element, which is known as lateral connection, by top-down Each stage in path executes, as shown in the figure.Then, one 3 × 3 convolution is added in each combined Feature Mapping Layer, to reduce the aliasing effect generated by up-sampling.The output of all 3 × 3 convolutional layers constitutes us based on ResNet's The final Feature Mapping set of FPN core network, the set expression be { P2, P3, P4, P5 }, pay attention to P6 on the basis of P5 into The maximum pondization that step-length of having gone is 2 operates, and for detecting lesser target, high-rise characteristic pattern is used for the characteristic pattern of such bottom Detect biggish target.

In second stage, region suggests network (RPN) by carrying out sliding window on more size characteristic figures that FPN is provided Operation is to extract rectangle candidate region.

RPN is a mininet, as shown in fig. 6, rectangular window therein can be in the feature gold word of above-mentioned core network It is slided on tower, the feature pyramid rank for executing slide is determined by ROI size.Then, sliding window is passed through into filter Convolutional layer having a size of 3 × 3 is mapped to 512 dimensional vectors, and by two 1 × 1 parallel convolutional layers of the vector feed-in, and one point For classifying, another branch returns branch for bounding box.K candidate region frame is extracted in each sliding window position, therefore Bounding box returns layer and exports 4K coordinate, and classification layer exports 2K score (probability for predicting each candidate region prospect, background), K candidate region frame is parameterized relative to reference block, these reference blocks are referred to as anchor and centered on sliding window.

The selection of anchor is based on anchor and true value (Ground Truth) friendship and than (IOU), and the highest anchor of IOU is be overlapped with single GT phase Folded, anchor of the IOU higher than 0.7 is assigned to positive sample in all GT, and anchor of the IOU lower than 0.3 is assigned to negative sample.Red block It is the window slided on the feature pyramid of above-mentioned core network, the loss function of RPN is defined as:

Wherein, p_iIndicate that i-th of anchor of prediction is the probability of target.If anchor is positive sample, GT labelIt is 1, it is on the contrary ?.t_iIndicate the parametrization coordinate that the bounding box of RPN prediction returns,Indicate the GT coordinate for corresponding to positive sample anchor point.L_cls Indicate the logarithm loss of binary classification (target/non-targeted), L_regIndicate the smooth L1 loss of predicted boundary frame and GT frame.RPN's Loss function is by N_clsAnd N_regStandardize, and is weighted by a balance parameters λ.Finally, RPN passes through training adjustment Network weight obtains a series of candidate region frames.

In the phase III, by a kind of technology for being known as RoIAlign, by the Feature Conversion in the frame of candidate region be with The small characteristic pattern (tubercle is identified as 7*7, due to required segmentation precision, nodule segmentation 14*14) of fixed space dimension H*W, The technology has well solved region caused by quantization twice in RoIPool operation (quantization of the boundary RoI and elementary boundary quantization) Mismatch problem.The bounding box identification and mask segmentation of tubercle are completed followed by Liang Ge branch, as shown in the figure.

In first branch, each Feature Mapping is fed to the full articulamentum of two connections, followed by two parallel Full articulamentum.One of them is returned for bounding box, and exports four values, is the side of each candidate region of n object class Boundary frame position carries out coding adjusting, another classifies for frame, exports the probability Estimation of N+1 (1 is background) a classification.Classification damage Lose L_clsIt is the logarithm loss of N+1 class, returns loss L_regBe by RPN generate bounding box and

The smooth L1 of GT frame loses.In this experiment, target category only has Lung neoplasm, so N is 1.

Second branch is parallel with the first branch, is made of RoIAlign and small-sized FCN, the FCN is only by several heaps Folded convolutional layer and a transfer convolution composition.Due to the corresponding relationship of pixel to pixel, FCN can carry out semantic segmentation instruction Practice.It should be noted that the input due to small-sized FCN is made of the Feature Mapping of the RoIAlign of low resolution, and small-sized The step-length of the transposition convolution of FCN is set as 2 to control the quantity of weight, therefore the corresponding relationship of pixel to pixel is actually spy The GT mask of a pixel-map to the RoI fixed size of sign (is sized four pixels for 28*28).

The Lung neoplasm binary segmentation mask of each RoI in mask branch prediction RPN, for each mask, small-sized FCN's Sigmoid is executed in each pixel of the last layer Feature Mapping:

Wherein ω_i,jIndicate the value at Feature Mapping position (i, j), L_maskIndicate all S (ω_i,j) average binary pair Number loss.

Therefore, network is trained by using the multitask loss definition of each sampling RoI:

L=L_cls+L_box+L_mask (1.4)

Mask-RCNN method is a kind of semantic segmentation technology based on region, can be used for predicting object boundary frame and corresponding Mask, the region that mask prediction is provided based on RPN is suggested, and mask loss is added in multitask loss, raising area The accuracy that domain is suggested.Mask prediction and region suggest mutually promoting, and improve the positioning accuracy on Lung neoplasm boundary.

Model adjustment is the minor adjustment to network architecture, such as changes the stride of specific convolutional layer, this adjustment Usually hardly change computation complexity, but the image that can not ignore may be generated to model accuracy.In the present invention, will make The effect of network model is promoted with three kinds of adjustment of ResNet and DenseNet.

ResNet framework is presently described, module especially relevant to model adjustment.ResNet network is led by an input Dry, four follow-up phases and a final output layer composition, as shown in fig. 7, in show convolution kernel, output channel number and stride Size (default value 1), pond layer are similar.Input trunk has 7 × 7 convolutional layers, and output channel number is 64, stride 2, The maximum pond layer for being 2 followed by stride.Width will be inputted and highly reduce 2 times by inputting trunk, and its port number is increased to 64。

Since Stage 2, each stage is started with one down-sampled piece (conv block), followed by several residual Poor block (identity block).In down-sampled piece, there are path A and path B, there are three convolutional layer, core sizes by path A Respectively 1 × 1,3 × 3 and 1 × 1, the stride of first convolutional layer are 2, and input width and height halve, concluding volume lamination Output channel number is 4 times bigger than the first two, referred to as bottleneck structure.Path B will input dimension transformation using 1 × 1 convolution that stride is 2 For the output dimension of path A, therefore can sum the output in two paths to obtain the output of down-sampling block.Residual block The convolution for being 1 using only stride, it is other to be similar to down-sampling block.

It can change the quantity of residual block in each stage, to obtain different ResNet network models, such as ResNet- 50 and ResNet-152, wherein the convolution layer number in digital representation network.

Two popular ResNet adjustment are first recalled, ResNet-B and ResNet-C is referred to as, introduces one later The new model of kind adjusts ResNet-D.

(1) ResNet-B, improvement part are exactly that the residual block of downsample will be in stage Downsample operation changes second 3 × 3 convolutional layer into from first 1 × 1 convolutional layer, if downsample operation is placed on 1 × 1 convolutional layer that stride is 2, then will lose compared with multicharacteristic information (default is to be reduced to 1/4), it can be understood as have 3/4 characteristic point is all not engaged in calculating, and downsample operation, which is placed on 3 × 3 convolutional layers then, can reduce this loss, Even if because stride is set as 2, but convolution kernel size is enough big, therefore can be with almost all of position on Cover Characteristics figure.

(2) ResNet-C, improvement part are exactly by 7 × 7 convolutional layers of the part input stem in Fig. 7 with 3 volume 3 × 3 Lamination replacement.The thought of Inception v2 has been used for reference in this part, and main consideration is calculation amount, after all large scale convolution kernel band The calculation amount come is many more than small size convolution kernel, if can find 33 × 3 convolutional layers in ResNet-C under carefully calculating Calculation amount lack unlike original.

(3) ResNet-D, improvement part be by stage part be downsample residual block branch from 1 × 1 convolutional layer that stride is 2 changes the convolutional layer that stride is 1 into, and adds a pond layer in front and be used to do downsample.Although pond layer can also lose information, at least lost again after selecting (for example being averaging operation here) Redundancy is lost, it is much better compared to 1 × 1 convolutional layer that stride is set as 2.

Fig. 8 is three kinds of ResNet structure optimizations.ResNet-B has modified the down-sampling block of ResNet, and ResNet-C is further Have modified input stem, it is most important that, ResNet-D has modified down-sampling block again.

Assuming that input is a picture X₀, by one L layers of neural network, wherein i-th layer of nonlinear transformation is denoted as H_i(*), H_i(*) can be cumulative such as BN, ReLU, Pooling or Conv of many kinds of function operation.I-th layer of feature exports note Make X_i。

In order to advanced optimize the propagation of information flow, DenseNet proposes the network structure of diagram.

As shown, i-th layer of input is not only related to i-1 layers of output, there are also the related notes of the output of all front layers Make:

X_l=H_l([X₀,X₁,...,X_l-1]) (1.5)

Wherein [] represents concatenation (splicing), both by X₀To X_l-1All output feature map of layer are pressed Channel is combined.Here used nonlinear transformation H is the combination of BN+ReLU+Conv (3 × 3).

Due to needing the feature map to different layers to carry out cat operation in DenseNet, so needing different layers Feature map keeps identical feature size, and which limits the realizations of Down sampling in network.In order to make With Down sampling, author divides DenseNet for multiple Denseblock.

Feature size is required to keep same size in the same Denseblock, between different Denseblock Transition layers is set and realizes Down sampling, transition layer is by BN+ in the experiment of author Conv (1 × 1)+2 × 2average-pooling composition.

In Denseblock, it is assumed that the output of each nonlinear transformation H is K feature map, then i-th layer The input of network is just K0+ (i-1) × K, here we can see that a main difference of DenseNet and existing network Point: DenseNet can receive output of the less characteristic pattern quantity as network layer.

Reason is exactly that each layer in the same Denseblock is all associated with all layers before, if handle Feature regards the global state of a Denseblock as, then each layer of training objective is by the existing overall situation State, judgement need to be added to global state updated value thus the characteristic pattern quantity K of each network layer output be also known as Growth rate, equally decide each layer need how much of the information updated to global state after us it will be seen that Lesser K is only needed just to be enough to realize the performance of state-of-art in the experiment of author.

Although DenseNet receives less k, that is, the quantity of feature map is used as output, due to different layers Between feature map together by cat operative combination, still can be finally feature map channel it is larger and become The burden of network.Author is used herein 1 × 1Conv (Bottleneck) as the method for Feature Dimension Reduction to reduce channel Quantity, to improve computational efficiency.Nonlinear transformation after improving becomes BN-ReLU-Conv (1 × 1)-BN-ReLU-Conv (3 × 3), the DenseNet using Bottleneck layers are known as DenseNet-B by author.In an experiment, author uses 1 × 1 convolution generates the feature map that channel quantity is 4k.

After being extracted by corpse area pixel, the corpse burning time anticipation based on full link neural network is needed to difference Size corpse carries out forecast analysis to Cremation Machine burning time.Combustion System program differentiates that result burns to Cremation Machine by this Time is adjusted, so that Cremation Machine burning time parameter setting is more reasonable.No matter subjective assessment or objectively evaluate An intuitive score is required to embody Cremation Machine burning time.In the present embodiment, by four layers of neural network to cremate Machine burning time is estimated.The input of neural network is the pixel quantity after human region identification.Network is connected entirely using four layers Network is connect, network structure is as shown in Figure 9.

Input is 1 node, is that two layers of the hidden layer of element of pixel shared by human body uses 32 units, exports as burning time Classification, Web vector graphic relu activation primitive, output layer use sigmod activation primitive, learning rate 0.001, and loss function uses Cross entropy loss function.

It should be noted that the description of the above specific embodiment is not intended to limit the invention, it is all in essence of the invention Any modification, equivalent replacement, improvement and so within mind and principle, should all be included in the protection scope of the present invention.

Claims

1. the Cremation Machine burning time calculation method of view-based access control model, it is characterised in that comprise the steps of:

Step 1: the position of corpse is fixed；

Step 2: photo is extracted；

Step 3: the region where corpse is determined by Mask-RCNN method；

2. the Cremation Machine burning time calculation method of view-based access control model according to claim 1, it is characterised in that the Mask- RCNN method includes three phases: feature extraction, region suggest and prediction.

3. the Cremation Machine burning time calculation method of view-based access control model according to claim 2, which is characterized in that mentioned in feature The stage is taken, using the FPN based on ResNet as the core network of Mask-RCNN, fine base is designed according to Lung neoplasm image Eigen extract layer and the subsequent characteristics extract layer simplified.

4. the Cremation Machine burning time calculation method of view-based access control model according to claim 3, which is characterized in that the FPN Include three parts: path from bottom to top, from top to bottom path and lateral connection.

5. the Cremation Machine burning time calculation method of view-based access control model according to claim 3, which is characterized in that built in region Network RPN is suggested by carrying out sliding window operation on more size characteristic figures that FPN is provided to extract rectangle in view stage, region Candidate region.

6. the Cremation Machine burning time calculation method of view-based access control model according to claim 2, which is characterized in that in prediction rank Feature Conversion in the frame of candidate region is the small characteristic pattern with fixed space dimension H*W using the method for RoIAlign by section, To solve region mismatch problem caused by quantifying twice in RoIPool operation.

7. the Cremation Machine burning time calculation method of view-based access control model according to claim 6, it is characterised in that the small spy Tubercle is identified as 7*7, nodule segmentation 14*14 in sign figure.

8. the Cremation Machine burning time calculation method of view-based access control model according to claim 6, it is characterised in that it is described twice It is quantified as the quantization of the boundary RoI and elementary boundary quantization.