CN114782869A

CN114782869A - Method and terminal for improving flame image recognition accuracy

Info

Publication number: CN114782869A
Application number: CN202210453482.0A
Authority: CN
Inventors: 袁炳杰
Original assignee: Xiamen Hl Wymobi Technology Co ltd
Current assignee: Xiamen Hl Wymobi Technology Co ltd
Priority date: 2022-04-27
Filing date: 2022-04-27
Publication date: 2022-07-22

Abstract

The invention provides a method and a terminal for improving flame image recognition accuracy, which are used for acquiring a preset atlas and segmenting a recognition target from each preset image of the preset atlas; extracting the image characteristics of the recognition target, and mapping the image characteristics to a color space to obtain a target color block; splicing the target color block and the preset image to obtain an updated image; training a preset neural network according to the updated image; the method maps the image features to the color space to obtain the target color block, splices the target color block and the original preset image to obtain the updated image, trains the preset neural network by using the updated image, increases the interpretability of the features in the preset image on the premise of not damaging the structure of the preset image, and improves the classification precision of the finally trained neural network.

Description

Method and terminal for improving flame image recognition accuracy

Technical Field

The invention relates to the field of image recognition processing, in particular to a method and a terminal for improving flame image recognition accuracy.

Background

TSN (Temporal Segment Networks, deep behavior recognition network) is a classical model of video classification, but it learns from Temporal variations in optical flow and RGB Diff (dispersion), with the following disadvantages: 1. the interest area is not designated, and the learning precision is influenced by the time sequence change of the background; 2. no pre-treatment is added, and some characteristic changes considered to be appropriate by expert experience may not be learned.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the method and the terminal for improving the flame image recognition accuracy are provided, and the image recognition with higher accuracy is realized.

In order to solve the technical problems, the invention adopts a technical scheme that:

a method for improving flame image recognition accuracy comprises the following steps:

acquiring a preset atlas, and segmenting an identification target from each preset image of the preset atlas;

extracting the image characteristics of the recognition target, and mapping the image characteristics to a color space to obtain a target color block;

splicing the target color block and the preset image to obtain an updated image;

and training a preset neural network according to the updated image.

In order to solve the technical problem, the invention adopts another technical scheme as follows:

a terminal for improving the accuracy of flame image recognition, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the computer program when executing the computer program:

acquiring a preset atlas, and segmenting a recognition target from each preset image of the preset atlas;

and training a preset neural network according to the updated image.

The invention has the beneficial effects that: the method comprises the steps of firstly segmenting a concerned identification target in an image, obtaining image features of the identification target, mapping the image features to a color space to obtain a target color block, splicing the target color block and an original preset image to obtain an updated image, and training a preset neural network by using the updated image, so that the interpretability of the features in the preset image is increased on the premise of not damaging the structure of the preset image, and the classification precision of the finally trained neural network is improved.

Drawings

FIG. 1 is a flow chart illustrating steps of a method for improving the accuracy of flame image recognition according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a terminal for improving the accuracy of flame image recognition according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating a calculation for stitching a target color block with a preset image according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an actual mosaic of a target color block and a preset image according to an embodiment of the present invention;

description of the reference symbols:

1. a terminal for improving the flame image recognition accuracy; 2.a processor; 3. a memory.

Detailed Description

In order to explain the technical contents, the objects and the effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.

Referring to fig. 1, a method for improving the flame image recognition accuracy includes the steps of:

and training a preset neural network according to the updated image.

As can be seen from the above description, the beneficial effects of the present invention are: the method comprises the steps of firstly segmenting a concerned identification target in an image, obtaining image features of the identification target, mapping the image features to a color space to obtain a target color block, splicing the target color block and an original preset image to obtain an updated image, and training a preset neural network by using the updated image, so that the interpretability of the features in the preset image is increased on the premise of not damaging the structure of the preset image, and the classification precision of the finally trained neural network is improved.

Further, the method also comprises the following steps:

acquiring a mark type corresponding to the preset image;

the extracting the image feature of the recognition target comprises:

acquiring candidate features, and determining image features according to the relevance of the candidate features and the label categories;

and extracting the image characteristics of the recognition target.

According to the description, a plurality of post-selection features are set, image features are determined according to the strength of the relevance between a plurality of candidate features and the mark categories, and then corresponding image features of the recognition target are extracted; data with weak relevance with the mark type does not need to be calculated, so that the calculation time is reduced, and the image processing efficiency is improved.

Further, the mapping the image feature into a color space comprises:

calculating coordinates (r, g, b) of the image feature in the color space;

r＝int(r1×256)，g＝int(g1×256)，b＝b1×256；

r1＝float(r2/256)，g1＝float(g2/256)，b1＝float(b2/256)；

r2＝x％256，g2＝(x-r2)/256％256，b2＝((x-r2-g2×256)/(256×256)％256)；

x is the number of image features × 256.

From the above description, the image features are converted into coordinates in the color space according to a certain calculation method, and the image features are converted into color features which are easily understood and learned by the model, so that the interpretability of the features in the image is enhanced.

Further, the obtaining of the preset atlas includes:

acquiring a preset video, and extracting video frames from the preset video to serve as preset images, wherein the preset images form the preset atlas.

As can be seen from the above description, the video frames are obtained from the video and used as the preset images to form the preset atlas, and since most of the generated data is video data, such as monitoring data, in the actual production process, the preset atlas obtained from the video is more convenient to use in the actual production process.

Further, the extracting the video frame from the preset video comprises:

determining a preset interval;

determining the number of video frames to be extracted according to the preset interval;

judging whether the number of the video frames is greater than or equal to a preset value, if not, returning to the step of determining the preset interval;

and extracting video frames from the preset video according to the preset interval.

It can be known from the above description that the extraction of the video frames is performed through the fixed pre-review interval, and the extraction from the video to the preset atlas is realized under the condition that the original content of the video is kept as much as possible, so that the marking is convenient and the workload of the marking is reduced by controlling the number of the video frames.

Further, the stitching the target color block and the preset image to obtain an updated image comprises:

replacing the updated image with a video frame corresponding to the preset image in the preset video to obtain an updated video;

the training of the preset neural network according to the updated image comprises:

and training a preset neural network according to the updated video.

According to the description, the obtained target color block and the preset image are spliced to obtain the updated image, the extracted features are directly fused into the original preset image and finally restored into the original video, the original features in the video are not influenced while the readable features are added, and the neural network obtained through final training can have a better classification effect.

Further, after the training of the preset neural network according to the updated image, the method further includes:

acquiring a real-time image, and obtaining a classification result of the real-time image through the trained preset neural network;

and comparing the classification result with a preset result, and adjusting the rotating speed of the fan according to the difference value of the classification result and the preset result.

According to the description, after the classification result is obtained, the rotating speed of the fan is adjusted according to the difference between the classification result and the preset result, so that the purpose of effectively controlling the flame intensity is achieved.

Further, the comparing the classification result with a preset value, and the adjusting the rotation speed of the fan according to the difference between the classification result and the preset value comprises:

acquiring a preset correction value, and adjusting the classification result according to the preset correction value;

retrieving a preset rule base according to the adjusted difference value between the classification result and the preset result to obtain a retrieval result;

and outputting a fan rotating speed control value according to the retrieval result.

According to the above description, the correction value is introduced, the obtained classification result is adjusted according to the correction value, the human experience value is introduced into the output result of the model, the precision of the classification result is further improved, the preset rule base is set, different fan rotating speed control values are matched for different classification results in the rule base, the corresponding fan rotating speed control value is output according to the retrieval result, the fan is effectively controlled, and therefore the flame intensity is effectively controlled.

Further, the segmenting the recognition target from each preset image of the preset atlas comprises:

and segmenting the recognition target from each preset image of the preset atlas through a BiseNetV2 network.

According to the description, the segmentation of the recognition target in the preset image is carried out through the BiseNeTV2 network, so that the macroscopic feature and the surrounding feature can be considered at the same time, the high-precision recognition target extraction is realized, and a foundation is laid for the subsequent steps.

Referring to fig. 2, a terminal for improving the flame image recognition accuracy includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method for improving the image recognition accuracy.

The method and the terminal for improving the flame image recognition accuracy of the invention can be applied to scenes needing to recognize flame characteristics, such as the determination of the flame intensity in the steelmaking process, and are explained in the following through specific embodiments

Referring to fig. 1, a first embodiment of the present invention is:

s1, acquiring a preset atlas, and segmenting an identification target from each preset image of the preset atlas, where the identification target is a flame in this embodiment;

in an optional implementation manner, a mark category corresponding to each preset image is obtained; for example, each preset image set corresponds to one mark type, seven mark types of serious excessive extraction, light excessive extraction, medium excessive extraction and normal are shared, the number of the preset images in each preset image set is the same, and if each mark type corresponds to 500 preset images; labeling can be performed using a labeling tool, such as EISeg;

wherein, the over-pumping is a state of flame, the current rotating speed of a fan of the converter is evaluated, and the fan is regulated according to the rotating speed; the over-pumping is in the flame performance that the flame does not overflow the furnace mouth within a period of time; when the fire is large, the flame overflows the furnace mouth within the range exceeding the first preset time period; normally, in a period of time, the flame enters and exits the furnace mouth, namely the flame keeps filling which overflows the furnace mouth or does not overflow the furnace mouth within the range of a second preset period of time;

in an alternative embodiment, the obtaining the preset atlas includes: acquiring a preset video, and extracting a video frame from the preset video to be used as a preset image, wherein the preset image forms the preset atlas; wherein extracting a video frame from the preset video comprises: s11, determining a preset interval; s12, determining the number of video frames to be extracted according to the preset interval; s13, judging whether the number of the video frames is larger than or equal to a preset value, if not, returning to S11; if so, extracting video frames from the preset video according to the preset interval;

in an optional embodiment, all preset Images are divided into a training set, a verification set and a test set according to a preset ratio, for example, a ratio of 6:2:2, and the BiseNetV2 network is adjusted through the divided training set, verification set and test set to reach the indexes Images: 808; mIoU: 0.9590; acc: 0.9891; kappa:0.9578Dice: 0.9789; segmenting a recognition target from each preset image of the preset atlas through the adjusted BiseNetV2 network;

wherein, Images represents the number of pictures in the verification set, and the number of pictures is 808 in this embodiment;

mIOU represents the average value of the intersection ratio between the prediction region and the real region of all classes;

acc represents the proportion of the intersection of the prediction region and the real region of all the classes in the prediction region;

kappa represents an index for consistency check, can be used for measuring the classification effect, and has the main function of marking a low score for a model with strong bias; the calculation of the kappa coefficient is based on a confusion matrix, the value is between-1 and 1, usually greater than 0, and the calculation formula of kappa is:

p₀-p_e/1-p_e；

wherein

The dice represents the relation between the prediction value set and the real value set, and the formula is

Wherein pred is a set of predicted values, and true is a set of true values;

s2, extracting the image characteristics of the recognition target, mapping the image characteristics to a color space to obtain a target color block, including:

s21, acquiring candidate characteristics, and determining image characteristics according to the relevance between the candidate characteristics and the label types;

in an optional implementation mode, the candidate features are ranked from strong to weak according to the relevance strength with the mark categories, and a preset number of candidate features are obtained as image features; if the area ratio, the length ratio, the width ratio and the jitter ratio are obtained as image characteristics, the values are all in the range of [0,1 ];

the area ratio is the recognition target area/single frame area, binary image contours are required, contours with the area lower than 100 are filtered, and then circumscribed rectangles of all the contours are solved; the length ratio is the width of a circumscribed rectangle of the recognition target/the width of a single frame; the height ratio is the height of a circumscribed rectangle of the recognition target/the height of a single frame; the jitter ratio is the area of the identification target area/the area of a circumscribed rectangle of the identification target;

s22, extracting the image characteristics of the recognition target;

s23, calculating coordinates (r, g, b) of the image features in the color space:

r＝int(r1×256)，g＝int(g1×256)，b＝b1×256；

r1＝float(r2/256)，g1＝float(g2/256)，b1＝float(b2/256)；

x is the number of image features x 256;

s24, integrally mapping the image to a color space according to the (r, g, b) to obtain a target color block;

s3, splicing the target color block and the preset image to obtain an updated image, wherein the method comprises the following steps:

s31, determining the weight of the target color block and the preset image;

s32, determining the position of the target color block according to the weight and calculating the area of the target color block;

s33, splicing the target color blocks around the preset image according to the position and the area to obtain an updated image;

in an optional implementation manner, replacing a video frame corresponding to the preset image in the preset video with the updated image to obtain an updated video;

s4, training a preset neural network according to the updated image;

in an optional implementation manner, a preset neural network is trained according to an updated video, the updated video is divided into a training set, a verification set and a test set according to a preset proportion, the preset neural network is trained according to the training set, the verification set and the test set, and the trained neural network can obtain a classification result of one of the label categories according to an input video;

in an optional embodiment, the preset neural network is TSN (Temporal Segment Networks, deep behavior recognition network);

s5, acquiring a real-time image, and obtaining a classification result from the real-time image through the trained preset neural network;

in an optional implementation mode, a real-time video is obtained, the steps of the preset images in S1-S3 are executed, and then the real-time video is sent to a preset neural network to obtain a classification result;

s6, comparing the classification result with a preset result, and adjusting the rotating speed of the fan according to the difference value of the classification result and the preset result;

in an optional embodiment, the method further includes obtaining a preset correction value, and adjusting the classification result according to the preset correction value; retrieving a preset rule base according to the adjusted difference value between the classification result and the preset result,obtaining a retrieval result; outputting fan rotating speed control values according to the retrieval results, such as control parameters (such as oxygen lance height, oxygen supply flow, oxygen consumption and slag making material type and quantity) and smoke detection parameters (CO and CO)₂、O₂) And the rotating speed of the primary dust removal fan and the like to obtain the correction of the flame characteristic judgment of the converter mouth.

Referring to fig. 3 and 4, a second embodiment of the present invention is:

a method for improving the flame image recognition accuracy is different from the first embodiment in that:

a specific scene is used to explain how to stitch the target color blocks and the preset image, i.e., step S3:

s31, acquiring the length X and the width Y of a preset image; acquiring the weights R:1 of the target color blocks and a preset image, wherein the weights of the target color blocks are equal to each other;

s32, obtaining target color blocks A1, A2.A3 and A4 corresponding to the area ratio, the length ratio, the width ratio and the jitter ratio respectively;

s33, please refer to fig. 3, if the four target tiles are stitched with the preset image, the width of one-side increase is X1, the height of one-side increase is Y1, and when the weights of the 4 target tiles are equal, (X + X1) × Y1 ═ Y + Y1) × 1, Y1/X1 ═ Y/X is obtained by simplification;

splicing A1, A2, A3 and A4 with preset images according to the calculated x1 and y 1; please refer to fig. 4, which is a schematic diagram of a target color block and a preset image after being spliced, it can be seen that an identified target, i.e., flame, has been extracted from the diagram, and four surrounding target color blocks correspond to different features, so the colors are different;

in an optional embodiment, if the number of the target color blocks is preset, calculating the area includes:

a1, confirming the area ratio according to the weight among the target color blocks, such as the ratio of the target color blocks B1, B2 and B3 is 2:1: 1;

a2, calculating the area of one unit, and determining the area of the color block according to the area of one unit and the area ratio:

total area of color block: b ═ 2Yy1+2Xx1+4x1y 1;

the unit color block area unit B is A/n; n is the total proportion, and in the process, 2+1+1 is equal to 4;

the ratio of the unit color block to the original image is recorded as R:1, in the process, the weight ratio of the target color block to the original image is 4: r;

keeping the ratio of the length and the width of the spliced images the same as the original image, and then Y1/X1 is equal to Y/X;

simplified according to the above four formulas

Obtaining:

the third embodiment of the invention is as follows:

pseudo code providing a BiseNetV2 network:

step 1, constructing a model:

batch _ size:4 [ size of mini-batch (number of training sets utilized for one-step gradient descent) on GPU or CPU ]

iters 50000 [ iterations of training model ]

train _ dataset: [ training data set ]

type Dataset [ custom data set ]

dataset _ root, custom _ dataset (data set storage catalog)

Custom _ dataset/train _ txt [ path of training dataset ] train _ path

num _ classes:2 [ number of pixel classes ]

Step 2, constructing a change method for the image:

transformations: [ transformation method for image ]

Resize [ Resize ] type

target _ size [1280,720] [ size of target image, adjust images with different resolutions to the same resolution, and conveniently send into network model for training ]

RandomHorizialFlip (horizontal flipping of images at a certain probability, capable of increasing number of preset images in preset image set and increasing available data amount of training model)

Type normaize [ standardizing images ]

Train [ what kind of data set is used ]

val _ dataset: [ authentication data set ]

type Dataset [ custom data set ]

dataset _ root, custom _ dataset [ data set storage directory ]

Custom _ dataset/val.txt [ path of verification dataset ]

num _ classes:2 [ number of pixel classes ]

transformations: [ transformation method for image ]

Type Resize [ Resize ] type

target _ size: [1280,720] [ size of target image ]

Type normaize [ standardizing images ]

mode val [ what data set to use ]

Step 3, constructing an optimizer:

optimizer: [ training optimizer ]

type sgd [ optimizer type, set to random gradient descent ]

momentum:0.9 [ momentum optimization method, setting momentum to 0.9 ]

weight _ decay 4.0e-5 [ L2 regularized value, set to 4.0e-5 ]

Step 4, establishing learning rate

lr _ scheduler: [ learning rate ]

type Polynomialdecay (learning rate type is set to Polynomialdecay, polynomial decay, i.e. the learning rate obtained is the interpolation between the initial learning rate and the given final learning by calculating the weight constant fraction points by a polynomial)

learning _ rate 0.01 [ initial learning rate ]

end _ lr:0 [ final learning rate ]

power 0.9 [ attenuation ratio ]

Step 5, constructing a loss function:

loss function

types:

Type Cross EntropyLoss type Cross entropy

coef [ [1,1,1,1,1] ] [ coefficient list corresponding to loss function list ]

Step 6, training the model according to the set parameters:

model:

type BiSeNetV2 model

pretrained: xxx [ pretrained model ]

The fourth embodiment of the invention is as follows:

providing a TSN network implementation procedure and pseudo code:

step 1, inputting a video training set, a video verification set and hyper-parameters, wherein in an optional implementation mode, the number of the video training sets is 1533, and the number of the video verification sets is 511; reading in a hyper-parameter file and establishing a hyper-parameter dictionary;

step 2, reading in a basic model, and establishing the model on the basis of the basic model:

step 201, reading in layers and selecting the number of ResNet layers;

step 202, recombining video data;

step 203, establishing a two-dimensional convolutional layer, in this embodiment, the number of output channels of the convolutional layer is 64, the size of the convolutional kernel is 7, the step size is 2, and the filling size is 3;

step 204, establishing a batch regularization layer and an activation function Relu;

step 205, establishing a pooling layer, in this embodiment, the size of the pooling core is 3, the step size is 2, the filling size is 1, and the pooling type is max-pooling;

step 206, building each module Bottleneckblock (residual block) of ResNet;

step 207, establishing a pooling layer in ResNet, wherein the pooling type is avg-pooling;

step 208, reshape characteristics are obtained according to seg _ num; the TSN segmented sampling strategy divides a piece of video data into seg _ num sections, wherein each section adopts a seglen frame with the length of seglen, the seglen frames are inserted back after data enhancement, and finally the seg _ num section characteristics are spliced;

step 209, establishing a full connection layer, wherein the output unit is the category number 7, activating a function softmax, and setting the variance of the random initialization parameter according to the characteristic shape;

step 3, reading in a loss function, an optimizer, a data reader and an evaluation index which are attached to the model;

step 4, reading the gpu amount, the thread amount, the total number of rounds, the number of printing rounds and storing the number of rounds;

step 5, reading data, enhancing the data, and inputting the enhanced data into a model;

step 6, training and recording the process, and storing the process model and the optimal verification model;

MODEL:

name:"TSN"

format, "video" # [ "video", "frames" ] [ input as video ]

#num_classes:400

num _ classes:7 [ 7 marker classes ]

seglen:1 [ number of frames sampled per segment ]

image _ mean [0.485,0.456,0.406] [ data enhancement z-score ]

image _ std [0.229,0.224,0.225] [ data enhancement z-score ]

num _ layers:101 [ number of layers of ResNet ]

topk:1 [ evaluation index, hit rate of highest accuracy per batch ]

TRAIN:

segment _ num 3# tracking with 3segments [ number of segments in the segmented sampling strategy proposed by TSN ]

Epoch 45 [ training discussion ]

short _ size:256 [ data enhancement, integer with number of pixels on the short side and 4/3 on the long side ] of the frame

target _ size:224 [ data enhancement, cropping, adjusting frame size to target size ]

num _ readers _ threads:12 [ number of threads to process originally input map ]

buf _ size:1024 [ queue size of read data ]

batch _ size:8 [ batch size ]

use _ GPU True [ whether GPU is used, default is not used ]

num _ gpus:1 [ gpu number ]

Fileist: "/data/dataset/tsn/train

learning _ rate 0.00125 [ initial learning rate ]

learning _ rate _ decay:0.1 [ learning rate decay ]

L2_ weight _ decay:1e-4 [ L2 canonical weight decay ]

momentum:0.9 [ optimizer, let swing amplitude of gradient become smaller ]

total _ videos:1533 [ step ═ total video/batch _ size + 1]

VALID:

seg _ num:3 [ number of segments in a segmented sampling strategy proposed by TSN ]

short _ size:256 [ data enhancement, integer of how many pixels the short edge of the frame is adjusted to and 4/3 the long edge of the frame ] is

buf _ size 1024 [ queue size for reading data ]

batch _ size:32 [ batch size ]

FILLIST: "/data/dataset/tsn/val.list" [ authentication data List ]

TEST:

seg _ num:7 [ number of segments in a segmented sampling strategy proposed by TSN ]

buf _ size:1024 [ queue size of read data ]

batch _ size:1 [ batch size ]

FILLIST: "/data/dataset/tsn/test

INFER:

segment _ num:3# relating using 300segments [ number of segments in segment sampling strategy proposed by TSN ]

num _ readers _ threads:12 [ number of threads to process original input map ]

buf _ size 1024 [ queue size for reading data ]

batch _ size 32 [ batch size ]

List "[ inference data List ]

Referring to fig. 2, a fifth embodiment of the present invention is:

a terminal 1 for improving image recognition accuracy comprises a processor 2, a memory 3 and a computer program stored on the memory 3 and capable of running on the processor 2, wherein the processor 2 implements the steps of the first embodiment when executing the computer program.

In summary, the present invention provides a method and a terminal for improving image recognition accuracy, wherein a feature fusion method is used to add a traditional single frame index as a color block to the periphery of an original image, so as to increase interpretability of image features without destroying the structure of the original image, and introduce an empirical value as a reference, so as to provide a correction value for a model result, and further improve the accuracy of image detection; the phenomenon that most of the actual scene is monitoring video is combined, the video frame is extracted through the video to serve as the original image for obtaining the target color block, the video is directly identified through the TSN network, the actual using scene is attached to the video more, meanwhile, after the identification result is obtained, the fan is controlled according to the identification result and the difference of the preset result, the adjustment of the flame size is achieved, and automatic control is achieved.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims

1. A method for improving the flame image recognition accuracy is characterized by comprising the following steps:

and training a preset neural network according to the updated image.

2. The method for improving the flame image recognition accuracy of claim 1, further comprising:

acquiring a mark type corresponding to the preset image;

the extracting of the image feature of the recognition target includes:

and extracting the image characteristics of the recognition target.

3. The method of claim 1, wherein the mapping the image features into color space comprises:

calculating coordinates (r, g, b) of the image feature in the color space;

r＝int(r1×256)，g＝int(g1×256)，b＝b1×256；

r1＝float(r2/256)，g1＝float(g2/256)，b1＝float(b2/256)；

x is the number of image features × 256.

4. The method for improving the flame image recognition accuracy according to the claims 1-3, wherein the obtaining the preset atlas comprises:

acquiring a preset video, and extracting video frames from the preset video to serve as preset images, wherein the preset images form the preset image set.

5. The method of claim 4, wherein the extracting the video frame from the preset video comprises:

determining a preset interval;

6. The method of claim 4, wherein the stitching the target color block and the preset image to obtain an updated image comprises:

and training a preset neural network according to the updated video.

7. The method for improving the flame image recognition accuracy according to claim 1, wherein the training of the preset neural network according to the updated image further comprises:

8. The method of claim 7, wherein comparing the classification result with a predetermined value, and adjusting the fan speed according to a difference between the classification result and the predetermined value comprises:

9. The method of claim 1, wherein the segmenting the recognition target from each preset image of the preset atlas comprises:

10. A terminal for improving the accuracy of flame image recognition, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the method for improving the accuracy of flame image recognition according to claims 1-9.