CN117636385A

CN117636385A - Method for drawing display position area based on neural network model

Info

Publication number: CN117636385A
Application number: CN202311618941.7A
Authority: CN
Inventors: 李天鸿; 洪峰; 冯建永; 连依萍; 陈洁
Original assignee: Zhongdian Zhi'an Technology Co ltd
Current assignee: Zhongdian Zhi'an Technology Co ltd
Priority date: 2023-11-29
Filing date: 2023-11-29
Publication date: 2024-03-01

Abstract

The invention relates to the field of computer processing, in particular to a method for drawing a display area based on a neural network model, which comprises the following steps: 1) Collecting a plurality of scanning pieces of the paper exhibition plan as exhibition drawing samples to form an exhibition drawing sample set; 2) Preprocessing all the display bitmap samples in the display bitmap sample set to improve the recognition effect of the computer on the display bitmap samples; 3) Extracting features of all the display bitmap samples in the display bitmap sample set to obtain a display bitmap feature set; 4) Selecting a neural network model with the best performance as a neural network model for drawing the coordinates of the display area; 5) And preprocessing the paper exhibition hall plan to be processed, and acquiring all exhibition position information and exhibition position area coordinates on the paper exhibition hall plan to be processed by using a neural network model. The invention utilizes the neural network model to identify and process the plan of the paper edition of the exhibition hall, and rapidly generates the corresponding online exhibition booth plan, thereby reducing the workload of technicians.

Description

Method for drawing display position area based on neural network model

Technical Field

The invention relates to the field of computer processing, in particular to a method for drawing a display area based on a neural network model.

Background

The exhibition or called exhibition activity refers to collective activity with a certain theme, which is formed by gathering a plurality of people in a certain regional space, and is regular or irregular, and the system or the non-system is short for activities such as conferences, exhibitions, exhibition sales, exhibition and the like. With the progress of three-dimensional technology and the continuous improvement of computer operation speed and network transmission speed, the development of online virtual exhibition is gradually started in recent years, and basically two modes exist, namely, the synchronous entity exhibition is moved to an online exhibition where the network becomes an entity exhibition; another is to build a virtual trade hall and provide virtual business functions.

As shown in FIG. 1, the on-line booth plan is a booth plan drawn on a computer according to an actual paper board drawing, and is used for representing the setting areas of all booth in the booth, and the on-line booth drawing needs to be exactly drawn as the position and the space size of the area where the actual booth is placed, and the corresponding booth number is generated.

The functions related to the online booth plan drawing method include adding booth, dragging booth, adjusting the position of booth plan, modifying booth attributes, deleting booth, quickly copying booth, and saving booth, namely, each drawing requires a great deal of tedious mouse operations of designers who can spread web page plan, resulting in the following two problems:

(1) the process is troublesome and the efficiency is low, taking an exhibition hall for drawing 40 exhibition stands as an example, and the average drawing is completed and needs to consume 30 to 60 minutes;

(2) the accuracy is not high, and repeated drawing errors can occur when more display positions are required to be drawn and the display position distances are crossed and repeated.

Disclosure of Invention

Aiming at the corresponding defects of the prior art, the invention provides a method for drawing a booth area based on a neural network model, which utilizes a trained neural network model to identify and process a plane view of a paper version of a booth, and rapidly generates a corresponding online booth plane view so as to reduce the workload of technicians.

The invention is realized by adopting the following scheme: a method for drawing a display area based on a neural network model comprises the following steps:

1) Collecting a plurality of scanning pieces of the paper exhibition plan as exhibition drawing samples to form an exhibition drawing sample set;

2) Preprocessing all the display bitmap samples in the display bitmap sample set to improve the recognition effect of the computer on the display bitmap samples;

3) Extracting features of all the display bitmap samples in the display bitmap sample set to obtain a display bitmap feature set;

4) Dividing the display bitmap characteristic set into a training set and a testing set, training the neural network model for multiple times by using the training set, verifying the corresponding performance of the neural network model after each training by using the testing set in sequence, and selecting the neural network model with the best performance as the neural network model for drawing the coordinates of the display region;

5) And after preprocessing the paper exhibition hall plan to be processed, acquiring all exhibition position information and exhibition position area coordinates on the paper exhibition hall plan to be processed by using a neural network model.

Preferably, the preprocessing comprises one or more of text information clearing, uniform image size, random image overturning, image segmentation processing, and random adjustment of brightness, contrast, saturation and color phase of an image.

Preferably, the SURF algorithm is used to perform feature extraction on all the spanmap samples in the spanmap sample set.

Preferably, in step 4), the method for obtaining the neural network model for drawing the coordinates of the booth area is as follows:

4-1) determining the structure of the neural network model;

4-2) initializing various parameters of a neural network model, wherein the parameters comprise model parameters, a loss function and an optimizer;

4-3) training the neural network model by utilizing a training set, and adjusting parameters for a plurality of times to obtain a plurality of neural network models;

4-4) verifying the performance of the plurality of neural network models obtained in the step 4-3) by using the test set;

4-5) selecting the neural network model with the best performance as the neural network model for drawing the coordinates of the display area.

The invention has the advantages that:

(1) the invention can recognize the paper board plan on line and draw a dynamic display map for user selection, setting and monitoring on the display web page through data collection, preprocessing, feature extraction, model training, model testing and optimization, post-processing and generating a recognition model, display map processing, display region information acquisition, display coordinate information acquisition and display web page plan output to a user;

(2) according to the invention, by classifying and identifying each pixel, the coordinate information of the display position can be extracted more accurately, the misjudgment and missed judgment conditions are reduced, and the high-precision identification of the dynamic display position is realized;

(3) the invention utilizes the machine learning and deep learning methods, can automatically classify and identify the picture at the pixel level, greatly reduces the manual intervention and operation, and improves the efficiency and accuracy; with the continuous expansion of data sets and the continuous improvement of technical level, the performance and the precision of a pixel identification algorithm are also continuously improved, and the method can be expanded and applied to more scenes and fields;

(4) the invention can more comprehensively extract the information of the display position by fusing and analyzing various information in the picture, for example, the information of the position, the size, the shape and the like of the display position can be more accurately detected and positioned by combining the technologies of target detection, target positioning and the like;

(5) the invention can rapidly process and identify the picture based on the pixel identification algorithm of the computer vision and the deep learning technology, thereby acquiring the display position information in real time, and having important significance for application scenes such as real-time monitoring and early warning of the display position information.

Noun interpretation:

SVG technology: SVG is a drawing technique, the full name of SVG is scalable vector graphics (Scalable Vector Graphics).

Drawings

FIG. 1 is a scanned item of a paper version of a venue display map;

FIG. 2 is a flow chart of the present invention;

FIG. 3 is a schematic diagram of manual and automatic identification efficiency comparisons of booth complexity;

FIG. 4 is a schematic diagram of manual and automatic recognition accuracy comparison of the display complexity;

FIG. 5 is a schematic diagram of a ResNet neural network;

FIG. 6 shows the accuracy and loss variation of training and verification sets according to an embodiment of the present invention;

FIG. 7 is a plan view of an on-line booth according to an embodiment of the present invention;

FIG. 8 is a schematic view of an on-line operation of an on-line booth diagram according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a sample in a display map feature set according to an embodiment of the present invention;

FIG. 10 is a plan view of an exhibition hall after text message removal according to an embodiment of the present invention;

fig. 11 shows coordinates of all booth areas on a plan of a paper version of a booth to be processed obtained using a neural network model in an embodiment of the present invention.

Detailed Description

As shown in fig. 1 to 11, a method for drawing a booth area based on a neural network model includes the following steps:

in order to improve the training effect on the neural network model, the more the number of samples in the display map sample set is, the better, but the problem of model training efficiency should also be considered, so in this embodiment, the display map sample set includes 128 display map samples;

the preprocessing mode comprises one or more of text information clearing, uniform image size, random image overturning, image segmentation processing, random image brightness adjustment, contrast, saturation and color phase adjustment, and the better the recognition effect of a computer on a display bitmap sample is.

In this embodiment, the preprocessing includes text information clearing, unifying the image size, randomly turning over the image, image segmentation processing, and randomly adjusting the brightness, contrast, saturation, and hue of the image, and specifically includes the following steps:

2-1) text message clearing:

(1) since KerasOCR provides a ready-made ocr character recognition model and an end-to-end training pipeline, the embodiment directly uses the api provided by KerasOCR to acquire character information in a sample picture and four vertex coordinate information of characters in the picture;

(2) acquiring a region needing to clear text from the four vertex coordinate information, and directly processing the region into rgb (255 ) (white) by using a repair function provided in OpenCV;

2-2) unifying image sizes: uniformly setting the sizes of all the display bitmap samples in the display bitmap sample set to 96×96 pixels, so that all the display bitmap samples in the display bitmap sample set are adapted to input nodes of the neural network;

2-3) randomly flipping the image:

in many image recognition problems, the flipping of the image generally does not affect the result of the recognition. The training image is turned over randomly when the model is trained, so that the trained model can better identify entities with different angles.

In this embodiment, a cv2.flip function () in the OpenCV library is used to randomly flip all the span-wise bitmap samples in the span-wise bitmap sample set;

2-4) image segmentation processing:

the image segmentation is performed by using the method of Conny operator in the embodiment, and the specific method is as follows:

(1) graying of images

(2) Gaussian smoothing of images

First, a two-dimensional Gaussian distribution matrix is generated:

p (x, y) =p (x) p (y) =frac {1} {2pi } exp (-frac { x 2+ y 2} {2 }) tag {5} uses a vector textbf { v = [ x y ] T for vectorization formula

Then, convolving it with the gray scale image to effect image filtering:

fs(x,y)＝f(x,y)*p(x,y)

(3) calculating gradient amplitude and direction:

when the change rate is calculated, the derivative is calculated for the unitary function, and the bias derivative is calculated for the binary function. In digital image processing, the gradient magnitude (change rate) of a gradation value is obtained by a first-order finite difference approximation method.

(4) Non-maximal suppression of gradient magnitude was performed (Non-Maximum Suppression, NMS).

The local maximum of the pixel is found and the gradient magnitudes before and after it are compared along the gradient direction. The maximum gradient amplitude in the field along its gradient direction is preserved, whereas it is suppressed. This step is mainly to exclude non-edge pixels, leaving only part of the thin lines (candidate edges).

(5) Edges are detected and connected using a double threshold method.

Selecting gradient amplitude values as a high threshold value TH and a low threshold value TL, wherein TH: TL is 2:1 or 3:1.

if the gradient magnitude at a pixel location exceeds TH, the pixel is retained as an edge pixel.

If the gradient magnitude at a pixel location is less than TL, then the pixel is excluded.

If the gradient magnitude at a pixel location is between TH and TL, that pixel is only preserved when connected to a higher level than the original pixel.

2-5) randomly adjusting brightness, contrast, saturation, hue of the image:

in this embodiment, the brightness and contrast of the image are randomly adjusted by using a cv2.convertScaleAbs () function in the OpenCV library, and the hue and saturation of the image are randomly adjusted by using a cv2.cvtColor () function;

3) Extracting features of all the display bitmap samples in the display bitmap sample set by adopting a SURF (Speeded Up Robust Features) algorithm to obtain a display bitmap feature set;

in this embodiment, the purpose of feature extraction is to extract information useful for tasks such as image recognition, classification, and retrieval from all the span map samples. These features typically have invariance to factors such as transformations, illumination, scale, etc., making subsequent tasks more stable and reliable. By extracting the features, the complex image data can be converted into a representation form which can be understood and processed by a computer, so that the automatic or semi-automatic identification of the image is realized.

4) Dividing the display bitmap characteristic set into a training set and a testing set, training the neural network model for multiple times by using the training set, verifying the corresponding performance of the neural network model after each training by using the testing set in sequence, and selecting the neural network model with the best performance as the neural network model for drawing the coordinates of the display region, wherein the specific mode is as follows:

4-1) defining a model structure:

in the embodiment, a ResNet18 model is adopted for training, and Residual blocks (Residual blocks) are introduced into the ResNet18, so that the training accuracy can be greatly improved;

4-2) initializing model parameters:

model structure: 7x7conv (stride=2) >3x3maxpool (stride=2);

model parameters: learning rate=0.01, epoch=100, mini-batch=128;

loss function: the cross entropy loss function (categorical cross-sentropy) is expressed in the following specific mathematical expression:

L＝-∑y_true*log(y_pred)

where y_true is the true label and y_pred is the probability of model prediction.

An optimizer: SGD, momentum=0.9

4-3) training the neural network model ResNet18 by using a training set, adjusting parameters for a plurality of times, and verifying the processing performance (namely the accuracy rate of the neural network model) of the neural network model corresponding to different parameters by using a testing set:

4-3-1) first adapting the model structure:

7x7conv(stride＝2)>3x3maxpool(stride＝1)；

after training, verifying by using a test set to obtain the accuracy of the trained neural network model as 80.7%;

4-3-2) second adapting the model structure:

7x7conv(stride＝1)>3x3maxpool(stride＝1)；

after training, verifying by using a test set to obtain the accuracy of the trained neural network model as 82.5%;

4-3-3) adjusting the batch_size parameter:

batch_size=128 > > batch_size=100, i.e., the batch_size of the neural network model is adjusted from 128 to 100, and the learning rate LR is adjusted to 0.008;

after training, verifying by using a test set to obtain that the accuracy of the trained neural network model is 84.78%;

4-3-4) third adjustment of the model structure:

3x3conv(stride＝1)>3x3maxpool(stride＝1)；

after training, verifying by using a test set to obtain the accuracy of the trained neural network model to be 86.23%;

4-3-5) fourth adjustment of the model structure:

3x3conv(stride＝1)>2x2maxpool(stride＝1)；

after training, verifying by using a test set to obtain that the accuracy of the trained neural network model is 88.31%;

4-3-6) again adjusting the learning rate:

the LR of the neural network model is sequentially adjusted to 0.004 and 0.002 from 0.008, after training, the accuracy of the trained neural network model with LR=0.002 is obtained by verification of a test set to be 95.3%;

4-4) from the above results, the optimal parameters of the neural network model in this embodiment are as follows:

model structure: 3x3conv (stride=1) >2x2maxpool (stride=1)

Model parameters: learning rate=0.002, training period epoch=100, training batch mini-batch=100;

L＝-∑y_true*log(y_pred)

An optimizer: SGD, momentum=0.9;

in the above steps, conv is a convolution kernel, stride is a step length, maxpool is a pooling window, LR is a learning rate, epoch is a training period, mini-batch is a minimum training batch, SGD is an optimizer, momentum is a gradient descent Momentum, and batch_size is a training batch.

4-5) selecting the neural network model with the best performance as the neural network model for drawing the coordinates of the booth area, wherein in the embodiment, the neural network model ResNet18 with the optimal parameters is adopted as the neural network model which can be used for acquiring all booth information and booth area coordinates on a paper-based booth plan to be processed after training;

5) And 2) after the pretreatment in the step 2) is carried out on the paper exhibition hall plan to be treated, acquiring all exhibition position information and exhibition position area coordinates on the paper exhibition hall plan to be treated by utilizing a neural network model ResNet18 with optimal parameters.

After all the booth areas are drawn on the online booth plan according to the obtained booth area coordinates through the svg technology, all booth information is in one-to-one correspondence, so that a client can click relevant information corresponding to the booth which is wanted to be checked on the online booth plan through a computer.

Experiments prove that the dynamic booth identification can be realized within 2 s.

According to the invention, along with continuous drawing of the display position information, the sample library data is increased at the same time, and the model training is more accurate, so that the recognition accuracy of the display position and the drawing accuracy of the display position are greatly improved

The above description is only of the preferred embodiments of the present invention and is not intended to limit the invention, and those skilled in the art will appreciate that the modifications made to the invention fall within the scope of the invention without departing from the spirit of the invention.

Claims

1. The method for drawing the display area based on the neural network model is characterized by comprising the following steps of:

2) Preprocessing all the display bitmap samples in the display bitmap sample set;

4) Dividing the display bitmap characteristic set into a training set and a testing set, training the neural network model by using the training set, and selecting the neural network model with the best performance as the neural network model for drawing the coordinates of the display region;

5) And after preprocessing the paper exhibition hall plan to be processed, acquiring all exhibition position information and exhibition position area coordinates on the paper exhibition hall plan to be processed by using a neural network model drawing the exhibition position area coordinates.

2. The method for drawing a booth area based on a neural network model according to claim 1, wherein the preprocessing includes one or more of text information clearing, uniform image size, random flipping of images, image segmentation processing, random adjustment of brightness, contrast, saturation, color phase of images.

3. The method for drawing a booth area based on a neural network model according to claim 1, wherein the SURF algorithm is used to perform feature extraction on all the booth image samples in the booth image sample set.

4. The method for drawing a booth area based on a neural network model according to claim 1, wherein in step 4), the neural network model for drawing the coordinates of the booth area is obtained as follows:

4-1) determining the structure of the neural network model;