CN114821610B

CN114821610B - Method for generating webpage code from image based on tree-shaped neural network

Info

Publication number: CN114821610B
Application number: CN202210527210.0A
Authority: CN
Inventors: 熊仁都; 谭业贵; 郭晓松; 宋云飞; 徐玉中
Original assignee: Three Gorges High Technology Information Technology Co ltd
Current assignee: Three Gorges High Technology Information Technology Co ltd
Priority date: 2022-05-16
Filing date: 2022-05-16
Publication date: 2022-11-29
Anticipated expiration: 2042-05-16
Also published as: CN114821610A

Abstract

The invention provides a method for generating a webpage code from an image based on a tree neural network, which comprises the steps of processing the image in a target webpage user interface and obtaining a U I screenshot image set; based on a preset convolutional neural network, performing feature extraction on the U I screenshot image in the UI screenshot image set to obtain a feature vector data set of the UI screenshot image; based on an improved tree-shaped long-short term memory neural network model, performing multi-label classification on the characteristic vector data set to obtain a classification result; and decoding the classification result to generate an HTML code of the target webpage user interface. The method is based on the improved deep learning technology of the tree neural network, and the HTML codes are generated according to the UI screenshot images, so that the generation quality of the deep HTML codes is improved, the time for manually writing the codes is saved, and the development efficiency of the HTML codes is improved.

Description

Method for generating webpage code from image based on tree-shaped neural network

Technical Field

The invention relates to the technical field of computers, in particular to a method for generating a webpage code from an image based on a tree neural network.

Background

With the rapid development and the increasing popularity of the Internet, the webpage becomes an important source for people to acquire information; converting the graphic user interface screenshot created by the designer into computer code, and using the computer code to construct customized software, websites and mobile application programs, which is important work to be completed by developers; the traditional webpage code generation technology needs a large amount of manual participation, has low flexibility and only serves as an auxiliary technology for coding work; the current method for generating webpage codes by using UI based on the deep learning technology can only generate a fixed one-layer or two-layer nested HTML codes, does not consider the natural tree structure of the HTML codes, and cannot generate codes with deeper layers.

Therefore, a method for generating web page codes from images based on a tree-like neural network is needed.

Disclosure of Invention

The invention provides a method for generating webpage codes from images based on a tree-shaped neural network, which is based on the deep learning technology of the improved tree-shaped neural network and generates HTML codes according to UI screenshot images, thereby improving the generation quality of deep HTML codes, saving the time for manually participating in code compiling and improving the development efficiency of the HTML codes.

The invention provides a method for generating a webpage code from an image based on a tree neural network, which comprises the following steps:

s1, processing images in a target webpage user interface, and acquiring a UI screenshot image set;

s2, based on a preset convolutional neural network, performing feature extraction on the UI screenshot images in the UI screenshot image set to obtain a feature vector data set of the UI screenshot images;

s3, based on the improved tree-shaped long-short term memory neural network model, performing multi-label classification on the feature vector data set to obtain a classification result;

and S4, decoding the classification result to generate an HTML code of the target webpage user interface.

Further, S2 includes:

s201, setting a convolutional neural network; the convolutional neural network comprises an input layer, a convolutional layer and an output layer; the convolution layer is connected with a normalization layer, the activation function of the convolution layer is a modified linear unit function, and the weight of the convolution layer is initialized by adopting standard normal distribution; the convolution layer is set as five layers; the convolution kernels of the first layer, the second layer and the third layer of the convolution layer are all set to be 5X5, and the convolution kernels of the fourth layer and the fifth layer of the convolution layer are all set to be 3X3;

and S202, inputting the UI screenshot images in the UI screenshot image set into the convolutional neural network to obtain a characteristic vector data set of the UI screenshot images.

Further, S3 includes:

s301, constructing a tree-shaped neural network classification model based on the improved long-term and short-term memory neural network model;

s302, classifying the feature vector data set based on the tree neural network classification model, and adding CSS attribute tags to obtain a tagged classification result.

Further, S4 includes:

s401, carrying out inverse mapping on the classified result with the label to obtain an HTML element code with a style;

s402, inserting the HTML element codes with the styles into corresponding node positions in a Document Object Model (DOM) tree structure one by one to generate HTML codes of corresponding UI screenshot images;

and S403, after all the HTML element codes with styles are inserted into the corresponding node positions, generating HTML codes of the target webpage user interface according to the document object model tree structure.

Further, S301 includes:

s3011, obtaining a historical characteristic vector data set of a UI screenshot image, and dividing the historical characteristic vector data set into a training set and a verification set;

s3012, constructing an improved long-term and short-term memory neural network model; the improved long-short term memory neural network model comprises a node unit, descendant node units of the node unit, an input gate, a forgetting gate, a first output gate and a second output gate; setting a Sigmoid activation function as a first processing mode, and setting a tanh activation function as a second processing mode;

processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a forgetting gate;

processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain an input door;

processing the input of the current node unit at the current moment and the input hidden layer state of the current node by adopting a second processing mode to obtain the input state of the current node unit at the current moment;

multiplying the output cell state of the last node unit at the last moment by the forgetting gate according to the element to obtain a first product; multiplying the state input by the current node unit at the current moment by an input gate according to elements to obtain a second product; adding and summing the first product and the second product to obtain the cell state of the current node unit at the current moment;

processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a first output gate;

processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a second output gate;

processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the first output gate to obtain the hidden layer state of the next node unit at the current moment;

processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the second output gate to obtain the initial cell state of a descendant node unit of the current node unit at the current moment;

processing the hidden layer state of the next node unit at the current moment by adopting a first processing mode to obtain an output result;

s3013, inputting the verification set into the improved long-short term memory neural network model for training, and generating a tree-like neural network classification model.

Further, S302 includes;

s3021, creating a first blank queue and a first root node Document Object Model (DOM) tree structure;

s3022, acquiring characteristics of the first UI screenshot image in the verification set;

s3023, inputting the characteristics of the first UI screenshot image into a tree-shaped neural network classification model initial node unit;

s3024, classifying the features to obtain classification results of the features, and obtaining initial cell states of descendant node units of the initial node units;

s3025, judging whether the classification is finished, if so, judging whether the first blank queue is empty, and if so, inversely mapping the Document Object Model (DOM) tree structure of the first root node to obtain an HTML code; if not, dequeuing a feature from the first blank queue, and turning to the step S3023 for the feature to perform an operation; if the classification is not finished, go to S3026;

s3026, inverse mapping is carried out on the classification result to obtain an HTML tag code, and the HTML tag code is inserted into a position corresponding to a Document Object Model (DOM) tree structure of the first root node; and simultaneously adding the characteristics of the initial node unit into the first blank queue buffer, generating the initial cell state of the next node unit of the initial node unit, and turning to S3024 to continue the classification operation.

Further, the step S302 of adding CSS attribute tags includes:

s302-1, obtaining a text of a CSS (cascading style sheet) quoted by a target webpage, and obtaining a text style set;

s302-2, screening the text style set, selecting a character string style and a symbol style in a text, and generating a first style set;

s302-3, acquiring the style of the text name with the connection number in the text style set, segmenting the style, abandoning the content after the connection number in the text name, reserving the content before the connection number in the text name, taking the reserved content as a new style of the text name, and renaming the text name to generate a new name text; summarizing the new name texts to generate a second style set;

s302-4, deleting the first style set and the second style set from the text style set to obtain a third style set;

s302-5, inputting the third style set into the tree-like neural network classification model to obtain a CSS style classification result, and setting the CSS style classification result as a classification label.

Further, S1 includes:

s101, converting an image in a target webpage user interface into an image with black and white gray value values;

s102, carrying out binarization on the image represented by the black and white gray value based on Canny operator edge detection to obtain an image represented by binarization;

s103, carrying out sliding interception on the image represented by the binaryzation by using a preset sliding window to obtain a screenshot image, and summarizing the screenshot image to obtain a UI screenshot image set.

Further, the method also comprises the following steps of S5: the accuracy verification is carried out on the generated HTML codes of the target webpage user interface, and the method comprises the following specific steps:

s501, acquiring a webpage source code corresponding to a target webpage user interface;

s502, capturing content in the webpage source code to obtain a webpage source code content character string;

s503, according to the webpage source code content character strings, constructing a character string error category library, and creating a plurality of global character string pattern matching functions, wherein the error categories correspond to corresponding global character string pattern matching functions;

s504, matching HTML codes of the target webpage user interface based on the global character string mode matching functions, and verifying the accuracy of the HTML codes.

Further, the method also comprises the step S6: the method for screenshot of the UI image comprises the following specific steps:

s601, obtaining size information of a UI image in a target webpage user interface to obtain minimum size information;

s602, initializing a display proportion of a target webpage, and setting the target webpage to be displayed in a 100% proportion;

s603, constructing a visual form mask on the target webpage, wherein the visual form mask covers all the target webpage, and the size of a cell in the visual form template is set to be 1/16 of the minimum size information;

s604, adjusting the display proportion of the target webpage, and adjusting the target webpage to be displayed according to 400% proportion;

s605, acquiring a table area corresponding to the UI image in the target webpage user interface and a cell position contained in the table area by using the visual table template, operating a mouse pointer to be arranged at a starting cell position corresponding to the UI image according to the cell position, pressing a left mouse button to generate a sliding window, intercepting the UI image, and finishing the interception of the UI image when the mouse pointer moves to an ending cell position corresponding to the UI image.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a schematic diagram of a method for generating a web page code from an image based on a tree neural network according to the present invention;

FIG. 2 is a schematic diagram of an implementation process of generating a web page code from an image based on a tree neural network according to the present invention;

FIG. 3 is a schematic diagram of a method for classification based on an improved tree-like long-short term memory neural network model according to the present invention;

FIG. 4 is a diagram of the tree neural network classification model of the present invention;

FIG. 5 is a schematic diagram of the HTML code encoding method of the present invention;

FIG. 6 is a schematic diagram illustrating a method for verifying the accuracy of HTML codes of a generated target webpage page according to the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it should be understood that they are presented herein only to illustrate and explain the present invention and not to limit the present invention.

A method for generating web page code from an image based on a tree neural network, as shown in fig. 1, includes:

The working principle of the technical scheme is as follows: the main content in the web page is constructed by an HTML language, and the traditional web page code generation technology needs developers to formulate detailed code generation rules or write templates. The technology of generating webpage codes end to end by using the UI images based on deep learning generates corresponding HTML codes for the attribute characteristics of the UI images, so that the working efficiency can be improved; the solution adopted in this embodiment is, as shown in fig. 2,

the method comprises the steps that firstly, images in a target webpage user interface are processed, and a UI screenshot image set is obtained;

secondly, based on a preset convolutional neural network, performing feature extraction on the UI screenshot images in the UI screenshot image set to obtain a feature vector data set of the UI screenshot images;

thirdly, based on an improved tree-shaped long-short term memory neural network model, performing multi-label classification on the characteristic vector data set to obtain a classification result;

the fourth step is decoding the classification result, inversely mapping the classification result into an HTML label code, inserting the HTML label code into a corresponding position of a Document Object Model (DOM) tree structure, checking whether all the nodes are generated completely, and continuing to the third step when not all the nodes are generated completely; and when all the nodes are generated, the generation process is finished.

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the HTML code is generated based on the improved tree-shaped long-short term memory neural network model according to the UI screenshot image, the generation quality of the deep HTML code is improved, the time for manually participating in code compiling is saved, and the development efficiency of the HTML code is improved.

In one embodiment, S2 comprises:

The working principle of the technical scheme is as follows: the convolutional neural network has the characteristics of parameter sharing, sparse connection and translation invariance, and is selected to process the image based on the characteristics; in order to prevent over-fitting, a normalization layer is arranged behind the convolution layer; in order to make the network output approximately obey the standard normal distribution, the weight of the convolutional layer is initialized by adopting the standard normal distribution, the mean value of the distribution is 0, the standard deviation is 0.1, and the bias is all initialized to be 0; meanwhile, a modified linear unit (ReLU) widely used for a convolutional neural network is adopted; based on the learning capability characteristic of the convolutional neural network and the data volume of the data set of the UI screenshot image characteristic vector, in this embodiment, the number of convolutional layers is set to five, and meanwhile, in order to facilitate obtaining large-scale structural information in the image, the sizes of convolutional kernels in the first three layers of convolutional layers are set to 5X5; in order to better acquire the detailed part of the image, the sizes of the two layers of convolution kernels are set to be 3X3; and after the network is set, inputting the UI screenshot images in the UI screenshot image set into the convolutional neural network to obtain a characteristic vector data set of the UI screenshot images.

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the good performance of the network can be ensured and the quality and efficiency of the UI screenshot image feature extraction can be improved by setting the convolutional neural network.

In one embodiment, as shown in fig. 3, S3 includes:

s301, constructing a tree-like neural network classification model based on the improved long-term and short-term memory neural network model;

The working principle of the technical scheme is as follows: the long-short term memory neural network model is a feedforward artificial neural network model, is inspired by the human nervous system, and consists of a plurality of node units, wherein the node units are divided into three layers: an input layer, a hidden layer, and an output layer. The input layer is used for receiving input, and the size of the input layer is the same as that of the feature vector; the hidden layer is used for mapping the input to the output through a highly nonlinear transformation; the output layer is used for generating a final output result; the method is based on a long-term and short-term memory neural network model, and the long-term and short-term memory neural network model is improved to construct a tree-like neural network classification model; and classifying the feature vector data set based on the tree-like neural network classification model, and adding a CSS attribute tag to obtain a tagged classification result.

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the long-term and short-term memory neural network model is adopted to construct the tree-like neural network classification model, and the CSS attribute tags are added, so that the quality of the UI screenshot image feature classification can be improved.

In one embodiment, S4 comprises:

s401, inverse mapping is carried out on the classification result with the label, and an HTML element code with a style is obtained;

The working principle of the technical scheme is as follows: after the classification result is obtained, inverse mapping is needed according to the classification structure, so as to obtain HTML element codes, for example, the CSS label classification results 01,10 and 11 are inversely mapped into the styles of color red, color blue and color green; mapping the content classification results 01,10 and 11 into width, 33 percent, 66 percent and 100 percent in an inverse manner; the HTML code corresponding to the classification result of 0110 is < div style = "color: red; 33% ">"; after HTML codes are obtained, inserting the HTML codes into corresponding node positions in a Document Object Model (DOM) tree structure one by one to generate HTML codes of corresponding UI screenshot images; and after all the HTML element codes with styles are inserted into the corresponding node positions, generating HTML codes of the target webpage user interface according to the document object model tree structure.

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the HTML element code with the style is obtained by inverse mapping; and inserted into a corresponding node position in a Document Object Model (DOM) tree structure, the accuracy of the generated HTML code can be improved.

In one embodiment, S301 includes:

multiplying the output cell state of the last node unit at the last moment by a forgetting gate according to elements to obtain a first product; multiplying the state input by the current node unit at the current moment by an input gate according to elements to obtain a second product; adding and summing the first product and the second product to obtain the cell state of the current node unit at the current moment;

and S3013, inputting the verification set into the improved long-short term memory neural network model for training to generate a tree neural network classification model.

The working principle of the technical scheme is as follows: the long-short term memory model protects and controls the cell state by introducing a gate control unit and a memory cell and by an input gate, a forgetting gate and an output gate, thereby solving the problem of gradient disappearance in a circulating neural network. The model is trained through forward propagation and backward propagation, wherein in the forward propagation process, the node unit of each layer receives the activation value of the node unit of the previous layer as input, and the activation value of the layer is calculated through a weight matrix, a bias and an activation function Sigmoid; calculating an output value through a tanh activation function; the specific steps of the embodiment are as follows:

s3012, constructing an improved long-short term memory neural network model, wherein the improved long-short term memory neural network model comprises node units, descendant node units of the node units, an input gate, a forgetting gate, a first output gate and a second output gate, and the improved long-short term memory neural network model is shown in figure 4; setting a Sigmoid activation function as a first processing mode, wherein the function outputs a numerical value between 0 and 1 to describe how much of each part can pass through, 0 represents that 'no quantity is allowed to pass through', and 1 represents that 'any quantity is allowed to pass through'; setting a tanh activation function as a second processing mode;

processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a forgetting gate; the calculation formula is as follows:

f _t ＝σ(W _if x _t +b _if +W _hf h _t-1 +b _hf )

in the above formula, f _t For forgetting gate, it is used to determine how much the state of the node unit at the previous time is reserved to the current time, x _t As input to the current node unit at the current time, W _if To correspond to x _t Weight matrix of h _t-1 As the input hidden state of the current node, b _if To correspond to x _t Offset of (2), W _hf To correspond to h _t-1 Weight matrix of b _hf To correspond to h _t-1 Bias of (3); σ denotes Sigmoid function; t is a time value;

processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain an input door; the calculation formula is as follows:

i _t ＝σ(W _ii x _t +b _ii +w _hi h _t-1 +b _hi )

in the above formula, i _t Is an input gate for determining the currentHow many inputs of the current node element are saved to the element state, x _t As input to the current node unit at the current time, W _ii To correspond to x _t Weight matrix of h _t-1 As the input hidden state of the current node, b _ii To correspond to x _t Offset of (2), W _hi To correspond to h _t-1 Weight matrix of b _hi To correspond to h _t-1 Bias of (3); σ denotes Sigmoid function; t is a time value;

processing the input of the current node unit at the current moment and the input hidden layer state of the current node by adopting a second processing mode to obtain the input state of the current node unit at the current moment; the calculation formula is as follows:

g _t ＝tanh(W _ig x _t +b _ig +W _hg h _t-1 +b _hg )

in the above formula, g _t For the state of the current node element input at the current time, x _t As input to the current node unit at the current time, W _ig To correspond to x _t Weight matrix of h _t-1 As the input hidden state of the current node, b _ig To correspond to x _t Offset of (2), W _hg To correspond to h _t-1 Weight matrix of b _hg To correspond to h _t-1 Bias of (c); tanh refers to the tanh activation function; t is a time value;

multiplying the output cell state of the last node unit at the last moment by the forgetting gate according to the element to obtain a first product; multiplying the state input by the current node unit at the current moment by an input gate according to elements to obtain a second product; adding and summing the first product and the second product to obtain the cell state of the current node unit at the current moment; the calculation formula is as follows:

c _t ＝f _t *c _t-1 +i _t *g _t

in the above formula, g _t For the state of the current node element input at the current time, i _t To the input gate, f _t To forget the door, c _t-1 The cell state of a node unit at the last time; c. C _t Is the current timeThe cellular state of the node unit; t is a time value;

processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a first output gate; the calculation formula is as follows:

o _t ＝σ(W _io x _t +b _io +W _ho h _t-1 +b _ho )

in the above formula, o _t A first output gate; x is the number of _t As input to the current node unit at the current time, W _io To correspond to x _t Weight matrix of h _t-1 As the input hidden state of the current node, b _io To correspond to x _t Offset of (2), W _ho To correspond to h _t-1 Weight matrix of b _ho To correspond to h _t-1 Bias of (3); σ denotes Sigmoid function; t is a time value;

processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a second output gate; the calculation formula is as follows:

s _t ＝σ(W _is x _t +b _is +W _hs h _t-1 +b _hs )

in the above formula, s _t Is a second output gate, x _t As input to the current node unit at the current time, W _is To correspond to x _t Weight matrix of h _t-1 As the input hidden state of the current node, b _is Is corresponding to x _t Offset of (2), W _hs To correspond to h _t-1 Weight matrix of b _hs To correspond to h _t-1 Bias of (c); σ denotes Sigmoid function; t is a time value;

processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the first output gate to obtain the hidden layer state of the next node unit at the current moment; the calculation formula is as follows:

h _t ＝o _t *tanh(c _t )

in the above formula, the first and second groups of the formula,h _t is the hidden layer state of the next node unit at the current time o _t Is a first output gate, c _t For the cell state of the current node unit at the current moment, tanh refers to a tanh activation function; t is a time value;

processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the second output gate to obtain the initial cell state of a descendant node unit of the current node unit at the current moment; the calculation formula is as follows:

c _t0 ＝s _t *tanh(c _t )

in the above formula, c _t0 Is the initial cell state, s, of the descendant node unit of the current node unit at the current time _t Is a second output gate, c _t For the cell state of the current node unit at the current moment, tanh refers to a tanh activation function; t is a time value;

processing the hidden layer state of the next node unit at the current moment by adopting a first processing mode to obtain an output result; the calculation formula is as follows:

y _t ＝σ(W _classify h _t +b _classify )

in the above formula, y _t To classify the result, h _t Is the hidden state of the next node unit at the current time, W _classify To correspond to h _t Weight matrix of b _classify To correspond to h _t Bias of (3); σ denotes Sigmoid function; t is a time value;

s3013, inputting the verification set into the improved long-term and short-term memory model for training, and generating a tree-like neural network classification model.

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the improved long-term and short-term memory neural network model is adopted, and the output gate is added, so that the construction of the tree-like neural network classification model is facilitated, and the classification efficiency is improved.

In one embodiment, as shown in fig. 5, S302 includes;

s3025, judging whether the classification is finished, if so, judging whether the first blank queue is empty, and if so, inversely mapping the Document Object Model (DOM) tree structure of the first root node to obtain an HTML code; if not, dequeuing a feature from the first blank queue, and going to the step S3023 for the feature to perform an operation; if the classification is not finished, the step goes to S3026;

The working principle of the technical scheme is as follows: a Document Object Model (DOM) may be application programmed with respect to HTML documents, which for any HTML document may be rendered as a structure of multiple levels of nodes. The embodiment edits the HTML code based on the document object model and the queue operation, and specifically includes:

s302 comprises the following steps;

s3026, inverse mapping is carried out on the classification result to obtain an HTML tag code, and the HTML tag code is inserted into a position corresponding to a Document Object Model (DOM) tree structure of the first root node; and simultaneously adding the characteristics of the initial node unit into the first blank queue cache, generating the initial cell state of the next node unit of the initial node unit, and turning to S3024 to continue the classification operation.

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the hierarchy of the tree neural network classification model can be increased and the operation speed can be improved by adopting a Document Object Model (DOM) tree structure.

In one embodiment, the S302 adding CSS attribute tags includes:

s302-1, acquiring a text of a Cascading Style Sheet (CSS) referred by a target webpage, and acquiring a text style set;

s302-3, acquiring the style of the text name with the connection number in the text style set, segmenting the style, discarding the content behind the connection number in the text name, reserving the content in front of the connection number in the text name, taking the reserved content as a new style of the text name, renaming the text name, and generating a new name text; summarizing the new name texts to generate a second style set;

The working principle of the technical scheme is as follows: CSS is an declarative, domain-specific language that defines the way structured documents are rendered and can also be used to build user interfaces for desktop and mobile-side applications. Since CSS style files are text-based content and both space and linefeed are unnecessary redundant styles; traversing is needed to be carried out in all style texts, and deletion operation is carried out on rules containing the style names, so that the simplified style with independent attribute values can be obtained; the embodiment specifically includes:

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the CSS style is processed, so that the complexity of the CSS style can be effectively reduced, the selectivity of the CSS attribute value is improved, and the identification degree of the classification label is enhanced.

In one embodiment, S1 comprises:

s101, converting an image in a target webpage user interface into an image with black and white gray values;

The working principle of the technical scheme is as follows: the gray value of the image is adopted to represent the image, so that the storage space and the calculation time can be saved; through the binarization processing of the image, the data volume of the image can be reduced; the Canny operator has obvious application effect in the aspects of image segmentation and small target movement tracking, can maximally inhibit false response edges and can inhibit the influence of noise;

the Canny operator is an operator containing an optimization idea, is an algorithm which effectively combines a Gaussian function and a first derivative, and has low error rate, high positioning precision and a unilateral response criterion. The edge detection step of the operator is as follows:

the first step, a Gaussian function and an image function are used for convolution, noise in an image is smoothed, and a filtered image is obtained;

secondly, calculating the gradient amplitudes of the image in the horizontal and vertical directions by using a finite difference operator;

thirdly, traversing the gradient amplitude image, comparing the gradient value of each pixel with the gradient values of two adjacent pixels in the gradient direction of the pixel, judging whether the gradient value is the maximum value, if so, the pixel is possibly an edge point, reserving the gradient value, if not, the pixel is certainly not the edge point, and setting the gradient value of the pixel to be zero so as to ensure the single edge of the detection result;

and fourthly, detecting and connecting edges by adopting a double-threshold algorithm. Setting two thresholds, wherein one is a high threshold and the other is a low threshold, segmenting the result image processed in the third step, and further detecting edge points; in this step, the dual thresholds need to be manually set, but the self-adaptive capability is poor, the same set of thresholds act on different images, and the edge detection results are very different. If the threshold value is too high, edge points of partial images can be filtered, the obtained edges are easy to generate edge breakage, and some important boundaries are lost; if the threshold is set too low, some noise or non-target edge points can be detected, so that the false detection problem is generated, the edge positioning accuracy is reduced, and the quality of the edge detection result is influenced. Therefore, in the embodiment, a gray level histogram of an image is utilized, and an inter-class variance of a target and a background is taken as a measure criterion to select an optimal threshold, specifically, gray level information of the gradient image is utilized to divide pixel points in the gradient image into an edge class and a background class, when the gradient pixel points in the edge class and the background class have a wrong time, the inter-class variance in the two classes is reduced, so that when the inter-class variance in a classification result is maximum, the boundary and the background are classified most correctly, and the probability that the pixel points are classified in a wrong time is minimum; the principle is that a digital image with m different gray levels is supposed to be provided; setting a threshold value r (0-r-t-m-1), carrying out threshold processing on the gradient amplitude image, dividing gradient amplitude pixel points into an image background gray level set and an image edge gray level set, and calculating the inter-class variance of the gradient amplitude pixel points, wherein the calculation formula is as follows:

in the above formula, r is a threshold value of 0<r<m-1，β ² (r) is the inter-class variance, T, corresponding to the threshold r ₁ (r) is the probability that an image pixel is assigned to a gray level set comprised by the image background; p _∈ The probability of the image gray histogram is the average gray value of the whole image; n (r) is the cumulative mean value of the gray levels up to r; m is the number of gray levels, belongs to the gray level belonging to the ∈ 0,1,2 \8230, m-1; calculating a threshold value which enables the inter-class variance to be maximum through the formula, namely the optimal threshold value; by adopting the optimal threshold value, the self-adaptive capacity of the algorithm can be improved.

The embodiment adopts the above processing method to process the image, and specifically includes:

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the interception quality and the working efficiency of the UI screenshot image can be improved by processing the UI screenshot image; by improving the threshold value in Canny operator edge detection, the self-adaptive capacity of the algorithm can be improved, and the accuracy of the edge detection result can be improved.

In one embodiment, as shown in fig. 6, further comprising S5: the accuracy verification is carried out on the generated HTML codes of the target webpage user interface, and the specific steps are as follows:

s502, capturing the content in the webpage source code to obtain a webpage source code content character string;

The working principle of the technical scheme is as follows: after the web page code is generated, it is necessary to verify the accuracy of the code in order to check for errors and correct them in time. The specific steps of the embodiment are as follows:

s504, matching HTML codes of the target webpage user interface based on the global character string pattern matching functions, and verifying the accuracy of the HTML codes.

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the error code can be found in time by verifying the accuracy of the generated code, and reference basis is provided for improving the network structure and correcting the generation method.

In one embodiment, further comprising S6: the method for screenshot of the UI image comprises the following specific steps:

s603, constructing a visual form mask on the target webpage, wherein the visual form mask covers all target webpage, and the size of the cells in the visual form mask is set to be 1/16 of the minimum size information;

The working principle of the technical scheme is as follows: when a UI screenshot image in a target webpage user interface is obtained, the accuracy of screenshot and the quality of the screenshot are guaranteed, so that the UI screenshot can be better obtained; the screenshot method comprises the following steps:

s602, initializing a target webpage display proportion, and setting the target webpage to be displayed in a 100% proportion;

The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the visual mask is adopted, the display scale is adjusted, the area range of the UI image can be accurately selected, and the accuracy of capturing the UI screenshot image is improved.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for generating webpage codes from images based on a tree neural network is characterized by comprising the following steps:

s3, performing multi-label classification on the feature vector data set based on a tree neural network classification model to obtain a classification result;

s4, decoding the classification result to generate an HTML code of a target webpage user interface;

the S3 comprises the following steps:

s302, classifying the feature vector data set based on the tree neural network classification model, and adding CSS attribute tags to obtain a tagged classification result;

the S4 comprises the following steps:

s402, inserting the HTML element codes with styles into corresponding node positions in a Document Object Model (DOM) tree structure one by one to generate HTML codes of corresponding UI screenshot images;

s403, after all the HTML element codes with styles are inserted into corresponding node positions, generating HTML codes of a target webpage user interface according to the DOM tree structure;

wherein, S302 includes:

s3022, acquiring characteristics of a first UI screenshot image in the verification set; the verification set is obtained by dividing a vector data set according to the historical characteristics of the UI screenshot image;

s3025, judging whether the classification is finished, if so, judging whether the first blank queue is empty, and if so, inversely mapping the DOM tree structure of the first root node document object model to obtain an HTML code; if not, dequeuing a feature from the first blank queue, and going to the step S3023 for the feature to perform an operation; if the classification is not finished, the step is transferred to the step S3026;

s3026, inversely mapping the classification result to obtain an HTML tag code, and inserting the HTML tag code into a position corresponding to a DOM tree structure of the first root node document object model; simultaneously adding the characteristics of the initial node unit into the first blank queue cache, generating the initial cell state of the next node unit of the initial node unit, and turning to the step S3024 to continue the classification operation;

wherein, the step S302 of adding CSS attribute tags includes:

2. The method for generating webpage code from image based on the tree neural network as claimed in claim 1, wherein S2 comprises:

s201, setting a convolutional neural network; the convolutional neural network comprises an input layer, a convolutional layer and an output layer; the convolution layer is connected with a normalization layer, the activation function of the convolution layer is a modified linear unit function, and the weight of the convolution layer is initialized by adopting standard normal distribution; the convolution layer is set as five layers; the sizes of convolution kernels of the first layer, the second layer and the third layer of the convolution layer are all set to be 5 multiplied by 5, and the sizes of convolution kernels of the fourth layer and the fifth layer of the convolution layer are all set to be 3 multiplied by 3;

3. The method for generating webpage code from image based on tree-like neural network of claim 1, wherein S301 comprises:

4. The method for generating webpage code from image based on the tree neural network as claimed in claim 1, wherein S1 comprises:

s102, based on Canny operator edge detection, carrying out binaryzation on the image represented by the black and white gray value to obtain an image represented by binaryzation;

5. The method for generating webpage code from image based on the tree neural network as claimed in claim 1, further comprising S5: the accuracy verification is carried out on the generated HTML codes of the target webpage user interface, and the specific steps are as follows:

6. The method for generating webpage code from image based on the tree neural network as claimed in claim 1, further comprising S6: the method for screenshot the UI image comprises the following specific steps:

s603, constructing a visual form template on the target webpage, wherein the visual form template covers all target webpage pages, and the size of cells in the visual form template is set to be 1/16 of the minimum size information;