CN114821610B - Method for generating webpage code from image based on tree-shaped neural network - Google Patents

Method for generating webpage code from image based on tree-shaped neural network Download PDF

Info

Publication number
CN114821610B
CN114821610B CN202210527210.0A CN202210527210A CN114821610B CN 114821610 B CN114821610 B CN 114821610B CN 202210527210 A CN202210527210 A CN 202210527210A CN 114821610 B CN114821610 B CN 114821610B
Authority
CN
China
Prior art keywords
image
neural network
style
node unit
tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210527210.0A
Other languages
Chinese (zh)
Other versions
CN114821610A (en
Inventor
熊仁都
谭业贵
郭晓松
宋云飞
徐玉中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Three Gorges High Technology Information Technology Co ltd
Original Assignee
Three Gorges High Technology Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Three Gorges High Technology Information Technology Co ltd filed Critical Three Gorges High Technology Information Technology Co ltd
Priority to CN202210527210.0A priority Critical patent/CN114821610B/en
Publication of CN114821610A publication Critical patent/CN114821610A/en
Application granted granted Critical
Publication of CN114821610B publication Critical patent/CN114821610B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/146Coding or compression of tree-structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/34Graphical or visual programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/73Program documentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/1444Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields
    • G06V30/1456Selective acquisition, locating or processing of specific regions, e.g. highlighted text, fiducial marks or predetermined fields based on user interactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • G06V30/18019Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections by matching or filtering
    • G06V30/18038Biologically-inspired filters, e.g. difference of Gaussians [DoG], Gabor filters
    • G06V30/18048Biologically-inspired filters, e.g. difference of Gaussians [DoG], Gabor filters with interaction between the responses of different filters, e.g. cortical complex cells
    • G06V30/18057Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text

Abstract

The invention provides a method for generating a webpage code from an image based on a tree neural network, which comprises the steps of processing the image in a target webpage user interface and obtaining a U I screenshot image set; based on a preset convolutional neural network, performing feature extraction on the U I screenshot image in the UI screenshot image set to obtain a feature vector data set of the UI screenshot image; based on an improved tree-shaped long-short term memory neural network model, performing multi-label classification on the characteristic vector data set to obtain a classification result; and decoding the classification result to generate an HTML code of the target webpage user interface. The method is based on the improved deep learning technology of the tree neural network, and the HTML codes are generated according to the UI screenshot images, so that the generation quality of the deep HTML codes is improved, the time for manually writing the codes is saved, and the development efficiency of the HTML codes is improved.

Description

Method for generating webpage code from image based on tree-shaped neural network
Technical Field
The invention relates to the technical field of computers, in particular to a method for generating a webpage code from an image based on a tree neural network.
Background
With the rapid development and the increasing popularity of the Internet, the webpage becomes an important source for people to acquire information; converting the graphic user interface screenshot created by the designer into computer code, and using the computer code to construct customized software, websites and mobile application programs, which is important work to be completed by developers; the traditional webpage code generation technology needs a large amount of manual participation, has low flexibility and only serves as an auxiliary technology for coding work; the current method for generating webpage codes by using UI based on the deep learning technology can only generate a fixed one-layer or two-layer nested HTML codes, does not consider the natural tree structure of the HTML codes, and cannot generate codes with deeper layers.
Therefore, a method for generating web page codes from images based on a tree-like neural network is needed.
Disclosure of Invention
The invention provides a method for generating webpage codes from images based on a tree-shaped neural network, which is based on the deep learning technology of the improved tree-shaped neural network and generates HTML codes according to UI screenshot images, thereby improving the generation quality of deep HTML codes, saving the time for manually participating in code compiling and improving the development efficiency of the HTML codes.
The invention provides a method for generating a webpage code from an image based on a tree neural network, which comprises the following steps:
s1, processing images in a target webpage user interface, and acquiring a UI screenshot image set;
s2, based on a preset convolutional neural network, performing feature extraction on the UI screenshot images in the UI screenshot image set to obtain a feature vector data set of the UI screenshot images;
s3, based on the improved tree-shaped long-short term memory neural network model, performing multi-label classification on the feature vector data set to obtain a classification result;
and S4, decoding the classification result to generate an HTML code of the target webpage user interface.
Further, S2 includes:
s201, setting a convolutional neural network; the convolutional neural network comprises an input layer, a convolutional layer and an output layer; the convolution layer is connected with a normalization layer, the activation function of the convolution layer is a modified linear unit function, and the weight of the convolution layer is initialized by adopting standard normal distribution; the convolution layer is set as five layers; the convolution kernels of the first layer, the second layer and the third layer of the convolution layer are all set to be 5X5, and the convolution kernels of the fourth layer and the fifth layer of the convolution layer are all set to be 3X3;
and S202, inputting the UI screenshot images in the UI screenshot image set into the convolutional neural network to obtain a characteristic vector data set of the UI screenshot images.
Further, S3 includes:
s301, constructing a tree-shaped neural network classification model based on the improved long-term and short-term memory neural network model;
s302, classifying the feature vector data set based on the tree neural network classification model, and adding CSS attribute tags to obtain a tagged classification result.
Further, S4 includes:
s401, carrying out inverse mapping on the classified result with the label to obtain an HTML element code with a style;
s402, inserting the HTML element codes with the styles into corresponding node positions in a Document Object Model (DOM) tree structure one by one to generate HTML codes of corresponding UI screenshot images;
and S403, after all the HTML element codes with styles are inserted into the corresponding node positions, generating HTML codes of the target webpage user interface according to the document object model tree structure.
Further, S301 includes:
s3011, obtaining a historical characteristic vector data set of a UI screenshot image, and dividing the historical characteristic vector data set into a training set and a verification set;
s3012, constructing an improved long-term and short-term memory neural network model; the improved long-short term memory neural network model comprises a node unit, descendant node units of the node unit, an input gate, a forgetting gate, a first output gate and a second output gate; setting a Sigmoid activation function as a first processing mode, and setting a tanh activation function as a second processing mode;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a forgetting gate;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain an input door;
processing the input of the current node unit at the current moment and the input hidden layer state of the current node by adopting a second processing mode to obtain the input state of the current node unit at the current moment;
multiplying the output cell state of the last node unit at the last moment by the forgetting gate according to the element to obtain a first product; multiplying the state input by the current node unit at the current moment by an input gate according to elements to obtain a second product; adding and summing the first product and the second product to obtain the cell state of the current node unit at the current moment;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a first output gate;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a second output gate;
processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the first output gate to obtain the hidden layer state of the next node unit at the current moment;
processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the second output gate to obtain the initial cell state of a descendant node unit of the current node unit at the current moment;
processing the hidden layer state of the next node unit at the current moment by adopting a first processing mode to obtain an output result;
s3013, inputting the verification set into the improved long-short term memory neural network model for training, and generating a tree-like neural network classification model.
Further, S302 includes;
s3021, creating a first blank queue and a first root node Document Object Model (DOM) tree structure;
s3022, acquiring characteristics of the first UI screenshot image in the verification set;
s3023, inputting the characteristics of the first UI screenshot image into a tree-shaped neural network classification model initial node unit;
s3024, classifying the features to obtain classification results of the features, and obtaining initial cell states of descendant node units of the initial node units;
s3025, judging whether the classification is finished, if so, judging whether the first blank queue is empty, and if so, inversely mapping the Document Object Model (DOM) tree structure of the first root node to obtain an HTML code; if not, dequeuing a feature from the first blank queue, and turning to the step S3023 for the feature to perform an operation; if the classification is not finished, go to S3026;
s3026, inverse mapping is carried out on the classification result to obtain an HTML tag code, and the HTML tag code is inserted into a position corresponding to a Document Object Model (DOM) tree structure of the first root node; and simultaneously adding the characteristics of the initial node unit into the first blank queue buffer, generating the initial cell state of the next node unit of the initial node unit, and turning to S3024 to continue the classification operation.
Further, the step S302 of adding CSS attribute tags includes:
s302-1, obtaining a text of a CSS (cascading style sheet) quoted by a target webpage, and obtaining a text style set;
s302-2, screening the text style set, selecting a character string style and a symbol style in a text, and generating a first style set;
s302-3, acquiring the style of the text name with the connection number in the text style set, segmenting the style, abandoning the content after the connection number in the text name, reserving the content before the connection number in the text name, taking the reserved content as a new style of the text name, and renaming the text name to generate a new name text; summarizing the new name texts to generate a second style set;
s302-4, deleting the first style set and the second style set from the text style set to obtain a third style set;
s302-5, inputting the third style set into the tree-like neural network classification model to obtain a CSS style classification result, and setting the CSS style classification result as a classification label.
Further, S1 includes:
s101, converting an image in a target webpage user interface into an image with black and white gray value values;
s102, carrying out binarization on the image represented by the black and white gray value based on Canny operator edge detection to obtain an image represented by binarization;
s103, carrying out sliding interception on the image represented by the binaryzation by using a preset sliding window to obtain a screenshot image, and summarizing the screenshot image to obtain a UI screenshot image set.
Further, the method also comprises the following steps of S5: the accuracy verification is carried out on the generated HTML codes of the target webpage user interface, and the method comprises the following specific steps:
s501, acquiring a webpage source code corresponding to a target webpage user interface;
s502, capturing content in the webpage source code to obtain a webpage source code content character string;
s503, according to the webpage source code content character strings, constructing a character string error category library, and creating a plurality of global character string pattern matching functions, wherein the error categories correspond to corresponding global character string pattern matching functions;
s504, matching HTML codes of the target webpage user interface based on the global character string mode matching functions, and verifying the accuracy of the HTML codes.
Further, the method also comprises the step S6: the method for screenshot of the UI image comprises the following specific steps:
s601, obtaining size information of a UI image in a target webpage user interface to obtain minimum size information;
s602, initializing a display proportion of a target webpage, and setting the target webpage to be displayed in a 100% proportion;
s603, constructing a visual form mask on the target webpage, wherein the visual form mask covers all the target webpage, and the size of a cell in the visual form template is set to be 1/16 of the minimum size information;
s604, adjusting the display proportion of the target webpage, and adjusting the target webpage to be displayed according to 400% proportion;
s605, acquiring a table area corresponding to the UI image in the target webpage user interface and a cell position contained in the table area by using the visual table template, operating a mouse pointer to be arranged at a starting cell position corresponding to the UI image according to the cell position, pressing a left mouse button to generate a sliding window, intercepting the UI image, and finishing the interception of the UI image when the mouse pointer moves to an ending cell position corresponding to the UI image.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a schematic diagram of a method for generating a web page code from an image based on a tree neural network according to the present invention;
FIG. 2 is a schematic diagram of an implementation process of generating a web page code from an image based on a tree neural network according to the present invention;
FIG. 3 is a schematic diagram of a method for classification based on an improved tree-like long-short term memory neural network model according to the present invention;
FIG. 4 is a diagram of the tree neural network classification model of the present invention;
FIG. 5 is a schematic diagram of the HTML code encoding method of the present invention;
FIG. 6 is a schematic diagram illustrating a method for verifying the accuracy of HTML codes of a generated target webpage page according to the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it should be understood that they are presented herein only to illustrate and explain the present invention and not to limit the present invention.
A method for generating web page code from an image based on a tree neural network, as shown in fig. 1, includes:
s1, processing images in a target webpage user interface, and acquiring a UI screenshot image set;
s2, based on a preset convolutional neural network, performing feature extraction on the UI screenshot images in the UI screenshot image set to obtain a feature vector data set of the UI screenshot images;
s3, based on the improved tree-shaped long-short term memory neural network model, performing multi-label classification on the feature vector data set to obtain a classification result;
and S4, decoding the classification result to generate an HTML code of the target webpage user interface.
The working principle of the technical scheme is as follows: the main content in the web page is constructed by an HTML language, and the traditional web page code generation technology needs developers to formulate detailed code generation rules or write templates. The technology of generating webpage codes end to end by using the UI images based on deep learning generates corresponding HTML codes for the attribute characteristics of the UI images, so that the working efficiency can be improved; the solution adopted in this embodiment is, as shown in fig. 2,
the method comprises the steps that firstly, images in a target webpage user interface are processed, and a UI screenshot image set is obtained;
secondly, based on a preset convolutional neural network, performing feature extraction on the UI screenshot images in the UI screenshot image set to obtain a feature vector data set of the UI screenshot images;
thirdly, based on an improved tree-shaped long-short term memory neural network model, performing multi-label classification on the characteristic vector data set to obtain a classification result;
the fourth step is decoding the classification result, inversely mapping the classification result into an HTML label code, inserting the HTML label code into a corresponding position of a Document Object Model (DOM) tree structure, checking whether all the nodes are generated completely, and continuing to the third step when not all the nodes are generated completely; and when all the nodes are generated, the generation process is finished.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the HTML code is generated based on the improved tree-shaped long-short term memory neural network model according to the UI screenshot image, the generation quality of the deep HTML code is improved, the time for manually participating in code compiling is saved, and the development efficiency of the HTML code is improved.
In one embodiment, S2 comprises:
s201, setting a convolutional neural network; the convolutional neural network comprises an input layer, a convolutional layer and an output layer; the convolution layer is connected with a normalization layer, the activation function of the convolution layer is a modified linear unit function, and the weight of the convolution layer is initialized by adopting standard normal distribution; the convolution layer is set as five layers; the convolution kernels of the first layer, the second layer and the third layer of the convolution layer are all set to be 5X5, and the convolution kernels of the fourth layer and the fifth layer of the convolution layer are all set to be 3X3;
and S202, inputting the UI screenshot images in the UI screenshot image set into the convolutional neural network to obtain a characteristic vector data set of the UI screenshot images.
The working principle of the technical scheme is as follows: the convolutional neural network has the characteristics of parameter sharing, sparse connection and translation invariance, and is selected to process the image based on the characteristics; in order to prevent over-fitting, a normalization layer is arranged behind the convolution layer; in order to make the network output approximately obey the standard normal distribution, the weight of the convolutional layer is initialized by adopting the standard normal distribution, the mean value of the distribution is 0, the standard deviation is 0.1, and the bias is all initialized to be 0; meanwhile, a modified linear unit (ReLU) widely used for a convolutional neural network is adopted; based on the learning capability characteristic of the convolutional neural network and the data volume of the data set of the UI screenshot image characteristic vector, in this embodiment, the number of convolutional layers is set to five, and meanwhile, in order to facilitate obtaining large-scale structural information in the image, the sizes of convolutional kernels in the first three layers of convolutional layers are set to 5X5; in order to better acquire the detailed part of the image, the sizes of the two layers of convolution kernels are set to be 3X3; and after the network is set, inputting the UI screenshot images in the UI screenshot image set into the convolutional neural network to obtain a characteristic vector data set of the UI screenshot images.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the good performance of the network can be ensured and the quality and efficiency of the UI screenshot image feature extraction can be improved by setting the convolutional neural network.
In one embodiment, as shown in fig. 3, S3 includes:
s301, constructing a tree-like neural network classification model based on the improved long-term and short-term memory neural network model;
s302, classifying the feature vector data set based on the tree neural network classification model, and adding CSS attribute tags to obtain a tagged classification result.
The working principle of the technical scheme is as follows: the long-short term memory neural network model is a feedforward artificial neural network model, is inspired by the human nervous system, and consists of a plurality of node units, wherein the node units are divided into three layers: an input layer, a hidden layer, and an output layer. The input layer is used for receiving input, and the size of the input layer is the same as that of the feature vector; the hidden layer is used for mapping the input to the output through a highly nonlinear transformation; the output layer is used for generating a final output result; the method is based on a long-term and short-term memory neural network model, and the long-term and short-term memory neural network model is improved to construct a tree-like neural network classification model; and classifying the feature vector data set based on the tree-like neural network classification model, and adding a CSS attribute tag to obtain a tagged classification result.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the long-term and short-term memory neural network model is adopted to construct the tree-like neural network classification model, and the CSS attribute tags are added, so that the quality of the UI screenshot image feature classification can be improved.
In one embodiment, S4 comprises:
s401, inverse mapping is carried out on the classification result with the label, and an HTML element code with a style is obtained;
s402, inserting the HTML element codes with the styles into corresponding node positions in a Document Object Model (DOM) tree structure one by one to generate HTML codes of corresponding UI screenshot images;
and S403, after all the HTML element codes with styles are inserted into the corresponding node positions, generating HTML codes of the target webpage user interface according to the document object model tree structure.
The working principle of the technical scheme is as follows: after the classification result is obtained, inverse mapping is needed according to the classification structure, so as to obtain HTML element codes, for example, the CSS label classification results 01,10 and 11 are inversely mapped into the styles of color red, color blue and color green; mapping the content classification results 01,10 and 11 into width, 33 percent, 66 percent and 100 percent in an inverse manner; the HTML code corresponding to the classification result of 0110 is < div style = "color: red; 33% ">"; after HTML codes are obtained, inserting the HTML codes into corresponding node positions in a Document Object Model (DOM) tree structure one by one to generate HTML codes of corresponding UI screenshot images; and after all the HTML element codes with styles are inserted into the corresponding node positions, generating HTML codes of the target webpage user interface according to the document object model tree structure.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the HTML element code with the style is obtained by inverse mapping; and inserted into a corresponding node position in a Document Object Model (DOM) tree structure, the accuracy of the generated HTML code can be improved.
In one embodiment, S301 includes:
s3011, obtaining a historical characteristic vector data set of a UI screenshot image, and dividing the historical characteristic vector data set into a training set and a verification set;
s3012, constructing an improved long-term and short-term memory neural network model; the improved long-short term memory neural network model comprises a node unit, descendant node units of the node unit, an input gate, a forgetting gate, a first output gate and a second output gate; setting a Sigmoid activation function as a first processing mode, and setting a tanh activation function as a second processing mode;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a forgetting gate;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain an input door;
processing the input of the current node unit at the current moment and the input hidden layer state of the current node by adopting a second processing mode to obtain the input state of the current node unit at the current moment;
multiplying the output cell state of the last node unit at the last moment by a forgetting gate according to elements to obtain a first product; multiplying the state input by the current node unit at the current moment by an input gate according to elements to obtain a second product; adding and summing the first product and the second product to obtain the cell state of the current node unit at the current moment;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a first output gate;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a second output gate;
processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the first output gate to obtain the hidden layer state of the next node unit at the current moment;
processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the second output gate to obtain the initial cell state of a descendant node unit of the current node unit at the current moment;
processing the hidden layer state of the next node unit at the current moment by adopting a first processing mode to obtain an output result;
and S3013, inputting the verification set into the improved long-short term memory neural network model for training to generate a tree neural network classification model.
The working principle of the technical scheme is as follows: the long-short term memory model protects and controls the cell state by introducing a gate control unit and a memory cell and by an input gate, a forgetting gate and an output gate, thereby solving the problem of gradient disappearance in a circulating neural network. The model is trained through forward propagation and backward propagation, wherein in the forward propagation process, the node unit of each layer receives the activation value of the node unit of the previous layer as input, and the activation value of the layer is calculated through a weight matrix, a bias and an activation function Sigmoid; calculating an output value through a tanh activation function; the specific steps of the embodiment are as follows:
s3011, obtaining a historical characteristic vector data set of a UI screenshot image, and dividing the historical characteristic vector data set into a training set and a verification set;
s3012, constructing an improved long-short term memory neural network model, wherein the improved long-short term memory neural network model comprises node units, descendant node units of the node units, an input gate, a forgetting gate, a first output gate and a second output gate, and the improved long-short term memory neural network model is shown in figure 4; setting a Sigmoid activation function as a first processing mode, wherein the function outputs a numerical value between 0 and 1 to describe how much of each part can pass through, 0 represents that 'no quantity is allowed to pass through', and 1 represents that 'any quantity is allowed to pass through'; setting a tanh activation function as a second processing mode;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a forgetting gate; the calculation formula is as follows:
f t =σ(W if x t +b if +W hf h t-1 +b hf )
in the above formula, f t For forgetting gate, it is used to determine how much the state of the node unit at the previous time is reserved to the current time, x t As input to the current node unit at the current time, W if To correspond to x t Weight matrix of h t-1 As the input hidden state of the current node, b if To correspond to x t Offset of (2), W hf To correspond to h t-1 Weight matrix of b hf To correspond to h t-1 Bias of (3); σ denotes Sigmoid function; t is a time value;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain an input door; the calculation formula is as follows:
i t =σ(W ii x t +b ii +w hi h t-1 +b hi )
in the above formula, i t Is an input gate for determining the currentHow many inputs of the current node element are saved to the element state, x t As input to the current node unit at the current time, W ii To correspond to x t Weight matrix of h t-1 As the input hidden state of the current node, b ii To correspond to x t Offset of (2), W hi To correspond to h t-1 Weight matrix of b hi To correspond to h t-1 Bias of (3); σ denotes Sigmoid function; t is a time value;
processing the input of the current node unit at the current moment and the input hidden layer state of the current node by adopting a second processing mode to obtain the input state of the current node unit at the current moment; the calculation formula is as follows:
g t =tanh(W ig x t +b ig +W hg h t-1 +b hg )
in the above formula, g t For the state of the current node element input at the current time, x t As input to the current node unit at the current time, W ig To correspond to x t Weight matrix of h t-1 As the input hidden state of the current node, b ig To correspond to x t Offset of (2), W hg To correspond to h t-1 Weight matrix of b hg To correspond to h t-1 Bias of (c); tanh refers to the tanh activation function; t is a time value;
multiplying the output cell state of the last node unit at the last moment by the forgetting gate according to the element to obtain a first product; multiplying the state input by the current node unit at the current moment by an input gate according to elements to obtain a second product; adding and summing the first product and the second product to obtain the cell state of the current node unit at the current moment; the calculation formula is as follows:
c t =f t *c t-1 +i t *g t
in the above formula, g t For the state of the current node element input at the current time, i t To the input gate, f t To forget the door, c t-1 The cell state of a node unit at the last time; c. C t Is the current timeThe cellular state of the node unit; t is a time value;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a first output gate; the calculation formula is as follows:
o t =σ(W io x t +b io +W ho h t-1 +b ho )
in the above formula, o t A first output gate; x is the number of t As input to the current node unit at the current time, W io To correspond to x t Weight matrix of h t-1 As the input hidden state of the current node, b io To correspond to x t Offset of (2), W ho To correspond to h t-1 Weight matrix of b ho To correspond to h t-1 Bias of (3); σ denotes Sigmoid function; t is a time value;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a second output gate; the calculation formula is as follows:
s t =σ(W is x t +b is +W hs h t-1 +b hs )
in the above formula, s t Is a second output gate, x t As input to the current node unit at the current time, W is To correspond to x t Weight matrix of h t-1 As the input hidden state of the current node, b is Is corresponding to x t Offset of (2), W hs To correspond to h t-1 Weight matrix of b hs To correspond to h t-1 Bias of (c); σ denotes Sigmoid function; t is a time value;
processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the first output gate to obtain the hidden layer state of the next node unit at the current moment; the calculation formula is as follows:
h t =o t *tanh(c t )
in the above formula, the first and second groups of the formula,h t is the hidden layer state of the next node unit at the current time o t Is a first output gate, c t For the cell state of the current node unit at the current moment, tanh refers to a tanh activation function; t is a time value;
processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the second output gate to obtain the initial cell state of a descendant node unit of the current node unit at the current moment; the calculation formula is as follows:
c t0 =s t *tanh(c t )
in the above formula, c t0 Is the initial cell state, s, of the descendant node unit of the current node unit at the current time t Is a second output gate, c t For the cell state of the current node unit at the current moment, tanh refers to a tanh activation function; t is a time value;
processing the hidden layer state of the next node unit at the current moment by adopting a first processing mode to obtain an output result; the calculation formula is as follows:
y t =σ(W classify h t +b classify )
in the above formula, y t To classify the result, h t Is the hidden state of the next node unit at the current time, W classify To correspond to h t Weight matrix of b classify To correspond to h t Bias of (3); σ denotes Sigmoid function; t is a time value;
s3013, inputting the verification set into the improved long-term and short-term memory model for training, and generating a tree-like neural network classification model.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the improved long-term and short-term memory neural network model is adopted, and the output gate is added, so that the construction of the tree-like neural network classification model is facilitated, and the classification efficiency is improved.
In one embodiment, as shown in fig. 5, S302 includes;
s3021, creating a first blank queue and a first root node Document Object Model (DOM) tree structure;
s3022, acquiring characteristics of the first UI screenshot image in the verification set;
s3023, inputting the characteristics of the first UI screenshot image into a tree-shaped neural network classification model initial node unit;
s3024, classifying the features to obtain classification results of the features, and obtaining initial cell states of descendant node units of the initial node units;
s3025, judging whether the classification is finished, if so, judging whether the first blank queue is empty, and if so, inversely mapping the Document Object Model (DOM) tree structure of the first root node to obtain an HTML code; if not, dequeuing a feature from the first blank queue, and going to the step S3023 for the feature to perform an operation; if the classification is not finished, the step goes to S3026;
s3026, inverse mapping is carried out on the classification result to obtain an HTML tag code, and the HTML tag code is inserted into a position corresponding to a Document Object Model (DOM) tree structure of the first root node; and simultaneously adding the characteristics of the initial node unit into the first blank queue buffer, generating the initial cell state of the next node unit of the initial node unit, and turning to S3024 to continue the classification operation.
The working principle of the technical scheme is as follows: a Document Object Model (DOM) may be application programmed with respect to HTML documents, which for any HTML document may be rendered as a structure of multiple levels of nodes. The embodiment edits the HTML code based on the document object model and the queue operation, and specifically includes:
s302 comprises the following steps;
s3021, creating a first blank queue and a first root node Document Object Model (DOM) tree structure;
s3022, acquiring characteristics of the first UI screenshot image in the verification set;
s3023, inputting the characteristics of the first UI screenshot image into a tree-shaped neural network classification model initial node unit;
s3024, classifying the features to obtain classification results of the features, and obtaining initial cell states of descendant node units of the initial node units;
s3025, judging whether the classification is finished, if so, judging whether the first blank queue is empty, and if so, inversely mapping the Document Object Model (DOM) tree structure of the first root node to obtain an HTML code; if not, dequeuing a feature from the first blank queue, and going to the step S3023 for the feature to perform an operation; if the classification is not finished, the step goes to S3026;
s3026, inverse mapping is carried out on the classification result to obtain an HTML tag code, and the HTML tag code is inserted into a position corresponding to a Document Object Model (DOM) tree structure of the first root node; and simultaneously adding the characteristics of the initial node unit into the first blank queue cache, generating the initial cell state of the next node unit of the initial node unit, and turning to S3024 to continue the classification operation.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the hierarchy of the tree neural network classification model can be increased and the operation speed can be improved by adopting a Document Object Model (DOM) tree structure.
In one embodiment, the S302 adding CSS attribute tags includes:
s302-1, acquiring a text of a Cascading Style Sheet (CSS) referred by a target webpage, and acquiring a text style set;
s302-2, screening the text style set, selecting a character string style and a symbol style in a text, and generating a first style set;
s302-3, acquiring the style of the text name with the connection number in the text style set, segmenting the style, discarding the content behind the connection number in the text name, reserving the content in front of the connection number in the text name, taking the reserved content as a new style of the text name, renaming the text name, and generating a new name text; summarizing the new name texts to generate a second style set;
s302-4, deleting the first style set and the second style set from the text style set to obtain a third style set;
s302-5, inputting the third style set into the tree-like neural network classification model to obtain a CSS style classification result, and setting the CSS style classification result as a classification label.
The working principle of the technical scheme is as follows: CSS is an declarative, domain-specific language that defines the way structured documents are rendered and can also be used to build user interfaces for desktop and mobile-side applications. Since CSS style files are text-based content and both space and linefeed are unnecessary redundant styles; traversing is needed to be carried out in all style texts, and deletion operation is carried out on rules containing the style names, so that the simplified style with independent attribute values can be obtained; the embodiment specifically includes:
s302-1, obtaining a text of a CSS (cascading style sheet) quoted by a target webpage, and obtaining a text style set;
s302-2, screening the text style set, selecting a character string style and a symbol style in a text, and generating a first style set;
s302-3, acquiring the style of the text name with the connection number in the text style set, segmenting the style, abandoning the content after the connection number in the text name, reserving the content before the connection number in the text name, taking the reserved content as a new style of the text name, and renaming the text name to generate a new name text; summarizing the new name texts to generate a second style set;
s302-4, deleting the first style set and the second style set from the text style set to obtain a third style set;
s302-5, inputting the third style set into the tree-like neural network classification model to obtain a CSS style classification result, and setting the CSS style classification result as a classification label.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the CSS style is processed, so that the complexity of the CSS style can be effectively reduced, the selectivity of the CSS attribute value is improved, and the identification degree of the classification label is enhanced.
In one embodiment, S1 comprises:
s101, converting an image in a target webpage user interface into an image with black and white gray values;
s102, carrying out binarization on the image represented by the black and white gray value based on Canny operator edge detection to obtain an image represented by binarization;
s103, carrying out sliding interception on the image represented by the binaryzation by using a preset sliding window to obtain a screenshot image, and summarizing the screenshot image to obtain a UI screenshot image set.
The working principle of the technical scheme is as follows: the gray value of the image is adopted to represent the image, so that the storage space and the calculation time can be saved; through the binarization processing of the image, the data volume of the image can be reduced; the Canny operator has obvious application effect in the aspects of image segmentation and small target movement tracking, can maximally inhibit false response edges and can inhibit the influence of noise;
the Canny operator is an operator containing an optimization idea, is an algorithm which effectively combines a Gaussian function and a first derivative, and has low error rate, high positioning precision and a unilateral response criterion. The edge detection step of the operator is as follows:
the first step, a Gaussian function and an image function are used for convolution, noise in an image is smoothed, and a filtered image is obtained;
secondly, calculating the gradient amplitudes of the image in the horizontal and vertical directions by using a finite difference operator;
thirdly, traversing the gradient amplitude image, comparing the gradient value of each pixel with the gradient values of two adjacent pixels in the gradient direction of the pixel, judging whether the gradient value is the maximum value, if so, the pixel is possibly an edge point, reserving the gradient value, if not, the pixel is certainly not the edge point, and setting the gradient value of the pixel to be zero so as to ensure the single edge of the detection result;
and fourthly, detecting and connecting edges by adopting a double-threshold algorithm. Setting two thresholds, wherein one is a high threshold and the other is a low threshold, segmenting the result image processed in the third step, and further detecting edge points; in this step, the dual thresholds need to be manually set, but the self-adaptive capability is poor, the same set of thresholds act on different images, and the edge detection results are very different. If the threshold value is too high, edge points of partial images can be filtered, the obtained edges are easy to generate edge breakage, and some important boundaries are lost; if the threshold is set too low, some noise or non-target edge points can be detected, so that the false detection problem is generated, the edge positioning accuracy is reduced, and the quality of the edge detection result is influenced. Therefore, in the embodiment, a gray level histogram of an image is utilized, and an inter-class variance of a target and a background is taken as a measure criterion to select an optimal threshold, specifically, gray level information of the gradient image is utilized to divide pixel points in the gradient image into an edge class and a background class, when the gradient pixel points in the edge class and the background class have a wrong time, the inter-class variance in the two classes is reduced, so that when the inter-class variance in a classification result is maximum, the boundary and the background are classified most correctly, and the probability that the pixel points are classified in a wrong time is minimum; the principle is that a digital image with m different gray levels is supposed to be provided; setting a threshold value r (0-r-t-m-1), carrying out threshold processing on the gradient amplitude image, dividing gradient amplitude pixel points into an image background gray level set and an image edge gray level set, and calculating the inter-class variance of the gradient amplitude pixel points, wherein the calculation formula is as follows:
Figure BDA0003644838970000181
in the above formula, r is a threshold value of 0<r<m-1,β 2 (r) is the inter-class variance, T, corresponding to the threshold r 1 (r) is the probability that an image pixel is assigned to a gray level set comprised by the image background; p The probability of the image gray histogram is the average gray value of the whole image; n (r) is the cumulative mean value of the gray levels up to r; m is the number of gray levels, belongs to the gray level belonging to the ∈ 0,1,2 \8230, m-1; calculating a threshold value which enables the inter-class variance to be maximum through the formula, namely the optimal threshold value; by adopting the optimal threshold value, the self-adaptive capacity of the algorithm can be improved.
The embodiment adopts the above processing method to process the image, and specifically includes:
s101, converting an image in a target webpage user interface into an image with black and white gray value values;
s102, carrying out binarization on the image represented by the black and white gray value based on Canny operator edge detection to obtain an image represented by binarization;
s103, carrying out sliding interception on the image represented by the binaryzation by using a preset sliding window to obtain a screenshot image, and summarizing the screenshot image to obtain a UI screenshot image set.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the interception quality and the working efficiency of the UI screenshot image can be improved by processing the UI screenshot image; by improving the threshold value in Canny operator edge detection, the self-adaptive capacity of the algorithm can be improved, and the accuracy of the edge detection result can be improved.
In one embodiment, as shown in fig. 6, further comprising S5: the accuracy verification is carried out on the generated HTML codes of the target webpage user interface, and the specific steps are as follows:
s501, acquiring a webpage source code corresponding to a target webpage user interface;
s502, capturing the content in the webpage source code to obtain a webpage source code content character string;
s503, according to the webpage source code content character strings, constructing a character string error category library, and creating a plurality of global character string pattern matching functions, wherein the error categories correspond to corresponding global character string pattern matching functions;
s504, matching HTML codes of the target webpage user interface based on the global character string mode matching functions, and verifying the accuracy of the HTML codes.
The working principle of the technical scheme is as follows: after the web page code is generated, it is necessary to verify the accuracy of the code in order to check for errors and correct them in time. The specific steps of the embodiment are as follows:
s501, acquiring a webpage source code corresponding to a target webpage user interface;
s502, capturing content in the webpage source code to obtain a webpage source code content character string;
s503, according to the webpage source code content character strings, constructing a character string error category library, and creating a plurality of global character string pattern matching functions, wherein the error categories correspond to corresponding global character string pattern matching functions;
s504, matching HTML codes of the target webpage user interface based on the global character string pattern matching functions, and verifying the accuracy of the HTML codes.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the error code can be found in time by verifying the accuracy of the generated code, and reference basis is provided for improving the network structure and correcting the generation method.
In one embodiment, further comprising S6: the method for screenshot of the UI image comprises the following specific steps:
s601, obtaining size information of a UI image in a target webpage user interface to obtain minimum size information;
s602, initializing a display proportion of a target webpage, and setting the target webpage to be displayed in a 100% proportion;
s603, constructing a visual form mask on the target webpage, wherein the visual form mask covers all target webpage, and the size of the cells in the visual form mask is set to be 1/16 of the minimum size information;
s604, adjusting the display proportion of the target webpage, and adjusting the target webpage to be displayed according to 400% proportion;
s605, acquiring a table area corresponding to the UI image in the target webpage user interface and a cell position contained in the table area by using the visual table template, operating a mouse pointer to be arranged at a starting cell position corresponding to the UI image according to the cell position, pressing a left mouse button to generate a sliding window, intercepting the UI image, and finishing the interception of the UI image when the mouse pointer moves to an ending cell position corresponding to the UI image.
The working principle of the technical scheme is as follows: when a UI screenshot image in a target webpage user interface is obtained, the accuracy of screenshot and the quality of the screenshot are guaranteed, so that the UI screenshot can be better obtained; the screenshot method comprises the following steps:
s601, obtaining size information of a UI image in a target webpage user interface to obtain minimum size information;
s602, initializing a target webpage display proportion, and setting the target webpage to be displayed in a 100% proportion;
s603, constructing a visual form mask on the target webpage, wherein the visual form mask covers all the target webpage, and the size of a cell in the visual form template is set to be 1/16 of the minimum size information;
s604, adjusting the display proportion of the target webpage, and adjusting the target webpage to be displayed according to 400% proportion;
s605, acquiring a table area corresponding to the UI image in the target webpage user interface and a cell position contained in the table area by using the visual table template, operating a mouse pointer to be arranged at a starting cell position corresponding to the UI image according to the cell position, pressing a left mouse button to generate a sliding window, intercepting the UI image, and finishing the interception of the UI image when the mouse pointer moves to an ending cell position corresponding to the UI image.
The beneficial effects of the above technical scheme are: by adopting the scheme provided by the embodiment, the visual mask is adopted, the display scale is adjusted, the area range of the UI image can be accurately selected, and the accuracy of capturing the UI screenshot image is improved.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (6)

1. A method for generating webpage codes from images based on a tree neural network is characterized by comprising the following steps:
s1, processing images in a target webpage user interface, and acquiring a UI screenshot image set;
s2, based on a preset convolutional neural network, performing feature extraction on the UI screenshot images in the UI screenshot image set to obtain a feature vector data set of the UI screenshot images;
s3, performing multi-label classification on the feature vector data set based on a tree neural network classification model to obtain a classification result;
s4, decoding the classification result to generate an HTML code of a target webpage user interface;
the S3 comprises the following steps:
s301, constructing a tree-like neural network classification model based on the improved long-term and short-term memory neural network model;
s302, classifying the feature vector data set based on the tree neural network classification model, and adding CSS attribute tags to obtain a tagged classification result;
the S4 comprises the following steps:
s401, inverse mapping is carried out on the classification result with the label, and an HTML element code with a style is obtained;
s402, inserting the HTML element codes with styles into corresponding node positions in a Document Object Model (DOM) tree structure one by one to generate HTML codes of corresponding UI screenshot images;
s403, after all the HTML element codes with styles are inserted into corresponding node positions, generating HTML codes of a target webpage user interface according to the DOM tree structure;
wherein, S302 includes:
s3021, creating a first blank queue and a first root node Document Object Model (DOM) tree structure;
s3022, acquiring characteristics of a first UI screenshot image in the verification set; the verification set is obtained by dividing a vector data set according to the historical characteristics of the UI screenshot image;
s3023, inputting the characteristics of the first UI screenshot image into a tree-shaped neural network classification model initial node unit;
s3024, classifying the features to obtain classification results of the features, and obtaining initial cell states of descendant node units of the initial node units;
s3025, judging whether the classification is finished, if so, judging whether the first blank queue is empty, and if so, inversely mapping the DOM tree structure of the first root node document object model to obtain an HTML code; if not, dequeuing a feature from the first blank queue, and going to the step S3023 for the feature to perform an operation; if the classification is not finished, the step is transferred to the step S3026;
s3026, inversely mapping the classification result to obtain an HTML tag code, and inserting the HTML tag code into a position corresponding to a DOM tree structure of the first root node document object model; simultaneously adding the characteristics of the initial node unit into the first blank queue cache, generating the initial cell state of the next node unit of the initial node unit, and turning to the step S3024 to continue the classification operation;
wherein, the step S302 of adding CSS attribute tags includes:
s302-1, acquiring a text of a Cascading Style Sheet (CSS) referred by a target webpage, and acquiring a text style set;
s302-2, screening the text style set, selecting a character string style and a symbol style in a text, and generating a first style set;
s302-3, acquiring the style of the text name with the connection number in the text style set, segmenting the style, abandoning the content after the connection number in the text name, reserving the content before the connection number in the text name, taking the reserved content as a new style of the text name, and renaming the text name to generate a new name text; summarizing the new name texts to generate a second style set;
s302-4, deleting the first style set and the second style set from the text style set to obtain a third style set;
s302-5, inputting the third style set into the tree-like neural network classification model to obtain a CSS style classification result, and setting the CSS style classification result as a classification label.
2. The method for generating webpage code from image based on the tree neural network as claimed in claim 1, wherein S2 comprises:
s201, setting a convolutional neural network; the convolutional neural network comprises an input layer, a convolutional layer and an output layer; the convolution layer is connected with a normalization layer, the activation function of the convolution layer is a modified linear unit function, and the weight of the convolution layer is initialized by adopting standard normal distribution; the convolution layer is set as five layers; the sizes of convolution kernels of the first layer, the second layer and the third layer of the convolution layer are all set to be 5 multiplied by 5, and the sizes of convolution kernels of the fourth layer and the fifth layer of the convolution layer are all set to be 3 multiplied by 3;
and S202, inputting the UI screenshot images in the UI screenshot image set into the convolutional neural network to obtain a characteristic vector data set of the UI screenshot images.
3. The method for generating webpage code from image based on tree-like neural network of claim 1, wherein S301 comprises:
s3011, obtaining a historical characteristic vector data set of a UI screenshot image, and dividing the historical characteristic vector data set into a training set and a verification set;
s3012, constructing an improved long-term and short-term memory neural network model; the improved long-short term memory neural network model comprises a node unit, descendant node units of the node unit, an input gate, a forgetting gate, a first output gate and a second output gate; setting a Sigmoid activation function as a first processing mode, and setting a tanh activation function as a second processing mode;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a forgetting gate;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain an input door;
processing the input of the current node unit at the current moment and the input hidden layer state of the current node by adopting a second processing mode to obtain the input state of the current node unit at the current moment;
multiplying the output cell state of the last node unit at the last moment by a forgetting gate according to elements to obtain a first product; multiplying the state input by the current node unit at the current moment by an input gate according to elements to obtain a second product; adding and summing the first product and the second product to obtain the cell state of the current node unit at the current moment;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a first output gate;
processing the input of the current node unit and the input hidden layer state of the current node at the current moment by adopting a first processing mode to obtain a second output gate;
processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the first output gate to obtain the hidden layer state of the next node unit at the current moment;
processing the cell state of the current node unit at the current moment by adopting a second processing mode, and performing product operation on a processing result and the second output gate to obtain the initial cell state of a descendant node unit of the current node unit at the current moment;
processing the hidden layer state of the next node unit at the current moment by adopting a first processing mode to obtain an output result;
and S3013, inputting the verification set into the improved long-short term memory neural network model for training to generate a tree neural network classification model.
4. The method for generating webpage code from image based on the tree neural network as claimed in claim 1, wherein S1 comprises:
s101, converting an image in a target webpage user interface into an image with black and white gray value values;
s102, based on Canny operator edge detection, carrying out binaryzation on the image represented by the black and white gray value to obtain an image represented by binaryzation;
s103, carrying out sliding interception on the image represented by the binaryzation by using a preset sliding window to obtain a screenshot image, and summarizing the screenshot image to obtain a UI screenshot image set.
5. The method for generating webpage code from image based on the tree neural network as claimed in claim 1, further comprising S5: the accuracy verification is carried out on the generated HTML codes of the target webpage user interface, and the specific steps are as follows:
s501, acquiring a webpage source code corresponding to a target webpage user interface;
s502, capturing the content in the webpage source code to obtain a webpage source code content character string;
s503, according to the webpage source code content character strings, constructing a character string error category library, and creating a plurality of global character string pattern matching functions, wherein the error categories correspond to corresponding global character string pattern matching functions;
s504, matching HTML codes of the target webpage user interface based on the global character string pattern matching functions, and verifying the accuracy of the HTML codes.
6. The method for generating webpage code from image based on the tree neural network as claimed in claim 1, further comprising S6: the method for screenshot the UI image comprises the following specific steps:
s601, obtaining size information of a UI image in a target webpage user interface to obtain minimum size information;
s602, initializing a target webpage display proportion, and setting the target webpage to be displayed in a 100% proportion;
s603, constructing a visual form template on the target webpage, wherein the visual form template covers all target webpage pages, and the size of cells in the visual form template is set to be 1/16 of the minimum size information;
s604, adjusting the display proportion of the target webpage, and adjusting the target webpage to be displayed according to 400% proportion;
s605, acquiring a table area corresponding to the UI image in the target webpage user interface and a cell position contained in the table area by using the visual table template, operating a mouse pointer to be arranged at a starting cell position corresponding to the UI image according to the cell position, pressing a left mouse button to generate a sliding window, intercepting the UI image, and finishing the interception of the UI image when the mouse pointer moves to an ending cell position corresponding to the UI image.
CN202210527210.0A 2022-05-16 2022-05-16 Method for generating webpage code from image based on tree-shaped neural network Active CN114821610B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210527210.0A CN114821610B (en) 2022-05-16 2022-05-16 Method for generating webpage code from image based on tree-shaped neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210527210.0A CN114821610B (en) 2022-05-16 2022-05-16 Method for generating webpage code from image based on tree-shaped neural network

Publications (2)

Publication Number Publication Date
CN114821610A CN114821610A (en) 2022-07-29
CN114821610B true CN114821610B (en) 2022-11-29

Family

ID=82515955

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210527210.0A Active CN114821610B (en) 2022-05-16 2022-05-16 Method for generating webpage code from image based on tree-shaped neural network

Country Status (1)

Country Link
CN (1) CN114821610B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116304457B (en) * 2023-02-27 2024-03-29 山东乾舜广告传媒有限公司 Marking method for webpage multiple information attribute
CN116820561B (en) * 2023-08-29 2023-10-31 成都丰硕智能数字科技有限公司 Method for automatically generating interface codes based on interface design diagram

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522017A (en) * 2018-11-07 2019-03-26 中山大学 It is a kind of based on neural network and from the webpage capture code generating method of attention mechanism
CN111124380A (en) * 2019-11-26 2020-05-08 江苏艾佳家居用品有限公司 Front-end code generation method
CN113504906A (en) * 2021-05-31 2021-10-15 北京房江湖科技有限公司 Code generation method and device, electronic equipment and readable storage medium
CN113778403A (en) * 2021-01-15 2021-12-10 北京沃东天骏信息技术有限公司 Front-end code generation method and device
CN113986251A (en) * 2021-12-29 2022-01-28 中奥智能工业研究院(南京)有限公司 GUI prototype graph code conversion method based on convolution and cyclic neural network

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110377282B (en) * 2019-06-26 2021-08-17 扬州大学 Method for generating Web code based on UI generating countermeasure and convolutional neural network
CN111190600B (en) * 2019-12-31 2023-09-19 中国银行股份有限公司 Method and system for automatically generating front-end codes based on GRU attention model
US11416244B2 (en) * 2020-04-01 2022-08-16 Paypal, Inc. Systems and methods for detecting a relative position of a webpage element among related webpage elements
CN111562919A (en) * 2020-07-14 2020-08-21 成都市映潮科技股份有限公司 Method, system and storage medium for generating front-end webpage code based on PSD file
CN114185540A (en) * 2021-11-02 2022-03-15 武汉大学 Deep learning-based GUI code automatic generation method
CN114398138A (en) * 2022-01-19 2022-04-26 平安国际智慧城市科技股份有限公司 Interface generation method and device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522017A (en) * 2018-11-07 2019-03-26 中山大学 It is a kind of based on neural network and from the webpage capture code generating method of attention mechanism
CN111124380A (en) * 2019-11-26 2020-05-08 江苏艾佳家居用品有限公司 Front-end code generation method
CN113778403A (en) * 2021-01-15 2021-12-10 北京沃东天骏信息技术有限公司 Front-end code generation method and device
CN113504906A (en) * 2021-05-31 2021-10-15 北京房江湖科技有限公司 Code generation method and device, electronic equipment and readable storage medium
CN113986251A (en) * 2021-12-29 2022-01-28 中奥智能工业研究院(南京)有限公司 GUI prototype graph code conversion method based on convolution and cyclic neural network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Improving pix2code based Bi-directional LSTM;Yanbin Liu et al.;《2018 IEEE International Conference on Automation, Electronics and Electrical Engineering (AUTEEE)》;20190523;第220-223页 *
pix2code: Generating Code from a Graphical User Interface Screenshot;Tony Beltramelli et al.;《arXiv》;20170919;第1-9页 *
基于深度学习的Web用户界面代码生成技术研究;张玮;《科学技术创新》;20200531(第14期);第82-83页 *
基于深度学习的Web页面生成系统的设计与实现;雷慧;《中国优秀硕士学位论文全文数据库 信息科技辑(月刊)》;20200715(第07期);第1-79页 *

Also Published As

Publication number Publication date
CN114821610A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
US11816165B2 (en) Identification of fields in documents with neural networks without templates
CN114821610B (en) Method for generating webpage code from image based on tree-shaped neural network
US11562588B2 (en) Enhanced supervised form understanding
US20230237040A1 (en) Automated document processing for detecting, extractng, and analyzing tables and tabular data
US11954139B2 (en) Deep document processing with self-supervised learning
CN110363049B (en) Method and device for detecting, identifying and determining categories of graphic elements
CN111881262A (en) Text emotion analysis method based on multi-channel neural network
CN109408058B (en) Front-end auxiliary development method and device based on machine learning
CN109190630A (en) Character identifying method
US20180365594A1 (en) Systems and methods for generative learning
CN111428457A (en) Automatic formatting of data tables
RU2765884C2 (en) Identification of blocks of related words in documents of complex structure
CN113762269B (en) Chinese character OCR recognition method, system and medium based on neural network
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN113159013A (en) Paragraph identification method and device based on machine learning, computer equipment and medium
CN114782974A (en) Table identification method, system, intelligent terminal and computer readable storage medium
KR102122561B1 (en) Method for recognizing characters on document images
Song Accuracy analysis of Japanese machine translation based on machine learning and image feature retrieval
CN117083605A (en) Iterative training for text-image-layout transformer models
CN112307749A (en) Text error detection method and device, computer equipment and storage medium
CN109670040B (en) Writing assistance method and device, storage medium and computer equipment
CN115546801A (en) Method for extracting paper image data features of test document
US20230138491A1 (en) Continuous learning for document processing and analysis
Xie et al. Enhancing multimodal deep representation learning by fixed model reuse
CN110688486A (en) Relation classification method and model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant