CN112732259A

CN112732259A - Front-end interactive page conversion method, device and medium based on artificial intelligence

Info

Publication number: CN112732259A
Application number: CN202110034032.3A
Authority: CN
Inventors: 余雄伟; 蔡羽; 黄德运
Original assignee: Agree Technology Co ltd
Current assignee: Agree Technology Co ltd
Priority date: 2021-01-11
Filing date: 2021-01-11
Publication date: 2021-04-30
Anticipated expiration: 2041-01-11
Also published as: CN112732259B

Abstract

The invention relates to a front-end interactive page conversion method, a device and a medium based on artificial intelligence, which comprises the following steps: performing machine learning on a plurality of component elements of HTML5 to obtain a classification model of an HTML5 component; performing machine learning on the position of the component and the printing font to obtain an OCR character recognition model; acquiring parameter information of a vector graphic to be converted, converting the vector graphic into an HTML5 definition file according to the parameter information, and further generating a plurality of basic components of HTML5 according to the HTML5 definition file; and selecting the basic component with the highest similarity for filling, carrying out recognition based on the OCR character recognition model on the filled image area, and adding recognition character information of the OCR character recognition model in the basic component. The invention has the beneficial effects that: the artificial intelligence image classification technology is introduced into the front-end interface development, so that the coding work of developers can be reduced, the conversion efficiency of the front-end interface from the UI design drawing to the final finished product HTML5 file is improved, and the front-end development becomes efficient.

Description

Front-end interactive page conversion method, device and medium based on artificial intelligence

Technical Field

The invention relates to the field of computers, in particular to a front-end interactive page conversion method, a front-end interactive page conversion device and a front-end interactive page conversion medium based on artificial intelligence.

Background

At present, various services of banks all adopt HTML5 as a main human-computer interaction interface, including a counter system used inside, a monitoring system, various management applications, a mobile phone bank used outside and self-service equipment, so that the development demand of the HTML5 is increased sharply at present. Meanwhile, the requirement on user experience is higher and higher, a UI designer is often required to perform a large amount of design work first and then is realized by a technician, so that the development period is prolonged.

The traditional HTML5 front-end development schemes are all that UI designers use tools such as Sketch and the like to perform prototype design first, and then technicians perform manual coding development by contrasting design drawings or perform repeated engraving development by a visualization tool; the existing development modes require considerable development cost for converting prototype designs into practically usable HTML5 files, and the work is often repeated mechanical labor with low technical difficulty, so that the development cost of the whole project is increased.

Disclosure of Invention

The invention aims to solve at least one of the technical problems in the prior art, and provides a front-end interactive page conversion method, a device and a medium based on artificial intelligence, which aim to reduce the development cost and improve the development efficiency by an image classification technology of neural network deep learning.

The technical scheme of the invention comprises a front-end interactive page conversion method based on artificial intelligence, which is characterized by comprising the following steps: training a classification model, namely performing machine learning on a plurality of components of HTML5 based on image classification of a MobileNet model to obtain a classification model of an HTML5 component; training a character positioning model, and obtaining an OCR character positioning model based on the position of the CTPN model learning component; training a character recognition model, learning printed characters based on a DenseNet model, and obtaining an OCR character recognition model; a vector graphic file first conversion step, namely acquiring parameter information of a vector graphic to be converted, converting the vector graphic into an HTML5 definition file according to the parameter information, and further generating a plurality of first components of HTML5 according to the HTML5 definition file, wherein the first components comprise texts, pictures and vector diagrams; a second conversion step of the vector graphic file, namely calling a classification model for recognition according to the front-end interactive page image information to be displayed, matching a corresponding first component according to a recognition result, selecting the first component with the highest similarity for filling, recognizing a filled image area based on an OCR character recognition model, adding recognition character information of the OCR character recognition model to the first component, and obtaining a second component; wherein the vector graphics file is a Sketch file.

According to the artificial intelligence-based front-end interactive page conversion method, the training of the classification model comprises the following steps: data preprocessing, pre-training model training, data labeling, model fine tuning and model publishing; the data preprocessing comprises the steps of classifying ReDraw data sets according to component types, and uniformly scaling the picture size to set pixels in an equal proportion mode; wherein the pre-training model training comprises training the model using a plurality of ReDraw data sets based on a MobileNet model; the data labeling comprises correspondingly labeling a plurality of component data of HTML page data; the model fine adjustment comprises the steps of using a MobileNet model and loading a pre-training model for fine adjustment; model publishing includes publishing the model and providing HTTP services and providing a calling interface.

According to the front-end interactive page conversion method based on artificial intelligence, OCR model training comprises character positioning model training and character recognition model training, wherein the character positioning model training is realized through a CTPN algorithm, and the character recognition model training is realized through a DenseNet algorithm.

According to the artificial intelligence-based front-end interactive page conversion method, the training of the character positioning model comprises the following steps: the method comprises the steps of character positioning picture marking, image data preprocessing and data enhancing, CTPN model building, CTPN model training, CTPN model evaluation and CTPN model storage.

According to the artificial intelligence-based front-end interactive page conversion method, the training of the character recognition model comprises the following steps: the method comprises the steps of character recognition picture labeling, image data preprocessing and data enhancement, construction of a DenseNet model, training of the DenseNet model, evaluation of the DenseNet model and storage of the DenseNet model.

According to the method for converting the front-end interactive page based on the artificial intelligence, the first conversion step of the vector graphic file comprises the following steps: acquiring each layer information from the Sketch design file, specifically, reading the Sketch file and converting the Sketch file into a JSON file through a Sketch-file plug-in, wherein field picture information, document information and page information in the JSON file are used; converting the picture information, the document information and the page information of the JSON file into a Def file, wherein the Def file is a definition file; and converting the Def file into a VUE page file in an HTML5 standard format, wherein the VUE page comprises a text label, a picture label and a vector diagram label, and the module structure of the VUE page is kept consistent with that of the Sketch file.

According to the method for converting the front-end interactive page based on the artificial intelligence, the second conversion step of the vector graphic file comprises the following steps: s410, screenshot is conducted on the front-end interaction page, an interface area for obtaining the first component is selected, and an interface area image is obtained by combining the screenshot; s420, performing Base64 coding on the interface region image, calling a classification model to perform coding on the image, arranging the recognition results from high to low according to the similarity, and returning a plurality of candidate components; s430, replacing the corresponding basic component with the best matching component according to the best matching component selected by the user; s440, calling an OCR recognition model to perform character recognition on the Base64 code of the interface region image to obtain a character recognition result; and S450, adding a character recognition result to the assembly regulated in the S430 to obtain the second assembly.

According to the artificial intelligence based front-end interactive page conversion method, an HTML5 definition file is used for WEBIDE.

The technical scheme of the invention also comprises a front-end interactive page conversion device based on artificial intelligence, which comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, and is characterized in that any one of the method steps is realized when the processor executes the computer program.

The present invention also includes a computer-readable storage medium, in which a computer program is stored, wherein the computer program, when executed by a processor, implements any of the method steps.

The invention has the beneficial effects that: the artificial intelligence image classification technology is introduced into the front-end interface development, compared with the traditional front-end interface development process, the coding work of developers can be greatly reduced, and the conversion efficiency of the front-end interface from a UI design drawing to a final finished product HTML5 file is greatly improved through artificial intelligence auxiliary rib development, so that the front-end development becomes very efficient.

Drawings

The invention is further described below with reference to the accompanying drawings and examples;

FIG. 1 illustrates an overall flow diagram according to an embodiment of the invention;

FIG. 2 is a flow chart of a classification model training process according to an embodiment of the present invention;

FIG. 3 is a text positioning training process according to an embodiment of the present invention;

FIG. 4 is a flow diagram of OCR text recognition reasoning according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an interface definition file in JSON format according to an embodiment of the present invention;

FIG. 6 is a flow diagram of a transformation of a base component according to an embodiment of the present invention;

fig. 7 shows a schematic view of an apparatus according to an embodiment of the invention.

Detailed Description

Reference will now be made in detail to the present preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.

In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number.

In the description of the present invention, the consecutive reference numbers of the method steps are for convenience of examination and understanding, and the implementation order between the steps is adjusted without affecting the technical effect achieved by the technical solution of the present invention by combining the whole technical solution of the present invention and the logical relationship between the steps.

In the description of the present invention, unless otherwise explicitly defined, terms such as set, etc. should be broadly construed, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the detailed contents of the technical solutions.

FIG. 1 shows a general flow diagram according to an embodiment of the invention, comprising the following steps: training a classification model, namely performing machine learning on a plurality of components of HTML5 based on image classification of a MobileNet model to obtain a classification model of an HTML5 component; training a character positioning model, and obtaining an OCR character positioning model based on the position of the CTPN model learning component; training a character recognition model, learning printed characters based on a DenseNet model, and obtaining an OCR character recognition model; a vector graphic file first conversion step, namely acquiring parameter information of a vector graphic to be converted, converting the vector graphic into an HTML5 definition file according to the parameter information, and further generating a plurality of first components of HTML5 according to the HTML5 definition file, wherein the first components comprise texts, pictures and vector diagrams; a second conversion step of the vector graphic file, namely calling a classification model for recognition according to the image information of the front-end interactive page to be displayed, matching a corresponding first component according to a recognition result, selecting the first component with the highest similarity for filling, performing recognition based on an OCR character recognition model on the filled image area, and adding the recognition character information of the OCR character recognition model to the first component to obtain a second component, wherein the second component is generally a functional component such as a button; wherein the vector graphics file is a Sketch file.

Specifically, a structuring preprocessing is carried out by importing the Sketch design file, the Sketch file is preliminarily converted into an HTML5 file, and the Sketch primitive is preliminarily converted into an HTML5 primitive label. A user selects and replaces the HTML5 primitive labels as corresponding HTML5 component labels according to the user requirements by image classification of the HTML5 primitive labels, and then tag texts are automatically added through OCR recognition. Thus forming a basic usable HTML5 page file.

FIG. 2 is a classification model training flow according to an embodiment of the present invention, which performs machine learning on various component elements of HTML5 based on image classification of a MobileNet model to form a classification model for an HTML5 component. MobileNet is a lightweight convolutional neural network based on deep separable convolution, compared with a standard convolutional network, the number of convolution parameters and the calculated amount can be greatly reduced, so that the inference speed is improved, and the precision is only lost by about 1%.

Aiming at the training process of various components of HTML5, the method comprises the following steps of data preprocessing, pre-training model training, data labeling, model fine tuning, model publishing and the like, and the specific flow is as follows:

ReDraw data preprocessing: the ReDraw data set is categorized by component category and the picture size is uniformly scaled to 224 pixels.

Training a pre-training model: because the data volume of HTML page scenes is small, the generalization capability of the model is improved by using a transfer learning idea, the model is trained by using a large-scale ReDraw data set based on a MobileNet model, and then the internal parameters of the model are transferred to a downstream task.

And (3) HTML page data labeling: the HTML page data is labeled, the current version is labeled with 29 types of components, and the data volume of each type is not less than 300.

HTML page data preprocessing: the picture size is uniformly scaled to 224 pixels.

Fine adjustment of a model: and (3) fine tuning by using a MobileNet model and loading a pre-training model, wherein after fine tuning, the precision of the model is as high as 99%.

Model release: publish the model and provide HTTP services to other module calls.

Fig. 3 is a text positioning training process according to an embodiment of the present invention, a text positioning model is formed based on the position of the CTPN model learning component, and a text recognition model DenseNet training process is identical to the CTPN training process. The OCR character recognition model includes a character locating function and a character recognition function. The text positioning function is realized based on a CTPN algorithm, the algorithm is improved based on an image target detection algorithm, the text positioning is different from the object positioning, the text is composed of a series of characters, the detection target is not a closed area, an interval exists between every two characters, the text positioning has higher requirements on the precision, and the whole detection area needs to be covered. The CTPN improves the area suggestion network part in the target detection algorithm and is added with the context information of the circular neural network structure extracted text.

The character recognition function is realized based on a DenseNet algorithm, the basic idea of the DenseNet is consistent with that of a deep residual error network, but the establishment of the character recognition function is the dense connection of all the layers in the front and the layers behind, and the other characteristic is that the characteristic reuse is realized through the connection of the characteristic on a channel, and the specific formula is as follows:

wherein [ x ]₀,x_l,....,x_l-1]Respectively representing the output of the convolutional network from layer 0 to layer l-1, H_lRepresenting a characteristic combination function, x_lRepresenting the newly generated convolution signature. These features may allow DenseNet to achieve better performance than the deep residual network with less parameter and computation cost.

FIG. 4 is a flow chart of OCR character recognition reasoning process according to the embodiment of the invention, wherein the OCR character recognition reasoning process comprises image data preprocessing, CTPN model loading and DenseNet model loading, component position recognition and component character recognition.

Fig. 5 is a schematic diagram of an interface definition file in a JSON format according to an embodiment of the present invention, and each item of information of the JSON file is defined in fig. 5.

Fig. 6 is a flowchart of the basic components for conversion according to the embodiment of the present invention, where the Sketch file is primarily converted into the HTML5 file, and the steps are as follows:

obtaining useful information of each Layer (i.e. elements in a Layer, i.e. a design drawing, such as text, pictures and the like) from the Sketch design file, such as the size of a drawing board (i.e. the size of a page), the position and width and height of the Layer, the type of the Layer and the like, so that the prototype conversion tool firstly reads the Sketch file into JSON format content through a Sketch-file plug-in, wherein the fields used in JSON mainly include (1) images: picture id and picture file data used in the Sketch file; (2) document: the whole document information comprises a current file, and contents such as a public layer, a text style, a layer style and the like introduced into an external resource library; (3) pages: all pages in the design drawing are JSON, and the JSON contains detailed information of each layer of the page, such as the width and the height of the layer, the size of the layer, a text style, a layer style and the like. Combining the information of the three parts, the useful information of each layer can be obtained for converting the useful information into the attribute of the component in the Def; for example, a picture assembly in a picture layer conversion editor in Sketch needs to take picture data from an image to write into a png picture file, refer to the picture in src of the picture assembly, find the picture layer from pages, obtain the width, height, top and left of the picture layer, and fill in width, height, top and left of the picture assembly.

Preliminarily converting each page in the Sketch design file into an interface definition file in a JSON format, wherein the JSON text format is as follows: the Def file is converted into a VUE page file in an HTML5 standard format, the VUE page is composed of basic components such as basic text labels, picture labels, vector diagram labels and the like, and the page module structure is consistent with the Sketch design module structure. Through the steps, the HTML file of the Sketch file preliminary conversion is obtained, and at the moment, all elements of the HTML are basic text, pictures, icons and the like without being signed, and do not have any components and attributes thereof.

Converting the basic components in the page file generated by the primary conversion, which comprises the following specific steps

S410, screenshot is conducted on an interface displayed by the html, the selected basic component is clicked to obtain an interface area (Xmin, Ymin, Xmax and Ymax) of the component, and an image of the area is obtained by combining the screenshot.

S420, performing Base64 coding on the regional image, calling a component model to recognize the image, arranging recognition results from high to low according to the similarity, and returning a plurality of candidate components.

S430, the user selects the best matching component, and the corresponding basic component replaces the selected module as the component

S440, calling an OCR recognition model, performing character recognition on the regional image Base64 codes, and returning the result

And S450, adding character attributes to the components regulated in the S430 to obtain a return result of the S440.

Fig. 7 shows a schematic view of an apparatus according to an embodiment of the invention. The apparatus comprises a memory 100 and a processor 200, wherein the processor 200 stores a computer program for performing: training a classification model, namely performing machine learning on a plurality of components of HTML5 based on image classification of a MobileNet model to obtain a classification model of an HTML5 component; training a character positioning model, and obtaining an OCR character positioning model based on the position of the CTPN model learning component; training a character recognition model, learning printed characters based on a DenseNet model, and obtaining an OCR character recognition model; a vector graphic file first conversion step, namely acquiring parameter information of a vector graphic to be converted, converting the vector graphic into an HTML5 definition file according to the parameter information, and further generating a plurality of first components of HTML5 according to the HTML5 definition file, wherein the first components comprise texts, pictures and vector diagrams; a second conversion step of the vector graphic file, namely calling a classification model for recognition according to the image information of the front-end interactive page to be displayed, matching a corresponding first component according to a recognition result, selecting the first component with the highest similarity for filling, performing recognition based on an OCR character recognition model on the filled image area, and adding the recognition character information of the OCR character recognition model to the first component to obtain a second component, wherein the second component is generally a functional component such as a button; wherein the vector graphics file is a Sketch file.

The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims

1. A front-end interactive page conversion method based on artificial intelligence is characterized by comprising the following steps:

training a classification model, namely performing machine learning on a plurality of components of HTML5 based on image classification of a MobileNet model to obtain a classification model of an HTML5 component;

training a character positioning model, and obtaining an OCR character positioning model based on the position of the CTPN model learning component;

training a character recognition model, learning printed characters based on a DenseNet model, and obtaining an OCR character recognition model;

a vector graphic file first conversion step, namely acquiring parameter information of a vector graphic to be converted, converting the vector graphic into an HTML5 definition file according to the parameter information, and further generating a plurality of first components of HTML5 according to the HTML5 definition file, wherein the first components comprise texts, pictures and vector diagrams;

a second conversion step of the vector graphic file, namely calling a classification model for recognition according to the front-end interactive page image information to be displayed, matching a corresponding first component according to a recognition result, selecting the first component with the highest similarity for filling, recognizing a filled image area based on an OCR character recognition model, adding recognition character information of the OCR character recognition model to the first component, and obtaining a second component;

wherein the vector graphics file is a Sketch file.

2. The artificial intelligence based front-end interactive page turning method of claim 1, wherein the classification model training comprises: data preprocessing, pre-training model training, data labeling, model fine tuning and model publishing; the data preprocessing comprises the steps of classifying ReDraw data sets according to component types, and uniformly scaling the picture size to set pixels in an equal proportion mode; wherein the pre-training model training comprises training the model using a plurality of ReDraw data sets based on a MobileNet model; the data labeling comprises correspondingly labeling a plurality of component data of HTML page data; the model fine adjustment comprises the steps of using a MobileNet model and loading a pre-training model for fine adjustment; model publishing includes publishing the model and providing HTTP services and providing a calling interface.

3. The artificial intelligence based front-end interactive page turning method according to claim 1, wherein said OCR model training comprises a character localization model training and a character recognition model training, wherein the character localization model training is implemented by CTPN algorithm, wherein the character recognition model training is implemented by DenseNet algorithm.

4. The artificial intelligence based front-end interactive page turning method of claim 3, wherein the text-positioning model training comprises: the method comprises the steps of character positioning picture marking, image data preprocessing and data enhancing, CTPN model building, CTPN model training, CTPN model evaluation and CTPN model storage.

5. The artificial intelligence based front-end interactive page turning method of claim 3, wherein the text recognition model training comprises: the method comprises the steps of character recognition picture labeling, image data preprocessing and data enhancement, construction of a DenseNet model, training of the DenseNet model, evaluation of the DenseNet model and storage of the DenseNet model.

6. The artificial intelligence based front-end interactive page turning method according to claim 1, wherein said vector graphics file first turning step comprises:

acquiring each layer information from the Sketch design file, specifically, reading the Sketch file and converting the Sketch file into a JSON file through a Sketch-file plug-in, wherein field picture information, document information and page information in the JSON file are used;

converting the picture information, the document information and the page information of the JSON file into a Def file, wherein the Def file is a definition file;

and converting the Def file into a VUE page file in an HTML5 standard format, wherein the VUE page comprises a text label, a picture label and a vector diagram label, and the module structure of the VUE page is kept consistent with that of the Sketch file.

7. The artificial intelligence based front-end interactive page turning method according to claim 1, wherein said vector graphics file second turning step comprises:

s410, screenshot is conducted on the front-end interaction page, an interface area for obtaining the first component is selected, and an interface area image is obtained by combining the screenshot;

s420, performing Base64 coding on the interface region image, calling a classification model to perform coding on the image, arranging the recognition results from high to low according to the similarity, and returning a plurality of candidate components;

s430, replacing the corresponding basic component with the best matching component according to the best matching component selected by the user;

s440, calling an OCR recognition model to perform character recognition on the Base64 code of the interface region image to obtain a character recognition result;

and S450, adding a character recognition result to the assembly regulated in the S430 to obtain the second assembly.

8. The artificial intelligence based front-end interactive page turning method of claim 1, wherein the HTML5 definition file is for WEBIDE.

9. An artificial intelligence based front-end interactive page transformation apparatus comprising a memory, a processor and a computer program stored in said memory and executable on said processor, wherein said processor implements the method steps of any of claims 1 to 8 when executing said computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method steps of any one of claims 1 to 8.