CN117113085A - Sample set generation and page element recognition model training method and device - Google Patents

Sample set generation and page element recognition model training method and device Download PDF

Info

Publication number
CN117113085A
CN117113085A CN202311056429.8A CN202311056429A CN117113085A CN 117113085 A CN117113085 A CN 117113085A CN 202311056429 A CN202311056429 A CN 202311056429A CN 117113085 A CN117113085 A CN 117113085A
Authority
CN
China
Prior art keywords
initial
sample
coordinate
replacement
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311056429.8A
Other languages
Chinese (zh)
Inventor
李宇航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202311056429.8A priority Critical patent/CN117113085A/en
Publication of CN117113085A publication Critical patent/CN117113085A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure provides a sample set generation and page element recognition model training method and device, the scheme is as follows: acquiring a page area, and determining initial elements and initial marking points in the page area; acquiring a scaling parameter set preset by an initial element, and respectively carrying out deformation treatment on the initial element and an initial mark point according to each scaling parameter in the scaling parameter set to obtain an initial sample set and a candidate mark point; for any initial sample, acquiring a replacement element set of the initial sample, and updating the initial sample based on each replacement element in the replacement element set to obtain updated candidate samples; and generating a target training sample set in the page area based on each candidate sample and the candidate mark point of each candidate sample. By carrying out replacement stretching treatment on a page area, a target training sample set containing a large number of training samples is generated, so that the diversity and the richness of the samples are increased, and the problem that the samples are difficult to acquire or less is solved.

Description

Sample set generation and page element recognition model training method and device
Technical Field
The disclosure relates to the technical field of computers, in particular to the technical fields of image processing, natural language processing, deep learning and the like, and particularly relates to a sample set generation and page element recognition model training method and device.
Background
Front-end intelligence is one of the important development directions in the industry, namely, the front-end intelligence is that a user uploads a webpage design drawing and directly generates webpage codes without development. How to identify the elements in the uploaded pictures of the user is an important step. The general method is that a large number of sample pictures with element positions marked in advance are input through deep learning, a targeted deep learning model is generated, and elements contained in the pictures input by a user can be predicted by using the model. The more the number of the marked sample pictures is, the better the prediction effect is.
Disclosure of Invention
The disclosure provides a method, a device, electronic equipment and a storage medium for generating a sample set.
According to a first aspect of the present disclosure, there is provided a sample set generating method, including: acquiring a page area to be processed, and determining initial elements and initial marking points in the page area, wherein the page area is used for sample generation; obtaining a scaling parameter set preset by the initial element, and respectively carrying out deformation treatment on the initial element and the initial marking point according to each scaling parameter in the scaling parameter set to obtain an initial sample set after deformation treatment and candidate marking points of each initial sample in the initial sample set; for any initial sample, acquiring a replacement element set of the initial sample, and updating the initial sample based on each replacement element in the replacement element set to obtain updated candidate samples; and generating a target training sample set in the page area based on each candidate sample and the candidate mark point of each candidate sample.
According to a second aspect of the present disclosure, there is provided a page element recognition model training method, including: acquiring an initial page element identification model to be trained, and processing a target page by the sample set generation method according to the embodiment of the first aspect to acquire a training sample set of the target page; training the initial page element recognition model based on the training sample set until training is completed, and generating a target page element recognition model.
According to a third aspect of the present disclosure, there is provided a sample set generating apparatus including: the acquisition module is used for acquiring a page area to be processed, determining initial elements and initial marking points in the page area, and generating a sample in the page area; the deformation module is used for acquiring a scaling parameter set preset by the initial element, and respectively carrying out deformation treatment on the initial element and the initial mark point according to each scaling parameter in the scaling parameter set to obtain a deformed initial sample set and candidate mark points of each initial sample in the initial sample set; the updating module is used for acquiring a replacement element set of an initial sample aiming at any initial sample, and updating the initial sample based on each replacement element in the replacement element set respectively to obtain updated candidate samples; and the generation module is used for generating a target training sample set in the page area based on each candidate sample and the candidate mark point of each candidate sample.
According to a fourth aspect of the present disclosure, there is provided a page element recognition model training apparatus, including: the invoking module is used for acquiring an initial page element identification model to be trained, and processing a target page through the sample set generating method according to the embodiment of the first aspect so as to acquire a training sample set of the target page; and the training module is used for training the initial page element identification model based on the training sample set until training is completed, and generating a target page element identification model.
According to a fifth aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the sample set generating method described in the embodiment of the above aspect.
According to a sixth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium having stored thereon computer instructions for causing the computer to execute the sample set generating method according to the embodiment of the above aspect.
According to a seventh aspect of the present disclosure, there is provided a computer program product comprising a computer program/instruction which, when executed by a processor, implements the sample set generating method according to an embodiment of the above aspect.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a flowchart of a sample set generating method according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram of initial elements of another sample set generation method provided by an embodiment of the present disclosure;
FIG. 3 is a flowchart of another sample set generating method according to an embodiment of the present disclosure;
FIG. 4 is a flowchart of another sample set generating method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of an initial sample of another sample set generation method provided by an embodiment of the present disclosure;
FIG. 6 is a flowchart of another sample set generating method according to an embodiment of the present disclosure;
FIG. 7 is a flowchart of a training method for a page element recognition model according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a sample set generating device according to an embodiment of the disclosure;
FIG. 9 is a schematic structural diagram of a training device for a page element recognition model according to an embodiment of the present disclosure;
fig. 10 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Sample set generation methods, apparatuses, and electronic devices according to embodiments of the present disclosure are described below with reference to the accompanying drawings.
Natural language processing (Natural Language Processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. The natural language processing is mainly applied to the aspects of machine translation, public opinion monitoring, automatic abstracting, viewpoint extraction, text classification, question answering, text semantic comparison, voice recognition and the like.
Deep Learning (DL) is a new research direction in the field of Machine Learning (ML), and is introduced into Machine Learning to make it closer to the original goal, i.e., artificial intelligence. Deep learning is the inherent law and presentation hierarchy of learning sample data, and the information obtained during such learning is helpful in interpreting data such as text, images and sounds. Its final goal is to have the machine have analytical learning capabilities like a person, and to recognize text, image, and sound data. Deep learning is a complex machine learning algorithm that achieves far greater results in terms of speech and image recognition than prior art.
Image processing (image processing) techniques, which analyze images with a computer to achieve a desired result. Also known as image processing. Image processing generally refers to digital image processing. The digital image is a large two-dimensional array obtained by photographing with equipment such as an industrial camera, a video camera, a scanner and the like, wherein the elements of the array are called pixels, and the values of the pixels are called gray values. Image processing techniques generally include image compression, enhancement and restoration, matching, description and recognition of 3 parts.
Fig. 1 is a flow chart of a sample set generating method according to an embodiment of the disclosure.
As shown in fig. 1, the sample set generating method may include:
s101, acquiring a page area to be processed, determining initial elements and initial marking points in the page area, wherein the page area is used for sample generation.
In the embodiment of the present disclosure, the page area to be processed may be selected in advance, or may be selected randomly in the page area to be processed, which is not limited herein, and may be specifically set according to actual design requirements.
It should be noted that the initial elements in the page area may include various types, for example, text type, picture type, audio type, and the like. But may be multiple types of mashups without limitation.
It should be noted that the shape of the initial element may be various, and is not limited in this respect, and for example, the initial element may be rectangular, circular, or other irregular shape.
In the embodiment of the present disclosure, the initial marker point is a marker point that identifies the initial element position. It should be noted that the number of the initial mark points is usually plural, so that the position of the initial element can be accurately calibrated.
In the embodiment of the present disclosure, the positions of the initial mark points may be set in advance, and the number and positions of the initial mark points may be determined according to the shape of the initial element. For example, as shown in fig. 2, the initial element is rectangular in shape, and the initial marking point A, B may be two diagonal points of the rectangle.
S102, acquiring a scaling parameter set preset by the initial element, and respectively carrying out deformation processing on the initial element and the initial marking point according to each scaling parameter in the scaling parameter set to obtain an initial sample set after deformation processing and candidate marking points of each initial sample in the initial sample set.
The scaling parameters are coefficients for scaling up and down the initial element, and the scaling parameters may be manually set, or may be randomly generated, and are not limited in this regard.
In the embodiment of the present disclosure, the scaling parameter set is a set of all scaling parameters, which are set in advance and may be changed according to actual design requirements, and is not limited in any way.
In the embodiment of the present disclosure, the scaling operation may be performed on the initial element by using each scaling parameter in the scaling parameter set, and it may be understood that when the scaling of the initial element is completed, the corresponding initial marker point position will also be changed accordingly. For example, when the scaling parameter set includes N scaling parameters, the N scaling parameters are respectively deformed for the initial element and the initial marker point, so that N initial sample sets and corresponding candidate marker points can be obtained.
S103, for any initial sample, acquiring a replacement element set of the initial sample, and updating the initial sample based on each replacement element in the replacement element set to obtain updated candidate samples.
The set of replacement elements is a set of replacement elements. The replacement element is text or a picture that can replace an element in the initial sample.
In the embodiment of the disclosure, the set of replacement elements is set in advance, and is a database containing a lot of pictures and texts.
In the embodiment of the disclosure, after the initial sample in the initial sample set is acquired, the purpose of updating the initial sample can be achieved by replacing the corresponding element in the initial sample set with the replacement element in the replacement element set.
S104, generating a target training sample set in the page area based on each candidate sample and the candidate mark point of each candidate sample.
It should be noted that, in the embodiment of the present disclosure, the target training sample set includes a plurality of candidate samples and candidate mark points corresponding to the candidate samples. In subsequent model training, the model may be trained by inputting candidate samples into the model and by the model predicted marker points and candidate marker points. Alternatively, the model may be trained by inputting candidate marker points into the model, predicting samples through the model, and based on the predicted samples and the candidate samples.
In the embodiment of the disclosure, firstly, a page area to be processed is obtained, initial elements and initial marking points in the page area are determined, the page area is used for sample generation, then, a scaling parameter set preset by the initial elements is obtained, the initial elements and the initial marking points are respectively deformed according to scaling parameters in the scaling parameter set, a deformed initial sample set and candidate marking points of each initial sample in the initial sample set are obtained, then, for any initial sample, a replacement element set of the initial sample is obtained, the initial sample is updated based on each replacement element in the replacement element set, updated candidate samples are obtained, and finally, a target training sample set in the page area is generated based on each candidate sample and the candidate marking points of each candidate sample. Therefore, by carrying out replacement stretching treatment on one page area, a target training sample set containing a large number of training samples is generated, so that the diversity and the richness of the samples are increased, and the problem that the samples are difficult to acquire or less is solved.
In the above embodiment, the initial element of the page area is determined, the page data of the page area may be acquired first, and then the component data is determined based on the page data as the initial element. It should be noted that the component data may be various, and is not limited in any way, and for example, the component data may include a button component, a text component, an image component, etc., and may be a combination of various components. The component data may be modified. Thus, by acquiring page data, component data which can be changed is determined, and a basis is provided for subsequent element replacement.
In the above embodiment, the initial element and the initial marking point are respectively deformed according to each scaling parameter in the scaling parameter set, so as to obtain the deformed initial sample set and the candidate marking point of each initial sample in the initial sample set, which can be further explained through fig. 3, and the method includes:
s301, obtaining a zoom center of the initial element.
In the embodiment of the disclosure, the zoom center is a base point when zooming, and the initial element and the initial mark point are deformed by taking the zoom center as an origin.
The zoom center is set in advance, and may be changed according to actual design requirements, and is not limited in any way.
S302, respectively carrying out deformation processing on the initial element and the initial marking point based on the zoom center and each zoom parameter in the zoom parameter set so as to obtain an initial sample set and candidate marking points of each initial sample in the initial sample set.
In the embodiment of the present disclosure, after the zoom center is acquired, the method of performing the deformation processing on the initial element and the initial mark point may be different.
In one possible implementation, the scaling operation may be performed on the initial element with the scaling center as a base point, so as to obtain a scaled initial sample. The center of scaling may also be determined proportionally based on the scaled initial sample.
In the embodiment of the disclosure, a scaling center of an initial element is first obtained, and then deformation processing is respectively performed on the initial element and the initial marking point based on the scaling center and each scaling parameter in a scaling parameter set, so as to obtain an initial sample set and candidate marking points of each initial sample in the initial sample set. Therefore, unified scaling processing is realized by setting the scaling center, the scaling flow is simplified, and the efficiency and consistency of deformation operation are improved.
In the embodiment of the disclosure, the initial element is deformed based on the scaling center and each scaling parameter in the scaling parameter set, and the scaling operation is performed on the initial element according to the scaling parameter by taking the scaling center point as a base point. Therefore, by setting the zoom center, the zoom operation on the initial element based on each zoom parameter in the zoom parameter set is realized, and unified zoom processing on the initial element is realized.
In the embodiment of the present disclosure, the initial sample is subjected to replacement processing based on each replacement element in the replacement element set, so as to obtain a candidate sample, and the method may further be explained by fig. 4, including:
s401, obtaining an element type of an alternative element of the initial sample.
It should be noted that the replaceable element is an element that may be replaced in the initial sample. The replaceable elements of the initial sample may be various and are not limited in any way herein.
In the embodiment of the present disclosure, the element types may include a plurality of types, and are not limited herein, and for example, as shown in fig. 5, the element types may include a picture type as shown in the left element of fig. 5, and may also include a text type as shown in the right element of fig. 5.
S402, determining a target replacement element set corresponding to each replaceable element from the replacement element sets based on the element types.
In the embodiment of the disclosure, after the element type is acquired, a replacement element matched with the element type of the replaceable element can be selected from the replacement element set, and a target replacement element set is generated based on the replacement element.
It should be noted that, in order to match the size of the replaceable element, the replaceable element may be screened by the size, and the like of the sample to screen out a suitable replaceable element to form the target replaceable element set, so as to prevent the distortion of the replacement caused by the oversized or undersized replaceable element and the consequences such as exceeding the sample range.
S403, replacing the corresponding replaceable elements of the elements in the target replacement element set.
In the embodiment of the present disclosure, the alternative processing methods corresponding to different element types may be different, which is not limited herein.
In one possible implementation, in response to the element type being a text type, an element in the target set of replacement elements replaces a corresponding replaceable element. And respectively modifying the src attribute values of the replaceable elements into the src attribute values of the elements in the target replacement element set in response to the element type being the picture type.
The src attribute is an optional attribute of the tag, and its value indicates an absolute path or a relative path of a file of the image.
In the embodiment of the disclosure, first, the replaceable elements of the initial sample and the element types of the replaceable elements are acquired, then, a target replacement element set corresponding to each replaceable element is determined from the replacement element sets based on the element types, and finally, replacement processing is performed on the replaceable elements corresponding to the elements in the target replacement element set. Therefore, different replacement elements are selected to replace the replaceable elements based on different element types, so that training samples with similar structures and different contents can be obtained, and effective training samples are provided for subsequent model training.
In the above embodiment, the deformation processing is performed on the initial marker point based on the zoom center and each zoom parameter in the zoom parameter set, which may be further explained by fig. 6, and the method includes:
s601, acquiring a first coordinate of a zoom center point and a second coordinate of an initial mark point.
In the disclosed embodiments, the first coordinates of the zoom center point and the second coordinates of the initial mark point may be obtained by acquiring the first coordinates and the second coordinates of the initial mark point in the same coordinate system.
The coordinate system is preset, and the position of the origin of the coordinate system can be adjusted according to the actual design requirement.
S602, calculating a coordinate migration value based on the scaling parameter, the first coordinate and the second coordinate.
In the embodiment of the present disclosure, the coordinate difference of the first coordinate and the second coordinate may be calculated first, and then the coordinate difference and the scaling parameter may be multiplied to obtain the coordinate shift value. The coordinate migration value may be calculated by the following formula:
Q=[(X 1 ,Y 1 )-(X 2 ,Y 2 )]*scale
wherein, (X 1 ,Y 1 ) Is the first coordinate, (X) 2 ,Y 2 ) Scale is a scaling parameter for the second coordinate, and Q is a coordinate shift value.
S603, determining the deformed initial mark point as a target mark point based on the coordinate migration value and the second coordinate, and acquiring a third coordinate of the target mark point.
In the embodiment of the present disclosure, after the coordinate shift value is obtained, the third coordinate of the target mark point may be obtained by subtracting the coordinate shift value from the second coordinate.
In the disclosed embodiment, the third coordinate may be calculated by the following formula.
(X 3 ,Y 3 )=(X 2 ,Y 2 )-Q
Wherein, (X 2 ,Y 2 ) Is the second coordinate, (X) 3 ,Y 3 ) And Q is a coordinate migration value for the third coordinate.
In the embodiment of the disclosure, first, a first coordinate of a zoom center point and a second coordinate of an initial mark point are acquired, then a coordinate migration value is calculated based on a zoom parameter, the first coordinate and the second coordinate, finally, the deformed initial mark point is determined to be a target mark point based on the coordinate migration value and the second coordinate, and a third coordinate of the target mark point is acquired. Therefore, the accuracy of acquiring data can be improved by placing the centering point and the initial marking point in the same coordinate system and calculating the third coordinate of the deformed target marking point.
Fig. 7 is a flowchart of a page element recognition model training method according to an embodiment of the present disclosure.
As shown in fig. 7, the sample page element recognition model training method may include:
s701, acquiring an initial page element identification model to be trained, and processing a target page through a sample set generation method to acquire a training sample set of the target page.
It should be noted that, the sample set generating method in the embodiment of the present disclosure is a sample set generating method as shown in fig. 1 to 6, and the specific sample set generating process may refer to the content in the above embodiment, which is not described herein again.
S702, training the initial page element recognition model based on the training sample set until training is completed, and generating a target page element recognition model.
In embodiments of the present disclosure, candidate samples may be predicted by an initial page element recognition model to obtain predicted mark points, and then the model is trained based on the predicted mark points and the candidate mark points.
It can be understood that the training of the model is a repeated iterative process, and the training is performed by continuously adjusting the network parameters of the model until the overall loss function value of the model is smaller than a preset value, or the overall loss function value of the model is not changed or the change amplitude is slow, and the model converges, so that a trained model is obtained.
In the embodiment of the disclosure, an initial page element identification model to be trained is firstly obtained, a target page is processed through a sample set generation method to obtain a training sample set of the target page, and then the initial page element identification model is trained based on the training sample set until training is completed, so that the target page element identification model is generated. Therefore, the target page is processed through the sample set generation method to generate the training sample set, so that the training sample set containing a large number of samples can be generated under the condition that the training samples are smaller, and the model training effect is improved.
Corresponding to the sample set generating methods provided in the above several embodiments, an embodiment of the present disclosure further provides a sample set generating device, and since the sample set generating device provided in the embodiment of the present disclosure corresponds to the sample set generating method provided in the above several embodiments, implementation of the sample set generating method described above is also applicable to the sample set generating device provided in the embodiment of the present disclosure, and will not be described in detail in the following embodiments.
Fig. 8 is a schematic structural diagram of a sample set generating device according to an embodiment of the present disclosure. As shown in fig. 8, the sample set generating device 800 includes: an acquisition module 810, a morphing module 820, an updating module 830, and a generation module 840.
The acquiring module 810 is configured to acquire a page area to be processed, and determine an initial element and an initial mark point in the page area, where the page area is used for sample generation.
The deformation module 820 is configured to obtain a scaling parameter set preset by the initial element, and perform deformation processing on the initial element and the initial marking point according to each scaling parameter in the scaling parameter set, so as to obtain a deformed initial sample set and candidate marking points of each initial sample in the initial sample set.
The updating module 830 is configured to obtain, for any initial sample, a set of replacement elements of the initial sample, and update the initial sample based on each replacement element in the set of replacement elements, to obtain an updated candidate sample.
The generating module 840 is configured to generate a target training sample set in the page area based on each candidate sample and the candidate mark point of each candidate sample.
In one embodiment of the present disclosure, the morphing module 820 is further configured to: acquiring a zoom center of an initial element; and respectively carrying out deformation processing on the initial element and the initial marking point based on the scaling center and each scaling parameter in the scaling parameter set so as to obtain an initial sample set and candidate marking points of each initial sample in the initial sample set.
In one embodiment of the present disclosure, the morphing module 820 is further configured to: and scaling the initial element according to the scaling parameters by taking the scaling center point as a base point.
In one embodiment of the present disclosure, the update module 830 is further configured to: and carrying out replacement processing on the initial sample based on each replacement element in the replacement element set respectively to obtain candidate samples.
In one embodiment of the present disclosure, the update module 830 is further configured to: acquiring an element type of an alternative element of the initial sample; determining a target replacement element set corresponding to each replaceable element from the replacement element sets based on the element types; and carrying out replacement processing on the elements in the target replacement element set corresponding to the replaceable elements.
In one embodiment of the present disclosure, the update module 830 is further configured to: in response to the element type being a text type, an element in the target replacement element set replaces a corresponding replaceable element.
In one embodiment of the present disclosure, the update module 830 is further configured to: and respectively modifying the src attribute values of the replaceable elements into the src attribute values of the elements in the target replacement element set in response to the element type being the picture type.
In one embodiment of the present disclosure, the morphing module 820 is further configured to: acquiring a first coordinate of a zoom center point and a second coordinate of an initial mark point; calculating a coordinate migration value based on the scaling parameter, the first coordinate and the second coordinate; and determining a third coordinate of the target mark point corresponding to the deformed initial mark point based on the coordinate migration value and the second coordinate.
In one embodiment of the present disclosure, the morphing module 820 is further configured to: calculating a coordinate difference between the first coordinate and the second coordinate; the coordinate difference and the scaling parameter are multiplied to obtain a coordinate migration value.
In one embodiment of the present disclosure, the obtaining module 810 is configured to: acquiring page data of a page area; component data is determined as an initial element based on the page data.
Therefore, by carrying out replacement stretching treatment on one page area, a target training sample set containing a large number of training samples is generated, so that the diversity and the richness of the samples are increased, and the problem that the samples are difficult to acquire or less is solved.
Fig. 9 is a schematic structural diagram of a training device for a page element recognition model according to an embodiment of the present disclosure. As shown in fig. 9, the page element recognition model training apparatus 900 includes: calling module 910, training module 920.
The invoking module 910 is configured to obtain an initial page element recognition model to be trained, and process the target page through a sample set generating method, so as to obtain a training sample set of the target page.
The training module 920 is configured to train the initial page element recognition model based on the training sample set until training is completed, and generate a target page element recognition model.
Therefore, the target page is processed through the sample set generation method to generate the training sample set, so that the training sample set containing a large number of samples can be generated under the condition that the training samples are smaller, and the model training effect is improved.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related user personal information all conform to the regulations of related laws and regulations, and the public sequence is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 10 shows a schematic block diagram of an example electronic device 1000 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 10, the apparatus 1000 includes a computing unit 1001 that can perform various appropriate actions and processes according to computer programs/instructions stored in a Read Only Memory (ROM) 1002 or loaded from a storage unit 1006 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data required for the operation of the device 1000 can also be stored. The computing unit 1001, the ROM 1002, and the RAM 1003 are connected to each other by a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.
Various components in device 1000 are connected to I/O interface 1005, including: an input unit 1006 such as a keyboard, a mouse, and the like; an output unit 1007 such as various types of displays, speakers, and the like; a storage unit 1008 such as a magnetic disk, an optical disk, or the like; and communication unit 1009 such as a network card, modem, wireless communication transceiver, etc. Communication unit 1009 allows device 1000 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
The computing unit 1001 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1001 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1001 performs the respective methods and processes described above, for example, the sample set generating method. For example, in some embodiments, the sample set generation method may be implemented as a computer software program tangibly embodied on a machine-readable medium, e.g., in some embodiments of the storage unit 1006, some or all of the computer program/instructions may be loaded onto and/or installed onto the device 1000 via the ROM 1002 and/or the communication unit 1009. When the computer program/instructions are loaded into RAM 1003 and executed by computing unit 1001, one or more steps of the sample set generation method described above may be performed. Alternatively, in other embodiments, the computing unit 1001 may be configured to perform the sample set generation method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs/instructions that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be a special or general purpose programmable processor, operable to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs/instructions running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (25)

1. A sample set generation method, comprising:
acquiring a page area to be processed, and determining initial elements and initial marking points in the page area, wherein the page area is used for sample generation;
obtaining a scaling parameter set preset by the initial element, and respectively carrying out deformation treatment on the initial element and the initial marking point according to each scaling parameter in the scaling parameter set to obtain an initial sample set after deformation treatment and candidate marking points of each initial sample in the initial sample set;
For any initial sample, acquiring a replacement element set of the initial sample, and updating the initial sample based on each replacement element in the replacement element set to obtain updated candidate samples;
and generating a target training sample set in the page area based on each candidate sample and the candidate mark point of each candidate sample.
2. The method of claim 1, wherein the deforming the initial element and the initial marking point according to each scaling parameter in the scaling parameter set to obtain a deformed initial sample set and candidate marking points of each initial sample in the initial sample set, respectively, includes:
acquiring a zoom center of the initial element;
and respectively carrying out deformation processing on the initial element and the initial marking point based on the scaling center and each scaling parameter in the scaling parameter set so as to obtain the initial sample set and candidate marking points of each initial sample in the initial sample set.
3. The method of claim 2, wherein deforming the initial element based on each scaling parameter in the set of scaling parameters of the scaling center comprises:
And scaling the initial element according to the scaling parameters by taking the scaling center point as a base point.
4. A method according to claim 3, wherein updating the initial sample based on each replacement element in the set of replacement elements, respectively, results in updated candidate samples, comprising:
and carrying out replacement processing on the initial sample based on each replacement element in the replacement element set respectively so as to obtain the candidate sample.
5. The method of claim 4, wherein the replacing the initial sample based on each replacement element in the set of replacement elements, respectively, comprises:
acquiring an element type of an alternative element of the initial sample;
determining a target replacement element set corresponding to each replaceable element from the replacement element sets based on the element types;
and carrying out replacement processing on the element corresponding to the replaceable element in the target replacement element set.
6. The method of claim 5, wherein the replacing the element of the target set of replacement elements with the corresponding replaceable element comprises:
In response to the element type being a text type, an element in the target set of replacement elements replaces the corresponding replaceable element.
7. The method of claim 5, wherein the replacing the element of the target set of replacement elements with the corresponding replaceable element comprises:
and respectively modifying the src attribute values of the replaceable elements to be the src attribute values of the elements in the target replacing element set in response to the element type being the picture type.
8. The method of claim 2, wherein deforming the initial marker point based on each scaling parameter in the set of scaling parameters of the scaling center comprises:
acquiring a first coordinate of a zoom center point and a second coordinate of the initial mark point;
calculating a coordinate migration value based on the scaling parameter, the first coordinate and the second coordinate;
and determining the deformed initial mark point as a target mark point based on the coordinate migration value and the second coordinate, and acquiring a third coordinate of the target mark point.
9. The method of claim 8, wherein the calculating a coordinate migration value based on the scaling parameter, the first coordinate, and the second coordinate comprises:
Calculating a coordinate difference between the first coordinate and the second coordinate;
multiplying the coordinate difference and the scaling parameter to obtain the coordinate migration value.
10. The method of claim 1, wherein the determining the initial element within the page region comprises:
acquiring page data of the page area;
component data is determined as the initial element based on the page data.
11. A page element recognition model training method comprises the following steps:
acquiring an initial page element identification model to be trained, and processing a target page by the sample set generation method according to any one of claims 1-10 to acquire a training sample set of the target page;
training the initial page element recognition model based on the training sample set until training is completed, and generating a target page element recognition model.
12. A sample set generating device comprising:
the acquisition module is used for acquiring a page area to be processed, determining initial elements and initial marking points in the page area, and generating a sample in the page area;
the deformation module is used for acquiring a scaling parameter set preset by the initial element, and respectively carrying out deformation treatment on the initial element and the initial mark point according to each scaling parameter in the scaling parameter set to obtain a deformed initial sample set and candidate mark points of each initial sample in the initial sample set;
The updating module is used for acquiring a replacement element set of an initial sample aiming at any initial sample, and updating the initial sample based on each replacement element in the replacement element set respectively to obtain updated candidate samples;
and the generation module is used for generating a target training sample set in the page area based on each candidate sample and the candidate mark point of each candidate sample.
13. The apparatus of claim 12, wherein the deformation module is further to:
acquiring a zoom center of the initial element;
and respectively carrying out deformation processing on the initial element and the initial marking point based on the scaling center and each scaling parameter in the scaling parameter set so as to obtain the initial sample set and candidate marking points of each initial sample in the initial sample set.
14. The apparatus of claim 13, wherein the deformation module is further configured to:
and scaling the initial element according to the scaling parameters by taking the scaling center point as a base point.
15. The apparatus of claim 14, wherein the update module is further configured to:
and carrying out replacement processing on the initial sample based on each replacement element in the replacement element set respectively so as to obtain the candidate sample.
16. The apparatus of claim 15, wherein the update module is further configured to:
acquiring an element type of an alternative element of the initial sample;
determining a target replacement element set corresponding to each replaceable element from the replacement element sets based on the element types;
and carrying out replacement processing on the element corresponding to the replaceable element in the target replacement element set.
17. The apparatus of claim 16, wherein the update module is further configured to:
in response to the element type being a text type, an element in the target set of replacement elements replaces the corresponding replaceable element.
18. The apparatus of claim 16, wherein the update module is further configured to:
and respectively modifying the src attribute values of the replaceable elements to be the src attribute values of the elements in the target replacing element set in response to the element type being the picture type.
19. The apparatus of claim 13, wherein the deformation module is further configured to:
acquiring a first coordinate of a zoom center point and a second coordinate of the initial mark point;
calculating a coordinate migration value based on the scaling parameter, the first coordinate and the second coordinate;
And determining a third coordinate of the target mark point corresponding to the initial mark point after deformation based on the coordinate migration value and the second coordinate.
20. The apparatus of claim 19, wherein the deformation module is further configured to:
calculating a coordinate difference between the first coordinate and the second coordinate;
multiplying the coordinate difference and the scaling parameter to obtain the coordinate migration value.
21. The apparatus of claim 12, wherein the means for obtaining is configured to:
acquiring page data of the page area;
component data is determined as the initial element based on the page data.
22. A page element recognition model training device, comprising:
a calling module, configured to obtain an initial page element identification model to be trained, and process a target page by using the sample set generating method according to any one of claims 1 to 10, so as to obtain a training sample set of the target page;
and the training module is used for training the initial page element identification model based on the training sample set until training is completed, and generating a target page element identification model.
23. An electronic device, comprising a memory, a processor;
Wherein the processor runs a program corresponding to the executable program code by reading the executable program code stored in the memory for implementing the sample set generating method according to any one of claims 1 to 10 or the page element recognition model training method according to claim 11.
24. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the sample set generation method of any one of claims 1-10 or to implement the page element recognition model training method of claim 11.
25. A computer program product comprising computer program/instructions which, when executed by a processor, implements the sample set generation method of any one of claims 1-10, or implements the page element recognition model training method of claim 11.
CN202311056429.8A 2023-08-21 2023-08-21 Sample set generation and page element recognition model training method and device Pending CN117113085A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311056429.8A CN117113085A (en) 2023-08-21 2023-08-21 Sample set generation and page element recognition model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311056429.8A CN117113085A (en) 2023-08-21 2023-08-21 Sample set generation and page element recognition model training method and device

Publications (1)

Publication Number Publication Date
CN117113085A true CN117113085A (en) 2023-11-24

Family

ID=88804913

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311056429.8A Pending CN117113085A (en) 2023-08-21 2023-08-21 Sample set generation and page element recognition model training method and device

Country Status (1)

Country Link
CN (1) CN117113085A (en)

Similar Documents

Publication Publication Date Title
EP3859605A2 (en) Image recognition method, apparatus, device, and computer storage medium
CN113177472B (en) Dynamic gesture recognition method, device, equipment and storage medium
CN115063875B (en) Model training method, image processing method and device and electronic equipment
CN112861885B (en) Image recognition method, device, electronic equipment and storage medium
EP3933752A1 (en) Method and apparatus for processing video frame
CN114792355B (en) Virtual image generation method and device, electronic equipment and storage medium
JP2023541527A (en) Deep learning model training method and text detection method used for text detection
CN113344862A (en) Defect detection method, defect detection device, electronic equipment and storage medium
CN113762109B (en) Training method of character positioning model and character positioning method
CN113033346B (en) Text detection method and device and electronic equipment
CN114495101A (en) Text detection method, and training method and device of text detection network
CN112085103B (en) Data enhancement method, device, equipment and storage medium based on historical behaviors
CN114881227B (en) Model compression method, image processing device and electronic equipment
CN114863450B (en) Image processing method, device, electronic equipment and storage medium
CN114972910B (en) Training method and device for image-text recognition model, electronic equipment and storage medium
US20220319141A1 (en) Method for processing image, device and storage medium
EP4047474A1 (en) Method for annotating data, related apparatus and computer program product
CN117113085A (en) Sample set generation and page element recognition model training method and device
CN114494686A (en) Text image correction method, text image correction device, electronic equipment and storage medium
CN114707638A (en) Model training method, model training device, object recognition method, object recognition device, object recognition medium and product
CN114445682A (en) Method, device, electronic equipment, storage medium and product for training model
CN113642612B (en) Sample image generation method and device, electronic equipment and storage medium
CN113239943B (en) Three-dimensional component extraction and combination method and device based on component semantic graph
CN116168442B (en) Sample image generation method, model training method and target detection method
CN113705594B (en) Image identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination