CN117292395B

CN117292395B - Training method and training device for drawing-examining model and drawing-examining method and device

Info

Publication number: CN117292395B
Application number: CN202311268276.3A
Authority: CN
Inventors: 左栋; 邹辉东; 张文晖; 张雨心; 何建辉; 柏松林; 杨建忠
Original assignee: Map Technology Examination Center Of Ministry Of Natural Resources
Current assignee: Map Technology Examination Center Of Ministry Of Natural Resources
Priority date: 2023-09-27
Filing date: 2023-09-27
Publication date: 2024-05-24
Anticipated expiration: 2043-09-27
Also published as: CN117292395A

Abstract

The disclosure provides training and device of a drawing model and drawing method and device, and relates to the field of artificial intelligence, in particular to the field of image processing. The specific implementation scheme is as follows: obtaining a to-be-examined sample map, a truth value map, and marking similarity of the to-be-examined map and the truth value map; line element extraction is respectively carried out on the sample map to be inspected and the truth value map to obtain a line element map to be inspected and a truth value line element map; inputting the line element diagram to be examined and the true line element diagram into a feature extractor based on a self-attention mechanism to obtain an intermediate feature diagram of the map to be examined and an intermediate feature diagram of the true map; inputting the intermediate feature map of the map to be examined and the intermediate feature map of the truth value map into a similarity comparison network in the aesthetic model to obtain predicted similarity; network parameters of the aesthetic model are adjusted based on the differences in the predicted similarity and the labeled similarity. According to the embodiment, the model for auditing the map can be obtained, so that the accuracy and efficiency of the map can be improved, and the labor cost can be reduced.

Description

Training method and training device for drawing-examining model and drawing-examining method and device

Technical Field

The disclosure relates to the field of artificial intelligence, in particular to the field of image processing, and specifically relates to a training method and a training device for an aesthetic drawing model and a method and a device for aesthetic drawing.

Background

Map auditing is an important task for discovering and correcting various problems on a map to ensure the accuracy, reliability, compliance and applicability of the map. Traditional map review approaches rely primarily on manual review, which, while capable of finding some problems, has some drawbacks. Firstly, the manual inspection speed is relatively slow, and a large amount of human resources are consumed; secondly, due to limited vision and attention of human beings, detection omission or false detection is easy; finally, the subjectivity of the manual inspection is strong, and the judgment standards and methods of different inspection personnel may be different, so that the reliability of the inspection result is low.

Disclosure of Invention

The present disclosure provides a method and apparatus for training a trial model, and a method, apparatus, device, storage medium and computer program product for trial.

According to a first aspect of the present disclosure, there is provided a training method of an aesthetic model, including: obtaining a to-be-examined sample map, a truth value map, and marking similarity of the to-be-examined sample map and the truth value map; extracting line elements from the to-be-inspected sample map and the truth value map respectively, and correspondingly generating a to-be-inspected line element map and a truth value line element map according to the extracted line elements; inputting the line element diagram to be examined and the truth line element diagram into a feature extractor based on a self-attention mechanism in an initial examination diagram model to obtain an intermediate feature diagram of a sample map to be examined and an intermediate feature diagram of a truth value map; inputting the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth value map into a similarity comparison network in the initial trial model to obtain predicted similarity; network parameters of a self-attention mechanism-based feature extractor and/or similarity comparison network in the initial aesthetic model are adjusted based on the difference between the predicted similarity and the labeled similarity.

According to a second aspect of the present disclosure, there is provided an aesthetic method comprising: acquiring a map to be examined and a standard map; extracting line elements from the to-be-inspected map and the standard map respectively, and correspondingly generating a to-be-inspected line element map and a standard line element map according to the extracted line elements; inputting the to-be-examined line element map and the standard line element map into an aesthetic map model trained according to the method of any one of the first aspect, and outputting aesthetic results of the to-be-examined map and the standard map.

According to a third aspect of the present disclosure, there is provided a training device of an aesthetic model, comprising: the acquisition unit is configured to acquire the to-be-examined sample map, the true value map, the annotation similarity of the to-be-examined sample map and the true value map; the preprocessing unit is configured to extract line elements from the to-be-examined sample map and the truth value map respectively, and correspondingly generate a to-be-examined line element map and a truth value line element map according to the extracted line elements; the extraction unit is configured to input the to-be-examined line element diagram and the truth line element diagram into a self-attention mechanism-based feature extractor in an initial aesthetic diagram model to obtain an intermediate feature diagram of a to-be-examined sample map and an intermediate feature diagram of a truth value map; the computing unit is configured to input the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth value map into a similarity comparison network in the initial trial model to obtain prediction similarity; an adjustment unit configured to adjust network parameters of a self-attention mechanism based feature extractor and/or similarity comparison network in the initial aesthetic model based on the difference of the predicted similarity and the labeled similarity.

According to a fourth aspect of the present disclosure, there is provided an aesthetic device comprising: an acquisition unit configured to acquire a pending map and a standard map; the preprocessing unit is configured to extract line elements from the to-be-examined map and the standard map respectively, and correspondingly generate a to-be-examined line element map and a standard line element map according to the extracted line elements; an auditing unit configured to input the to-be-audited line element map and the standard line element map into an approval model trained by the apparatus according to any one of the third aspect, and output approval results of the to-be-audited map and the standard map.

According to a fifth aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the first or second aspects.

According to a sixth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of the first or second aspects.

According to a seventh aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of any one of the first or second aspects.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is an exemplary system architecture diagram to which the present disclosure may be applied;

FIG. 2 is a flow chart of one embodiment of a training method according to the aesthetic model of the present disclosure;

FIG. 3 is a schematic illustration of one application scenario of a training method according to the aesthetic model of the present disclosure;

FIG. 4 is a flow chart of one embodiment of a method according to an aesthetic of the present disclosure;

FIG. 5 is a schematic structural view of one embodiment of a training device according to an aesthetic model of the present disclosure;

FIG. 6 is a schematic diagram of an embodiment of an apparatus according to an aesthetic of the present disclosure;

Fig. 7 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 illustrates an exemplary system architecture 100 to which the methods of training, apparatuses of training, methods of examining, or apparatuses of examining of the drawings of embodiments of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include terminals 101, 102, a network 103, a database server 104, and a server 105. The network 103 serves as a medium for providing a communication link between the terminals 101, 102, the database server 104 and the server 105. The network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user 110 may interact with the server 105 via the network 103 using the terminals 101, 102 to receive or send messages or the like. The terminals 101, 102 may have various client applications installed thereon, such as model training class applications, map auditing class applications, map class applications, shopping class applications, payment class applications, web browsers, instant messaging tools, and the like.

The terminals 101 and 102 may be hardware or software. When the terminals 101, 102 are hardware, they may be various electronic devices with display screens, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video experts compression standard audio plane 3), laptop and desktop computers, and the like. When the terminals 101, 102 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein.

When the terminals 101 and 102 are software, map-like software may be installed thereon, and a map may be displayed. And the screenshot map after the map class software is called by different terminals can be checked. And the map is used for verifying whether the map provided by the map class software meets the release standard.

Database server 104 may be a database server that provides various services. For example, a database server may have stored therein a sample set. The sample set contains a large number of samples. The sample can comprise a pending sample map, a truth value map, and annotation similarity of the pending sample map and the truth value map. Thus, the user 110 may also select samples from the sample set stored by the database server 104 via the terminals 101, 102.

The server 105 may also be a server providing various services, such as a background server providing support for various applications displayed on the terminals 101, 102. The background server can train the initial model by utilizing samples in the sample set stored in the database server 104, and the user can apply the trained aesthetic model to conduct map auditing.

The database server 104 and the server 105 may be hardware or software. When they are hardware, they may be implemented as a distributed server cluster composed of a plurality of servers, or as a single server. When they are software, they may be implemented as a plurality of software or software modules (e.g., to provide distributed services), or as a single software or software module. The present invention is not particularly limited herein. Database server 104 and server 105 may also be servers of a distributed system or servers that incorporate blockchains. Database server 104 and server 105 may also be cloud servers, or intelligent cloud computing servers or intelligent cloud hosts with artificial intelligence technology.

It should be noted that, the training method of the aesthetic model or the method of the aesthetic model provided by the embodiments of the present disclosure is generally performed by the server 105. Accordingly, training means for the aesthetic model or means for the aesthetic are also typically provided in the server 105.

It should be noted that the database server 104 may not be provided in the system architecture 100 in cases where the server 105 may implement the relevant functions of the database server 104.

It should be understood that the number of terminals, networks, database servers, and servers in fig. 1 are merely illustrative. There may be any number of terminals, networks, database servers, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a training method of the aesthetic model according to the present disclosure is shown. The training method of the aesthetic model can comprise the following steps:

And step 201, obtaining a to-be-examined sample map, a truth value map, and marking similarity of the to-be-examined sample map and the truth value map.

In this embodiment, the execution subject of the training method of the aesthetic model (e.g., the server shown in fig. 1) may acquire a sample set from the database server. Each sample comprises a to-be-examined sample map, a truth value map, and annotation similarity of the to-be-examined sample map and the truth value map. The truth value map is prepared according to the national boundary line drawing method standard of China and world, and can be used for news propaganda pictures, books and periodicals newspaper illustration, advertisement display background pictures, artwork design base pictures and the like, and also can be used as a reference base picture for preparing a public version map. During training, samples are randomly selected from the sample set for training. The pending sample map and the truth map have the same scale, the same size, and are positioned at the same coordinate location. In the sample collection process, the to-be-examined sample map and the truth value map can be adjusted to be in a state capable of being compared in a zooming and cutting mode and the like.

The similarity of the sample map to be inspected and the truth map can be automatically marked through a machine, for example, an aesthetic model trained through the method or other methods in the prior art can be used as an automatic marking tool for automatically marking the similarity of the sample map to be inspected and the truth map. The marking of the data can be realized through manual examination, the similarity between the sample map to be examined and the truth value map is scored manually, if the sample map to be examined and the truth value map are completely consistent, the score is 1, if the sample map to be examined and the truth value map are completely different, the score is 0, and the rest scores are distributed between 0 and 1 according to the similarity degree. And (3) carrying out manual examination on the truth value map and the sample map to be examined, wherein the score is 0.7 (representing that the similarity between the truth value map and the sample map to be examined is 0.7). The similarity herein refers to the similarity of line elements. In order to distinguish from the predicted similarity, the similarity noted in the sample is named as the noted similarity.

Step 202, respectively extracting line elements from the sample map to be inspected and the truth value map, and correspondingly generating a line element map to be inspected and a truth value line element map according to the extracted line elements.

In this embodiment, according to the color information of the line element to be inspected, the image segmentation is performed on the sample map to be inspected and the truth map to separate out the potential line element. Optionally, image processing techniques may also be applied to reduce or remove noise and interference in the pending line element graphs and the truth line element graphs. Image processing techniques such as morphological operations, filters, connected component analysis, etc. may be applied to eliminate noise and interference in the image.

Color information of the line elements can be extracted from the sample map and the truth map through a pre-trained preprocessing model, and the line elements in each image are separated from the background. Thus, a binary mask map or segmentation map containing potential line elements may be obtained, named a pending line element map and a true line element map, respectively.

And 203, inputting the line element diagram to be examined and the true line element diagram into a feature extractor based on a self-attention mechanism in the initial examination diagram model to obtain an intermediate feature diagram of the sample map to be examined and an intermediate feature diagram of the true value map.

In this embodiment, the aesthetic model includes a feature extractor and similarity comparison network based on a self-attention mechanism.

In this process, the processed sample and truth maps are used as inputs to a self-attention feature extractor, which outputs intermediate feature maps of the sample and truth maps, respectively, that contain rich local and global structural features (specifically, features of texture, color, etc. of lines). Specifically:

input data: the processed pending sample map and truth map are input into a self-attention feature extractor. The maps are preprocessed to divide the color information of the line elements and then filtered to reduce noise.

Self-attention feature extractor: the self-attention feature extractor is a deep learning model and is specially used for processing the pending sample map and the truth map. The model is able to extract rich feature information from both maps. These features not only contain local details of the map, but also capture the overall structure of the map.

And (3) generating an intermediate feature map: after the self-attention feature extractor processes the pending sample map and the truth value map, an intermediate feature map of the pending sample map and an intermediate feature map of the truth value map are output. These feature maps contain information about local and global features at each pixel location. They can be regarded as abstract representations of a map.

And 204, inputting the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth value map into a similarity comparison network in the initial aesthetic model to obtain the predicted similarity.

In this embodiment, the similarity comparison network may be of two types, one of which can directly compare the average pixel-by-pixel distances of the two intermediate feature maps to obtain the predicted similarity. Another way is to calculate the similarity after converting the two feature maps into feature vectors, e.g. cosine similarity.

In step 205, network parameters of the self-attention mechanism based feature extractor and/or similarity comparison network in the initial aesthetic model are adjusted based on the differences in the predicted and annotated similarities.

In this embodiment, the application performs supervised training, uses the labeled similarity as a supervision signal, and calculates the loss value according to the difference between the predicted similarity and the labeled similarity. The method comprises the steps of adjusting network parameters of any one of the self-attention mechanism-based feature extractor and the similarity comparison network in the aesthetic model, or simultaneously adjusting the network parameters of the self-attention mechanism-based feature extractor and the similarity comparison network so that loss values are converged. The re-selection of samples repeats steps 201-205 until the loss value converges to a predetermined value.

The method provided by the embodiment of the disclosure can improve the accuracy of extracting local and global features of the aesthetic model by utilizing an image self-attention mechanism algorithm and a computer vision technology, thereby improving the accuracy of the aesthetic model. The line elements on the map are inspected by using the trained aesthetic model, so that whether the problems of errors, omission, overlapping and the like exist can be checked more accurately. The automatic line element inspection can improve the efficiency and accuracy of map inspection and reduce human errors.

In some optional implementations of this embodiment, the extracting line elements from the pending sample map and the truth map respectively, and generating a pending line element map and a truth line element map according to the extracted line element correspondence, includes: extracting color information of line elements from the sample map to be examined and the truth value map; and separating the line elements in the to-be-inspected sample map and the truth value map from the background by using an image segmentation technology based on the color information of the line elements to obtain a to-be-inspected line element map and a truth value line element map. First, color information of line elements is extracted from a sample map to be examined and a truth value map. This may be achieved by color space conversion, filtering and enhancement steps to ensure that the color of the line elements is highlighted in the image. The color information of the line elements in the image may then be separated from the background using image segmentation techniques, such as segmentation methods based on threshold, edge detection, region growing, or depth learning. This allows a binary mask or segmentation map to be obtained that contains potential line elements, named pending line element map and true line element map. The line element diagrams are directly extracted for comparison, so that the calculated amount can be reduced, and the prediction accuracy is improved.

In some optional implementations of this embodiment, the method further includes: image processing techniques are applied to reduce or remove noise and interference in the pending line element graphs and the truth line element graphs. Image processing techniques such as morphological operations, filters, connected component analysis, etc. may be applied to reduce or remove noise and interference in the image. This helps preserve the accuracy of the line elements under review.

In some optional implementations of this embodiment, the inputting the pending line element graph and the truth line element graph into the self-attention mechanism-based feature extractor in the initial aesthetic model obtains an intermediate feature graph of the pending sample map and an intermediate feature graph of the truth value map, including: dividing the line element diagram to be inspected and the true line element diagram into blocks with fixed sizes respectively to obtain a sub-image set to be inspected and a true sub-image set; and carrying out global feature extraction on the sub-image set to be inspected and the true sub-image set to obtain an intermediate feature map of the sample map to be inspected and an intermediate feature map of the true map. This is a self-attention mechanism based feature extractor in VIT (Vision Transformer) fashion. The VIT method has been successfully applied in the field of natural language processing, and local and global features in a map can be captured by dividing an image to obtain a plurality of local images and then extracting global features from each local image by using a transducer mechanism.

In some optional implementations of this embodiment, the inputting the pending line element graph and the truth line element graph into the self-attention mechanism-based feature extractor in the initial aesthetic model obtains an intermediate feature graph of the pending sample map and an intermediate feature graph of the truth value map, including: extracting a local feature map of a to-be-examined sample map and a local feature map of a truth value map from the to-be-examined line element map and the truth value line element map respectively based on a convolutional neural network; and carrying out global feature extraction on the local feature map of the sample map to be inspected and the local feature map of the truth value map respectively to obtain an intermediate feature map of the sample map to be inspected and an intermediate feature map of the truth value map. The method combines the advantages of CNN (convolutional neural network) and the advantages of a transducer, firstly utilizes CNN to extract a local feature map, and then utilizes the transducer to extract global features, so that features can be extracted at a higher level of abstraction, and the content of a map can be better understood. Either way, the self-attention feature extractor can effectively extract features from the pending sample map and the truth map. The characteristics have key roles in the subsequent comparison of the predicted similarity and the labeled similarity, and are used for judging the matching degree of the line elements of the sample map to be examined and the line elements of the true value map, so that an automatic picture examination process is realized.

In some optional implementations of this embodiment, inputting the intermediate feature map of the pending sample map and the intermediate feature map of the truth map into a similarity comparison network in the initial aesthetic model to obtain the predicted similarity includes: splicing the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth value map to obtain a complete feature map; expanding the complete feature map into feature vectors, and inputting the feature vectors into a multi-layer perceptron in the similarity comparison network to obtain perception vectors; and inputting the perception vector into a Sigmoid function in the similarity comparison network to obtain the predicted similarity. The method adopts a deep learning technology and is used for comparing the similarity between the sample map to be examined and the truth value map. In the process, the intermediate features of the sample map to be examined and the intermediate features of the truth value map are firstly taken as inputs to the similarity comparison network. The two features are then stitched together to integrate the information of both. The feature map is then expanded into feature vectors to facilitate subsequent full join operations. The feature vectors are processed more deeply by a multi-layer perceptron (MLP) to learn more advanced feature expressions. Finally, by mapping the output of the MLP to between 0 and 1 by applying the Sigmoid function, a similarity score between the pending sample map and the truth map is obtained. This score can accurately measure the degree of similarity between the two, a score close to 1 indicates a high degree of similarity, and a score close to 0 indicates a low degree of similarity. The deep learning method comprehensively utilizes various technologies, so that the result of the automatic aesthetic drawing can be evaluated more accurately.

In some optional implementations of this embodiment, the expanding the complete feature map into feature vectors includes: rolling and pooling the complete feature map to obtain a compressed feature map; and expanding the compressed feature map into feature vectors. Downsampling and channel number reduction processes are performed on the stitched features by convolution and pooling operations, thereby compressing the features and capturing a higher level representation of the features while reducing the number of parameters for subsequent processing. Features of the compressed feature map may be extracted by convolutional neural networks and cyclic neural networks. The features of the compressed feature graphs are subjected to feature fusion through a fusion layer to obtain feature vectors.

In some optional implementations of this embodiment, the extracting line elements from the to-be-inspected sample map and the truth map respectively, and generating a to-be-inspected line element map and a truth line element map according to the extracted line element correspondence, includes: and respectively preprocessing the to-be-examined sample map and the truth value map through a deep learning segmentation model to obtain a to-be-examined line element map and a truth value line element map.

In some optional implementations of this embodiment, the method further includes: and adjusting network parameters of the segmentation model based on the difference between the predicted similarity and the labeled similarity. If the segmentation model is adopted for preprocessing, network parameters of the segmentation model can be adjusted in the process of training the aesthetic drawing model, so that the segmentation model suitable for the aesthetic drawing is obtained.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the training method of the aesthetic model according to the present embodiment. In the application scenario of fig. 3, a sample is randomly selected from a sample set, a to-be-inspected sample map and a truth value map in the sample are respectively subjected to line element extraction, the steps include extracting color information, dividing elements in the map, and removing image noise and interference to obtain a to-be-inspected line element map and a truth value line element map. After that, a plurality of nonlinear elements may be included in the processed map, so that structural information of the map needs to be extracted from a local and global perspective in order to filter out the nonlinear elements. For this reason, an image self-attention mechanism is introduced, which can efficiently capture structural information of a map. And extracting intermediate features of the to-be-examined line element map and the truth line element map by a feature extractor to obtain an intermediate feature map of the to-be-examined sample map and an intermediate feature map of the truth value map. And then, carrying out similarity evaluation on the intermediate features of the sample map to be inspected and the truth value map by using the information extracted by the self-attention mechanism so as to realize automatic judgment of the drawing result. And calculating the similarity between the intermediate feature map of the sample map to be inspected and the intermediate feature map of the truth value map through a similarity comparison network, and taking the similarity as a predicted value. And comparing the labeled similarity in the sample as a true value with a predicted value, and calculating a loss value. And adjusting network parameters of the aesthetic model according to the loss value, namely adjusting network parameters of the feature extractor and the similarity comparison network so that the loss value converges. The above steps are repeated until the loss value converges to a predetermined value. At this time, training is completed, and the obtained aesthetic model can be directly used for automatic aesthetic drawing. The process combines color and structure information, and map examination is completed in an automatic mode, so that the efficiency and accuracy of map examination are improved.

With further reference to fig. 4, a flow 400 of one embodiment of an aesthetic method is shown. The process 400 of the method for mapping comprises the following steps:

Step 401, obtaining a pending map and a standard map.

In this embodiment, the electronic device (for example, the server shown in fig. 1) on which the mapping method operates may acquire the pending map from the terminal device, and store the complete standard map in the server in advance. The coordinate position, scale and size of the standard map can be adjusted to be in the same state as the map to be inspected according to the auditing elements (at least one of national boundary, important islands in south China sea, harbor and australia stations and provincial and municipal names). The standard map is compiled according to the national boundary line drawing method standard of China and countries of the world, can be used for news propaganda pictures, books and periodicals newspaper inserting pictures, advertisement display background pictures, artwork design base pictures and the like, and also can be used as a reference base picture for compiling a public map.

And step 402, respectively extracting line elements from the to-be-inspected map and the standard map, and correspondingly generating a to-be-inspected line element map and a standard line element map according to the extracted line elements.

In this embodiment, the method described in step 202 is used to extract line elements from the to-be-inspected map and the standard map, and this step includes extracting color information, dividing elements in the map, and removing image noise and interference to obtain the to-be-inspected line element map and the standard line element map.

And step 403, inputting the to-be-examined line element diagram and the standard line element diagram into an examination diagram model, and outputting examination diagram results of the to-be-examined map and the standard map.

In this embodiment, the aesthetic model is a trained aesthetic model according to steps 201-205. The processed map may contain a variety of non-linear elements, so that it is necessary to extract structural information of the map from a local and global perspective in order to filter out these non-linear elements. For this reason, an image self-attention mechanism is introduced, which can efficiently capture structural information of a map. And then, carrying out similarity evaluation on the intermediate features of the map to be inspected and the standard map by using the information extracted by the self-attention mechanism so as to realize automatic judgment of the result of the inspection. The result of the examination and drawing can be similarity, and can also be that the examination and the drawing pass or the examination and drawing fail. The process combines color and structure information, and map examination is completed in an automatic mode, so that the efficiency and accuracy of map examination are improved.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flowchart 400 of the method for examining the images in this embodiment represents the step of examining the images by using the examination model trained by the flowchart 200. Thus, the scheme described in this embodiment captures global and local structural information by means of an image self-attention mechanism by first performing line element extraction using color information, and then extracting structural information of the map. And then, judging the similarity of the intermediate features of the map to be inspected and the standard map, and realizing an automatic image inspection result. The method combines the advantages of color information preprocessing and self-attention mechanism, and can automatically judge the similarity of the to-be-examined map and the standard map, thereby efficiently and accurately completing the line element drawing task and bringing a remarkable breakthrough to the line element drawing field.

With further reference to fig. 5, as an implementation of the method shown in the foregoing figures, the present disclosure provides an embodiment of a training apparatus for an aesthetic model, where the apparatus embodiment corresponds to the method embodiment shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 5, the training device 500 of the aesthetic model of the present embodiment includes: an acquisition unit 501, a preprocessing unit 502, an extraction unit 503, a calculation unit 504, and an adjustment unit 505. The acquiring unit 501 is configured to acquire the to-be-examined sample map, the truth value map, the labeling similarity of the to-be-examined sample map and the truth value map; the preprocessing unit 502 is configured to extract line elements from the to-be-inspected sample map and the truth value map respectively, and correspondingly generate a to-be-inspected line element map and a truth value line element map according to the extracted line elements; an extraction unit 503, configured to input the to-be-examined line element graph and the truth line element graph into a feature extractor based on a self-attention mechanism in an initial aesthetic graph model, so as to obtain an intermediate feature graph of a to-be-examined sample map and an intermediate feature graph of a truth value map; a computing unit 504 configured to input the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth map into a similarity comparison network in the initial trial model to obtain a predicted similarity; an adjustment unit 505 configured to adjust network parameters of a self-attention mechanism based feature extractor and/or similarity comparison network in the aesthetic model based on the difference of the predicted similarity and the labeled similarity.

In this embodiment, the specific processes of the obtaining unit 501, the preprocessing unit 502, the extracting unit 503, the calculating unit 504, and the adjusting unit 505 of the training apparatus 500 for the aesthetic model may refer to steps 201 to 205 in the corresponding embodiment of fig. 2.

In some optional implementations of the present embodiment, the preprocessing unit 502 is further configured to: extracting color information of line elements from the sample map to be examined and the truth value map; and separating the line elements in the to-be-inspected sample map and the truth value map from the background by using an image segmentation technology based on the color information of the line elements to obtain a to-be-inspected line element map and a truth value line element map.

In some optional implementations of the present embodiment, the apparatus 500 further includes a denoising unit (not shown in the drawings) configured to: image processing techniques are applied to reduce or remove noise and interference in the pending line element graphs and the truth line element graphs.

In some optional implementations of the present embodiment, the extraction unit 503 is further configured to: dividing the line element diagram to be inspected and the true line element diagram into blocks with fixed sizes respectively to obtain a sub-image set to be inspected and a true sub-image set; and carrying out global feature extraction on the sub-image set to be inspected and the true sub-image set to obtain an intermediate feature map of the sample map to be inspected and an intermediate feature map of the true map.

In some optional implementations of the present embodiment, the extraction unit 503 is further configured to: extracting a local feature map of a to-be-examined sample map and a local feature map of a truth value map from the to-be-examined line element map and the truth value line element map respectively based on a convolutional neural network; and carrying out global feature extraction on the local feature map of the sample map to be inspected and the local feature map of the truth value map respectively to obtain an intermediate feature map of the sample map to be inspected and an intermediate feature map of the truth value map.

In some optional implementations of the present embodiment, the computing unit 504 is further configured to: splicing the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth value map to obtain a complete feature map; expanding the complete feature map into feature vectors, and inputting the feature vectors into a multi-layer perceptron in the similarity comparison network to obtain perception vectors; and inputting the perception vector into a Sigmoid function in the similarity comparison network to obtain the predicted similarity.

In some optional implementations of the present embodiment, the computing unit 504 is further configured to: rolling and pooling the complete feature map to obtain a compressed feature map; and expanding the compressed feature map into feature vectors.

In some optional implementations of the present embodiment, the preprocessing unit 502 is further configured to: and respectively extracting line elements from the to-be-examined sample map and the truth value map through a deep learning segmentation model to obtain a to-be-examined line element map and a truth value line element map.

In some optional implementations of the present embodiment, the adjusting unit 505 is further configured to: and adjusting network parameters of the segmentation model based on the difference between the predicted similarity and the labeled similarity.

With further reference to fig. 6, as an implementation of the method shown in the foregoing figures, the present disclosure provides an embodiment of a device for examining figures, where the embodiment of the device corresponds to the embodiment of the method shown in fig. 4, and the device may be applied to various electronic devices in particular.

As shown in fig. 6, the image-examining device 600 of the present embodiment includes: an acquisition unit 601, a preprocessing unit 602, and an auditing unit 603. Wherein, the acquisition unit 601 is configured to acquire a pending map and a standard map; a preprocessing unit 602, configured to perform preprocessing on the to-be-inspected map and the standard map respectively to obtain a to-be-inspected line element map and a standard line element map; an auditing unit 603 is configured to input the to-be-audited line element map and the standard line element map into an aesthetic model trained by the apparatus 500, and output the similarity of the to-be-audited map and the standard map.

In this embodiment, the specific processes of the obtaining unit 601, the preprocessing unit 602, and the auditing unit 603 of the examination apparatus 600 may refer to steps 401 to 403 in the corresponding embodiment of fig. 4.

In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of flow 200 or 400.

A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of flow 200 or 400.

A computer program product comprising a computer program that when executed by a processor implements the method of flow 200 or 400.

Fig. 7 illustrates a schematic block diagram of an example electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the apparatus 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 may also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in device 700 are connected to I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, an optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 701 performs the various methods and processes described above, such as the training method of the aesthetic model. For example, in some embodiments, the training method of the aesthetic model may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 700 via ROM 702 and/or communication unit 709. When the computer program is loaded into RAM 703 and executed by the computing unit 701, one or more steps of the training method of the above-described aesthetic model may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the training method of the aesthetic model by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A training method of a graphic model, comprising:

Obtaining a to-be-examined sample map, a truth value map and marking similarity of the to-be-examined sample map and the truth value map, wherein the truth value map is compiled according to the national world line drawing method standard of China and the world;

Extracting line elements from the to-be-inspected sample map and the truth value map respectively, and correspondingly generating a to-be-inspected line element map and a truth value line element map according to the extracted line elements, wherein the method comprises the following steps: extracting line elements from the to-be-examined sample map and the truth value map through a deep learning segmentation model to obtain a to-be-examined line element map and a truth value line element map;

Inputting the line element diagram to be examined and the truth line element diagram into a feature extractor based on a self-attention mechanism in an initial examination diagram model to obtain an intermediate feature diagram of a sample map to be examined and an intermediate feature diagram of a truth value map;

Inputting the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth value map into a similarity comparison network in the initial trial model to obtain predicted similarity, wherein the method comprises the following steps: splicing the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth value map to obtain a complete feature map; expanding the complete feature map into feature vectors, and inputting the feature vectors into a multi-layer perceptron in the similarity comparison network to obtain perception vectors; inputting the perception vector into a Sigmoid function of the similarity comparison network to obtain predicted similarity;

network parameters of a self-attention mechanism-based feature extractor and/or similarity comparison network in the initial aesthetic model are adjusted based on the difference between the predicted similarity and the labeled similarity.

2. The method of claim 1, wherein the extracting line elements from the pending sample map and the truth map respectively, and generating a pending line element map and a truth line element map according to the extracted line element correspondence, comprises:

extracting color information of line elements from the sample map to be examined and the truth value map;

and separating the line elements in the to-be-inspected sample map and the truth value map from the background by using an image segmentation technology based on the color information of the line elements to obtain a to-be-inspected line element map and a truth value line element map.

3. The method of claim 2, wherein the method further comprises:

Image processing techniques are applied to reduce or remove noise and interference in the pending line element graphs and the truth line element graphs.

4. The method of claim 1, wherein the method further comprises:

and adjusting network parameters of the segmentation model based on the difference between the predicted similarity and the labeled similarity.

5. The method of claim 1, wherein said inputting the co-pending line element graph and the truth line element graph into a self-attention mechanism based feature extractor in an initial aesthetic model results in a mid-feature graph of a co-pending sample map and a mid-feature graph of a truth value map, comprising:

Dividing the line element diagram to be inspected and the true line element diagram into blocks with fixed sizes respectively to obtain a sub-image set to be inspected and a true sub-image set;

And carrying out global feature extraction on the sub-image set to be inspected and the true sub-image set to obtain an intermediate feature map of the sample map to be inspected and an intermediate feature map of the true map.

6. The method of claim 1, wherein said inputting the co-pending line element graph and the truth line element graph into a self-attention mechanism based feature extractor in an initial aesthetic model results in a mid-feature graph of a co-pending sample map and a mid-feature graph of a truth value map, comprising:

extracting a local feature map of a to-be-examined sample map and a local feature map of a truth value map from the to-be-examined line element map and the truth value line element map respectively based on a convolutional neural network;

And carrying out global feature extraction on the local feature map of the sample map to be inspected and the local feature map of the truth value map respectively to obtain an intermediate feature map of the sample map to be inspected and an intermediate feature map of the truth value map.

7. The method of claim 1, wherein the expanding the complete feature map into feature vectors comprises:

rolling and pooling the complete feature map to obtain a compressed feature map;

and expanding the compressed feature map into feature vectors.

8. A method of mapping comprising:

Acquiring a map to be examined and a standard map;

Extracting line elements from the to-be-inspected map and the standard map respectively, and correspondingly generating a to-be-inspected line element map and a standard line element map according to the extracted line elements;

Inputting the to-be-examined line element map and the standard line element map into an aesthetic map model trained according to the method of any one of claims 1-7, and outputting aesthetic results of the to-be-examined map and the standard map.

9. A training device for a graphic model, comprising:

The system comprises an acquisition unit, a judgment unit and a judgment unit, wherein the acquisition unit is configured to acquire a to-be-examined sample map, a truth value map and the labeling similarity of the to-be-examined sample map and the truth value map, and the truth value map is compiled according to the national world line drawing method standard of China and world countries;

The preprocessing unit is configured to extract line elements from the to-be-examined sample map and the truth value map respectively, and correspondingly generate a to-be-examined line element map and a truth value line element map according to the extracted line elements, and comprises the following steps: extracting line elements from the to-be-examined sample map and the truth value map through a deep learning segmentation model to obtain a to-be-examined line element map and a truth value line element map;

The extraction unit is configured to input the to-be-examined line element diagram and the truth line element diagram into a self-attention mechanism-based feature extractor in an initial aesthetic diagram model to obtain an intermediate feature diagram of a to-be-examined sample map and an intermediate feature diagram of a truth value map;

The computing unit is configured to input the intermediate feature map of the pending sample map and the intermediate feature map of the truth value map into a similarity comparison network in the initial trial model to obtain a predicted similarity, and comprises the following steps: splicing the intermediate feature map of the sample map to be examined and the intermediate feature map of the truth value map to obtain a complete feature map; expanding the complete feature map into feature vectors, and inputting the feature vectors into a multi-layer perceptron in the similarity comparison network to obtain perception vectors; inputting the perception vector into a Sigmoid function of the similarity comparison network to obtain predicted similarity;

an adjustment unit configured to adjust network parameters of a self-attention mechanism based feature extractor and/or similarity comparison network in the initial aesthetic model based on the difference of the predicted similarity and the labeled similarity.

10. The apparatus of claim 9, wherein the preprocessing unit is further configured to:

11. The apparatus of claim 10, wherein the apparatus further comprises a denoising unit configured to:

12. The apparatus of claim 9, wherein the adjustment unit is further configured to:

13. The apparatus of claim 9, wherein the extraction unit is further configured to:

14. The apparatus of claim 9, wherein the extraction unit is further configured to:

15. The apparatus of claim 9, wherein the computing unit is further configured to:

and expanding the compressed feature map into feature vectors.

16. A device for examining a picture, comprising:

an acquisition unit configured to acquire a pending map and a standard map;

The preprocessing unit is configured to extract line elements from the to-be-examined map and the standard map respectively, and correspondingly generate a to-be-examined line element map and a standard line element map according to the extracted line elements;

an auditing unit configured to input the to-be-audited line element map and the standard line element map into an approval model trained by the apparatus according to any one of claims 9 to 15, and output approval results of the to-be-audited map and the standard map.

17. An electronic device, comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.

18. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-8.