CN116563795A

CN116563795A - Doll production management method and doll production management system

Info

Publication number: CN116563795A
Application number: CN202310622946.0A
Authority: CN
Inventors: 梁琪琛
Original assignee: Beijing Tianyi Culture Media Co ltd
Current assignee: Beijing Tianyi Culture Media Co ltd
Priority date: 2023-05-30
Filing date: 2023-05-30
Publication date: 2023-08-08

Abstract

A doll production management method and system, it obtains the monitoring image of the doll to be checked; the feature extraction and classification are carried out on the monitoring image of the doll by utilizing the deep learning and artificial intelligence technology so as to realize the automatic detection of the surface defects of the doll, improve the production efficiency and quality level of the doll and ensure the rights and interests and safety of consumers.

Description

Doll production management method and doll production management system

Technical Field

The present application relates to the field of intelligent production technology, and more particularly, to a method for managing production of a doll and a system thereof.

Background

With the growing market for toys, dolls have become a popular toy whose quality of production has become a concern for both manufacturers and consumers. In order to ensure the safety, the aesthetic property and the functionality of the doll, the production process thereof needs to be monitored and controlled, in particular to carry out quality detection so as to ensure the safety, the aesthetic property and the functionality of the doll.

However, in the process of doll production management, the conventional quality detection has the problems of low manual detection efficiency and high false detection rate. Thus, a solution is desired.

Disclosure of Invention

The present application has been made in order to solve the above technical problems. The embodiment of the application provides a production management method and a system for a doll, wherein the method and the system acquire a monitoring image of the doll to be detected; the feature extraction and classification are carried out on the monitoring image of the doll by utilizing the deep learning and artificial intelligence technology so as to realize the automatic detection of the surface defects of the doll, improve the production efficiency and quality level of the doll and ensure the rights and interests and safety of consumers.

In a first aspect, a method of manufacturing management for a doll is provided, comprising: acquiring a monitoring image of a doll to be detected; performing image preprocessing on the monitoring image to obtain a preprocessed monitoring image, wherein the image preprocessing comprises gray level conversion, image standardization, CLAHE and gamma correction; performing image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks; passing the sequence of monitoring image blocks through a ViT model comprising an embedded layer to obtain a plurality of doll local image block feature vectors; the feature vectors of the plurality of doll local image blocks are arranged into a doll global feature matrix, and then the doll global feature matrix is processed by a bidirectional attention mechanism module to obtain a classification feature matrix; performing feature distribution optimization on the classification feature matrix to obtain an optimized classification feature matrix; and the optimized classification characteristic matrix passes through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the doll to be detected has surface defects or not.

In the above-mentioned doll production management method, the image blocking processing is performed on the preprocessed monitoring image to obtain a sequence of monitoring image blocks, including: and uniformly partitioning the preprocessed monitoring image to obtain a sequence of monitoring image blocks, wherein each monitoring image block in the sequence of monitoring image blocks has the same size.

In the above-mentioned doll production management method, passing the sequence of the monitoring image blocks through a ViT model containing an embedded layer to obtain a plurality of doll local image block feature vectors includes: performing vector embedding on each monitoring image block in the sequence of monitoring image blocks by using an embedding layer of the ViT model to obtain a sequence of image block embedded vectors; and inputting the sequence of image block embedding vectors into a converter of the ViT model to obtain the plurality of doll local image block feature vectors.

In the above-described doll production management method, inputting the sequence of image block embedding vectors into the converter of the ViT model to obtain the plurality of doll partial image block feature vectors includes: the sequence of the image block embedded vectors is subjected to one-dimensional arrangement to obtain an image block global feature vector; calculating the product between the global feature vector of the image block and the transpose vector of each image block embedded vector in the sequence of image block embedded vectors to obtain a plurality of self-attention association matrices; respectively carrying out standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices; obtaining a plurality of probability values by using a Softmax classification function through each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and weighting each image block embedded vector in the sequence of image block embedded vectors by taking each probability value in the plurality of probability values as a weight to obtain the plurality of doll local image block feature vectors.

In the above method for managing the production of dolls, the method for obtaining the classification feature matrix by the bidirectional attention mechanism module after arranging the feature vectors of the plurality of doll local image blocks into the doll global feature matrix comprises the following steps: pooling the doll global feature matrix along the horizontal direction and the vertical direction respectively to obtain a first pooling vector and a second pooling vector; performing association coding on the first pooling vector and the second pooling vector to obtain a bidirectional association matrix; inputting the bidirectional association matrix into a Sigmoid activation function to obtain a bidirectional association weight matrix; and calculating the position-based point multiplication between the bidirectional association weight matrix and the doll global feature matrix to obtain the classification feature matrix.

In the above doll production management method, performing feature distribution optimization on the classification feature matrix to obtain an optimized classification feature matrix includes: converting the classification characteristic matrix into a square matrix through linear transformation, wherein the number of rows and the number of columns of the square matrix are the same; and carrying out vector spectral clustering agent learning fusion optimization on the square matrix to obtain the optimized classification feature matrix.

In the doll production management method, vector spectral clustering agent learning fusion optimization is performed on the square matrix to obtain the optimized classification feature matrix, and the method comprises the following steps: vector spectral clustering agent learning fusion optimization is carried out on the square matrix according to the following optimization formula so as to obtain the optimized classification feature matrix; wherein, the optimization formula is:wherein->Is the optimized classification feature matrix, +.>Is the square matrix, < >>Representing the individual row eigenvectors of the square matrix, and +.>Is a distance matrix consisting of the distances between every two corresponding row feature vectors of said square matrix,/>Is the transpose of the square matrix, +.>Is a transpose of the distance matrix, +.>Representing the Euclidean distance between the individual row feature vectors of the square matrix,/>Representing an exponential operation of a matrix, the exponential operation of the matrix being represented in each of the matricesThe position characteristic value is a natural exponential function value of a power, < >>And->Respectively representing dot-by-location multiplication and matrix addition.

In the above doll production management method, the optimizing classification feature matrix is passed through a classifier to obtain a classification result, where the classification result is used to indicate whether the doll to be detected has a surface defect, and the method includes: expanding the optimized classification feature matrix into classification feature vectors according to row vectors or column vectors; performing full-connection coding on the classification feature vectors by using a plurality of full-connection layers of the classifier to obtain coded classification feature vectors; and passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.

In a second aspect, there is provided a doll production management system comprising: the image acquisition module is used for acquiring a monitoring image of the doll to be detected; the preprocessing module is used for carrying out image preprocessing on the monitoring image to obtain a preprocessed monitoring image, wherein the image preprocessing comprises gray level conversion, image standardization, CLAHE and gamma correction; the image blocking processing module is used for carrying out image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks; the embedded coding module is used for enabling the sequence of the monitoring image blocks to pass through a ViT model containing an embedded layer so as to obtain a plurality of doll local image block feature vectors; the bidirectional attention module is used for arranging the feature vectors of the plurality of doll local image blocks into a doll global feature matrix and then obtaining a classification feature matrix through the bidirectional attention mechanism module; the optimizing module is used for optimizing the characteristic distribution of the classification characteristic matrix to obtain an optimized classification characteristic matrix; and

and the detection result generation module is used for enabling the optimized classification characteristic matrix to pass through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the doll to be detected has surface defects or not.

In the doll production management system, the image blocking processing module is used for: and uniformly partitioning the preprocessed monitoring image to obtain a sequence of monitoring image blocks, wherein each monitoring image block in the sequence of monitoring image blocks has the same size.

Compared with the prior art, the doll production management method and the doll production management system provided by the application acquire the monitoring image of the doll to be detected; the feature extraction and classification are carried out on the monitoring image of the doll by utilizing the deep learning and artificial intelligence technology so as to realize the automatic detection of the surface defects of the doll, improve the production efficiency and quality level of the doll and ensure the rights and interests and safety of consumers.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic view of a scene of a method of managing production of a doll according to an embodiment of the present application.

Fig. 2 is a flowchart of a method of manufacturing management of a doll according to an embodiment of the present application.

Fig. 3 is a schematic diagram of an architecture of a method of manufacturing management of a doll according to an embodiment of the present application.

Fig. 4 is a flowchart of the sub-steps of step 140 in a method of manufacturing management of a doll according to an embodiment of the present application.

Fig. 5 is a flowchart of the sub-steps of step 142 in the doll manufacturing management method in accordance with an embodiment of the present application.

Fig. 6 is a flowchart of the substep of step 150 in the method of manufacturing management of a doll according to an embodiment of the present application.

Fig. 7 is a flowchart of a sub-step of step 160 in a method of manufacturing management of a doll according to an embodiment of the present application.

Fig. 8 is a flowchart of the sub-steps of step 170 in a method of manufacturing management of a doll according to an embodiment of the present application.

Fig. 9 is a block diagram of a doll production management system according to an embodiment of the present application.

Detailed Description

The following description of the technical solutions in the embodiments of the present application will be made with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

Unless defined otherwise, all technical and scientific terms used in the examples of this application have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application.

In the description of the embodiments of the present application, unless otherwise indicated and defined, the term "connected" should be construed broadly, and for example, may be an electrical connection, may be a communication between two elements, may be a direct connection, or may be an indirect connection via an intermediary, and it will be understood by those skilled in the art that the specific meaning of the term may be understood according to the specific circumstances.

It should be noted that, the term "first\second\third" in the embodiments of the present application is merely to distinguish similar objects, and does not represent a specific order for the objects, it is to be understood that "first\second\third" may interchange a specific order or sequence where allowed. It is to be understood that the "first\second\third" distinguishing objects may be interchanged where appropriate such that the embodiments of the present application described herein may be implemented in sequences other than those illustrated or described herein.

Aiming at the technical problems, the technical conception of the method is to utilize deep learning and artificial intelligence technology to extract and classify the characteristics of the monitoring image of the doll so as to realize automatic detection of the surface defects of the doll, improve the production efficiency and quality level of the doll and ensure the rights and interests and safety of consumers.

Specifically, in the technical scheme of the application, firstly, a monitoring image of a doll to be detected is obtained. Here, the monitoring image may be used as basic data for doll quality inspection, thereby effectively analyzing and recognizing the doll surface.

In order to improve the quality and the analyzability of an image, in the technical scheme of the application, the monitoring image is subjected to image preprocessing to obtain a preprocessed monitoring image, wherein the image preprocessing comprises gray level conversion, image standardization, CLAHE and gamma correction. Specifically, the gray conversion means converting a color image into a gray image to reduce the dimension and complexity of the image while retaining the main information of the image; the image normalization is to normalize the pixel value of the gray level image to the [0,1] interval so as to eliminate the brightness difference between different images and enhance the contrast of the images; the CLAHE limits the self-adaptive histogram equalization of the contrast, so that the contrast of an image can be enhanced, and meanwhile, noise amplification and detail loss caused by excessive enhancement can be avoided; gamma correction is a method for adjusting the brightness of an image, which can strengthen details of a dark area and inhibit details of a bright area according to the perception characteristic of human eyes on the brightness, so that the image is clearer and more natural.

And then, carrying out image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks. That is, the preprocessed monitoring image is divided into a plurality of small image blocks. Therefore, the efficiency of image processing and deep learning analysis can be improved, the calculated amount and the memory occupation can be reduced, and the complexity and the cost of the system are reduced.

The sequence of surveillance image blocks is then passed through a ViT model containing an embedded layer to obtain a plurality of doll local image block feature vectors. The ViT model is a model in which a transducer is applied to image processing. The ViT model has the advantage that long-range dependencies between image tiles can be captured using the transducer's self-attention mechanism. Specifically, the ViT model contains an embedded layer that maps sequences of image blocks into fixed length vectors that are then fed into the transducer's encoder. The function of the embedding layer is to convert the original image block data into a form suitable for the transform process while preserving the information of the image block.

In consideration of the fact that the surface defects of the doll may be distributed at different parts of the doll, in the technical scheme of the application, the plurality of doll local image block feature vectors are arranged into a doll global feature matrix and then pass through a bidirectional attention mechanism module to obtain a classification feature matrix. Here, the overall information of the doll is represented by arranging the partial image block feature vectors of the doll into a global feature matrix of the doll. Also, the bi-directional attention mechanism module may utilize the concept of self-attention to enhance correlation between local regions. Specifically, the bidirectional attention module respectively calibrates the attention weights of the whole feature matrix from the horizontal direction and the vertical direction and acquires complex feature relations, so that local feature information can be completely acquired from the global features of the space.

Further, the classification feature matrix is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the doll to be detected has surface defects. The classifier can learn the characteristic distribution of normal and abnormal samples of the doll through a training data set, so that the new doll to be detected (namely, an input classification characteristic matrix) can be classified. In the technical scheme of the application, the classification labels of the classifier are of two types, namely 'the doll to be detected has surface defects' and 'the doll to be detected does not have surface defects', and the classifier determines which classification label the classification feature matrix belongs to through a soft maximum function. By the method, normal dolls and dolls with surface defects are distinguished, automatic detection of the surface defects of the dolls is achieved, the production efficiency and quality level of the dolls are improved, and the rights and the safety of consumers are guaranteed.

In the technical scheme of the application, when the classification feature matrix is obtained through the bidirectional attention mechanism module after the plurality of doll local image block feature vectors are arranged into the doll global feature matrix, the problem that when the convolutional neural network model containing the bidirectional attention mechanism performs feature extraction based on the attention weight mechanism in the row direction and the column direction, the overall feature distribution expression effect of the classification feature matrix is affected due to the fact that the internal image feature semantics of the doll local image block feature vectors are mixed with synthetic noise features caused by feature vector arrangement when the doll local image block feature vectors are spliced.

Therefore, the classification feature matrix is first converted into a square matrix by linear transformation, i.e. the number of rows and columns are the same, and then the square matrix is written as, for exampleVector spectral clustering agent learning fusion optimization is performed to obtain an optimized classification feature matrix, for example, expressed as +.>Here, the optimized classification feature matrix +.>The method comprises the following steps:wherein->Representing the square matrix->Is a line feature vector of (1), and->Is a distance matrix of distances between the respective vectors.

Here, when feature extraction is performed after feature vector stitching is performed on each doll local image block of the doll global feature matrix, the internal image feature semantics of each doll local image block feature vector is confused with the synthesized noise feature, so that the ambiguity of the demarcation between the meaningful quasi-regression image semantic features and the noise features is caused, and the vector spectral clustering agent learning fusion optimization utilizes the conceptual information of the association between the quasi-regression semantic features and the quasi-regression scene by introducing spectral clustering agent learning for representing the spatial layout and the semantic similarity between the doll local image block feature vectors, so that the latent association attribute between each doll local image block feature vector is subjected to implicit supervision propagation, thereby improving the overall distribution dependency of the doll global feature matrix serving as the synthesized feature, improving the overall feature distribution expression effect of the doll global feature matrix, and improving the accuracy of the classification result obtained by a classifier.

The application has the following technical effects: 1. a doll production management scheme, more particularly an automated doll surface defect detection scheme, is provided. 2. The scheme can realize automatic detection of the surface defects of the doll, improve the efficiency and accuracy of doll production management, reduce labor cost and resource consumption, and improve the quality and market competitiveness of the doll.

Fig. 1 is a schematic view of a scene of a method of managing production of a doll according to an embodiment of the present application. As shown in fig. 1, in this application scenario, first, a monitoring image (e.g., C as illustrated in fig. 1) of a doll to be detected (e.g., M as illustrated in fig. 1) is acquired; the acquired monitoring image is then input into a server (e.g., S as illustrated in fig. 1) where a doll 'S production management algorithm is deployed, wherein the server is capable of processing the monitoring image based on the doll' S production management algorithm to generate a classification result indicating whether the doll to be inspected has a surface defect.

Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.

In one embodiment of the present application, fig. 2 is a flowchart of a method of manufacturing management of a doll according to an embodiment of the present application. As shown in fig. 2, a method 100 for managing production of a doll according to an embodiment of the present application includes: 110, obtaining a monitoring image of the doll to be detected; 120, performing image preprocessing on the monitoring image to obtain a preprocessed monitoring image, wherein the image preprocessing comprises gray level conversion, image standardization, CLAHE and gamma correction; 130, performing image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks; 140, passing the sequence of the monitoring image blocks through a ViT model comprising an embedded layer to obtain a plurality of doll local image block feature vectors; 150, arranging the feature vectors of the plurality of doll local image blocks into a doll global feature matrix, and then obtaining a classification feature matrix through a bidirectional attention mechanism module; 160, optimizing the characteristic distribution of the classification characteristic matrix to obtain an optimized classification characteristic matrix; and 170, passing the optimized classification feature matrix through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the doll to be detected has surface defects or not.

Fig. 3 is a schematic diagram of an architecture of a method of manufacturing management of a doll according to an embodiment of the present application. As shown in fig. 3, in the network architecture, first, a monitoring image of a doll to be detected is acquired; then, performing image preprocessing on the monitoring image to obtain a preprocessed monitoring image, wherein the image preprocessing comprises gray level conversion, image standardization, CLAHE and gamma correction; then, carrying out image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks; then, passing the sequence of the monitoring image blocks through a ViT model comprising an embedded layer to obtain a plurality of doll local image block feature vectors; then, the feature vectors of the plurality of doll local image blocks are arranged into a doll global feature matrix, and then a classification feature matrix is obtained through a bidirectional attention mechanism module; then, carrying out feature distribution optimization on the classification feature matrix to obtain an optimized classification feature matrix; and finally, the optimized classification feature matrix is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the doll to be detected has surface defects or not.

Specifically, in step 110, a monitoring image of the doll to be inspected is acquired. Aiming at the technical problems, the technical conception of the method is to utilize deep learning and artificial intelligence technology to extract and classify the characteristics of the monitoring image of the doll so as to realize automatic detection of the surface defects of the doll, improve the production efficiency and quality level of the doll and ensure the rights and interests and safety of consumers.

Specifically, in step 120, the monitoring image is subjected to image preprocessing to obtain a preprocessed monitoring image, wherein the image preprocessing includes gray level conversion, image normalization, CLAHE and gamma correction. In order to improve the quality and the analyzability of an image, in the technical scheme of the application, the monitoring image is subjected to image preprocessing to obtain a preprocessed monitoring image, wherein the image preprocessing comprises gray level conversion, image standardization, CLAHE and gamma correction. Specifically, the gray conversion means converting a color image into a gray image to reduce the dimension and complexity of the image while retaining the main information of the image; the image normalization is to normalize the pixel value of the gray level image to the [0,1] interval so as to eliminate the brightness difference between different images and enhance the contrast of the images; the CLAHE limits the self-adaptive histogram equalization of the contrast, so that the contrast of an image can be enhanced, and meanwhile, noise amplification and detail loss caused by excessive enhancement can be avoided; gamma correction is a method for adjusting the brightness of an image, which can strengthen details of a dark area and inhibit details of a bright area according to the perception characteristic of human eyes on the brightness, so that the image is clearer and more natural.

Specifically, in step 130, the preprocessed monitoring image is subjected to image blocking processing to obtain a sequence of monitoring image blocks. And then, carrying out image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks. That is, the preprocessed monitoring image is divided into a plurality of small image blocks. Therefore, the efficiency of image processing and deep learning analysis can be improved, the calculated amount and the memory occupation can be reduced, and the complexity and the cost of the system are reduced.

The image blocking processing is performed on the preprocessed monitoring image to obtain a sequence of monitoring image blocks, and the method comprises the following steps: and uniformly partitioning the preprocessed monitoring image to obtain a sequence of monitoring image blocks, wherein each monitoring image block in the sequence of monitoring image blocks has the same size.

Specifically, in step 140, the sequence of surveillance tiles is passed through a ViT model containing an embedded layer to obtain a plurality of doll local tile feature vectors. The sequence of surveillance image blocks is then passed through a ViT model containing an embedded layer to obtain a plurality of doll local image block feature vectors. The ViT model is a model in which a transducer is applied to image processing. The ViT model has the advantage that long-range dependencies between image tiles can be captured using the transducer's self-attention mechanism. Specifically, the ViT model contains an embedded layer that maps sequences of image blocks into fixed length vectors that are then fed into the transducer's encoder. The function of the embedding layer is to convert the original image block data into a form suitable for the transform process while preserving the information of the image block.

Fig. 4 is a flowchart of the substep of step 140 in the method for managing the production of a doll according to an embodiment of the present application, as shown in fig. 4, of passing the sequence of the monitoring image blocks through a ViT model including an embedding layer to obtain a plurality of doll local image block feature vectors, including: 141, performing vector embedding on each monitoring image block in the sequence of monitoring image blocks by using an embedding layer of the ViT model to obtain a sequence of image block embedded vectors; and, 142 inputting the sequence of image block embedding vectors into a converter of the ViT model to obtain the plurality of doll local image block feature vectors.

Fig. 5 is a flowchart of the substep of step 142 in the method for managing the production of a doll according to an embodiment of the present application, as shown in fig. 5, of inputting the sequence of image block embedded vectors into the converter of the ViT model to obtain the plurality of doll local image block feature vectors, comprising: 1421, performing one-dimensional arrangement on the sequence of the image block embedded vectors to obtain an image block global feature vector; 1422, calculating the product between the global feature vector of the image block and the transpose vector of each image block embedding vector in the sequence of image block embedding vectors to obtain a plurality of self-attention correlation matrices; 1423, respectively performing standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices; 1424, obtaining a plurality of probability values by using a Softmax classification function for each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and 1425 weighting each image block embedding vector in the sequence of image block embedding vectors with each probability value in the plurality of probability values as a weight to obtain the plurality of doll local image block feature vectors.

It should be understood that since the transducer structure proposed by Google in 2017, a wave of hot surge is rapidly initiated, and for the NLP field, the self-attention mechanism replaces the conventional cyclic neural network structure adopted when processing sequence data, so that not only is parallel training realized, but also the training efficiency is improved, and meanwhile, good results are obtained in application. In NLP, a sequence is input into a transducer, but in the field of vision, how to convert a 2d picture into a 1d sequence needs to be considered, and the most intuitive idea is to input pixels in the picture into the transducer, but the complexity is too high.

While the ViT model can reduce the complexity of input, the picture is cut into image blocks, each image block is projected as a fixed length vector into the transducer, and the operation of the subsequent encoder is identical to that of the original transducer. However, because the pictures are classified, a special mark is added into the input sequence, and the output corresponding to the mark is the final class prediction. ViT exhibits quite excellent performance over many visual tasks, but the lack of inductive biasing allows ViT to be applied to small data sets with very much dependence on model regularization (model regularization) and data augmentation (data augmentation) compared to CNN (Convolutional Neural Network ).

Specifically, in step 150, the plurality of doll local image block feature vectors are arranged into a doll global feature matrix, and then passed through a bidirectional attention mechanism module to obtain a classification feature matrix. In consideration of the fact that the surface defects of the doll may be distributed at different parts of the doll, in the technical scheme of the application, the plurality of doll local image block feature vectors are arranged into a doll global feature matrix and then pass through a bidirectional attention mechanism module to obtain a classification feature matrix. Here, the overall information of the doll is represented by arranging the partial image block feature vectors of the doll into a global feature matrix of the doll. Also, the bi-directional attention mechanism module may utilize the concept of self-attention to enhance correlation between local regions. Specifically, the bidirectional attention module respectively calibrates the attention weights of the whole feature matrix from the horizontal direction and the vertical direction and acquires complex feature relations, so that local feature information can be completely acquired from the global features of the space.

Fig. 6 is a flowchart of a sub-step of step 150 in a method for managing the production of a doll according to an embodiment of the present application, as shown in fig. 6, after the plurality of doll local image block feature vectors are arranged into a doll global feature matrix, the doll global feature matrix is obtained by a bidirectional attention mechanism module, which includes: 151, pooling the doll global feature matrix along a horizontal direction and a vertical direction respectively to obtain a first pooling vector and a second pooling vector; 152, performing association coding on the first pooling vector and the second pooling vector to obtain a bidirectional association matrix; 153, inputting the bidirectional association matrix into a Sigmoid activation function to obtain a bidirectional association weight matrix; and 154, calculating the multiplication of the position points between the bidirectional association weight matrix and the doll global feature matrix to obtain the classification feature matrix.

The attention mechanism is a data processing method in machine learning, and is widely applied to various machine learning tasks such as natural language processing, image recognition, voice recognition and the like. On one hand, the attention mechanism is that the network is hoped to automatically learn out the places needing attention in the picture or text sequence; on the other hand, the attention mechanism generates a mask by the operation of the neural network, the weights of the values on the mask. In general, the spatial attention mechanism calculates the average value of different channels of the same pixel point, and then obtains spatial features through some convolution and up-sampling operations, and the pixels of each layer of the spatial features are given different weights.

Specifically, in step 160, the feature distribution of the classification feature matrix is optimized to obtain an optimized classification feature matrix. Fig. 7 is a flowchart of a sub-step of step 160 in the doll production management method according to an embodiment of the present application, as shown in fig. 7, performing feature distribution optimization on the classification feature matrix to obtain an optimized classification feature matrix, including: 161, converting the classification feature matrix into a square matrix through linear transformation, wherein the number of rows and the number of columns of the square matrix are the same; and 162, performing vector spectral clustering agent learning fusion optimization on the square matrix to obtain the optimized classification feature matrix.

Therefore, the classification feature matrix is first converted into a square matrix by linear transformation, i.e. the number of rows and columns are the same, and then the square matrix is written as, for exampleVector spectral focusingThe class proxy learning fuses the optimizations to obtain an optimized classification feature matrix, e.g., expressed as +.>Here, the optimized classification feature matrix +. >The method comprises the following steps: vector spectral clustering agent learning fusion optimization is carried out on the square matrix according to the following optimization formula so as to obtain the optimized classification feature matrix; wherein, the optimization formula is: />Wherein->Is the optimized classification feature matrix, +.>Is the square matrix, < >>Representing the individual row eigenvectors of the square matrix, and +.>Is a distance matrix consisting of the distances between every two corresponding row feature vectors of said square matrix,/>Is the transpose of the square matrix, +.>Is a transpose of the distance matrix, +.>Representing the Euclidean distance between the individual row feature vectors of the square matrix,/>Exponential operation representing a matrix, the exponential operation of the matrixCalculating a natural exponential function value representing the power of each position feature value in the matrix, < >>And->Respectively representing dot-by-location multiplication and matrix addition.

Specifically, in step 170, the optimized classification feature matrix is passed through a classifier to obtain a classification result, where the classification result is used to indicate whether the doll to be detected has a surface defect. Further, the classification feature matrix is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the doll to be detected has surface defects. The classifier can learn the characteristic distribution of normal and abnormal samples of the doll through a training data set, so that the new doll to be detected (namely, an input classification characteristic matrix) can be classified. In the technical scheme of the application, the classification labels of the classifier are of two types, namely 'the doll to be detected has surface defects' and 'the doll to be detected does not have surface defects', and the classifier determines which classification label the classification feature matrix belongs to through a soft maximum function. By the method, normal dolls and dolls with surface defects are distinguished, automatic detection of the surface defects of the dolls is achieved, the production efficiency and quality level of the dolls are improved, and the rights and the safety of consumers are guaranteed.

Fig. 8 is a flowchart of a sub-step of step 170 in the doll production management method according to an embodiment of the present application, as shown in fig. 8, the optimized classification feature matrix is passed through a classifier to obtain a classification result, where the classification result is used to indicate whether the doll to be detected has a surface defect, and the method includes: 171, expanding the optimized classification feature matrix into classification feature vectors according to row vectors or column vectors; 172, performing full-connection coding on the classification feature vector by using a plurality of full-connection layers of the classifier to obtain a coded classification feature vector; and 173, passing the encoded classification feature vector through a Softmax classification function of the classifier to obtain the classification result.

In summary, a method 100 for managing the production of a doll according to an embodiment of the present application is illustrated, which acquires a monitoring image of a doll to be inspected; the feature extraction and classification are carried out on the monitoring image of the doll by utilizing the deep learning and artificial intelligence technology so as to realize the automatic detection of the surface defects of the doll, improve the production efficiency and quality level of the doll and ensure the rights and interests and safety of consumers.

In one embodiment of the present application, fig. 9 is a block diagram of a doll production management system according to an embodiment of the present application. As shown in fig. 9, a production management system 200 of a doll according to an embodiment of the present application includes: an image acquisition module 210 for acquiring a monitoring image of the doll to be detected; a preprocessing module 220, configured to perform image preprocessing on the monitoring image to obtain a preprocessed monitoring image, where the image preprocessing includes gray level conversion, image normalization, CLAHE and gamma correction; the image blocking processing module 230 is configured to perform image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks; an embedded encoding module 240 for passing the sequence of the monitoring image blocks through a ViT model comprising an embedded layer to obtain a plurality of doll local image block feature vectors; the bidirectional attention module 250 is configured to arrange the feature vectors of the plurality of doll local image blocks into a doll global feature matrix, and then obtain a classification feature matrix through the bidirectional attention mechanism module; the optimizing module 260 is configured to optimize the feature distribution of the classification feature matrix to obtain an optimized classification feature matrix; and a detection result generating module 270, configured to pass the optimized classification feature matrix through a classifier to obtain a classification result, where the classification result is used to indicate whether the doll to be detected has a surface defect.

In a specific example, in the production management system of the doll, the image blocking processing module is configured to: and uniformly partitioning the preprocessed monitoring image to obtain a sequence of monitoring image blocks, wherein each monitoring image block in the sequence of monitoring image blocks has the same size.

In a specific example, in the above doll production management system, the embedded coding module includes: an embedding unit, configured to use an embedding layer of the ViT model to perform vector embedding on each monitoring image block in the sequence of monitoring image blocks to obtain a sequence of image block embedded vectors; and a transform coding unit for inputting the sequence of image block embedded vectors into the converter of the ViT model to obtain the plurality of doll local image block feature vectors.

In a specific example, in the above doll production management system, the conversion encoding unit includes: a one-dimensional arrangement subunit, configured to perform one-dimensional arrangement on the sequence of the image block embedding vectors to obtain an image block global feature vector; a self-attention subunit, configured to calculate a product between the global feature vector of the image block and a transpose vector of each image block embedding vector in the sequence of image block embedding vectors to obtain a plurality of self-attention correlation matrices; the normalization subunit is used for respectively performing normalization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of normalized self-attention correlation matrices; an activating subunit, configured to obtain a plurality of probability values from each normalized self-attention correlation matrix in the plurality of normalized self-attention correlation matrices through a Softmax classification function; and a weighting subunit, configured to weight each image block embedding vector in the sequence of image block embedding vectors with each probability value in the plurality of probability values as a weight, so as to obtain the plurality of doll local image block feature vectors.

In one specific example, in the doll production management system described above, the bidirectional attention module includes: the pooling unit is used for pooling the doll global feature matrix along the horizontal direction and the vertical direction respectively to obtain a first pooling vector and a second pooling vector; the association coding unit is used for carrying out association coding on the first pooling vector and the second pooling vector to obtain a bidirectional association matrix; the matrix activation unit is used for inputting the bidirectional association matrix into a Sigmoid activation function to obtain a bidirectional association weight matrix; and the matrix calculation unit is used for calculating the position-based point multiplication between the bidirectional association weight matrix and the doll global feature matrix to obtain the classification feature matrix.

In a specific example, in the doll production management system described above, the optimization module includes: the linear transformation unit is used for transforming the classification characteristic matrix into a square matrix through linear transformation, wherein the number of rows and the number of columns of the square matrix are the same; and the optimization unit is used for carrying out vector spectral clustering agent learning fusion optimization on the square matrix to obtain the optimized classification feature matrix.

In a specific example, in the production management system of the doll described above, the optimizing unit is configured to: vector spectral clustering agent learning fusion optimization is carried out on the square matrix according to the following optimization formula so as to obtain the optimized classification feature matrix; wherein, the optimization formula is:wherein->Is the optimized classification characteristic matrix，/>Is the square matrix, < >>Representing the individual row eigenvectors of the square matrix, and +.>Is a distance matrix consisting of the distances between every two corresponding row feature vectors of said square matrix,/>Is the transpose of the square matrix, +.>Is a transpose of the distance matrix, +.>Representing the Euclidean distance between the individual row feature vectors of the square matrix,/>An exponential operation representing a matrix representing a natural exponential function value raised to a power by a characteristic value of each position in the matrix, ">And->Respectively representing dot-by-location multiplication and matrix addition.

In a specific example, in the production management system of the doll, the detection result generating module includes: the unfolding unit is used for unfolding the optimized classification feature matrix into classification feature vectors according to row vectors or column vectors; the full-connection coding unit is used for carrying out full-connection coding on the classification characteristic vectors by using a plurality of full-connection layers of the classifier so as to obtain coded classification characteristic vectors; and the classification unit is used for passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.

Here, it will be understood by those skilled in the art that the specific functions and operations of the respective units and modules in the above-described doll's production management system have been described in detail in the above description of the doll's production management method with reference to fig. 1 to 8, and thus, repetitive descriptions thereof will be omitted.

As described above, the doll production management system 200 according to the embodiment of the present application may be implemented in various terminal devices, such as a server for doll production management, and the like. In one example, the doll production management system 200 according to an embodiment of the present application may be integrated into the terminal device as one software module and/or hardware module. For example, the doll's production management system 200 may be a software module in the operating system of the terminal device, or may be an application developed for the terminal device; of course, the doll production management system 200 may likewise be one of a number of hardware modules of the terminal device.

Alternatively, in another example, the doll's production management system 200 and the terminal device may be separate devices, and the doll's production management system 200 may be connected to the terminal device through a wired and/or wireless network and transmit interactive information in a contracted data format.

The present application also provides a computer program product comprising instructions which, when executed, cause an apparatus to perform operations corresponding to the above-described methods.

In one embodiment of the present application, there is also provided a computer readable storage medium storing a computer program for executing the above-described method.

It should be appreciated that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the forms of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects may be utilized. Furthermore, the computer program product may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Methods, systems, and computer program products of embodiments of the present application are described in terms of flow diagrams and/or block diagrams. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The basic principles of the present application have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not intended to be limited to the details disclosed herein as such.

The block diagrams of the devices, apparatuses, devices, systems referred to in this application are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

It is also noted that in the apparatus, devices and methods of the present application, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent to the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims

1. A method of manufacturing management of a doll, comprising: acquiring a monitoring image of a doll to be detected; performing image preprocessing on the monitoring image to obtain a preprocessed monitoring image, wherein the image preprocessing comprises gray level conversion, image standardization, CLAHE and gamma correction; performing image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks; passing the sequence of monitoring image blocks through a ViT model comprising an embedded layer to obtain a plurality of doll local image block feature vectors; the feature vectors of the plurality of doll local image blocks are arranged into a doll global feature matrix, and then the doll global feature matrix is processed by a bidirectional attention mechanism module to obtain a classification feature matrix; performing feature distribution optimization on the classification feature matrix to obtain an optimized classification feature matrix; and the optimized classification characteristic matrix passes through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the doll to be detected has surface defects or not.

2. The method of claim 1, wherein image blocking the preprocessed monitoring image to obtain a sequence of monitoring image blocks, comprises: and uniformly partitioning the preprocessed monitoring image to obtain a sequence of monitoring image blocks, wherein each monitoring image block in the sequence of monitoring image blocks has the same size.

3. The method of claim 2 wherein passing the sequence of monitor tiles through a ViT model containing an embedded layer to obtain a plurality of doll local tile feature vectors comprises: performing vector embedding on each monitoring image block in the sequence of monitoring image blocks by using an embedding layer of the ViT model to obtain a sequence of image block embedded vectors; and inputting the sequence of image block embedded vectors into a converter of the ViT model to obtain the plurality of doll local image block feature vectors.

4. The method of claim 3, wherein inputting the sequence of image block embedded vectors into the converter of the ViT model to obtain the plurality of doll local image block feature vectors comprises: the sequence of the image block embedded vectors is subjected to one-dimensional arrangement to obtain an image block global feature vector; calculating the product between the global feature vector of the image block and the transpose vector of each image block embedded vector in the sequence of image block embedded vectors to obtain a plurality of self-attention association matrices; respectively carrying out standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices; obtaining a plurality of probability values by using a Softmax classification function through each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and weighting each image block embedded vector in the sequence of image block embedded vectors by taking each probability value in the plurality of probability values as a weight to obtain the plurality of doll local image block feature vectors.

5. The method of claim 4, wherein the arranging the plurality of doll local image block feature vectors into a doll global feature matrix and then passing through a bi-directional attention mechanism module to obtain a classification feature matrix comprises: pooling the doll global feature matrix along the horizontal direction and the vertical direction respectively to obtain a first pooling vector and a second pooling vector; performing association coding on the first pooling vector and the second pooling vector to obtain a bidirectional association matrix; inputting the bidirectional association matrix into a Sigmoid activation function to obtain a bidirectional association weight matrix; and calculating the position-based point multiplication between the bidirectional association weight matrix and the doll global feature matrix to obtain the classification feature matrix.

6. The method of claim 5, wherein optimizing the feature distribution of the classification feature matrix to obtain an optimized classification feature matrix comprises: converting the classification characteristic matrix into a square matrix through linear transformation, wherein the number of rows and the number of columns of the square matrix are the same; and carrying out vector spectral clustering agent learning fusion optimization on the square matrix to obtain the optimized classification feature matrix.

7. The method of claim 6, wherein performing vector spectral clustering proxy learning fusion optimization on the square matrix to obtain the optimized classification feature matrix comprises: vector spectral clustering agent learning fusion optimization is carried out on the square matrix according to the following optimization formula so as to obtain the optimized classification feature matrix; wherein, the optimization formula is:wherein->Is the optimized classification feature matrix, +.>Is the square matrix, < >>Representing the squareEach row of the matrix is characterized by vectors, and->Is a distance matrix consisting of the distances between every two corresponding row feature vectors of said square matrix,/>Is the transpose of the square matrix, +.>Is a transpose of the distance matrix, +.>Representing the euclidean distance between the individual row feature vectors of the square matrix,an exponential operation representing a matrix representing a natural exponential function value raised to a power by a characteristic value of each position in the matrix, ">And->Respectively representing dot-by-location multiplication and matrix addition.

8. The method of claim 7, wherein the optimizing the classification feature matrix through a classifier to obtain a classification result, the classification result being used to indicate whether the doll to be inspected has a surface defect, comprises: expanding the optimized classification feature matrix into classification feature vectors according to row vectors or column vectors; performing full-connection coding on the classification feature vectors by using a plurality of full-connection layers of the classifier to obtain coded classification feature vectors; and passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.

9. A doll production management system comprising: the image acquisition module is used for acquiring a monitoring image of the doll to be detected; the preprocessing module is used for carrying out image preprocessing on the monitoring image to obtain a preprocessed monitoring image, wherein the image preprocessing comprises gray level conversion, image standardization, CLAHE and gamma correction; the image blocking processing module is used for carrying out image blocking processing on the preprocessed monitoring image to obtain a sequence of monitoring image blocks; the embedded coding module is used for enabling the sequence of the monitoring image blocks to pass through a ViT model containing an embedded layer so as to obtain a plurality of doll local image block feature vectors; the bidirectional attention module is used for arranging the feature vectors of the plurality of doll local image blocks into a doll global feature matrix and then obtaining a classification feature matrix through the bidirectional attention mechanism module; the optimizing module is used for optimizing the characteristic distribution of the classification characteristic matrix to obtain an optimized classification characteristic matrix; and the detection result generation module is used for enabling the optimized classification feature matrix to pass through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the doll to be detected has surface defects or not.

10. The doll production management system of claim 9 wherein the image blocking processing module is configured to: and uniformly partitioning the preprocessed monitoring image to obtain a sequence of monitoring image blocks, wherein each monitoring image block in the sequence of monitoring image blocks has the same size.