CN116124787A

CN116124787A - Autonomous multi-angle joint detection method for appearance of non-lambertian body

Info

Publication number: CN116124787A
Application number: CN202211604958.2A
Authority: CN
Inventors: 王碧; 聂宇松; 黄伟东; 杨书新; 吴剑青; 曾博文
Original assignee: Jiangxi University of Science and Technology
Current assignee: Jiangxi University of Science and Technology
Priority date: 2022-12-14
Filing date: 2022-12-14
Publication date: 2023-05-16

Abstract

The invention discloses an autonomous multi-angle joint detection method for a non-lambertian appearance, and relates to the technical field of automatic detection. And detecting the appearance of a single product in a fixed detection scene, and obtaining a multi-view planning strategy by a data-driven artificial intelligence method. And inquiring Top-k approximate neighbor samples of the new product image through a learned view planning strategy and a sample set used for learning, and obtaining a corresponding view planning strategy. The weights of k retrieval view planning strategies are adjusted through samples of new products, and the multi-view planning strategy of any new product image is estimated through a weighted sum mode. And adopting a deep reinforcement learning framework to obtain a visual angle planning strategy by continuous trial and error. Knowledge migration is achieved using a new few sample learning method: the LSH is used to retrieve similar samples and the planning strategy is revised by re-weighting. The gradient compensation trace is used for rewarding sparse: the time sequence characteristics of the states and the gradient compensation traces are combined to accelerate the learning process.

Description

Autonomous multi-angle joint detection method for appearance of non-lambertian body

Technical Field

The invention relates to the technical field of automatic detection, in particular to an autonomous multi-angle joint detection method for the appearance of a non-lambertian body.

Background

The quality of the appearance of the product is an important component of the quality of the product, which directly affects the consumer's sense and reduces its commercial value and brand praise. Specifically, the product appearance inspection includes stain, scratch, pit, shallow tumor, edge defect, pattern defect, and the like. The automatic visual inspection (automated visual inspection, AVI) system has the advantages of high inspection accuracy, high inspection speed, low cost and the like, and has been widely used. However, the products to be inspected are complex and changeable, and it is difficult to design a general visual inspection system. Therefore, how to determine the image acquisition view angle and realize knowledge migration under different scenes is a key for constructing a general AVI system.

Disclosure of Invention

The invention aims to provide an autonomous multi-angle joint detection method for the appearance of a non-lambertian body, which is used for planning and collecting a visual angle sequence for a product through deep reinforcement learning so as to realize product quality detection; knowledge migration among different products and different scenes is realized through migration learning and anti-noise learning, and the product appearance detection automation process is facilitated.

The technical aim of the invention is realized by the following technical scheme: an autonomous multi-angle joint detection method for non-lambertian appearance, comprising the steps of:

s1: detecting the appearance of a single product in a fixed detection scene, and obtaining a multi-view planning strategy by a data-driven artificial intelligence method;

s2: inquiring Top-k approximate neighbor samples of a new product image through a learned view planning strategy and a sample set for learning, and obtaining a corresponding view planning strategy;

s3: the weights of k retrieval view planning strategies are adjusted through samples of new products, and the multi-view planning strategy of any new product image is estimated through a weighted sum mode.

The invention is further provided with: in the step S1, a non-lambertian surface slight flaw detection method based on deep reinforcement learning is adopted for detecting the appearance of a single product; automatic visual detection modeling is adopted, and a camera is utilized to capture images of a product to be detected under any angle; preprocessing the captured image of the product to be detected, and inputting the image into a deep neural network model with a convolution layer; the output manipulator is required to rotate the offset angle of the product to be detected; and further controlling the manipulator to rotate the product and capture a new image until judging whether the product has appearance flaws or not.

The invention is further provided with: enhancing automatic visual detection as a Markov decision process model:

M＝＜δ，A，R，Р，γ＞

where δ is the state space, i.e. the sequence of images that the product has captured; a is a decision space, namely the rotation angle of the manipulator; r is a reward function, namely the numerical good or bad degree of the current action, R: delta x A→R; p is the environmental dynamic, i.e. the mapping from the current state to the new state after the motion change, P is delta x A → delta; gamma e 0,1 is the decay factor, i.e. the impact of the current action on future rewards gradually decreases over time interval.

The invention is further provided with: the deep neural network model adopts the combination of a multi-convolution-pooling layer and a dense layer, the node number of an output layer is |A|, and the value of each action is output; let the network model be f, its inputs be state and action, its output be action value, and the learning of the value function be regarded as a regression problem. The objective function is mean square error:

s _t representing the state at time t, s _t+1 A represents the state at the next time from time t _t Indicating the action at time t, r _t Indicating rewards, θ _t Representing current network parameters.

The invention is further provided with: and step S2 and step S3 adopt a depth reinforcement learning cross-scene strategy correction method based on transfer learning, when the appearance of a product is changed, after capturing an image of a new product, similar product detection is realized through a hash group, k product images obtained through retrieval are input into an existing model, so that a responsive planning strategy is obtained, and weights are used for estimating the strategy to be corrected through learning, so that knowledge transfer is completed.

The invention is further provided with: the local sensitive hash is adopted to help search, the local sensitive hash encourages conflict, and two similar points in the original space have similar hash values through mapping, and vice versa; the input of the local sensitive hash is that a new product image X' is input into a dense vector of an original model after being embedded by a convolution-pooling layer; outputting discrete hash values, and searching k similar original product images x according to the hash values of the new product images; the convolution layer is CNN: R ^m →R ⁿ The local sensitive hash is LSH: R ⁿ →П ^d The method comprises the steps of carrying out a first treatment on the surface of the The retrieval of top-k existing samples of high similarity through a locality sensitive hashing is formally described as:

I(X',K)＝TopK _x∈D (Sim(LSH(CNN(x)),LSH(CNN(X')))),

wherein, sim is pi ^d ×Π ^d →[0,1]Is a similarity function.

The invention is further provided with: learning and adjusting weights, wherein the weights consist of k nodes, a softmax normalized neural network layer adopts a three-layer neural network, an activation function is not used for calculating the weights, the input is k similar view planning strategies, and I (x', k) E R ^|A|×k Representation, output of the weighting y ε R ^k And if the weight learning network is g, the objective function is as follows:

the invention is further provided with: in order to avoid sampling products for multiple times at the same angle in the rotating process of the manipulator, the image sequence acquired by the current product to be detected is used as a state code, and the coding mode combines historical images of the product according to the maximum rotating times to serve as deep neural network input or realizes the coding of the historical image sequence through a neural network model which can be used for time sequence problems such as a long-period memory network and the like;

the motion coding adopts discrete coding, namely, uniformly dividing [0 degree, 360 degrees ] by taking small offset angles as intervals, and adding two decision items of qualification and disability; the reward function adopts sparse rewards, and when the product is correctly detected as being qualified or defective, the reward of +1 is given; when the quality of the product is judged to be wrong, giving-1 rewards; other actions are rewarded with zero.

The invention is further provided with: the state coding adopts a cyclic neural network (RNN)) to code the time sequence characteristics of the state, and the calculation formula is as follows:

h _t ＝sigmoid(m _t w _h +b _h )

m _t ＝tanh(x _t w _m1 +m _t-1 w _m2 +b _m )

wherein x is _t And h _t Input and output, m, respectively, in a recurrent neural network architecture ₀ By initialising random assignments, w _m1 、w _m2 And w _h Is three matrices (dense layers), updated by back propagation, sigmoid and tanh are activation functions to normalize the output space.

The invention is further provided with: solving the problem of rewarding sparseness by adopting gradient compensation traces, wherein the formula is as follows:

/>

wherein θ _t Is the weight of each layer of network at the time t, e _t Is the critical cumulative amount in the qualification trace, alpha _t Is the learning rate, gamma _t Is a discount factor; the time difference error is noted as:

δ _t ＝r _t +γmax _a f(s _t+1 ,a)-f(s _t ,a _t )

in summary, the invention has the following beneficial effects: 1. in multi-view based work, view planning is designed and the angle of the captured image is fixed for the specific product appearance and detection scene. The method adopts dynamic multi-view planning, and realizes automatic planning of detection view angles through a data-driven artificial intelligence method. 2. Because after key factors influencing visual information acquisition such as appearance of a product to be detected in a detection scene, light source position and the like change, the detection performance of the product to be detected can be obviously reduced. 3. And solving the problem of rewarding sparseness by adopting a gradient compensation trace, and improving the learning efficiency by combining time sequence characteristics of states and the gradient compensation trace acceleration learning process.

Drawings

FIG. 1 is a shallow pit and scratch of a non-lambertian surface;

FIG. 2 is a block diagram of an exemplary automated vision inspection system operation;

FIG. 3 is a schematic representation of a non-lambertian surface mild flaw detection technique based on deep reinforcement learning in accordance with an embodiment of the present invention;

FIG. 4 is a schematic diagram of a deep reinforcement learning cross-scene strategy correction technique route based on transfer learning according to an embodiment of the invention;

FIG. 5 is a technical route of a reward sparse environment learning acceleration method based on gradient compensation traces according to an embodiment of the invention.

Detailed Description

The invention is described in further detail below with reference to fig. 1-5.

Examples: an autonomous multi-angle joint detection method for non-lambertian appearance, as shown in fig. 1-5, comprises the following steps:

In order to cope with non-lambertian appearance detection of the smooth coating, in a detection scene of a fixed light source, the angle between a product to be detected, the light source and the camera is adjusted through a mechanical arm. On one hand, the difference between the smooth surface such as shallow pits and the like and the main image during reflection is fully utilized; on the other hand, efforts are made to avoid the situation where the high intensity reflection area covers the minute scratches; meanwhile, the problem of false detection caused by light source reflection is reduced by adjusting the relative angle. Reinforcement learning is a data-driven artificial intelligence method that can effectively address sequence decision-making problems. Meanwhile, considering the problem that image feature extraction is difficult, deep reinforcement learning is selected to control the angle sequence of the rotation of the manipulator. The technical route is shown in figure 3.

The detection of the appearance of a single product is realized by adopting a non-lambertian surface slight flaw detection method based on deep reinforcement learning; automatic visual detection modeling is adopted, and a camera is utilized to capture images of a product to be detected under any angle; preprocessing the captured image of the product to be detected, and inputting the image into a deep neural network model with a convolution layer; the output manipulator is required to rotate the offset angle of the product to be detected; and further controlling the manipulator to rotate the product and capture a new image until judging whether the product has appearance flaws or not.

Before a surface detection learning system is built, it is necessary to model the automatic visual detection as a reinforcement learning processable markov decision process:

M＝＜δ，A，R，Р，γ＞

In order to avoid sampling products for multiple times at the same angle in the rotating process of the manipulator, the image sequence acquired by the current product to be detected is used as a state code, and the coding mode combines historical images of the product according to the maximum rotating times to serve as deep neural network input or realizes the coding of the historical image sequence through a neural network model which can be used for time sequence problems such as a long-period memory network and the like;

the motion coding adopts discrete coding, namely, uniformly dividing [0 degree, 360 degrees ] by taking small offset angles as intervals, and adding two decision items of qualification and disability; the reward function adopts sparse rewards, and when the product is correctly detected as being qualified or defective, the reward of +1 is given; when the quality of the product is judged to be wrong, giving-1 rewards; other actions are rewarded with zero. In order to ensure the high efficiency of the automatic visual detection system and reduce the rotation times of the mechanical arm during single product detection, an attenuation coefficient gamma=0.99 is designed.

The deep neural network model adopts the combination of a multi-convolution-pooling layer and a dense layer, the node number of an output layer is |A|, and the value of each action is output; let the network model be f, its inputs be state and action, its output be action value, and the learning of the value function be regarded as a regression problem. The objective function is mean square error:

In order to solve the problems such as light source change, camera equipment change, to-be-detected product appearance change and the like in a product detection scene, the mobility of an automatic visual detection system is improved, and knowledge is quickly migrated through migration learning. The technical route is shown in fig. 4. When the appearance of a product is changed, after capturing an image of a new product, similar product detection is realized through a hash group, k product images obtained through retrieval are input into an existing model, so that a responsive planning strategy is obtained, and weights are used for estimating the strategy to be corrected through learning and adjustment, so that knowledge migration is completed.

To quickly and accurately obtain historical images that are close to new product images, a locality sensitive hash (Locality Sensitive Hashing, LSH) is employed to aid in retrieval. The locality sensitive hashing encourages collisions by mapping such that two similar points in the original space have similar hash values and vice versa. Therefore, it can be used to realize large-scale information storage and retrieval. The input of the local sensitive hash is that a new product image X' is input into a dense vector of an original model after being embedded by a convolution-pooling layer; outputting discrete hash values, and searching k similar original product images x according to the hash values of the new product images; the convolution layer is CNN: R ^m →R ⁿ The local sensitive hash is LSH: R ⁿ →П ^d The method comprises the steps of carrying out a first treatment on the surface of the The retrieval of top-k existing samples of high similarity through a locality sensitive hashing is formally described as:

I(X',K)＝TopK _x∈D (Sim(LSH(CNN(x)),LSH(CNN(X')))),

wherein, sim is pi ^d ×Π ^d →[0,1]Is a similarity function.

Learning and adjusting weights, wherein the weights consist of k nodes, a softmax normalized neural network layer adopts a three-layer neural network, an activation function is not used for calculating the weights, the input is k similar view planning strategies, and I (x', k) E R ^|A|×k Representation, output of the weighting y ε R ^k And if the weight learning network is g, the objective function is as follows:

in order to solve the problems of slow learning caused by overlong detection sequences and rare rewards in the initial learning stage due to continuous rotation angles, the feedback speed of rewards in the learning process is improved through gradient compensation marks. The technical route is shown in fig. 5. Firstly, coding an image sequence acquired after the product rotates through a cyclic neural network. The state with the time sequence characteristic is then used as the input of the regression task, and the neural network is connected as a function of the value. And finally, when the network weight is updated, replacing the update gradient of the original optimizer by using a gradient compensation trace, and accelerating the updating of the value function.

To reduce the likelihood of repeated accesses from only certain angles of view being possible with the current visual image, a recurrent neural network (Recurrent Neural Network, RNN) is employed to encode the timing characteristics of the states. I.e. the historically observed image information will be present in some form in the encoded image after the mapping of the current image. Thereby reducing duplicate detection at the same angle. Compared to gate cycle units and long and short term memory networks, RNNs have lower computational complexity. In the case of a small data volume, the RNN has similar output performance to its variant network. The calculation formula is as follows:

h _t ＝sigmoid(m _t w _h +b _h )

m _t ＝tanh(x _t w _m1 +m _t-1 w _m2 +b _m )

wherein x is _t And h _t Input and output, m, respectively, in a recurrent neural network architecture ₀ By initialising random assignments, w _m1 、w _m2 And w _h Is three matrices (dense layers), updated by back propagation, sigmoid and tanh are activation functions to normalize the output space. The training value function network is consistent with the method for constructing the depth network model.

The eligibility trace may be described simply as an accumulation of historical gradients. It allows the value of the history state (action) to be updated with the current prize by changing the update rules. Gradient compensation trace is a qualification trace proposed in this application. The gradient compensation trace does not cause the effect of the reward on the status to decay, or even disappear, over time, compared to the existing qualifying trace. Therefore, the gradient compensation trace accords with the specific condition of the project and can be used for solving the rewarding sparse problem. Solving the problem of rewarding sparseness by adopting gradient compensation traces, wherein the formula is as follows:

δ _t ＝r _t +γmax _a f(s _t+1 ,a)-f(s _t ,a _t )

the present application uses a new visual inspection method to adapt view angle planning: and adopting a deep reinforcement learning framework to obtain a visual angle planning strategy by continuous trial and error. Knowledge migration is achieved using a new few sample learning method: the LSH is used to retrieve similar samples and the planning strategy is revised by re-weighting. The gradient compensation trace is used for rewarding sparse: the time sequence characteristics of the states and the gradient compensation traces are combined to accelerate the learning process.

The present embodiment is only for explanation of the present invention and is not to be construed as limiting the present invention, and modifications to the present embodiment, which may not creatively contribute to the present invention as required by those skilled in the art after reading the present specification, are all protected by patent laws within the scope of claims of the present invention.

Claims

1. The autonomous multi-angle joint detection method for the appearance of the non-lambertian body is characterized by comprising the following steps of: the method comprises the following steps:

2. The autonomous multi-angle joint detection method for non-lambertian appearance of claim 1, characterized by: in the step S1, a non-lambertian surface slight flaw detection method based on deep reinforcement learning is adopted for detecting the appearance of a single product; automatic visual detection modeling is adopted, and a camera is utilized to capture images of a product to be detected under any angle; preprocessing the captured image of the product to be detected, and inputting the image into a deep neural network model with a convolution layer; the output manipulator is required to rotate the offset angle of the product to be detected; and further controlling the manipulator to rotate the product and capture a new image until judging whether the product has appearance flaws or not.

3. The autonomous multi-angle joint detection method for non-lambertian appearance of claim 2, characterized by: enhancing automatic visual detection as a Markov decision process model:

M＝＜δ，A，R，Р，γ＞

4. The autonomous multi-angle joint detection method for non-lambertian appearance of claim 2, characterized by: the deep neural network model adopts the combination of a multi-convolution-pooling layer and a dense layer, the node number of an output layer is |A|, and the value of each action is output; let the network model be f, its inputs be state and action, its output be action value, and the learning of the value function be regarded as a regression problem. The objective function is mean square error:

5. The autonomous multi-angle joint detection method for non-lambertian appearance of claim 1, characterized by: and step S2 and step S3 adopt a depth reinforcement learning cross-scene strategy correction method based on transfer learning, when the appearance of a product is changed, after capturing an image of a new product, similar product detection is realized through a hash group, k product images obtained through retrieval are input into an existing model, so that a responsive planning strategy is obtained, and weights are used for estimating the strategy to be corrected through learning, so that knowledge transfer is completed.

6. The autonomous multi-angle joint detection method for non-lambertian appearance of claim 5, characterized by: the local sensitive hash is adopted to help search, the local sensitive hash encourages conflict, and two similar points in the original space have similar hash values through mapping, and vice versa; the input of the local sensitive hash is that a new product image X' is input into a dense vector of an original model after being embedded by a convolution-pooling layer; outputting discrete hash values, and searching k similar original product images x according to the hash values of the new product images; the convolution layer is CNN: R ^m →R ⁿ The local sensitive hash is LSH: R ⁿ →П ^d The method comprises the steps of carrying out a first treatment on the surface of the Top-k high similarity searches through local sensitive hashThe existing sample formalized description is:

I(X',K)＝TopK _x∈D (Sim(LSH(CNN(x)),LSH(CNN(X')))),

wherein, sim is pi ^d ×Π ^d →[0,1]Is a similarity function.

7. The autonomous multi-angle joint detection method for non-lambertian appearance of claim 5, characterized by: learning and adjusting weights, wherein the weights consist of k nodes, a softmax normalized neural network layer adopts a three-layer neural network, an activation function is not used for calculating the weights, the input is k similar view planning strategies, and I (x', k) E R ^|A|×k Representation, output of the weighting y ε R ^k And if the weight learning network is g, the objective function is as follows:

。

8. the autonomous multi-angle joint detection method for non-lambertian appearance of claim 2, characterized by: in order to avoid sampling products for multiple times at the same angle in the rotating process of the manipulator, the image sequence acquired by the current product to be detected is used as a state code, and the coding mode combines historical images of the product according to the maximum rotating times to serve as deep neural network input or realizes the coding of the historical image sequence through a neural network model which can be used for time sequence problems such as a long-period memory network and the like;

9. The autonomous multi-angle joint detection method for non-lambertian appearance of claim 8, characterized by: state coding uses a cyclic neural network (RNN)) to code the timing characteristics of states, and the calculation formula is as follows:

h _t ＝sigmoid(m _t w _h +b _h )

m _t ＝tanh(x _t w _m1 +m _t-1 w _m2 +b _m )

10. The autonomous multi-angle joint detection method for non-lambertian appearance of claim 9, characterized by: solving the problem of rewarding sparseness by adopting gradient compensation traces, wherein the formula is as follows:

e _t+1 ＝γ _t-1 e _t +a _t ▽ _θ f(h _t ,θ _t )

δ _t ＝r _t +γmax _a f(s _t+1 ,a)-f(s _t ,a _t )

/>