CN112257622A

CN112257622A - Road crack segmentation method based on genetic algorithm and U-shaped neural network

Info

Publication number: CN112257622A
Application number: CN202011172312.2A
Authority: CN
Inventors: 朱贵杰; 韦家弘; 范衠; 马培立; 黄文宁; 李晓明; 林培涵; 叶志豪
Original assignee: Shantou University
Current assignee: Shantou University
Priority date: 2020-10-28
Filing date: 2020-10-28
Publication date: 2021-01-22
Anticipated expiration: 2040-10-28
Also published as: CN112257622B

Abstract

The invention provides a road crack segmentation method based on a genetic algorithm and a U-shaped neural network, which is used for searching a full convolution neural network framework of a U-shaped encoding-decoding structure by using the genetic algorithm so as to realize automatic design, solving the problems of complicated work, large workload, more complex designed model and inaccurate road crack segmentation under complex conditions of manually designed road crack segmentation neural network models and automatically and accurately segmenting road cracks; the method has the advantages that structures and operations which are better than manual design and have lower calculation complexity are found for the modules, the robustness on the interference of complex road surface image cracks, the nonuniformity of diseases and the uneven illumination phenomenon is high, the characteristics of the road cracks can be extracted more accurately, and therefore the segmentation accuracy of the whole image is improved.

Description

Road crack segmentation method based on genetic algorithm and U-shaped neural network

Technical Field

The invention belongs to the technical field of structural health monitoring and image processing, and particularly relates to a road crack segmentation method based on a genetic algorithm and a U-shaped neural network.

Background

With the development of the transportation industry, the maintenance work of roads becomes very important. Cracks are the most common defects in road damage, and the detection of road pavement defects is a prerequisite for subsequent maintenance and repair. Therefore, the detection work of the road crack is indispensable. In the actual detection process, the cracks are distributed disorderly and irregularly, and are easily interfered by peripheral obstacles, so that missed detection and false detection are caused, and great potential safety hazards are generated on the road health condition.

Traditional road crack discernment is generally detected by the on-the-spot manual work of road maintainer, though adopt camera equipment to carry out image acquisition, cracked discernment mark still needs artifical mark to accomplish, and different observers' experience is judged, and subjective impression is all different, even if to same road crack image discernment mark, the result that different observers gave is also different. Therefore, the traditional manual road crack identification not only consumes a large amount of manpower, but also has no way to ensure the accuracy and the efficiency of crack identification. The traditional image processing method is adopted to identify the road cracks, the image quality requirement on the images is high, the operation is complex and time-consuming, the obtained identification effect is not ideal, and the accuracy is not high; by adopting an unsupervised method, the neural network model usually relates to more additional conditions which need to be met, the requirement on the quality of the image is higher, and the identified road crack precision is lower; in the supervision method, the neural network model has stronger extraction capability on the characteristics of the complex road surface image, and compared with the traditional image processing method and the unsupervised method, the neural network model has obvious advantages on the complex conditions. However, at present, the existing artificially designed neural network model for detecting the road cracks has certain limitations in the face of the complex situation of road surface images, and the calculation complexity is high, so that the process of identifying the cracks by the neural network model is long in time and low in efficiency. Manually designing high performance neural network models often requires a large amount of repetitive work and places high demands on the experience knowledge of the designer.

These prior art techniques currently suffer from the following problems:

1. for the crack segmentation task, the training of the existing artificially designed neural network is time-consuming, the parameter amount is large, and higher memory and computational complexity are generally consumed, so that the existing artificially designed neural network is difficult to operate on equipment without enough computational power.

On the other hand, under the condition of complex road surface images, the crack structure of the road is still difficult to obtain accurately, and the effect on intricate intersection and fine crack segmentation is not ideal.

2. The traditional image processing method has low accuracy for segmenting cracks in a road image with a complex background, the artificially designed neural network method has high requirements for training, testing and verifying the road crack image of the neural network, and if the quality of the sample image is poor, the obtained neural network has poor accuracy for identifying and segmenting the cracks.

3. The neural network model designed manually depends heavily on the knowledge and experience of people, and an efficient and high-precision road crack method needs to be designed, so that the consumed time is long, the neural network model cannot be optimized, and even a satisfactory neural network model cannot be obtained.

The cause is as follows:

complexity of road surface image: the method has the advantages that the road cracks have motion blur in the collection process, the image quality is reduced, particularly, the crack branches with small width are reduced, so that the accurate segmentation of the small crack branches is a difficult problem, the traditional digital image processing has high requirements on the pictures of the cracks, and in addition, the problems that the image is uneven in illumination, the cracks are difficult to distinguish from the background and the like can exist; interference from shadows, water and oil stains is also often encountered.

Disturbance of image background: in addition to interference caused by man-made or equipment in the image acquisition process, under the common condition, sundries such as leaves, sand and stones exist on roads, the more serious interference condition exists, roads with water surfaces exist in cloudy days or rainy days, and under the condition, the traditional digital image segmentation method is difficult to carry out high-precision image segmentation;

3. limitations of empirical knowledge: the artificial design of the neural network depends on more experience knowledge of people, does not have too much solid theoretical support, is usually assumed firstly and then verified through experiments, and the scheme is adopted if the effect is achieved. In order to make the performance of the neural network more excellent, the neural network tends to be designed to be larger and more complex, so that the model tends to have higher computational complexity. Meanwhile, the artificially designed neural network architecture is not an optimal architecture due to the limitation of human energy and knowledge.

In summary, it is challenging and necessary to automatically design a lightweight neural network model for road fracture segmentation due to factors such as image background interference, complexity of road surface images, limitations of human experience knowledge, and the like.

Disclosure of Invention

The invention aims to provide a road crack segmentation method based on a genetic algorithm and a U-shaped neural network, so as to solve one or more technical problems in the prior art and provide at least one beneficial selection or creation condition.

The invention provides a road crack segmentation method based on a genetic algorithm and a U-shaped neural network, which is used for searching a full convolution neural network framework of a U-shaped encoding-decoding structure by using the genetic algorithm to realize automatic design and solving the problems of tedious work, large workload, more complex designed model and inaccuracy in road crack segmentation under complex conditions of manually designed road crack segmentation neural network models.

The invention combines the advantages of an evolutionary algorithm and a neural network to carry out high-efficiency and high-precision segmentation on the cracks on the road surface. Firstly, making a road crack data set, and dividing the data set into a training set, a verification set and a test set; secondly, determining a search space of a network model by taking the U-Net network as a basic framework, searching internal structures of different modules of the U-Net network by using a genetic algorithm, obtaining a light-weight U-shaped convolution neural network model by using a crack training set and a verification set, finally using a test set for testing and verifying the performance of the automatically designed neural network model and segmenting road cracks, and displaying a segmentation result graph of the road cracks. The lightweight neural network model which is lightweight and can efficiently partition the road cracks can be automatically designed by adopting a genetic algorithm, and the obtained neural network model is low in calculation complexity.

In order to achieve the above object, there is provided a road crack segmentation method based on a genetic algorithm and a U-shaped neural network, the method comprising the steps of:

s100: reading a road crack data set;

s200: constructing a neural network model;

s300: optimizing the neural network model to obtain an optimized neural network model;

s400: training the optimized neural network model by using a training set divided by the road crack data set to obtain a road crack segmentation model;

s500: and performing road crack segmentation on the input road surface image through a road crack segmentation model.

Further, in S100, the road crack data set is a public data set such as CFD, AigleRN, or the like, or is acquired by a handheld intelligent terminal such as a mobile phone or a camera, or is captured by a mobile robot or an unmanned aerial vehicle carrying the mobile robot or the camera, or may be acquired by network downloading. To increase the data sample, it may be a collection of data obtained in the manner described above. In the present invention, the disclosed road fracture dataset CFD is used to design a relevant neural network model. Dividing a data set CFD into a training set, a verification set and a test set, wherein 50 photos, 22 photos and 36 photos are respectively obtained; except for the public data set, images obtained in other modes need to be labeled, and are divided into a training set, a verification set and a test set according to a certain proportion or quantity.

Further, in S100, the road crack data set is divided into a training set, a validation set, and a test set.

Further, in S200, the method for constructing the neural network model includes: the neural network model is searched by taking a U-shaped neural network (such as a U-Net network) as a basic neural network skeleton, and the neural network model is composed of an encoder E and a decoder D, wherein the encoder E and the decoder D respectively comprise an encoding module ei (i is 0,1,2,3) and a decoding module dj (j is 0,1, 2); from top to bottom, the U-shaped neural network can be divided into different stages Sk (k is 0,1,2,3), ensuring that the feature diagram dimension of the same stage is unchanged, the U-shaped structure adopts 4 stages, including 4 encoding modules and 3 decoding modules, and the U-shaped structure can include several different stages except 4. Except for the last stage, the corresponding encoder module and decoder module (i ═ j) transmit different semantic information extracted by the encoder E to the decoder D in a Skip connection mode, so that the link between the decoder and the encoder can be strengthened, and the gradient dissipation problem in model training can be reduced. The encoder D needs to fuse information from the skip-join and the upsampling, wherein the information acquired in the decoding process and the information from the skip-join can be fused by means of splicing and adding in alignment. In order to reduce the computational complexity of the model, a method of adding alignment (corresponding element position data) is selected for feature fusion.

Further, the encoding module and the decoding module in the neural network model are collectively referred to as a module, the internal structure of each module is composed of nodes and edges connecting the nodes, each node represents an operation unit or an operation sequence, and each edge represents that two nodes have a connection relationship. Binary coding is used for representing the connection relation between nodes in the module, and the nodes in the module are firstly divided into two types, namely default nodes and intermediate nodes. The default nodes comprise default input nodes and default output nodes, the function of the default input nodes is to ensure the validity of each bit of binary code, the default input nodes receive data output by a previous module and transmit the output data to each node without a front end; the default output node receives the output data of all nodes without postings, adds up and processes the data, and then passes the data to the pooling layer (max pooling). For the other K intermediate nodes in the module) v_k(K-0, 1, 2.., K-1), use was made of

The bit encodes the connection relationship between the nodes, and the first bit represents (v)₀，v₁) The next two bits represent (v)₀，v₂) And (v)₁，v₂) Until the last K-1 bit is used to represent v₁，v₂，...，v_K-2And v_K-1The connection relationship between them. To ensure that the module is not too complex and the number of intermediate nodes needs to be limited to a small range, the method sets K to 5. If the bit corresponding to the two nodes is 1, the two nodes are connected, the following nodes take the output of the front node as a part of the input, and if the bit is 0, the two nodes are not connected; the nodes will add all their inputs together before they canAnd (6) processing.

Further, constructing 16 operation sequences in the neural network model as operation options of nodes in the module, wherein each operation sequence is composed of a plurality of basic operation units, including 3 × 3 convolution (Conv), 5 × 5 convolution (Conv), ReLU activation function, Mish activation function and Instance normalization; in order to search for a lightweight neural network architecture, the number C of convolution kernels of convolution operation in each operation sequence needs to be limited to a small range, and C is set to be 20; the operation sequences mainly differ in the size of the convolution kernel, the activation function, the activation mode (pre-activation or post-activation) and the normalization type (whether instance normalization is used or not), so that a 4-bit binary code is used for representing the operation sequences. Assuming that nodes in the modules are all the same operation sequences, 4 coding modules and 3 decoding modules in the U-Net network are corresponding to a plurality of module genes, so that each module gene consists of a 4-bit operation gene and a 10-bit connection gene, and the operation genes are codes of operation sequences such as convolution operation (3 x 3 or 5 x 5), an activation function (ReLu or Mish), an activation mode (pre-activation or post-activation) and a normalization type (whether example normalization is used) and the like; the connection gene is the code of the connection relation among all the nodes; and, 7 modular genes together constitute a neural architecture genotype.

Further, the method for optimizing the neural network model to obtain the optimized neural network model comprises the following steps:

each individual represents a neural framework, the fitness value of the individual depends on the performance of the corresponding framework, and after new individuals are generated, the neural frameworks represented by the new individuals are trained from zero according to the provided data to obtain the fitness value of the new individuals;

s301: firstly, randomly initializing a population (consisting of N individuals) (taking N as 20), wherein N is the population scale; then evolving the T generations, wherein T is the maximum evolution generation, (T is 50), each generation comprises the operations of crossing, mutation and selection, and p_cTo cross probability, p_mAs the mutation probability, p_bIs the probability of variation per bit; (ii) a

S302: randomly initializing N individuals with binary codes as initialization population P₀；

S303: evaluation of population P₀Fitness value of the subject;

s304: setting the initial value of a variable T to be 0, wherein T is a natural number and takes the value of [0, T]Let Q_tIs a population;

s305: set Q_tIs an empty set;

s306: from P_tIn which two parent individuals p are selected₁And p₂；

S307: the parent individual p₁And p₂Respectively with a probability p_c、p_mAnd p_bCrossover and mutation to generate two progeny individuals q₁And q is₂；

S308: when | Q_tIf | is less than N, Q_t∪q₁∪q₂Assign value to Q_tTurning to step S306, otherwise turning to step S309;

s309: evaluating population Q_tFitness value of the subject;

s310: from P using a corresponding context selection method_t∪Q_tSelecting N individuals to a population P_t+1Performing the following steps;

s311: when T is less than T, increment the variable T by 1 and go to step S305, otherwise go to step S312;

s312: output population P_tThe individual with the largest fitness value.

Wherein the two individuals p1 and p2 were different individuals in each selection.

Further, in S306 and S310, the slave P in S306_tIn which two parent individuals p are selected₁And p₂The selection method of (3) and the environment selection method in S310 are: from the current population P_tAnd the generated offspring Q_tSelecting the best S individuals to enter the next generation population P_t+1S is 5, at this time, the selected individual is also selected from P_t∪Q_tIs removed. Then continue to use the binary tournament selection method from P_t∪Q_tSelecting other individuals to enter the next generation population until the next generation population P_t+1And the current population P_tThe population sizes of (A) are the same.

Further, in S306 and S310, the slave P in S306_tIn which two parent individuals p are selected₁And p₂The selection method of (3) and the environment selection method in S310 are: the cross operation of the genetic algorithm, in order to guarantee the information exchange among the population individuals, the effective information exchange can guarantee the convergence of the algorithm; the traditional genetic algorithm usually selects single-point crossing or two-point crossing to generate filial generations, but the search step length of the two crossing operations is small, and when the gene coding length is long, the performance of the algorithm is not improved ideally; the multipoint crossing has a larger search step length compared with the single point crossing and the crossing between two points, the algorithm performance is better when the gene coding length is longer, and the multipoint crossing operation is selected for generating filial generations by considering that individuals in the method have longer gene codes. In addition, considering that two similar individuals are selected to be crossed, if the similarity between the generated child individuals and the parent individuals is more likely to mean that the crossing operation at the moment is not likely to be meaningful; in order to ensure the searching capability of the algorithm, two individuals p1 and p2 are selected through a binary tournament selection method, a difference value diff of the two individuals is calculated through a formula (1), if the difference value diff is larger than a threshold value mu set by the user to be 0.2, the two individuals are set as parent individuals to be crossed, otherwise, p1 and p2 are reselected in the same way, if p1 and p2 still do not meet the requirements after 10 reselections, and the individuals selected last time are set as the parent individuals to be crossed. The selected parent individuals then do a difference-guided crossover operation with a probability of pc ═ 0.9.

Wherein, the calculation formula of the difference value between the two individuals is as follows:

p1 and p2 represent two individuals, L_geneIs the length of the individual gene, sum () isThe summation function, XOR () is an exclusive or function.

Further, the method of differential guided crossover operation comprises the steps of:

a301: let P_tIs the current population, p_cTo be the cross probability, μ is the difference threshold,

symbol ← represents a setting value or assignment, and sets an initial value of a variable j to 0;

a302: slave population P_tRandomly selecting two different individuals, and selecting the individual with larger fitness value as p₁；

A303: again from the population P_tRandomly selecting two different individuals, and selecting the individual p with larger fitness value from the two different individuals₂(ii) a Wherein p is₁Is not equal to p₂；

A304: calculating p for an individual₁And p₂The degree of difference diff of (d);

a305: if diff is greater than mu, going to step A306, otherwise increasing the variable j by 1 and judging whether j is greater than or equal to 10, if j is greater than or equal to 10, going to step A306, and if j is less than 10, going to step A302;

a306: randomly generating a number r from (0, 1);

a307, if r<p_cThen calculate the individual p₁And p₂Length of gene (n) len, if r>Jump to a312 if pc;

a308, randomly selecting 10 different integers from [0, len) and sequencing the integers from small to large to obtain a number sequence ins;

a309, dividing the ints into 5 pairs (i) in sequence₀,i₁),(i₂,i₃),(i₄,i₅),(i₆,i₇),(i₈,i₉) Setting the initial value of the variable k to be 0;

a310, mixing p₁[i_2k:i_2k+1]And p₂[i_2k:i_2k+1]Carrying out exchange; p is a radical of₁[i_2k:i_2k+1]In the sense of an individual p₁To (1)i_2kAnd i_2k+1A bit;

a311: otherwise, increasing the variable k by 1 and judging whether k is more than or equal to 5, if k is more than or equal to 5, turning to the step A311, and if k is less than 5, turning to the step A309;

a312: the individual p₁And p₂Input to an individual o₁,o₂；

A313: output switched individuals o₁,o₂。

Further, the method of evaluating the fitness value comprises the steps of:

b301: let the population to be evaluated for fitness value be P_tThe training data is D_trainVerification data is D_validIndividual is an individual;

b302: converting the individual into a neural network architecture arch corresponding to the individual;

b303: carrying out weight parameter of architecture arch, and initializing evolution algebra epoch ← 0 and optimal adaptation value F1-score_best← 0, symbol ← representing as a setting value or assignment;

b304: using training data D_trainTraining a framework arch round through a gradient descent algorithm, and increasing an evolution algebra epoch by 1;

b305: if epoch > 80, then the verification data D is used_validValidating the architecture arch under training to obtain a fitness value F1-score;

b306: if F1-score > F1-score_bestF1-score_best←F1-score；

B307: go to step B304 if epoch < 130, otherwise F1-score_bestSet to the fitness value of the individual and go to step B308;

b308: each is judged to be in P_tWhether all individual in (1) have evaluated the fitness value, if yes, turning to step B309, and if not, turning to step B302 to start evaluating the fitness value of the individual which has not been evaluated;

b309: population P outputting evaluated fitness value_t。

Further, mutationThe operation method comprises the following steps: in the selection process, both individuals with good performance and individuals with relatively poor performance are selected into the next generation population. From the current population P_tAnd the generated offspring Q_tThe best S individuals (S is 5 in the embodiment) are selected to enter the next generation population P_t+1At this time, these selected individuals will also be selected from P_t∪Q_tIs removed. Then continue to use the binary tournament selection method from P_t∪Q_tSelecting other individuals to enter the next generation population until the next generation population P_t+1And the current population P_tThe population sizes of (A) are the same. The filial individuals generated by the cross operation are mutated with the probability of pm being 0.7, and after the individuals determine the mutation, each bit of the genes is inverted with the probability of pb being 0.05.

Further, fitness value the fitness value of an individual is F1-score whose corresponding architecture is a road crack segmentation on the data provided, and the method of assessing fitness values summarizes the steps of assessing individuals in a population, each individual being assessed in this manner. Before the start of the evaluation, each individual needs to decode it into the corresponding neural architecture. Before the framework starts training, He Initialization is used for initializing the weight parameters of the framework, and then, based on the provided training data, a lokahead method with Adam as a basic optimizer is used for training the neural framework. From round 80, the validation data was used to validate the performance of the framework in training to F1-score after each round of training on the training set until the end of round 130. After the training of the architecture is stopped, the best F1-score obtained on the validation data during the training process is set as the fitness value for the corresponding individual. The method of mutation operation does not train the architecture to converge, but uses an early-stop strategy.

Further, in the present embodiment, the data for the architecture search is from a CFD dataset, which has a total of 108 road surface photos and their corresponding labels, and is divided into a training set, a validation set, and a test set, which each include 50, 22, and 36 photos. According to the method, in the framework searching process, the photo of the test set is not used, the data of the training set is used for training the model to be evaluated, and the data of the verification set is used for evaluating the framework corresponding to the individual in the framework searching process so as to obtain the adaptive value of the corresponding individual.

The invention has the beneficial effects that: the invention provides a method for automatically designing a light neural network model for road crack segmentation, which searches and optimizes the internal structure of modules in a U-shaped decoding-coding structure serving as a backbone network, thereby finding out structures and operations which are better than manual design and have lower computational complexity for the modules. The neural network model designed by the method can more effectively process the complex situation of the road surface image, has stronger robustness on the interference of the complex road surface image cracks, the nonuniformity of diseases and the uneven illumination phenomenon, and can more accurately extract the characteristics of the road cracks, thereby improving the segmentation accuracy of the whole image. The invention designs a more reasonable and compact search space, thereby not only ensuring the flexibility of the architecture, but also improving the efficiency of the architecture search, improving the cross operation in the genetic algorithm and improving the search capability of the genetic algorithm in the architecture search process. The method can automatically design the light neural network model for road crack segmentation, can effectively reduce the workload of manual design, and reduces the dependence on professional knowledge, and compared with other models, the designed model has lower computational complexity and better crack segmentation effect, and has more potential to be applied to automatic structure health diagnosis in engineering.

Drawings

The above and other features of the present invention will become more apparent by describing in detail embodiments thereof with reference to the attached drawings in which like reference numerals designate the same or similar elements, it being apparent that the drawings in the following description are merely exemplary of the present invention and other drawings can be obtained by those skilled in the art without inventive effort, wherein:

FIG. 1 illustrates a skeletal diagram of a search space;

FIG. 2 is a schematic diagram of a connection relationship between two encoding nodes;

FIG. 3 is a genotype diagram showing the modular genes and architecture;

FIG. 4 shows a diagram of Top one architecture;

FIG. 5 shows a diagram of the Top two architecture;

FIG. 6 is a graph comparing the effect of the optimized neural network model and the U-Net road crack segmentation.

Detailed Description

The conception, the specific structure and the technical effects of the present invention will be clearly and completely described in conjunction with the embodiments and the accompanying drawings to fully understand the objects, the schemes and the effects of the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

The invention provides a road crack segmentation method based on a genetic algorithm and a U-shaped neural network, which specifically comprises the following steps:

1. constructing a road crack data set, and dividing the road crack data set into a training set, a verification set and a test set;

the road crack data set can be public data sets such as CFD and AigleRN, can be acquired by a handheld intelligent terminal such as a mobile phone and a camera, can also be acquired by shooting of a mobile robot and an unmanned aerial vehicle carrying the intelligent mobile phone and the camera, and can also be acquired by downloading on a network. To increase the data sample, it may be a collection of data obtained in the manner described above. In an embodiment of the invention, we use the disclosed road fracture data set CFD to design a relevant neural network model. The data set CFD is divided into a training set, a validation set, and a test set with 50, 22, and 36 photographs, respectively.

2. Designing an optimal neural network model for road crack segmentation;

firstly, designing a neural architecture search space for road crack segmentation, then coding the search space by using a binary coding mode, and then searching and optimizing a neural network architecture by using a genetic algorithm.

(1) Designing and encoding a search space

As shown in FIG. 1, the invention uses U-Net as the basic neural network framework to search, which is composed of an encoder E and a decoder D. Wherein, the encoder E and the decoder D each comprise an encoding module ei (i ═ 0,1,2,3) and a decoding module dj (j ═ 0,1,2), respectively. From top to bottom, U-Net can be divided into different phases Sk (

k

0,1,2,3), ensuring that the feature map dimensions of the same phase are constant. In this embodiment, the U-shaped structure employs 4 stages, including 4 encoding modules and 3 decoding modules. It will be appreciated that in a practical embodiment the U-shaped structure may comprise several different stages other than 4. Except for the last stage, the corresponding encoder module and decoder module (i ═ i) both transmit different semantic information extracted by the encoder E to the decoder D in a Skip connection mode, so that the link between the decoder and the encoder can be strengthened, and the problem of gradient dissipation in model training can be alleviated. The encoder D needs to fuse information from the skip-join and the upsampling, wherein the information acquired in the decoding process and the information from the skip-join can be fused by means of splicing and adding in alignment. To reduce the computational complexity of the model, we have chosen a way of bitwise addition (corresponding element position data addition) for feature fusion in this embodiment.

(b) Module and coding thereof

In our method, the internal structure of each module is composed of nodes and edges connecting the nodes, each node represents an operation unit or an operation sequence, and each edge represents that two nodes have a connection relationship. As shown in fig. 2, fig. 2 is an example of a connection relationship between two coding nodes (numbers in nodes only represent the order of the nodes), and we use binary coding to represent the connection relationship between nodes inside a module, and first divide the nodes in the module into two types, namely, default nodes and intermediate nodes. The default nodes include a default Input node (Input in fig. 2) and a default Output node (Output in fig. 2), and the functions of the default Input node are to ensure the validity of each bit binary code, the default Input node receives the data Output by the previous module and transmits the Output data to each non-previous moduleA node; the default output node receives the output data of all nodes without postings, adds up and processes the data, and then passes the data to the pooling layer (max pooling). For the other K intermediate nodes v in the module_k(K ═ 0,1,2, for K-1), we need to use

The bit encodes the connection relationship between the nodes, and the first bit represents (v)₀，v₁) The next two bits represent (v)₀，v₂) And (v)₁，v₂) Until the last K-1 bit is used to represent v₁，v₂Specially for v_K-2And v_K-1The connection relationship between them. To ensure that the module is not too complex in this embodiment, we set K to 5. If the bit corresponding to two nodes is 1, the connection between the two nodes is indicated, the following node takes the output of the previous node as a part of the input, and if the bit is 0, the connection between the two corresponding nodes is not indicated. The nodes will add all their inputs before processing.

As shown IN table 1 optional operation sequences of nodes, IN represents example normalization (Instance normalization), we provide 16 operation sequences as operation options of nodes IN the module, and each operation sequence is composed of some basic operation units, including 3 × 3 convolution (Conv), 5 × 5 convolution (Conv), ReLU activation function, Mish activation function, and Instance normalization (Instance normalization). In order to search for a lightweight neural network architecture, the number C of convolution kernels of convolution operation in each operation sequence needs to be limited to a small range, and in the present embodiment, C is set to 20. The main differences of these operation sequences are represented by the size of the convolution kernel, the activation function, the activation mode (pre-activation or post-activation) and the normalization type (whether instance normalization is used or not), so we use a 4-bit binary code to represent these operation sequences. We assume that the nodes in the modules are all the same Operation sequence, so each module gene consists of a 4-bit Operation gene (Operation gene in FIG. 3 (a)) and a 10-bit Connection gene (Connection gene in FIG. 3 (a)) (as shown in FIG. 3 (a)). And, 7 modular genes together constitute the genotype of one neural architecture (as shown in fig. 3 (b)).

Table 1 sequence of operations selectable for a node, IN represents example normalization

(2) Optimization algorithm

Each individual represents a neural architecture, and the fitness value of an individual depends on the performance of the corresponding architecture. As shown in the flow framework of Algorithm 1: Genetic U-Net, the flow of the algorithm is as follows: firstly, randomly initializing a population (consisting of N individuals) (N is 20 in the embodiment); t generations (in this example, T is 50) are then evolved, each including crossover, mutation, and selection operations. After generating new individuals, we will train the neural architecture represented by these new individuals from scratch according to the provided data to obtain their fitness values, note: algorithm 1. the flow framework of Genetic U-Net is pseudo-code.

The invention adopts a difference-guided parent crossing individual selection method, wherein the method firstly selects two individuals p1 and p2 through a binary tournament selection method, calculates a difference value diff of the two individuals through a formula (1), sets the two individuals as parent individuals to be crossed if the difference value diff is larger than a threshold value mu set by the user to be 0.2, otherwise, the p1 and the p2 are reselected in the same way, and sets the last selected individual as the parent individual to be crossed if the p1 and the p2 still do not meet the requirements after 10 reselections. The selected parent individuals then cross with a probability of pc ═ 0.9. The detailed flow of interleaving is shown in algorithm 2. the interleaving operation guided by the difference, note that: algorithm 2. Cross-over operation guided by differences is pseudo-code: j ← 0<10 meaning: let the initial value of variable j be 0 and j < 10.

p1 and p2 represent two individuals, L_geneIs the length of the individual gene.

Preferably, the mutation operation is: in the selection process, good-performing individuals and relatively poor-performing individuals are selected to enter the next generation population. We will follow the current population P_tAnd the generated offspring Q_tThe best S individuals (S is 5 in the embodiment) are selected to enter the next generation population P_t+1At this time, these selected individuals will also be selected from P_t∪Q_tIs removed. We would then proceed to use binary tournament selection from P_t∪Q_tSelecting other individuals to enter the next generation population until the next generation population P_t+1And the current population P_tThe population size of (1) is the same, offspring individuals generated by the crossover operation have a probability variation of pm being 0.7, and after the individuals determine the variation, each bit of the genes is inverted with a probability of pb being 0.05.

Preferably, fitness value of the individual is assessed as the fitness value of F1-score whose corresponding architecture is road fracture segmentation on the provided data, algorithm 3: assessing fitness values summarizes the steps of assessing individuals in a population, each being assessed in this manner. Before the start of the evaluation, each individual needs to decode it into the corresponding neural architecture. Before the framework starts training, we initialize the weight parameters of the framework using He Initialization, and then train the neural framework using the Lookahead method with Adam as the base optimizer based on the provided training data. From round 80, the validation data was used to validate the performance of the framework in training to F1-score after each round of training on the training set until the end of round 130. After the training of the architecture is stopped, we will set the best F1-score obtained on the verification data during the training process as the fitness value of the corresponding individual. Our approach does not train the architecture to converge, but uses an early-stop strategy. Note: algorithm 3: evaluating the fitness value as pseudo-code:

the data for the architectural search in this embodiment is from the CFD dataset, which has a total of 108 road surface photos and their corresponding labels, and we have performed training set-validation-test set partitioning on it, with the training set, validation set, and test set each containing 50, 22, and 36 photos. According to the method, in the framework searching process, the photo of the test set is not used, the data of the training set is used for training the model to be evaluated, and the data of the verification set is used for evaluating the framework corresponding to the individual in the framework searching process so as to obtain the adaptive value of the corresponding individual.

(3) Results

After the architecture search is finished, we select the best 2 individuals from the population of the last generation, and decode them into the corresponding neural network architecture, the architecture of which is shown in fig. 4 and 5, fig. 4 is a Top one architecture diagram; fig. 5 shows a Top two architecture diagram, with 3 × 3 convolutions for the first module Input (default Input node) and 1 × 1 convolutions for the last module Output (default Output node) in fig. 4 and 5.

We verified the performance of the best architecture searched on the data sets of road crack segmentation such as CFD and AigleRN (shown in FIG. 4), and compared the computational complexity of the architecture and the performance of the road crack segmentation task with the original U-Net model, the comparison between the algorithm (the algorithm is Genetic U-Net) and U-Net is shown in FIG. 6 and Table 2, Table 2 is the computational complexity of the algorithm (Genetic U-Net) and U-Net, FIG. 6 is a graph comparing the effect of the optimized neural network model of the present invention and the U-Net road crack segmentation, in FIG. 6, Images are raw road surface Images, Label is an image Label, and Genetic U-Net and U-Net are road crack Images after Genetic U-Net segmentation and road crack Images after U-Net segmentation, respectively. The road crack segmentation method based on the multi-dimensional road surface image search is obvious in that the searched architecture is more excellent than the original U-Net in the road crack segmentation task, the road crack structure can be accurately segmented under the complex condition of a road surface image, certain robustness is shown, in the aspect of computational complexity, the searched architecture is far lower than the original U-Net, the parameter is 100 times smaller than the U-Net, and MACs is about 3.2 times smaller than the original U-Net.

TABLE 2 this patent (Genetic U-Net) compares the computational complexity with U-Net

The existing genetic algorithm can randomly select two parent individuals and then directly cross the parent individuals, similarity between the crossed individuals cannot be judged, and therefore the function of a cross operator in the evolution process is limited and low-efficiency.

Although the present invention has been described in considerable detail and with reference to certain illustrated embodiments, it is not intended to be limited to any such details or embodiments or any particular embodiment, so as to effectively encompass the intended scope of the invention. Furthermore, the foregoing describes the invention in terms of embodiments foreseen by the inventor for which an enabling description was available, notwithstanding that insubstantial modifications of the invention, not presently foreseen, may nonetheless represent equivalent modifications thereto.

Claims

1. The road crack segmentation method based on the genetic algorithm and the U-shaped neural network is characterized by comprising the following steps of:

s100: reading a road crack data set;

s200: constructing a neural network model;

2. The method for road crack segmentation based on genetic algorithm and U-shaped neural network as claimed in claim 1, wherein in S100, the road crack data set is public data set CFD, AigleRN, or collected by a handheld smart terminal, or shot by a mobile robot or an unmanned aerial vehicle carrying a smart phone or a camera, or downloaded through network; except for the public data set, images obtained in other modes need to be labeled, and are divided into a training set, a verification set and a test set according to a certain proportion or quantity.

3. The method for road crack segmentation based on genetic algorithm and U-shaped neural network as claimed in claim 1, wherein in S200, the method for constructing the neural network model comprises: the neural network model is searched by taking a U-shaped neural network as a basic neural network framework, and consists of an encoder and a decoder, wherein the encoder and the decoder respectively comprise an encoding module and a decoding module; from top to bottom, the U-shaped decoding and encoding structure of the U-shaped neural network can be divided into different stages, and the feature diagram dimension of the same stage is unchanged; except for the last stage, the corresponding encoder module and decoder module transmit different semantic information extracted by the encoder E to the decoder D in a jump connection mode, the encoder D needs to fuse information from jump connection and up-sampling, and the information up-collected in the decoding process and the information from jump connection are fused in a splicing and alignment adding mode.

4. The method for road crack segmentation based on genetic algorithm and U-shaped neural network as claimed in claim 1, wherein coding modules and decoding modules in the neural network model are collectively referred to as modules, each module has an internal structure composed of nodes and edges connecting the nodes, each node represents an operation unit or an operation sequence, and each edge represents that two nodes have a connection relationship; using binary coding to represent the connection relation between nodes in the module, firstly dividing the nodes in the module into two types, namely default nodes and intermediate nodes; the default nodes comprise default input nodes and default output nodes, and the default input nodes receive data output by previous modules and transmit the output data to each node without the previous modules; the default output node receives the output data of all nodes without postings, adds up and processes the data, and then transmits the data to the pooling layer; for the other K intermediate nodes v in the module_k(K-0, 1,2, …, K-1) and their use

The bit encodes the connection relationship between the nodes, and the first bit represents (v)₀,v₁) The next two bits represent (v)₀,v₂) And (v)₁,v₂) Until the last K-1 bit is used to represent v₁,v₂,…,v_K-2And v_K-1The connection relationship between two nodes is shown, if the bit corresponding to two nodes is 1, the connection between the two nodes is shown, the output of the front node is used as a part of the input of the rear node, and if the bit is 0, the connection between the two corresponding nodes is not shown; the nodes will add all their inputs before processing.

5. The method for road crack segmentation based on genetic algorithm and U-shaped neural network as claimed in claim 1, wherein the method for optimizing the neural network model to obtain the optimized neural network model comprises the following steps:

s301: firstly, randomly initializing N populations, wherein N is the population scale; then evolving T generations, T being the maximum evolution generation, each generation including crossover, mutation and selection operations, p_cTo cross probability, p_mAs the mutation probability, p_bIs the probability of variation per bit;

S303: evaluation of population P₀Fitness value of the subject;

s305: set Q_tIs an empty set;

s306: from P_tIn which two parent individuals p are selected₁And p₂；

S308: when | Q_t|<When N is equal to Q_t∪q₁∪q₂Assign value to Q_tTurning to step S306, otherwise turning to step S309;

s309: evaluating population Q_tFitness value of the subject;

s312: output population P_tThe individual with the largest fitness value.

6. Genetic algorithm and U-shaped neural network based track according to claim 5The method for dividing road cracks is characterized in that in S306 and S310, the slave P in S306_tIn which two parent individuals p are selected₁And p₂The selecting method of (1) and the environment selecting method of (S310) are that two individuals p1 and p2 are selected through a binary tournament selection method, a difference value diff of the two individuals is calculated through a formula (1), if the difference value diff is larger than a threshold value mu set by us to be 0.2, the two individuals are set as parent individuals to be crossed, otherwise, p1 and p2 are reselected in the same way, if the p1 and p2 still do not meet the requirements after multiple reselections, the individuals selected last time are set as the parent individuals to be crossed; then, the selected parent individuals perform the crossover operation of difference guidance with the probability that pc is 0.9;

p1 and p2 represent two individuals, L_geneIs the length of the individual gene, sum () is the summation function, and XOR () is the exclusive or function.

7. The method for road crack segmentation based on genetic algorithm and U-shaped neural network as claimed in claim 6, wherein the method of difference-guided intersection operation comprises the following steps:

A303: again from the population P_tRandomly selecting two different individuals and selecting suitable individuals from the two individualsIndividual p with large response value₂(ii) a Wherein p is₁Is not equal to p₂；

a306: randomly generating a number r from (0, 1);

a307: if r is<p_cThen calculate the individual p₁And p₂Length of gene (n) len, if r>Jump to a312 if pc;

a308: randomly selecting 10 different integers from [0, len) and sequencing the integers from small to large to obtain a number sequence ins;

a309: dividing the ints into 5 pairs (i) in sequence₀,i₁),(i₂,i₃),(i₄,i₅),(i₆,i₇),(i₈,i₉) Setting the initial value of the variable k to be 0;

a310: p is to be₁[i_2k:i_2k+1]And p₂[i_2k:i_2k+1]Carrying out exchange; p is a radical of₁[i_2k:i_2k+1]Representing an individual p₁I of (1)_2kAnd i_2k+1A bit;

a312: the individual p₁And p₂Input to an individual o₁,o₂；

A313: output switched individuals o₁,o₂。

8. The method for road crack segmentation based on genetic algorithm and U-shaped neural network as claimed in claim 6, wherein the method for evaluating the fitness value comprises the following steps:

b303: initializing the weight parameters of the architecture arch, and initializing evolution algebra epoch ← 0, and optimal adaptation value F1-score_best← 0, symbol ← representing as a setting value or assignment;

b304: using training data D_trainTraining a framework arch round through a gradient descent algorithm, and increasing the epoch by 1;

b305: if epoch>80 then use the authentication data D_validValidating the architecture arch under training to obtain a fitness value F1-score;

b306: if F1-score>F1-score_bestF1-score_best←F1-score；

B307: if epoch<130 to step B304, otherwise F1-score_bestSet to the fitness value of the individual and go to step B308;

b308: each is judged to be in P_tWhether all individual in (1) have evaluated the fitness value, if yes, going to step B309, and if not, going to step B302 to start evaluating the fitness value of the individual which has not been evaluated;

b309: population P outputting evaluated fitness value_t。

9. The road crack segmentation method based on the genetic algorithm and the U-shaped neural network as claimed in claim 5, wherein the method of the mutation operation is as follows: in the selection process, selecting the individuals with good performance and the individuals with relatively poor performance into the next generation of population; from the current population P_tAnd the generated offspring Q_tSelecting the best S individuals to enter the next generation population P_t+1At this time, these selected individuals will also be selected from P_t∪Q_tRemoving; then continue to use the binary tournament selection method from P_t∪Q_tSelecting other individuals to enter the next generation population until the next generation population P_t+1And current speciesGroup P_tThe population scales of the cells are the same; the filial individuals generated by the cross operation are mutated with the probability of pm being 0.7, and after the individuals determine the mutation, each bit of the genes is inverted with the probability of pb being 0.05.