CN106845546B - BFBA and ELM-based mammary X-ray image feature selection method - Google Patents

BFBA and ELM-based mammary X-ray image feature selection method Download PDF

Info

Publication number
CN106845546B
CN106845546B CN201710048258.2A CN201710048258A CN106845546B CN 106845546 B CN106845546 B CN 106845546B CN 201710048258 A CN201710048258 A CN 201710048258A CN 106845546 B CN106845546 B CN 106845546B
Authority
CN
China
Prior art keywords
bat
rand
new
representing
elm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710048258.2A
Other languages
Chinese (zh)
Other versions
CN106845546A (en
Inventor
韩晓红
相洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiyuan University of Technology
Original Assignee
Taiyuan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Technology filed Critical Taiyuan University of Technology
Priority to CN201710048258.2A priority Critical patent/CN106845546B/en
Publication of CN106845546A publication Critical patent/CN106845546A/en
Application granted granted Critical
Publication of CN106845546B publication Critical patent/CN106845546B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a breast X-ray image feature selection method based on BFBA and ELM, which relates to the technical field of image processing and aims to solve the technical problems of 'exponential explosion' encountered by a dynamic programming method and easy falling of an analytic method or a bat algorithm into a local optimal solution; the technical scheme is as follows: the method comprises the following steps: the first step, collecting the data set MIAS; secondly, setting BFBA parameters; thirdly, initializing bat population; fourthly, generating a corresponding characteristic subset according to each bat position code; fifthly, updating the search pulse frequency, speed and position of each bat; sixthly, generating uniformly distributed random numbers rand; eighthly, sequencing the fitness values of all bats to find out the current optimal solution and optimal value; ninthly, judging whether the optimal solution changes; step ten, judging whether the static _ count is equal to the static _ max; the tenth step, repeat the fourth step to the tenth step; and step twelve, outputting a global optimal value and an optimal solution.

Description

BFBA and ELM-based mammary X-ray image feature selection method
Technical Field
The invention relates to the technical field of image processing, in particular to a breast X-ray image feature selection method based on Bird population response Bat Algorithm (BFBA) and Extreme Learning Machine (ELM).
Background
Breast diseases are one of the common diseases of women, and the multiple and dangerous properties of breast cancer seriously affect the health and even life of women, so the early diagnosis of breast diseases is directly related to the health of women. Especially for breast cancer, the pathogenesis of breast cancer cannot be completely determined. The current clinical diagnosis methods of breast cancer mainly comprise touch diagnosis, histological diagnosis, cytological diagnosis and imaging diagnosis. Imaging is widely adopted for its convenience of diagnosis, scientificity and relatively high operability. Mammography is the most common early diagnosis technique of breast cancer, and for this method, the diseased condition of the breast needs to be analyzed from the X-ray image of the breast. With the rapid development of computer technology, the analysis of X-ray images of the breast also realizes the conversion from traditional manual analysis to computer-aided analysis, and the conversion can make the diagnosis of breast cancer faster and more accurate.
Features are key to determining similarity and classification. Since breast cancer is identified by a computer, a large number of original features are extracted from a breast X-ray image in order to improve the identification rate. However, from the aspect of the extraction method, many features are not independent, that is, the features have redundancy and affect the speed and accuracy of pattern recognition, so that the original features need to be selected, and the features which are ambiguous, difficult to distinguish or strong in correlation are discarded. Feature selection is essentially a combinatorial optimization problem. Conventional optimization algorithms, such as analytic methods, can only obtain a locally optimal solution, but not a globally optimal solution, and require continuous and differentiable objective functions; enumeration, while overcoming these drawbacks, is computationally inefficient. Even well-known dynamic programming methods suffer from the problem of "exponential explosion", which is often ineffective for moderate-scale and moderately complex problems. A breast X-ray image feature selection method based on a genetic algorithm and a particle swarm algorithm obtains good classification results. The Bat Algorithm (BA) is an emerging heuristic group intelligent Algorithm, and compared with a particle swarm Algorithm and a genetic Algorithm, the BA can realize the interconversion process between dynamic control local search and global search and has the potential of playing greater roles. However, during the BA search, there is a phenomenon of premature convergence, and more inefficient iterations are required to obtain a more accurate estimate of the local optimum due to the rapid reduction in individual diversity of the BA. In this case, it is difficult to achieve a good balance between exploration and production, which makes the BA prone to stay locally optimal.
Disclosure of Invention
The invention overcomes the defects in the prior art, and aims to provide a mammary gland X-ray image feature selection method, so that the problems of 'exponential explosion' encountered by a dynamic programming method and the problem that an analytic method or a bat algorithm is easy to fall into a locally optimal solution are solved.
In order to solve the technical problems, the invention adopts the technical scheme that: a breast X-ray image feature selection method based on BFBA and ELM comprises the following steps:
firstly, collecting a data set MIAS (the medical Image Analysis Society), extracting mammary X-ray Image characteristics, dividing the data set into a training set and a testing set, wherein the training set is used for training an Extreme Learning Machine (ELM), namely an Extreme Learning Machine (ELM) to design an ELM classifier, and the testing set is used for checking the effectiveness of the ELM classifier;
the method for extracting the mammary gland X-ray image features is a gray level co-occurrence matrix, and four statistical parameters are extracted: the directions of the angle second moment, entropy, inertia moment and related coefficient and the gray level co-occurrence matrix are 0 degree, 45 degree, 90 degree and 135 degree; firstly, calculating gray level co-occurrence matrixes in four directions, wherein the distance between the pixels is 1, and secondly, calculating four statistical parameters by each gray level co-occurrence matrix; dividing each image into four blocks, extracting the 16 characteristics from each sub-image block as original sample data, and obtaining 64 statistical characteristics in total; dividing the sample database obtained by feature extraction into 10 parts, selecting 90% of the sample database to train, and testing the other 10% of the sample database;
secondly, setting BFBA parameters;
the initial parameters comprise the bat group size of 20-100, the dimension D of each bat individual is 64, the pulse volume A is 0.5, the pulse rate R is 0.5, and the pulse frequency range Q is searchedmin,Qmax]Wherein Q ismin=0,Q max2, the maximum iteration number iter _ max is 1000, the stall counter statant _ count is 0, and the maximum stall number statant _ max is 4;
thirdly, initializing a bat population, wherein a bat position vector consists of 64 statistical features given in the first step, marking feature combinations by using binary codes and randomly initializing individual bat positions and speeds to form a sample set;
the bat position coding adopts binary, the original characteristics are 64, namely the length L of an individual is 64, each gene of the individual corresponds to the characteristics of the corresponding sequence, namely when a certain gene in the individual is 1, the characteristic item corresponding to the gene is selected; conversely, a "0" indicates that the feature item is not selected. Using X ═ { X1,x2,x3……xi……xNDenotes the location of the bat colony, using V ═ V1,v2,v3……vi……vN} tableVelocity, where xiRepresenting the location, v, of the ith individual batiRepresenting the speed, x, of the ith batsiAnd viIs a 64-dimensional row vector, N bats are in total, and N is more than or equal to 20 and less than or equal to 100.
The formula for random generation of gene values is:
Figure BDA0001215501450000021
in the formula, rand () is random numbers which are independently and equally distributed in the interval of [0,1 ];
fourthly, generating a corresponding characteristic subset according to each bat position code, generating a training set and a test set by using the characteristic subset, wherein the training set is used for designing an ELM classifier, the test set is used for testing the classifier, and calculating the fitness value fit of the corresponding bat according to the test resulti
The fitness value is calculated as follows:
Figure BDA0001215501450000031
wherein the content of the first and second substances,
Figure BDA0001215501450000032
in the formula, ωAFormula represents the accurate weight of classification, omegaFRepresents a feature selection number weight, fjCharacteristic values representing genes: 0 or 1, acciRepresenting the classification accuracy rate, cc representing the correct classification number, and uc representing the incorrect classification number;
fifthly, updating the search pulse frequency, speed and position of each bat;
the search pulse frequency, velocity and position are updated by the following equations:
Qi=Qmin+(Qmax-Qmin)×β (11)
vi(t)=vi(t-1)+(xi(t)-xbest)×Qi(12)
xi(t)=xi(t-1)+vi(t) (13)
wherein β belongs to [0, 1]]Are uniformly distributed random numbers; qiIs the search pulse frequency, Q, of bat iiBelong to [ Qmin,Qmax];vi(t)、vi(t-1) represents the velocity of the bat i at times t and t-1, respectively; x is the number ofi(t)、xi(t-1) represents the position of the bat i at times t and t-1, respectively; x is the number ofbestRepresenting the optimal solution of all bats at present;
and sixthly, generating uniformly distributed random numbers rand, and if rand is greater than R, and R is the pulse rate in the second step, randomly disturbing the current optimal solution to generate a new solution:
xnew=xbest+0.001×randn(1,64); (14)
wherein x isnewRepresenting a newly generated new solution, xbestRepresenting the best solution at that time.
Seventhly, generating uniformly distributed random numbers rand if rand accords with rand<A and fit (x)i(t))<fit(xi(t-1)), where a is the pulse volume in the second step, then the new solution generated in step six is accepted:
xi(t)=xnew(15)
wherein, fit (x)i(t)) represents finding the individual xiFitness value of (a), xnewRepresenting the new solution generated in the sixth step;
if the rand does not meet the condition, skipping the step and entering the next step;
eighthly, sequencing the fitness values of all bats to find out the current optimal solution and optimal value;
the ninth step, judge whether the optimum solution has changed, if not, then
The static _ count +1, otherwise, the static _ count is 0
Wherein the static _ count is the stall counter in the second step;
tenth, judging whether the static _ count is equal to static _ max, if so, generating group response for all bats;
i.e., if the optimal global adaptation value no longer changes over a certain number of iterations, then a relocation of bird population response occurs, the speed and position of each bat is re-updated using equations (16) and (17), in equation (16),
Figure BDA0001215501450000041
is bat
Figure BDA0001215501450000042
A new position after the population response has been repositioned, the new position being obtained by calculating the position average of its seven nearest neighbors,
Figure BDA0001215501450000043
the nearest adjacent bat
Figure BDA0001215501450000044
Minimum Euclidean distance, rand, from other batsxIs [ -1,1 [ ]]A random real number of intervals, in equation (17),
Figure BDA0001215501450000045
is the new speed of the bat after the speed adjustment,
Figure BDA0001215501450000046
is the original speed of the bat, and the new speed is obtained by calculating the average value of the speeds of the seven nearest adjacent bats thereof, randvIs [0, 1]]One random real number of interval, NiComprises
Figure BDA0001215501450000047
The index values of seven adjacent bats;
Figure BDA0001215501450000048
the tenth step, repeating the fourth step to the tenth step until the set optimal solution condition is met or the maximum iteration number is reached;
and step twelve, outputting a global optimal value and an optimal solution.
In the first step, the data set MIAS is a standard data set for studying mammographic images, which are 1024 × 1024 gray-scale images.
Compared with the prior art, the invention has the following beneficial effects.
The method realizes the optimized selection of the high-dimensional characteristics of the mammary X-ray image, effectively reduces the characteristic dimension, improves the accuracy of classification and identification, meets the requirements of a classifier of an extreme learning machine, and further improves the classification performance. The method has good characteristic selection effect and high efficiency, and can effectively improve the classification precision of the mammary gland X-ray image.
The BFBA in the invention introduces a new diversity exploration mechanism to enhance the diversity exploration capability of the BFBA, and the mechanism is derived from the group response behavior of birds. With the group response mechanism, BFBA can explore a wider search space and thus avoid limiting to partially suboptimal solutions. The ELM can solve and analyze the output weight of the learning network through one-step calculation, compared with a neural network and a support vector machine, the extreme learning machine greatly improves the generalization capability and the learning speed of the network, and the ELM is used as a classifier to evaluate the importance of the characteristics.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings.
FIG. 1 shows the gene codes of a certain individual of the present invention.
FIG. 2 is a flow chart of the present invention.
Detailed Description
As shown in fig. 1 and fig. 2, the breast X-ray image feature selection method based on BFBA and ELM of the present invention specifically includes the following steps: firstly, collecting a data set MIAS (the medical Image Analysis Society), extracting mammary X-ray Image characteristics, dividing the data set into a training set and a testing set, wherein the training set is used for training an Extreme Learning Machine (ELM), namely an Extreme Learning Machine (ELM) to design an ELM classifier, and the testing set is used for checking the effectiveness of the ELM classifier;
the method for extracting the mammary gland X-ray image features is a gray level co-occurrence matrix, and four statistical parameters are extracted: the directions of the angle second moment, entropy, inertia moment and related coefficient and the gray level co-occurrence matrix are 0 degree, 45 degree, 90 degree and 135 degree; firstly, calculating gray level co-occurrence matrixes in four directions, wherein the distance between the pixels is 1, and secondly, calculating four statistical parameters by each gray level co-occurrence matrix; dividing each image into four blocks, extracting the 16 characteristics from each sub-image block as original sample data, and obtaining 64 statistical characteristics in total; dividing the sample database obtained by feature extraction into 10 parts, selecting 90% of the sample database to train, and testing the other 10% of the sample database;
angular second moment f1Expressed, entropy is given by f2Representing, moment of inertia by f3Representation, correlation coefficient by f4Represents:
Figure BDA0001215501450000051
Figure BDA0001215501450000052
Figure BDA0001215501450000053
Figure BDA0001215501450000054
in the formula, mu1,μ2
Figure BDA0001215501450000055
And
Figure BDA0001215501450000056
are respectively defined as:
Figure BDA0001215501450000057
Figure BDA0001215501450000058
Figure BDA0001215501450000059
Figure BDA00012155014500000510
randomly dividing the feature library into 10 parts, selecting 90% of the feature library for training, and testing the rest 10%;
secondly, setting BFBA parameters;
the initial parameters comprise the bat group size of 20-100, the dimension D of each bat individual is 64, the pulse volume A is 0.5, the pulse rate R is 0.5, and the pulse frequency range Q is searchedmin,Qmax]Wherein Q ismin=0,Q max2, the maximum iteration number iter _ max is 1000, the stall counter statant _ count is 0, and the maximum stall number statant _ max is 4;
thirdly, initializing a bat population, wherein a bat position vector consists of 64 statistical features given in the first step, marking feature combinations by using binary codes and randomly initializing individual bat positions and speeds to form a sample set;
the bat position coding adopts binary, the original characteristics are 64, namely the length L of an individual is 64, each gene of the individual corresponds to the characteristics of the corresponding sequence, namely when a certain gene in the individual is 1, the characteristic item corresponding to the gene is selected; conversely, a "0" indicates that the feature item is not selected. Using X ═ { X1,x2,x3……xi……xNDenotes the location of the bat colony, using V ═ V1,v2,v3……vi……vNDenotes the velocity, where xiRepresenting the location, v, of the ith individual batiRepresenting the speed of the ith bat individualDegree, xiAnd viIs a 64-dimensional row vector, N bats are in total, and N is more than or equal to 20 and less than or equal to 100.
The formula for random generation of gene values is:
Figure BDA0001215501450000061
in the formula, rand () is random numbers which are independently and equally distributed in the interval of [0,1 ];
fourthly, generating a corresponding characteristic subset according to each bat position code, generating a training set and a test set by using the characteristic subset, wherein the training set is used for designing an ELM classifier, the test set is used for testing the classifier, and calculating the fitness value fit of the corresponding bat according to the test resulti
The fitness value is calculated as follows:
Figure BDA0001215501450000062
wherein the content of the first and second substances,
Figure BDA0001215501450000063
in the formula, ωAFormula represents the accurate weight of classification, omegaFRepresents a feature selection number weight, fjCharacteristic values representing genes: 0 or 1, acciRepresenting the classification accuracy rate, cc representing the correct classification number, and uc representing the incorrect classification number;
fifthly, updating the search pulse frequency, speed and position of each bat;
the search pulse frequency, velocity and position are updated by the following equations:
Qi=Qmin+(Qmax-Qmin)×β (11)
vi(t)=vi(t-1)+(xi(t)-xbest)×Qi(12)
xi(t)=xi(t-1)+vi(t) (13)
wherein β belongs to [0, 1]]Is uniformA distributed random number; qiIs the search pulse frequency, Q, of bat iiBelong to [ Qmin,Qmax];vi(t)、vi(t-1) represents the velocity of the bat i at times t and t-1, respectively; x is the number ofi(t)、xi(t-1) represents the position of the bat i at times t and t-1, respectively; x is the number ofbestRepresenting the optimal solution of all bats at present;
and sixthly, generating uniformly distributed random numbers rand, and if rand is greater than R, and R is the pulse rate in the second step, randomly disturbing the current optimal solution to generate a new solution:
xnew=xbest+0.001×randn(1,64); (14)
wherein x isnewRepresenting a newly generated new solution, xbestRepresenting the best solution at that time.
Seventhly, generating uniformly distributed random numbers rand if rand accords with rand<A and fit (x)i(t))<fit(xi(t-1)), where a is the pulse volume in the second step, then the new solution generated in step six is accepted:
xi(t)=xnew(15)
wherein, fit (x)i(t)) represents finding the individual xiFitness value of (a), xnewRepresenting the new solution generated in the sixth step;
if the rand does not meet the condition, skipping the step and entering the next step;
eighthly, sequencing the fitness values of all bats to find out the current optimal solution and optimal value;
the ninth step, judge whether the optimum solution has changed, if not, then
The static _ count +1, otherwise, the static _ count is 0
Wherein the static _ count is the stall counter in the second step;
tenth, judging whether the static _ count is equal to static _ max, if so, generating group response for all bats;
i.e. if after a certain number of iterations the optimal global adaptation value is no longer presentA change occurs, then a relocation of bird population response occurs, the speed and position of each bat is re-updated using equations (16) and (17), in equation (16),
Figure BDA0001215501450000071
is bat
Figure BDA0001215501450000072
A new position after the population response has been repositioned, the new position being obtained by calculating the position average of its seven nearest neighbors,
Figure BDA0001215501450000073
the nearest adjacent bat
Figure BDA0001215501450000074
Minimum Euclidean distance, rand, from other batsxIs [ -1,1 [ ]]A random real number of intervals, in equation (17),
Figure BDA0001215501450000075
is the new speed of the bat after the speed adjustment,
Figure BDA0001215501450000076
is the original speed of the bat, and the new speed is obtained by calculating the average value of the speeds of the seven nearest adjacent bats thereof, randvIs [0, 1]]One random real number of interval, NiComprises
Figure BDA0001215501450000077
The index values of seven adjacent bats;
Figure BDA0001215501450000078
Figure BDA0001215501450000079
the tenth step, repeating the fourth step to the tenth step until the set optimal solution condition is met or the maximum iteration number is reached;
and step twelve, outputting a global optimal value and an optimal solution.
In the first step, the data set MIAS is a standard data set for studying mammographic images, which are 1024 × 1024 gray-scale images.
The invention can better analyze the X-ray image of the mammary gland, can more accurately analyze the diseased condition of the mammary gland, and helps doctors to more accurately and quickly diagnose and make treatment plans.

Claims (2)

1. A breast X-ray image feature selection method based on BFBA and ELM is characterized by comprising the following steps:
firstly, collecting a data set MIAS (the medical Image Analysis Society), extracting mammary X-ray Image characteristics, dividing the data set into a training set and a testing set, wherein the training set is used for training an Extreme Learning Machine (ELM), namely an Extreme Learning Machine (ELM) to design an ELM classifier, and the testing set is used for checking the effectiveness of the ELM classifier;
the method for extracting the mammary gland X-ray image features is a gray level co-occurrence matrix, and four statistical parameters are extracted: the directions of the angle second moment, entropy, inertia moment and related coefficient and the gray level co-occurrence matrix are 0 degree, 45 degree, 90 degree and 135 degree; firstly, calculating gray level co-occurrence matrixes in four directions, wherein the distance between the pixels is 1, and secondly, calculating four statistical parameters by each gray level co-occurrence matrix to obtain 16 characteristics; dividing each image into four blocks, extracting the 16 characteristics from each sub-image block as original sample data, and obtaining 64 statistical characteristics in total; dividing the sample database obtained by feature extraction into 10 parts, selecting 90% of the sample database to train, and testing the other 10% of the sample database;
secondly, setting BFBA parameters;
the initial parameters comprise the bat group size of 20-100, the dimension L of each bat individual is 64, the pulse volume A is 0.5, the pulse rate R is 0.5, and the pulse frequency range Q is searchedmin,Qmax]Wherein Q ismin=0,Qmax2, maximum number of iterations iter _ max 1000, stall counter stagnant _ count is 0, and maximum number of times of stagnation is 4;
thirdly, initializing a bat population, wherein a bat position vector consists of 64 statistical features given in the first step, marking feature combinations by using binary codes and randomly initializing individual bat positions and speeds to form a sample set;
the bat individual position coding adopts binary, the original characteristics are 64, namely the length L of the bat individual position is 64, each gene of the bat individual position corresponds to the characteristics of the corresponding sequence, namely when a certain gene in the bat individual position is 1, the characteristic item corresponding to the gene is selected; conversely, a value of "0" indicates that the feature item is not selected, and X ═ X is used1,x2,x3……xi……xNDenotes the location of the bat colony, using V ═ V1,v2,v3……vi……vNDenotes the bat colony velocity, where xiRepresenting the location, v, of the ith individual batiRepresenting the speed, x, of the ith batsiAnd viIs a 64-dimensional row vector, N bats are contained in total, and N is more than or equal to 20 and less than or equal to 100;
the formula for random generation of gene values is:
Figure FDA0002539242220000011
in the formula, rand () is random numbers which are independently and equally distributed in the interval of [0,1 ];
fourthly, generating a corresponding characteristic subset according to each bat position code, generating a training set and a test set by using the characteristic subset, wherein the training set is used for designing an ELM classifier, the test set is used for testing the classifier, and calculating the fitness value fit of the corresponding bat individual position according to the test resulti
The fitness value is calculated as follows:
Figure FDA0002539242220000021
wherein the content of the first and second substances,
Figure FDA0002539242220000022
in the formula, ωAIndicates the accurate weight, omega, of the classificationFRepresents a feature selection number weight, fiCharacteristic values representing genes: 0 or 1, acciRepresenting the classification accuracy rate generated by using the ith bat individual, cc representing the correct classification number, and uc representing the incorrect classification number;
fifthly, updating the search pulse frequency, speed and position of each bat;
the search pulse frequency, velocity and position are updated by the following equations:
Qi=Qmin+(Qmax-Qmin)×β; (11)
vi(t)=vi(t-1)+(xi(t)- xbest) ×Qi; (12)
xi(t)=xi(t-1)+vi(t); (13)
wherein β belongs to [0, 1]]Are uniformly distributed random numbers; qiIs the search pulse frequency, Q, of bat iiBelong to [ Qmin,Qmax];vi(t)、vi(t-1) represents the velocity of the bat i at times t and t-1, respectively; x is the number ofi(t)、xi(t-1) represents the position of the bat i at times t and t-1, respectively; x is the number ofbestRepresents the optimal solution of all present bat positions;
and sixthly, generating uniformly distributed random numbers rand, and if rand is greater than R, and R is the pulse rate in the second step, randomly disturbing the current optimal solution to generate a new solution:
xnew= xbest+0.001×randn(1,64); (14)
wherein x isnewRepresenting a newly generated new solution, xbestRepresenting the then best solution;
seventhly, generating uniformly distributed random numbers rand if rand accords with rand<A and fit (x)i(t))<fit(xi(t-1)), A is the second stepThe pulse volume in (1) is received by the new solution generated in the step six:
xi(t)= xnew; (15)
wherein, fit (x)i(t)) represents finding the bat individual position xiFitness value of (a), xnewRepresenting the new solution generated in the sixth step;
if the rand does not conform to the rand<A and fit (x)i(t))<fit(xi(t-1)) and if A is the pulse volume in the second step, skipping the step and entering the next step;
eighthly, sequencing the fitness values of all bats to find out the current optimal solution and optimal value;
the ninth step, judge whether the optimum solution has changed, if not, then
The static _ count is static _ count +1, otherwise, the static _ count is 0;
wherein the static _ count is the stall counter in the second step;
tenth, judging whether the static _ count is equal to static _ max, if so, generating group response for all bats;
Figure FDA0002539242220000031
Figure FDA0002539242220000032
i.e., if the optimal global adaptation value no longer changes over a certain number of iterations, then a relocation of bird population response occurs, the speed and position of each bat is re-updated using equations (16) and (17), in equation (16),
Figure FDA0002539242220000039
is bat
Figure FDA0002539242220000033
New after population response relocationA position, the new position being obtained by calculating an average of the positions of its seven nearest neighbors,
Figure FDA0002539242220000034
the nearest adjacent bat
Figure FDA0002539242220000035
Minimum Euclidean distance, rand, from other batsxIs [ -1,1 [ ]]A random real number of intervals, in equation (17),
Figure FDA0002539242220000036
is the new speed of the bat after the speed adjustment,
Figure FDA0002539242220000037
is the original speed of the bat, and the new speed is obtained by calculating the average value of the speeds of the seven nearest adjacent bats thereof, randvIs [0, 1]]One random real number of interval, NiComprises
Figure FDA0002539242220000038
The index values of seven adjacent bats;
the tenth step, repeating the fourth step to the tenth step until the set optimal solution condition is met or the maximum iteration number is reached;
and step twelve, outputting a global optimal value and an optimal solution.
2. The BFBA and ELM based breast X-ray image feature selection method as recited in claim 1, wherein: in the first step, the data set MIAS is a standard data set for studying mammographic images, which are 1024 × 1024 gray-scale images.
CN201710048258.2A 2017-01-20 2017-01-20 BFBA and ELM-based mammary X-ray image feature selection method Expired - Fee Related CN106845546B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710048258.2A CN106845546B (en) 2017-01-20 2017-01-20 BFBA and ELM-based mammary X-ray image feature selection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710048258.2A CN106845546B (en) 2017-01-20 2017-01-20 BFBA and ELM-based mammary X-ray image feature selection method

Publications (2)

Publication Number Publication Date
CN106845546A CN106845546A (en) 2017-06-13
CN106845546B true CN106845546B (en) 2020-09-04

Family

ID=59119468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710048258.2A Expired - Fee Related CN106845546B (en) 2017-01-20 2017-01-20 BFBA and ELM-based mammary X-ray image feature selection method

Country Status (1)

Country Link
CN (1) CN106845546B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112107332A (en) * 2020-10-10 2020-12-22 高慧强 Method, equipment and system for processing medical ultrasonic image

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9646209B2 (en) * 2010-08-26 2017-05-09 Blast Motion Inc. Sensor and media event detection and tagging system
CN103593674B (en) * 2013-11-19 2016-09-21 太原理工大学 A kind of cervical lymph node ultrasonoscopy feature selection method
CN106203614B (en) * 2016-07-22 2018-07-03 吉林大学 KP model densities Function identification methods based on adaptive bat searching algorithm

Also Published As

Publication number Publication date
CN106845546A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
CN111798921B (en) RNA binding protein prediction method and device based on multi-scale attention convolution neural network
Yang Machine learning approaches to bioinformatics
CN108595916B (en) Gene expression full-spectrum inference method based on generation of confrontation network
CN106874489B (en) Lung nodule image block retrieval method and device based on convolutional neural network
CN106529165A (en) Method for identifying cancer molecular subtype based on spectral clustering algorithm of sparse similar matrix
US20070294067A1 (en) Prediction of estrogen receptor status of breast tumors using binary prediction tree modeling
WO2015173435A1 (en) Method for predicting a phenotype from a genotype
CN112560914A (en) Rolling bearing fault diagnosis method based on improved LSSVM
CN110866134B (en) Image retrieval-oriented distribution consistency keeping metric learning method
CN110738662B (en) Pituitary tumor texture image grading method based on fine-grained medical image segmentation and truth value discovery data amplification
CN106682454A (en) Method and device for data classification of metagenome
CN103593674A (en) Cervical lymph node ultrasonoscopy feature selection method
CN111110192A (en) Skin abnormal symptom auxiliary diagnosis system
Rojas-Thomas et al. Neural networks ensemble for automatic DNA microarray spot classification
CN113643758B (en) Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter
Dong et al. An improved YOLOv5 network for lung nodule detection
CN106845546B (en) BFBA and ELM-based mammary X-ray image feature selection method
Maashi et al. Anas Platyrhynchos Optimizer with Deep Transfer Learning based Gastric Cancer Classification on Endoscopic Images
CN114241267A (en) Structural entropy sampling-based multi-target architecture search osteoporosis image identification method
KR102572437B1 (en) Apparatus and method for determining optimized learning model based on genetic algorithm
CN115985503B (en) Cancer prediction system based on ensemble learning
CN111583194B (en) High-dimensional feature selection algorithm based on Bayesian rough set and cuckoo algorithm
CN116503147A (en) Financial risk prediction method based on deep learning neural network
CN113177608B (en) Neighbor model feature selection method and device for incomplete data
CN115908909A (en) Evolutionary neural architecture searching method and system based on Bayes convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200904