CN115481681A - Artificial intelligence-based breast sampling data processing method - Google Patents

Artificial intelligence-based breast sampling data processing method Download PDF

Info

Publication number
CN115481681A
CN115481681A CN202211101853.5A CN202211101853A CN115481681A CN 115481681 A CN115481681 A CN 115481681A CN 202211101853 A CN202211101853 A CN 202211101853A CN 115481681 A CN115481681 A CN 115481681A
Authority
CN
China
Prior art keywords
output tensor
data
branch
feature extraction
tensor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211101853.5A
Other languages
Chinese (zh)
Other versions
CN115481681B (en
Inventor
胡钦勇
燕自保
袁静萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Zhongshu Medical Technology Co ltd
Original Assignee
Wuhan Zhongshu Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Zhongshu Medical Technology Co ltd filed Critical Wuhan Zhongshu Medical Technology Co ltd
Priority to CN202211101853.5A priority Critical patent/CN115481681B/en
Publication of CN115481681A publication Critical patent/CN115481681A/en
Application granted granted Critical
Publication of CN115481681B publication Critical patent/CN115481681B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/30Assessment of water resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

The application discloses a mammary gland sampling data processing method based on artificial intelligence, which comprises the following steps: collecting resistivity data of a human mammary gland part according to the external excitation frequency of a patient to be detected; processing the resistivity data to obtain corresponding four-dimensional input data; inputting the input data into a text classification model, wherein the text classification model comprises a feature extraction part and a category classification part, and the input data is subjected to feature extraction and category classification in sequence; and outputting the input data after passing through the text classification model, and judging whether the output data is abnormal or not. The text classification model can learn and process input data to obtain more associated information between the data, classify the processed data, judge whether the data is abnormal or not, reduce patient detection processes, effectively improve the judgment accuracy, provide data support for doctor's judgment, and greatly facilitate doctors and patients.

Description

Artificial intelligence-based breast sampling data processing method
Technical Field
The invention relates to the field of text classification, in particular to a method for processing breast sampling data based on artificial intelligence.
Background
The examination range of the image clinical examination means in the current medical examination is wider, the method is suitable for a plurality of organs of the body, can be used for examining a plurality of diseases, and can timely recover the health of the human body by early discovery and early treatment. The existing image clinical detection means mainly comprise nuclear magnetic resonance imaging, doppler ultrasonic imaging, computed tomography technologies and the like, the technologies are developed for years, mature and stable, but the medical imaging technology has low time resolution, large equipment, ionizing radiation and other factors, and cannot perform real-time imaging monitoring on lung components.
The electrical impedance tomography technology has small damage to human body and low cost, and can realize real-time imaging. Through the development of the last forty years, the excitation signal of the electrical impedance tomography technology is developed into multi-frequency excitation from single-frequency excitation, and the excitation signal adopts a current signal; the front-end acquisition platform mainly converts an analog technology into a digital technology, and particularly performs digital signal processing after acquisition through a high-speed analog-to-digital conversion circuit in a signal detection stage. In the aspect of image reconstruction, artificial intelligence processing methods are added in recent years, and the aim is to improve the imaging resolution. However, technical problems of nonlinearity, morbidity, inadequacy and the like generally exist in the imaging process of the current electrical impedance tomography system, so that the imaging quality effect is not ideal enough, the imaging resolution ratio is low, and the current electrical impedance tomography system is not applied well in clinic.
In the prior art, patent CN202210153549 discloses a breast cancer feature information identification method, which effectively improves the accuracy of breast cancer classification and identification according to the method of acquiring patient feature data of a patient in an electronic medical record system, a machine learning algorithm, a neurodynamics method and the like. However, the feature data selected by the method is the analysis result of the medical image data, and the data acquisition process is complex.
Therefore, finding a method that can reduce the patient detection procedures, improve the accuracy of breast tumor classification and identification, and provide effective data support for the diagnosis of doctors is a technical problem to be solved urgently by those skilled in the art.
Disclosure of Invention
In view of the above disadvantages of the prior art, the present invention provides a method for processing breast sampling data based on artificial intelligence, comprising the following steps:
collecting resistivity data of a human mammary gland part according to the external excitation frequency of a patient to be detected;
processing the resistivity data to obtain corresponding four-dimensional input data;
inputting the input data into a text classification model, wherein the text classification model comprises a feature extraction part and a category classification part, and the input data is subjected to feature extraction and category classification in sequence;
and outputting the input data after passing through the text classification model, and judging whether the output data is abnormal or not.
In an embodiment of the present invention, the feature extraction part is a pseudo-twin feature extraction network, the pseudo-twin feature extraction network includes a first branch, a second branch, a first fully-connected layer and a second fully-connected layer, and the specific steps are as follows:
respectively inputting the input data into a first branch and a second branch for feature extraction to obtain an output tensor of the first branch and an output tensor of the second branch;
combining the first branch output tensor and the second branch output tensor to obtain a third output tensor, and performing dimensionality reduction processing on the third output tensor to obtain a fourth output tensor;
and sequentially carrying out first full-connection layer and second full-connection layer operation on the fourth output tensor to obtain an output tensor of feature extraction.
In an embodiment of the present invention, the first branch is feature extraction based on a transform structure, and the feature extraction of the first branch includes: and the input data sequentially passes through 5 transform layers and then passes through 1 separable convolution with the depth of 5 multiplied by 5 to obtain the output tensor of the first branch.
In an embodiment of the present invention, the transform layer specifically operates as follows:
firstly, the input data passes through a multi-head attention mechanism sublayer to obtain a first output tensor;
linearly changing the first output tensor, and activating by using a sigmoid activation function to obtain an output tensor of the multi-head attention mechanism;
carrying out neuron random inactivation on the output tensor of the multi-head attention mechanism to obtain a second output tensor;
adding the second output tensor and the input data tensor to obtain a first part output tensor of a transform layer;
enabling the output tensor of the first part of the transform layer to enter a feedforward full-connection layer for processing, and performing neuron random inactivation on the output tensor of the feedforward full-connection layer to obtain the output tensor of the second part of the transform layer; wherein the feed-forward fully-connected layer comprises a first fully-connected sublayer and a second fully-connected sublayer;
and adding the output tensor of the first part of the transform layer and the output tensor of the second part of the transform layer to obtain the output tensor of the transform layer.
In an embodiment of the present invention, entering the first partial output tensor of the transform layer into the feedforward fully-connected layer for processing includes:
enabling the output tensor of the first part of the transform layer to enter a first full-connection sublayer for full-connection layer operation, and activating by using a Relu function;
carrying out neuron random inactivation on the activated output tensor of the first full-connection sublayer;
and enabling the output tensor of the first full-connection sublayer after random inactivation to enter a second full-connection sublayer to perform full-connection layer operation, and activating by using a sigmoid function to obtain the output tensor of the feedforward full-connection layer.
In an embodiment of the present invention, a second branch is feature extraction based on a full-volume machine network, the second branch includes a plurality of volume blocks, and the input tensor is sequentially input to the volume blocks and processed to obtain an output tensor of the second branch.
In an embodiment of the present invention, the processing steps of the volume block are as follows:
and performing 1 × 1 convolution on the input tensor, performing 13 × 3 deep separable convolution on the input tensor, performing 1 × 1 convolution on the input tensor, and performing random deactivation on neurons to obtain the output tensor of the second branch.
In an embodiment of the present invention, the class classification includes classifying the output tensor of the feature extraction by a random forest classification algorithm to generate a plurality of classes, and voting the class with the largest number of votes as a final class.
In an embodiment of the present invention, the random forest algorithm specifically includes the following steps:
sampling the output tensor of the feature extraction for N times with playback to form N samples, wherein N is the number of the resistivity data of the mammary gland part of the human body;
each sample contains M attributes, and M attributes are randomly extracted for classification, wherein M & lt M;
the m attributes are arranged in an increasing mode from small to large, and the attribute with the maximum final information gain in the m attributes is selected as a final split point by calculating the information entropy and the final information gain;
splitting the final splitting point, carrying out predictive marking on the splitting result of the final splitting point, selecting a mark with the highest vote number according to a voting method, and outputting the marking result;
wherein, the calculation formula of the information entropy is as follows:
Figure BDA0003839824780000031
Figure BDA0003839824780000032
in the formula, m represents the same attribute value, avg (m), for all samples that are split attribute nodes i ,m j ) Means that two adjacent attribute values are averaged after sorting from small to large, N y Representing the number of instances when the sample label y is equal to the corresponding value, p y Represents the probability that the sample label y is equal to the corresponding value;
the final information gain is calculated as follows:
Figure BDA0003839824780000041
in the formula, D is the training sample space, | D | represents the number of training samples,
Figure BDA0003839824780000042
representing the number of samples in the training sample whose corresponding attribute value is greater than the split attribute value,
Figure BDA0003839824780000043
representing the number of samples in the training sample whose corresponding attribute value is less than the split attribute value.
An electronic device comprising a memory and a processor, wherein the processor is configured to execute a text classification model stored in the memory to implement a method for processing artificial intelligence-based breast sampling data as described in any one of the above.
Compared with the prior art, the invention can obtain the following beneficial effects:
1. the method comprises the steps of collecting resistivity data of a human breast part by using a bioelectrical impedance information collection platform, processing the resistivity data, extracting characteristics of the data by using a pseudo-twin characteristic extraction network, carrying out information fusion on the characteristic extraction result of the pseudo-twin characteristic extraction network, and carrying out full-connection layer operation on the fused information to obtain more information association among the data, and utilizing machine learning data to expand the difference among the data, so that the data characteristics are obvious, and the data are classified conveniently.
2. The method and the device have the advantages that a classifier is built by adopting a random forest algorithm, the feature extraction results are classified, the final splitting point is selected for splitting in a mode of calculating the information entropy and the final information gain, the splitting result of the final splitting point is predicted and marked, the mark with the highest ticket number is selected according to a voting method, whether data are abnormal or not is judged according to the output marking result, the judgment accuracy is improved, meanwhile, the detection flow of a patient to be detected is reduced, the shooting or image generation is not needed, doctors and the patient are greatly facilitated, and the working efficiency is effectively improved.
Drawings
Fig. 1 is a flowchart of a method for processing artificial intelligence-based breast sampling data according to an embodiment of the present invention;
FIG. 2 is a flow diagram of a feature extraction section of an embodiment of the present invention;
FIG. 3 is a flow chart illustrating the operation of the transform structure according to an embodiment of the present invention;
FIG. 4 is a flowchart of a category classification section according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or be indirectly disposed on the other element; when an element is referred to as being "connected to" another element, it can be directly connected to the other element or be indirectly connected to the other element.
It will be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like, refer to an orientation or positional relationship illustrated in the drawings for convenience in describing the present application and to simplify description, and do not indicate or imply that the referenced device or component must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the present application.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "plurality" or "a plurality" means two or more unless specifically limited otherwise.
It should be understood that the structures, ratios, sizes, and the like shown in the drawings are only used for matching the disclosure of the specification, so as to be understood and read by those skilled in the art, and are not used to limit the practical limit conditions of the present application, so that the modifications of the structures, the changes of the ratio relationships, or the adjustment of the sizes, do not have the technical essence, and the modifications, the changes of the ratio relationships, or the adjustment of the sizes, are all within the scope of the technical contents disclosed in the present application without affecting the efficacy and the achievable purpose of the present application.
As shown in fig. 1, a preferred embodiment of the present invention provides a method for processing breast sampling data based on artificial intelligence, which includes the following steps:
s1: and collecting resistivity data of the mammary gland part of the human body according to the extracorporeal excitation frequency of the patient to be detected.
In the embodiment of the present application, the bioelectrical impedance information collecting platform is used to collect the resistivity data of the breast part of the patient to be detected under different external excitation frequencies, preferably, the detection range of the patient to be detected is a circular region with a diameter of 5 cm and a breast as a center, the external excitation frequency is gradually increased from 10000Hz to 30000Hz, and the frequencies of the excitation sources are respectively: 10000 11074.14, 13623.34, 17327.13, 21293.6, 26168.07, 32158.38, 39519.98, 40901.3, 50264.29, 59684.52, 73347.3, 90137.74, 131531.95, 161641.83, 198644.36, 244117.41, 281641.83, 300000 in hertz (Hz) and the resistivity at the breast of the patient measured at each excitation source is in ohm centimeters (Ω cm), with multiple measurements per patient.
S2: and processing the resistivity data to obtain corresponding four-dimensional input data.
In the embodiment of the application, the collected resistivity data is subjected to type conversion, the resistivity data is converted into two-dimensional data in a word embedding mode, a dimension with the dimension of 1 is added in front of the first dimension of the two-dimensional data, the two-dimensional data is converted into three-dimensional data, and the three-dimensional data is converted into four-dimensional data in a dimension mapping mode.
S3: and inputting the input data into a text classification model, wherein the text type model is divided into a feature extraction part and a category classification part, and the input data is subjected to feature extraction and category classification in sequence. By extracting the characteristics of the input data, more associated information among texts can be obtained, the association among the data is stronger, and after classification, whether the data is abnormal can be directly judged according to the classification result, so that the patient detection process is reduced.
S4: and after the input data is output through a text classification model, judging whether the output data is abnormal or not. When the biological tissue is diseased and the impedance is changed obviously from normal, if the output data is abnormal, the breast part of the human body to be detected is diseased, namely the patient to be detected is suffered from breast tumor; if the output data is normal, the physiological condition of the breast part of the human body to be detected is normal, and the patient to be detected does not suffer from breast tumor.
In a further embodiment of the present application, the feature extraction part is a pseudo-twin feature extraction network, the pseudo-twin feature extraction network includes a first branch, a second branch, a first fully-connected layer and a second fully-connected layer, as shown in fig. 2, the specific steps are as follows:
s31: and respectively inputting the input data into the first branch and the second branch for feature extraction to obtain the output tensor of the first branch and the output tensor of the second branch.
The first branch is feature extraction based on a transformer structure, and the feature extraction of the first branch comprises the following steps: the input data sequentially pass through 5 transform layers and then pass through 1 separable convolution with 5 multiplied by 5 depths to obtain the output tensor of the first branch, and the correlation of a larger range among the input data can be obtained through the feature extraction of the first branch, and the calculated amount is reduced; the second branch is based on feature extraction of a full convolution network, the second branch includes a plurality of convolution blocks, the number of the convolution blocks may be specifically set according to the number of actually detected data, and is not limited herein, and is only described in an embodiment where the number of convolution blocks is 3, where the processing step of any convolution block is as follows: the input data is subjected to 1 × 1 convolution, then 13 × 3 deep separable convolution, then 1 × 1 convolution and finally neuron random inactivation to obtain the output tensor of the second branch.
In the embodiment of the application, the transform structure does not have a normalization sublayer, and sigmoid activation function normalization tensor data is added behind a multi-head attention mechanism sublayer and a feedforward full connection layer, so that data overfitting is prevented. As shown in fig. 3, the concrete operations of the transform structure include:
s310: the input data is firstly processed by a multi-head attention mechanism sublayer to obtain a first output tensor, wherein the calculation rule of the multi-head attention mechanism is as follows:
Figure BDA0003839824780000061
where Q, K, V are all the same input data tensors, d k Multiplying Q by K after transposition, zooming the coefficient, and performing softmax operation on the last dimension of the zoomed input data tensor to obtain the attention tensorFinally, multiplying the output tensor of the softmax operation by V to obtain attention expression, obtaining the attention expression and the attention tensor through a multi-head attention mechanism, enabling the input data tensor to change linearly through the multi-head attention mechanism, transforming a matrix square matrix without changing the sizes of Q, K and V, dividing the dimension of word embedding according to the number of heads through the multi-head attention mechanism, inputting the dimension into the attention mechanism, adjusting the tensor output by the attention mechanism into the size of the input data tensor, and enabling the input data to pass through a multi-head attention mechanism sublayer to obtain a first output tensor;
s311: linearly changing the first output tensor without changing the tensor size, and activating by using a sigmoid activation function to obtain the output tensor of the multi-head attention mechanism, wherein the activation by the sigmoid activation function is used for reducing the function range and preventing input data from being over-fitted to influence the result of feature extraction;
s312: carrying out neuron random inactivation on the output tensor of the multi-head attention mechanism to obtain a second output tensor;
s313: adding the second output tensor and the input data tensor to obtain an output tensor of the first part of the transformer layer;
s314: enabling the output tensor of the first part of the transform layer to enter a feedforward full connection layer for processing, and randomly deactivating the output tensor of the feedforward full connection layer to obtain the output tensor of the second part of the transform layer; the feedforward full-link layer comprises a first full-link sublayer and a second full-link sublayer, and the feedforward full-link layer specifically operates as follows:
enabling the first part of output tensor of the transformer layer to enter a first full-connection sublayer for full-connection layer operation, activating by using a Relu function, and converting all negative values in the first part of output tensor of the transformer layer into 0;
carrying out neuron random inactivation on the activated first full-connection sublayer output tensor, and preventing the over-fitting of the activated first full-connection sublayer output tensor data so as to influence the characteristic extraction effect;
and enabling the output tensor of the first full-connection sublayer after random inactivation to enter a second full-connection sublayer to perform full-connection layer operation, and activating by using a sigmoid function to obtain the output tensor of the feedforward full-connection layer. And activating by using a sigmoid function, and limiting the value range of the output tensor data of the first fully-connected sublayer after random inactivation to be (0,1), so that the output tensor data of the first fully-connected sublayer can be smoother, and the data can be classified conveniently.
S315: and adding the output tensor of the first part of the transform layer and the output tensor of the second part of the transform layer to obtain the output tensor of the transform layer.
S32: and combining the first branch output tensor and the second branch output tensor to obtain a third output tensor, and performing dimensionality reduction processing on the third output tensor to obtain a fourth output tensor.
In this embodiment of the present application, the output tensor of the first branch and the output tensor of the second branch are added element by element to obtain a third output tensor, where the third output tensor is four-dimensional data, and the dimensionality of the third output tensor is reduced, that is, the first two dimensionalities of the third output tensor are kept unchanged, and the last two dimensionalities are combined to obtain a fourth output tensor, where the fourth output tensor is three-dimensional data.
S33: and carrying out first full-connection layer and second full-connection layer operation on the fourth output tensor to obtain an output tensor of feature extraction.
In the embodiment of the present application, a fourth output tensor is input into a first full-link layer to perform full-link layer operation, the last dimension of the fourth output tensor is reduced to 1/4 of the original dimension, a fifth output tensor is obtained, the fifth output tensor is input into a second full-link layer to perform full-link layer operation, the last dimension of the fifth output tensor is restored to the number of resistivity data of the mammary part of the human body, a sixth output tensor is obtained, dimension reduction processing is performed on the sixth output tensor to obtain an output tensor with extracted features, and the output tensor with extracted features is two-dimensional data. In an embodiment of the present invention, the dimension reduction process may be an assignment method, in which a tensor whose first dimension is 1 in an output tensor of the feature extraction is directly extracted so that the output tensor of the feature extraction becomes two-dimensional data, or the output tensor of the feature extraction may be changed into two-dimensional data by a morphing operation.
The information fusion and the dimensionality transformation are carried out on the output tensor of the first branch and the output tensor of the second branch, the input data are learned through the pseudo-twin feature extraction network, the difference between the data is enlarged, the features of the data are obvious, the data after the features are extracted by the classification part are convenient to classify, and therefore the accuracy of judging the data is improved.
Classifying the output tensor of the feature extraction by adopting a random forest algorithm to generate a plurality of classes, and casting the class with the maximum vote number as a final class in a voting mode, wherein the random forest algorithm comprises the following specific steps as shown in fig. 4:
s341: performing N times of sampling with replacement on the output tensor of the feature extraction, extracting only 1 sample each time to finally form N samples, and training a decision tree by using the selected N samples to serve as samples output from a root node of the decision tree; wherein N is the number of resistivity data of the collected mammary gland part of the human body;
s342: each sample has M attributes, M attributes are randomly extracted from the M attributes for classification, each of the M attributes is classified, and the attribute is subjected to node splitting by using information gain, wherein M < M, and preferably M = log 2 M;
S343: arranging the m attributes according to an increasing sequence from small to large, taking the middle points of two adjacent columns as classification points for secondary classification, selecting the attribute with the maximum final information gain from the m attributes as a final split point through an information entropy calculation formula and a final information gain calculation formula, and splitting the final split point; the information entropy calculation formula is as follows:
Figure BDA0003839824780000081
Figure BDA0003839824780000091
in the formula, m represents the same attribute value of all samples as a split attribute node, avg (m) i ,m j ) Means that two adjacent attribute values are averaged after sorting from small to large, N y Represents the number of instances when the exemplar label y is equal to the corresponding value, p y Representing the probability of the exemplar label y being equal to the corresponding value;
the final information gain calculation formula is as follows:
Figure BDA0003839824780000092
in the formula, D is the training sample space, | D | represents the number of training samples,
Figure BDA0003839824780000093
representing the number of samples in the training sample whose corresponding attribute value is greater than the split attribute value,
Figure BDA0003839824780000094
representing the number of samples in the training sample whose corresponding attribute value is less than the split attribute value.
And (3) splitting each node in the decision tree forming process according to the method, if the attribute selected by the node is split at the father node at the next time, enabling the node to reach the leaf node without splitting, and forming the random forest classifier by establishing the decision tree.
S344: forecasting and marking the splitting result of the final splitting point, selecting the mark with the highest vote number according to a voting method, and outputting the marking result; and if a plurality of marks obtain the highest ticket at the same time, randomly selecting one from the marks, wherein the prediction mark is 0 or 1,0, which indicates that the data of the data set is not abnormal, the physiological condition of the breast part of the human body to be detected is normal, the patient to be detected does not suffer from breast tumor, and 1 indicates that the data of the data set is abnormal, which indicates that the breast part of the human body to be detected suffers from lesion, namely the patient to be detected suffers from breast tumor.
The application provides a mammary gland sampling data processing method based on artificial intelligence, electrical impedance change of a mammary gland part of a patient is identified through resistivity data acquired by a bioelectrical impedance acquisition platform, data features are extracted and fused through a pseudo-twin feature extraction network, the feature extraction result is classified by using a random forest algorithm, and whether the data are abnormal or not is judged. By using the method of the embodiment to manufacture the data set and carry out deep learning and training on the data, through verification, the accuracy of data judgment by using the text classification model reaches over 90 percent, the judgment of data abnormity of the mammary gland part under an excitation source can be improved, the data can be used as effective support for diagnosis of doctors, meanwhile, the flow of patient detection is reduced, great convenience is brought to doctors and patients, and the efficiency is effectively improved. The text classification model can be used for processing breast sampling data and can also be derived to process other histopathology such as thyroid, skin, rectum, cervix and the like.
An embodiment of the present application further provides an electronic device, which includes a memory and a processor, where the memory is configured to execute the text classification model stored in the memory, so as to implement any one of the above methods for processing artificial intelligence-based breast sampling data.
Any reference to memory, storage, database, or other medium used herein may include non-volatile and/or volatile memory. Suitable non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous Link (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct bused dynamic RAM (DRDRAM), and Rambus Dynamic RAM (RDRAM).
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A breast sampling data processing method based on artificial intelligence is characterized by comprising the following steps:
collecting resistivity data of a human mammary gland part according to the external excitation frequency of a patient to be detected;
inputting the input data into a text classification model, wherein the text classification model comprises a feature extraction part and a category classification part, and the input data is subjected to feature extraction and category classification in sequence;
and outputting the input data after passing through the text classification model, and judging whether the output data is abnormal or not.
2. The artificial intelligence based breast sampling data processing method according to claim 1, wherein the feature extraction part is a pseudo-twin feature extraction network, the pseudo-twin feature extraction network includes a first branch, a second branch, a first fully-connected layer and a second fully-connected layer, and the specific steps are as follows:
respectively inputting the input data into a first branch and a second branch for feature extraction to obtain an output tensor of the first branch and an output tensor of the second branch;
combining the first branch output tensor and the second branch output tensor to obtain a third output tensor, and performing dimensionality reduction processing on the third output tensor to obtain a fourth output tensor;
and sequentially carrying out first full-connection layer and second full-connection layer operation on the fourth output tensor to obtain an output tensor of feature extraction.
3. The method for processing artificial intelligence-based breast sampling data according to claim 2, wherein the first branch is a transform structure-based feature extraction, and the feature extraction of the first branch comprises: and the input data sequentially passes through 5 transform layers and then passes through 1 separable convolution with the depth of 5 multiplied by 5 to obtain the output tensor of the first branch.
4. The method for processing artificial intelligence-based breast sampling data according to claim 3, wherein the transform layer is specifically operated as follows:
firstly, the input data passes through a multi-head attention mechanism sublayer to obtain a first output tensor;
linearly changing the first output tensor, and activating by using a sigmoid activation function to obtain an output tensor of the multi-head attention mechanism;
carrying out neuron random inactivation on the output tensor of the multi-head attention mechanism to obtain a second output tensor;
adding the second output tensor and the input data tensor to obtain a first part output tensor of a transform layer;
enabling the output tensor of the first part of the transform layer to enter a feedforward full connection layer for processing, and performing neuron random inactivation on the output tensor of the feedforward full connection layer to obtain the output tensor of the second part of the transform layer; wherein the feed-forward fully-connected layer comprises a first fully-connected sublayer and a second fully-connected sublayer;
and adding the output tensor of the first part of the transform layer and the output tensor of the second part of the transform layer to obtain the output tensor of the transform layer.
5. The method of claim 4, wherein the entering of the first partial output tensor of the transform layer into the feedforward full link layer for processing comprises:
enabling the output tensor of the first part of the transform layer to enter a first full-connection sublayer for full-connection layer operation, and activating by using a Relu function;
carrying out neuron random inactivation on the activated output tensor of the first full-connection sublayer;
and enabling the output tensor of the first full-connection sublayer after random inactivation to enter a second full-connection sublayer to perform full-connection layer operation, and activating by using a sigmoid function to obtain the output tensor of the feedforward full-connection layer.
6. The method as claimed in claim 2, wherein the second branch is a feature extraction based on a full-volume machine network, the second branch includes a plurality of volume blocks, and the input tensor is sequentially input into the volume blocks for processing, so as to obtain an output tensor of the second branch.
7. The method for processing artificial intelligence-based breast sampling data according to claim 6, wherein the processing steps of the volume block are as follows:
and performing 1 × 1 convolution on the input data, performing 13 × 3 deep separable convolution, performing 1 × 1 convolution, and performing random neuron inactivation to obtain the output tensor of the second branch.
8. The method as claimed in claim 1, wherein the classification of classes includes classifying the output tensors of the feature extraction by using a random forest classification algorithm to generate a plurality of classes, and voting the class with the highest number of votes as the final class.
9. The artificial intelligence based breast sampling data processing method according to claim 8, wherein the random forest algorithm comprises the following specific steps:
sampling the output tensor of the feature extraction with the playback for N times to form N samples, wherein N is the number of collected resistivity data of the mammary gland part of the human body;
each sample contains M attributes, and M attributes are randomly extracted for classification, wherein M & lt M;
the m attributes are arranged from small to large in an increasing mode, and the attribute with the largest final information gain in the m attributes is selected as a final split point through calculating information entropy and final information gain;
splitting the final splitting point, carrying out predictive marking on the splitting result of the final splitting point, selecting a mark with the highest vote number according to a voting method, and outputting the marking result;
the calculation formula of the information entropy is as follows:
Figure FDA0003839824770000031
Figure FDA0003839824770000032
in the formula, m represents the same attribute value, avg (m), for all samples that are split attribute nodes i ,m j ) Means that two adjacent attribute values are averaged after sorting from small to large, N y Representing the number of instances when the sample label y is equal to the corresponding value, p y Represents the probability that the sample label y is equal to the corresponding value;
the final information gain is calculated as follows:
Figure FDA0003839824770000033
in the formula, D is the training sample space, | D | represents the number of training samples,
Figure FDA0003839824770000034
representing the number of samples in the training sample whose corresponding attribute value is greater than the split attribute value,
Figure FDA0003839824770000035
indicating the number of samples in which the corresponding attribute value in the training sample is less than the split attribute value.
10. An electronic device comprising a memory and a processor, wherein the processor is configured to execute the text classification model stored in the memory to implement the artificial intelligence based breast sampling data processing method according to any one of claims 1 to 9.
CN202211101853.5A 2022-09-09 2022-09-09 Mammary gland sampling data processing method based on artificial intelligence Active CN115481681B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211101853.5A CN115481681B (en) 2022-09-09 2022-09-09 Mammary gland sampling data processing method based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211101853.5A CN115481681B (en) 2022-09-09 2022-09-09 Mammary gland sampling data processing method based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN115481681A true CN115481681A (en) 2022-12-16
CN115481681B CN115481681B (en) 2024-02-06

Family

ID=84424147

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211101853.5A Active CN115481681B (en) 2022-09-09 2022-09-09 Mammary gland sampling data processing method based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN115481681B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105534524A (en) * 2016-02-05 2016-05-04 思澜科技(成都)有限公司 Device and method for quickly recognizing parathyroid gland in thyroid surgery
CN110633667A (en) * 2019-09-11 2019-12-31 沈阳航空航天大学 Action prediction method based on multitask random forest
WO2021104099A1 (en) * 2019-11-29 2021-06-03 中国科学院深圳先进技术研究院 Multimodal depression detection method and system employing context awareness
CN113012753A (en) * 2021-03-09 2021-06-22 桂林电子科技大学 Low-density lipoprotein data processing method based on ensemble learning
WO2021143402A1 (en) * 2020-01-17 2021-07-22 上海优加利健康管理有限公司 Heartbeat classification method for multi-tag ecg signal labeling, and device
CN114041773A (en) * 2021-11-01 2022-02-15 河南师范大学 Apoplexy position classification method based on electrical impedance tomography measurement framework
CN114862844A (en) * 2022-06-13 2022-08-05 合肥工业大学 Infrared small target detection method based on feature fusion

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105534524A (en) * 2016-02-05 2016-05-04 思澜科技(成都)有限公司 Device and method for quickly recognizing parathyroid gland in thyroid surgery
CN110633667A (en) * 2019-09-11 2019-12-31 沈阳航空航天大学 Action prediction method based on multitask random forest
WO2021104099A1 (en) * 2019-11-29 2021-06-03 中国科学院深圳先进技术研究院 Multimodal depression detection method and system employing context awareness
WO2021143402A1 (en) * 2020-01-17 2021-07-22 上海优加利健康管理有限公司 Heartbeat classification method for multi-tag ecg signal labeling, and device
CN113012753A (en) * 2021-03-09 2021-06-22 桂林电子科技大学 Low-density lipoprotein data processing method based on ensemble learning
CN114041773A (en) * 2021-11-01 2022-02-15 河南师范大学 Apoplexy position classification method based on electrical impedance tomography measurement framework
CN114862844A (en) * 2022-06-13 2022-08-05 合肥工业大学 Infrared small target detection method based on feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GUOCHEN YU等: "Dual-Branch Attention-In-Attention Transformer for Single-Channel Speech Enhancement", 《ICASSP 2022 - 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)》 *
李昕等: "一种电阻抗频谱法自动诊断乳腺组织疾病优化算法研究", 《中国生物医学工程学报》, no. 02 *

Also Published As

Publication number Publication date
CN115481681B (en) 2024-02-06

Similar Documents

Publication Publication Date Title
Selvathi et al. Deep learning techniques for breast cancer detection using medical image analysis
CN112116609A (en) Machine learning classification method and system based on structure or material segmentation in image
Gupta Pneumonia detection using convolutional neural networks
Kumaraswamy et al. A review on cancer detection strategies with help of biomedical images using machine learning techniques
Al Husaini et al. Automatic breast cancer detection using inception V3 in thermography
Lahane et al. Classification of thermographic images for breast cancer detection based on deep learning
Kaur et al. Analysis of brain tumor using pre-trained CNN models and machine learning techniques
Afaq et al. MAMMO-Net: An Approach for Classification of Breast Cancer using CNN with Gabor Filter in Mammographic Images
Irfan et al. Skin cancer prediction using deep learning techniques
Abraham et al. Lung nodule classification in CT images using convolutional neural network
CN115481681B (en) Mammary gland sampling data processing method based on artificial intelligence
Shi et al. Dual Convolutional Neural Network for Lung Nodule Classification
Saglam et al. COVID-19 Detection from X-ray Images Using a New CNN Approach
Raj et al. An Enhanced Approach on Brain Tumor Segmentation by the use of Deep Learning
Grigore et al. A Deep CNN approach using thermal imagery for breast cancer diagnosis
CN116304781B (en) Thyroid sampling data identification method based on cyclic neural network
Li et al. Structure regularized attentive network for automatic femoral head necrosis diagnosis and localization
CN116226702B (en) Thyroid sampling data identification method based on bioelectrical impedance
Latha et al. Analysis of Deep Learning and Machine Learning Methods for Breast Cancer Detection
CN116186575B (en) Mammary gland sampling data processing method based on machine learning
Raziq et al. Development of Light-Weight Convolutional Neural Network Model to Diagnose Tuberculosis
CN115546109B (en) Thyroid sampling data identification method and device based on machine learning
CN116186574B (en) Thyroid sampling data identification method based on artificial intelligence
Akella et al. Hybrid Edge-Artificial Intelligence Model for Identification and Classification of Brain Tumours from Computed Tomography Scans
Poojahsri et al. Various Methods in Texture Analysis and Classification Techniques used in Cervical Cancer Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant