CN111048207B - Plasma donor evaluation method and system - Google Patents

Plasma donor evaluation method and system Download PDF

Info

Publication number
CN111048207B
CN111048207B CN201911374472.2A CN201911374472A CN111048207B CN 111048207 B CN111048207 B CN 111048207B CN 201911374472 A CN201911374472 A CN 201911374472A CN 111048207 B CN111048207 B CN 111048207B
Authority
CN
China
Prior art keywords
data
pulp
group
input
consistency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911374472.2A
Other languages
Chinese (zh)
Other versions
CN111048207A (en
Inventor
杨智钧
杨佑禄
白永明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Jiuba Village Information Technology Co ltd
Original Assignee
Sichuan Jiuba Village Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Jiuba Village Information Technology Co ltd filed Critical Sichuan Jiuba Village Information Technology Co ltd
Priority to CN201911374472.2A priority Critical patent/CN111048207B/en
Publication of CN111048207A publication Critical patent/CN111048207A/en
Application granted granted Critical
Publication of CN111048207B publication Critical patent/CN111048207B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a plasma donor evaluation method, which belongs to the technical field of data processing analysis, and comprises the following steps: extracting a plurality of groups of data sources; performing data preliminary preprocessing on each group of data sources to obtain each input characteristic of each group of data; performing correlation analysis to calculate correlation coefficients between different input features; according to the obtained correlation coefficients, calculating the weights corresponding to the input features in each group of data through a hierarchical analysis method; normalizing each input feature, and respectively carrying out weighted summation on each input feature in each group of data; and automatically clustering the weighted sum result set through a machine learning algorithm to grade the serous staff corresponding to each group of data so as to realize the input information of the serous staff of the service system of the serous station to infer the market potential of each serous staff, thereby enabling the serous station and the biological company to pertinently take different popularization measures for different serous staff.

Description

Plasma donor evaluation method and system
Technical Field
The invention belongs to the technical field of data processing analysis, relates to the technical fields of correlation analysis, hierarchical analysis, cluster analysis, python computer language, data feature engineering, machine learning and the like, and particularly relates to a plasma donor evaluation method and system.
Background
The slurry station is faced with a large number of new, fixed, young and senior slurry members each day, the slurry members coming from various places, the slurry members being of different sexes and the slurry members being at different distances from the slurry station. Most of the slurry stations realize full automation of the slurry collecting flow at present, so that the efficiency of the service flow of the slurry stations is greatly improved, but the slurry collecting amount of the slurry stations at present in China has a tendency of reduction, and the collection amount of the plasma in China is insufficient according to the self-sufficient standard given by the world health organization. Compared with the United states, the population of the United states is only one fourth of that of the United states, and the pulp yield of the United states in 2017 is 2.5 times that of China, so that the population has a large lifting space in the aspect of pulp yield.
At present, in order to expand the service of the pulpers, publicize and determine the crowd of advertising marks, the pulp station and the biological company judge whether the grade of the pulpers is a high-quality pulper or a general pulper according to the past experience or one or two standards, so that a great amount of judgment errors often exist, because the grading dimension of the pulpers is more than two, a decision maker can hardly comprehensively analyze a plurality of dimensions, and the support of a great amount of data is also lacking.
Based on the above, it is necessary to analyze the grades of the pulpers from multiple dimensions according to a large amount of data, and there is no method for grading the pulpers according to the characteristics of the pulpers in the market, which results in blindness of the propaganda of the pulp station and increases the cost of the propaganda of the pulp station.
Disclosure of Invention
In view of the above, in order to solve the above problems in the prior art, an object of the present invention is to provide a plasma supplier evaluation method and system for estimating market potential of each plasma supplier from input information of the plasma supplier of a plasma supplier business system, so as to enable a plasma supplier and a biological company to pertinently take different popularization measures for different plasma suppliers.
The technical scheme adopted by the invention is as follows: a method of evaluating a donor, the method comprising:
extracting a plurality of groups of data sources;
performing data preliminary preprocessing on each group of data sources to obtain each input characteristic of each group of data;
performing correlation analysis to calculate correlation coefficients between different input features;
according to the obtained correlation coefficients, calculating the weights corresponding to the input features in each group of data through a hierarchical analysis method;
normalizing each input feature, and respectively carrying out weighted summation on each input feature in each group of data;
automatically clustering the weighted sum result set through a machine learning algorithm;
and grading the pulpers corresponding to each group of data by taking the output of the automatic clustering as a grading basis.
Further, the input characteristics of each set of data include pulp donation frequency, potential pulp donation time, type of pulper, and gender.
Further, the types of the sizing agents are divided into fixed sizing agents and non-fixed sizing agents, the definition label of the fixed sizing agents is 1, and the definition label of the non-fixed sizing agents is 0.
Further, the gender defines the gender of the male and female respectively according to the historical pulp donation number proportion of the male and female.
Further, in the correlation analysis, the pulp donation frequency is taken as a main factor, and correlation coefficients between the potential pulp donation time, the type of the pulp player, the sex and the pulp donation frequency are calculated.
Further, the analytic hierarchy process includes:
(a) Establishing a judgment matrix according to each correlation coefficient and each input characteristic;
(b) Analyzing the consistency of the judgment matrix, and if the consistency is not met, re-producing the judgment matrix; and if the consistency is met, calculating to obtain the weight of each input characteristic.
The analytic hierarchy process is a system method which takes a complex multi-objective decision problem as a system, decomposes an objective into a plurality of objectives or criteria, further decomposes the objectives into a plurality of layers of multi-indexes (or criteria and constraints), calculates single-order (weights) and total order of the layers through a qualitative index fuzzy quantization method, and takes the single-order (weights) and total order as objective (multi-index) multi-scheme optimization decisions.
Further, the automatic clustering includes:
(1) Taking a result set obtained by respectively carrying out weighted summation on each input characteristic as a sample of a machine learning algorithm;
(2) Randomly generating N clustering centers;
(3) Dividing the samples into N clusters according to the distances between the samples and the centroids of the clustering centers;
(4) Judging whether each cluster has a change, if so, readjusting each cluster center; if not, outputting each cluster center, and taking each cluster center as a division basis.
Further, the machine learning algorithm employs a K-Means clustering algorithm.
The invention also discloses a blood donor evaluation system, which comprises a data source acquisition module, a data preprocessing module, a data analysis module, a level analysis module and a main operation program module which are sequentially connected in a communication way, wherein the data source acquisition module is used for acquiring a plurality of groups of data sources;
the data preprocessing module is used for carrying out data preliminary preprocessing on each group of data sources and acquiring each input characteristic of each group of data;
the data analysis module is used for calculating correlation coefficients among different input features;
the analytic hierarchy process module is used for calculating the weight corresponding to each input feature in each group of data and obtaining a result set of weighted summation;
the main operation program module is used for calling a machine learning algorithm, automatically clustering the weighted sum result set and grading the pulpers corresponding to each group of data.
Further, the system also comprises a storage module for storing the grading result of the pulper in real time.
The beneficial effects of the invention are as follows:
1. by adopting the plasma donor evaluation method and system provided by the invention, the correlation coefficient of each input characteristic of the plasma donor is analyzed, a basis is provided for the analytic hierarchy process to construct a judgment matrix, and the subjective speculation of people is avoided; each input feature is added with a weight through analytic hierarchy process, so that subjective assumption of human unilateral is avoided; and (3) carrying out cluster analysis by a machine learning algorithm, providing each cluster center as a standard of sizing agent classification, updating once a day, dynamically updating the sizing agent classification in real time every day, and carrying out different popularization for sizing agent users. The whole process is completely automatic based on data analysis and processing; by the method and the system, the pulp operators can be reasonably classified, the pulp stations or biological companies can pertinently expand the service of the pulp operators, and the change condition of each pulp operator class every day can be seen.
Drawings
FIG. 1 is a block diagram of the workflow of the plasma donor evaluation method provided by the present invention;
FIG. 2 is a schematic diagram of the workflow of the K-Means clustering algorithm in the plasma donor evaluation method provided by the present invention;
fig. 3 is a system architecture diagram of a donor evaluation system provided by the present invention.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar modules or modules having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application. On the contrary, the embodiments of the present application include all alternatives, modifications, and equivalents as may be included within the spirit and scope of the appended claims.
Example 1
In this embodiment, a plasma supplier evaluation method is specifically disclosed, by which a plasma supplier can be reasonably classified, a plasma station or a biological company can pertinently expand the service of the plasma supplier, and can see the change condition of each plasma supplier class every day, as shown in fig. 1, and the method comprises the following steps when in specific application:
1. and (3) data source acquisition: original data are extracted from the database in a targeted manner to obtain a plurality of groups of data sources, and partial data of the data sources are shown in the following table 1:
1 name sex age doctime regdate donatetimes donortype
2 siberian cocklebur good 1 43 29/8/2019 23/9/2019 2 1
3 To snow 1 31 8/10/2019 8/10/2019 1 3
4 Wang Dadong 1 27 26/8/2019 26/8/2019 1 3
5 Lin Xiao 2 27 26/8/2019 26/8/2019 1 3
6 Wang Aiguo 1 55 26/8/2019 26/8/2019 1 3
7 Li Liangui 1 31 26/8/2019 26/8/2019 1 3
8 Li Jianguo 1 55 26/8/2019 27/8/2019 1 3
9 Wang Xiaoxiao 2 27 26/8/2019 26/8/2019 1 3
10 Lin Qian 2 23 26/8/2019 26/8/2019 1 3
11 Page Liang Hui 1 41 26/8/2019 26/8/2019 1 3
12 Cheng Qian diving 2 39 26/8/2019 26/8/2019 1 3
13 Li Liangui 1 31 26/8/2019 27/8/2019 2 1
14 Page Liang Hui 1 41 26/8/2019 27/8/2019 2 1
15 Is of the type of being able to stop the flow of yang 2 36 29/8/2019 23/9/2019 1 3
16 Wang Dadong 1 27 26/8/2019 27/8/2019 2 1
In this step, the SQL query program is adopted to select original data in the database in a targeted manner, and the result is stored in a result table and saved as a CSV file.
2. Preliminary pretreatment of data: and carrying out data preliminary preprocessing on each group of data sources to acquire each input characteristic of each group of data. In this embodiment, each input feature in each set of data includes a feed frequency, a potential feed time, a type of member, and a gender, where the feed frequency is determined from the number of feeds recorded in the data source.
The potential pulp donation time is calculated by the following way: statistical age of [18,55 ]]What we pay attention to is the potential future pulp donation time of the pulper, such as: x is x 1 ,x 2 ,…,x n Representing the age of n seroators, we used 55-x 1 ,55-x 2 ,…,55-x n Representing potential time spent by the pulper.
The types of the sizing agents are divided into fixed sizing agents and non-fixed sizing agents, the original field meaning is reserved for the fixed sizing agents or not, the number 1 represents the fixed sizing agents, and the contribution rate of the fixed sizing agents to the sizing agent grade is higher; the number 0 represents a non-stationary pulper. Finally, the definition label for fixed sizing agent is 1 and the definition label for non-fixed sizing agent is 0.
The gender was treated as follows: according to the proportion of the number of pulp donations of men and women, and according to the historical data, a pie chart of the number of pulp donations of men and women is counted, wherein the proportion is as follows: 0.348:0.652. So, the male and female labels are respectively denoted by 0.348 and 0.652, namely the label of male sex is defined as 0.348, and the label of female sex is defined as 0.652, but the data is dynamic and also shows real-time property of the data.
The data after preliminary pretreatment are shown in table 2 below:
1 frq future_time after_deal_donortype after_deal_sex
2 0.01886792 13 1 0.348
3 0.01515152 25 0 0.348
4 0.00917431 29 0 0.348
5 0.00917431 29 0 0.652
6 0.00917431 1 0 0.348
7 0.00917431 25 0 0.348
8 0.00917431 1 0 0.348
9 0.00917431 29 0 0.652
10 0.00917431 33 0 0.652
11 0.00917431 15 0 0.348
12 0.00917431 17 0 0.652
13 0.01834862 25 1 0.348
14 0.01834862 15 1 0.348
15 0.00943396 20 0 0.652
16 0.01834862 29 1 0.348
3. performing correlation analysis to calculate correlation coefficients between different input features, wherein in the embodiment, pulp donation frequency is used as the most direct standard in the grading of pulp staff, the higher the pulp donation frequency is, the better the pulp staff is, and the higher the grade is considered, so that the correlation between other input features and pulp donation frequency is calculated;
1) The proportion of men and women in the people who donate the pulp is as follows: 0.348:0.652, the correlation coefficient between gender and pulp donation frequency of the pulper is: 0.0023566340412670616;
2) The correlation coefficient between whether a fixed pulper and the pulp donation frequency of the pulper is: 0.6550817781382937;
3) The correlation coefficient between potential pulp donation time and pulp donation frequency of a pulper is: 0.06690809987447563;
in the step, the correlation coefficient can be calculated by adopting a CORREL function in excel, and any two groups of data are taken as input.
4. Hierarchical analysis: according to the obtained correlation coefficients, calculating the weights corresponding to the input features in each group of data through a hierarchical analysis method; the analytic hierarchy process is a system method which takes a complex multi-objective decision problem as a system, decomposes an objective into a plurality of objectives or criteria, further decomposes the objectives into a plurality of layers of multi-indexes (or criteria and constraints), calculates single-order (weights) and total order of the layers through a qualitative index fuzzy quantization method, and takes the single-order (weights) and total order as objective (multi-index) multi-scheme optimization decisions.
The analytic hierarchy process is to decompose the decision problem into different hierarchical structures according to the sequence of the total target, the sub-targets of each layer and the evaluation criteria until a specific spare power switching scheme, then to calculate the priority weight of each element of each layer to a certain element of the previous layer by solving the matrix feature vector, and finally to merge the final weight of each alternative scheme to the total target in a hierarchical manner by a weighted sum method, wherein the final weight with the largest weight is the optimal scheme.
The analytic hierarchy process is more suitable for the target system with hierarchical staggered evaluation indexes, and the target value is difficult to quantitatively describe, and the specific steps of applying the analytic hierarchy process in the embodiment are as follows:
(a) Constructing a judgment matrix: the correlation coefficients obtained in the step 3, whether the correlation coefficients are fixed pulp men, the population proportion of pulp donation, the sex and the potential pulp donation time are influence factors, wherein the potential pulp donation time belongs to objective factors, the potential pulp donation time is placed in front of the objective factors, a judgment matrix is established, the judgment matrix method is an improvement of a relative comparison method and also belongs to an experience scoring method, all indexes are listed to form an N multiplied by N square matrix, the indexes are compared and scored pairwise, and finally the scores of the indexes are summed and normalized. The method for establishing the judgment matrix is already described in the existing literature, and is not described here in detail, and the judgment matrix in this embodiment is as follows:
Figure BDA0002340549480000081
(b) Analyzing the consistency of the judgment matrix, and if the consistency is not met, re-producing the judgment matrix; if the consistency is satisfied, the weight of each input feature is calculated as follows:
(b1) Eigenvalues (taking the real part): [4.16540439, -0.1115912, -0.02690659, -0.02690659]
(b2) Eigenvector (taking part):
[[0.79200584,-0.85345804,0.68048194,0.68048194]
[0.28327861,-0.13188751,-0.36522767,-0.36522767]
[0.53691597,0.50279034,0.12210932,0.12210932]
[0.06481678,0.03764203,0.01696366,0.01696366]]
(b3) The maximum characteristic value is 4.165404389255812
The corresponding feature vectors are:
[0.79200584,0.28327861,0.53691597,0.06481678]
(b4) Computing a consistency index CI
CI=(λ max -1)/(n-1)
Wherein lambda is max Represents the maximum feature value, and n represents the feature quantity.
In order to measure the consistency of different judgment matrixes, an average randomness consistency index RI of the judgment matrixes is introduced, and RI values adopted by the judgment matrixes in consistency detection are as follows: 0.9,
calculating a random consensus ratio CR
CR=CI/RI
If the random uniformity ratio CR <0.1 is calculated, the matrix is considered to have satisfactory uniformity.
The CR value of the judgment matrix in this example is: 0.06126088490956005
Consistency is satisfied through consistency test;
the weight matrix of the level index is as follows:
[0.4722705521131915,0.16891813164079006,0.32016128036588176,0.03865003588013668]。
5. normalization processing and weighted summation: normalization processing is carried out on each input feature, and weighted summation is carried out on each input feature in each group of data, specifically as follows:
normalization is performed to change data of different orders of magnitude to the same order of magnitude, eliminating the effect of the orders of magnitude, as shown in Table 3 below:
Figure BDA0002340549480000101
based on the normalized table, each feature of each row is multiplied by a weight corresponding to the feature, each input feature has different contribution degrees to the division of the pulp player grade, each column in the table 2 is multiplied by a corresponding weight according to the weight obtained in the step 4, and then partial data is obtained by summing the rows, as shown in the table 4:
Figure BDA0002340549480000102
6. and (3) cluster analysis: the result set of weighted summation is automatically clustered by a machine learning algorithm, which in this embodiment, as shown in fig. 2, comprises the steps of:
(1) Taking a result set obtained by respectively carrying out weighted summation on each input characteristic as a sample of a machine learning algorithm;
(2) Randomly generating four cluster centers, and extracting the mass centers of the cluster centers;
(3) According to the distance between the sample and the mass center of each clustering center, the sample is divided into four clusters, the K-Means algorithm in unsupervised learning is a clustering algorithm, namely, according to a similarity principle, data objects with higher similarity are divided into the same type of clusters, and data objects with higher dissimilarity are divided into different types of clusters;
(4) Judging whether each cluster has a change, if so, readjusting each cluster center; if not, outputting each clustering center; through the step, the grading of the pulpers can be updated in real time and the grading is based on each clustering center.
7. Grading: defining four classes by a K-Means clustering algorithm in unsupervised machine learning, extracting each clustering center in the step 6 as a division basis, classifying the serous staff corresponding to each group of data into four classes A, B, C, D, and finally obtaining the class corresponding to each serous staff, wherein the division result of part of data is shown in the following table 5:
Figure BDA0002340549480000111
example 2
On the basis of the plasma donor evaluation method provided in embodiment 1, as shown in fig. 3, in this embodiment, a plasma donor evaluation system is also disclosed, where the system includes a data source acquisition module, a data preprocessing module, a data analysis module, a hierarchical analysis module and a main operation program module that are sequentially connected in communication, and the main operation program module is connected with a storage module, where the data source acquisition module is used to acquire multiple groups of data sources in a database in a targeted manner;
the data preprocessing module is used for carrying out data preliminary preprocessing on each group of data sources and acquiring each input characteristic of each group of data, wherein each input characteristic comprises pulp donation frequency, potential pulp donation time, pulp member type and gender;
the data analysis module is used for calculating correlation coefficients among different input features, in the embodiment, pulp donation frequency is taken as a main factor, and the correlation coefficients among potential pulp donation time, pulp member type, gender and pulp donation frequency are calculated respectively;
the analytic hierarchy process module is used for calculating the weight corresponding to each input feature in each group of data, respectively carrying out weighted summation on each input feature in each group of data, and finally obtaining a weighted summation result set;
the main operation program module is used for calling a machine learning algorithm, the machine learning algorithm is a K-Means algorithm in unsupervised learning, the result set of weighted summation is automatically clustered through the K-Means algorithm, the result set is defined into four types, each clustering center output by the K-Means algorithm is extracted, the grades of the serous staff corresponding to each group of data are classified, and in the embodiment, the grades of the serous staff are classified into A, B, C, D types.
The storage module is used for storing the grading result of the pulper in real time so as to check the grading condition of the pulper in real time and conduct corresponding popularization and propaganda.
Example 3
On the basis of the plasma donor evaluation method provided in embodiment 1, in this embodiment, a readable storage medium is specifically provided, where one or more programs are stored, and the one or more programs may be executed by one or more processors, so as to implement the plasma donor evaluation method described in embodiment 1 above, so as to implement grading of the plasma donors, and perform adaptive popularization for the grading situation.
It should be noted that in the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "plurality" means at least two.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (6)

1. A method of donor evaluation, the method comprising:
extracting a plurality of groups of data sources;
performing data preliminary preprocessing on each group of data sources to obtain each input characteristic of each group of data;
performing correlation analysis to calculate correlation coefficients between different input features;
according to the obtained correlation coefficients, calculating the weights corresponding to the input features in each group of data through a hierarchical analysis method;
normalizing each input feature, and respectively carrying out weighted summation on each input feature in each group of data;
automatically clustering the weighted sum result set through a machine learning algorithm;
classifying the pulpers corresponding to each group of data by taking the output of the automatic clustering as a classification basis;
the analytic hierarchy process comprises:
(a) Establishing a judgment matrix according to each correlation coefficient and each input characteristic;
(b) Analyzing the consistency of the judgment matrix, and if the consistency is not met, re-producing the judgment matrix; if the consistency is met, calculating to obtain the weight of each input characteristic;
the method for analyzing the consistency of the judgment matrix comprises the following steps:
the randomness agreement ratio CR is calculated as follows:
CR=CI/RI;
CI=(λ max -1)/(n-1);
wherein CR represents a random uniformity ratio, CI represents a uniformity index, RI represents an average random uniformity index, lambda max Representing the maximum characteristic value, and n represents the characteristic quantity;
if CR <0.1, then the consistency is considered satisfactory; otherwise, consider that the consistency is not satisfied;
the automatic clustering includes:
(1) Taking a result set obtained by respectively carrying out weighted summation on each input characteristic as a sample of a machine learning algorithm;
(2) Randomly generating N clustering centers;
(3) Dividing the samples into N clusters according to the distances between the samples and the centroids of the clustering centers;
(4) Judging whether each cluster has a change, if so, readjusting each cluster center; if not, outputting each cluster center, and taking each cluster center as a division basis;
the input characteristics of each set of data comprise pulp donation frequency, potential pulp donation time, pulp operator type and gender;
the pulp donation frequency is taken as a main factor, and correlation coefficients among the potential pulp donation time, the type of the pulp player and the gender and the pulp donation frequency are calculated.
2. The method according to claim 1, wherein the plasma donor types are classified into fixed plasma donors and non-fixed plasma donors, the definition label for fixed plasma donors is 1, and the definition label for non-fixed plasma donors is 0.
3. The method according to claim 1, wherein the sex is defined by a tag for each sex according to a ratio of the number of plasma donations of a male and a female.
4. The plasma donor evaluation method of claim 1, wherein the machine learning algorithm employs a K-Means clustering algorithm.
5. The blood donor evaluation system is characterized by comprising a data source acquisition module, a data preprocessing module, a data analysis module, a level analysis module and a main operation program module which are sequentially in communication connection, wherein the data source acquisition module is used for acquiring a plurality of groups of data sources;
the data preprocessing module is used for carrying out data preliminary preprocessing on each group of data sources and acquiring each input characteristic of each group of data;
the data analysis module is used for calculating correlation coefficients among different input features;
the analytic hierarchy process module is used for calculating the weight corresponding to each input feature in each group of data and obtaining a result set of weighted summation;
the main operation program module is used for calling a machine learning algorithm, automatically clustering the weighted sum result set and grading the pulpers corresponding to each group of data;
the analytic hierarchy process module performs the following steps:
(a) Establishing a judgment matrix according to each correlation coefficient and each input characteristic;
(b) Analyzing the consistency of the judgment matrix, and if the consistency is not met, re-producing the judgment matrix; if the consistency is met, calculating to obtain the weight of each input characteristic;
the method for analyzing the consistency of the judgment matrix comprises the following steps:
the randomness agreement ratio CR is calculated as follows:
CR=CI/RI;
CI=(λ max -1)/(n-1);
wherein CR represents a random uniformity ratio, CI represents a uniformity index, RI represents an average random uniformity index, lambda max Representing the maximum characteristic value, and n represents the characteristic quantity;
if CR <0.1, then the consistency is considered satisfactory; otherwise, consider that the consistency is not satisfied;
the main operation program module executes the following steps when carrying out automatic clustering:
(1) Taking a result set obtained by respectively carrying out weighted summation on each input characteristic as a sample of a machine learning algorithm;
(2) Randomly generating N clustering centers;
(3) Dividing the samples into N clusters according to the distances between the samples and the centroids of the clustering centers;
(4) Judging whether each cluster has a change, if so, readjusting each cluster center; if not, outputting each cluster center, and taking each cluster center as a division basis;
the input characteristics of each set of data comprise pulp donation frequency, potential pulp donation time, pulp operator type and gender;
the pulp donation frequency is taken as a main factor, and correlation coefficients among the potential pulp donation time, the type of the pulp player and the gender and the pulp donation frequency are calculated.
6. The donor evaluation system as recited in claim 5, further comprising a storage module for storing the grading result of the plasma member in real time.
CN201911374472.2A 2019-12-27 2019-12-27 Plasma donor evaluation method and system Active CN111048207B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911374472.2A CN111048207B (en) 2019-12-27 2019-12-27 Plasma donor evaluation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911374472.2A CN111048207B (en) 2019-12-27 2019-12-27 Plasma donor evaluation method and system

Publications (2)

Publication Number Publication Date
CN111048207A CN111048207A (en) 2020-04-21
CN111048207B true CN111048207B (en) 2023-06-16

Family

ID=70240564

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911374472.2A Active CN111048207B (en) 2019-12-27 2019-12-27 Plasma donor evaluation method and system

Country Status (1)

Country Link
CN (1) CN111048207B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625719B (en) * 2020-05-21 2023-06-13 四川九八村信息科技有限公司 Propaganda channel expanding system and method for single plasma collecting station
CN113420722B (en) * 2021-07-21 2023-02-17 上海塞嘉电子科技有限公司 Emergency linkage method and system for airport security management platform
CN116895355B (en) * 2023-09-11 2023-12-08 山东优杰生物科技有限公司 Blood collection electronic information management system and method for blood collection vehicle

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101955998A (en) * 2010-07-20 2011-01-26 赵英仁 Method for detecting necrosis degree of hepatic cells through level of plasma free DNA
CN105243255A (en) * 2015-08-11 2016-01-13 北华航天工业学院 Evaluation method for soft foundation treatment scheme
CN110491493A (en) * 2019-08-15 2019-11-22 成都市佳颖医用制品有限公司 A kind of intelligence sampled plasma management system and management method

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105626009B (en) * 2015-12-17 2018-04-06 中国石油天然气股份有限公司 Quantitative evaluation method for single-well water injection oil replacement effect of fracture-cave carbonate reservoir
US10982286B2 (en) * 2016-01-22 2021-04-20 Mayo Foundation For Medical Education And Research Algorithmic approach for determining the plasma genome abnormality PGA and the urine genome abnormality UGA scores based on cell free cfDNA copy number variations in plasma and urine
CN106203867A (en) * 2016-07-19 2016-12-07 国家电网公司 Grid division methods based on power distribution network assessment indicator system and cluster analysis
CN107093005A (en) * 2017-03-24 2017-08-25 北明软件有限公司 The method that tax handling service hall's automatic classification is realized based on big data mining algorithm
US20190295727A1 (en) * 2018-03-23 2019-09-26 American Heart Association, Inc. System and Method for Assessing Heart Health and Communicating the Assessment to a Patient
CN112292697A (en) * 2018-04-13 2021-01-29 弗里诺姆控股股份有限公司 Machine learning embodiments for multi-analyte determination of biological samples
US11385237B2 (en) * 2018-06-05 2022-07-12 The Board Of Trustees Of The Leland Stanford Junior University Methods for evaluating glycemic regulation and applications thereof
CN109389282A (en) * 2018-08-17 2019-02-26 浙江华云信息科技有限公司 A kind of electric energy meter production firm evaluation method based on gauss hybrid models
CN109685318A (en) * 2018-11-26 2019-04-26 大连海洋大学 River Ecology health assessment method and its application based on ecosystem integrity
CN110309863B (en) * 2019-06-13 2023-08-08 上海交通大学 Identity credibility evaluation method based on analytic hierarchy process and gray correlation analysis
CN110569820A (en) * 2019-09-16 2019-12-13 四川九八村信息科技有限公司 Identification system and method for plasma supplier of plasma collecting station

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101955998A (en) * 2010-07-20 2011-01-26 赵英仁 Method for detecting necrosis degree of hepatic cells through level of plasma free DNA
CN105243255A (en) * 2015-08-11 2016-01-13 北华航天工业学院 Evaluation method for soft foundation treatment scheme
CN110491493A (en) * 2019-08-15 2019-11-22 成都市佳颖医用制品有限公司 A kind of intelligence sampled plasma management system and management method

Also Published As

Publication number Publication date
CN111048207A (en) 2020-04-21

Similar Documents

Publication Publication Date Title
CN111048207B (en) Plasma donor evaluation method and system
Magidson et al. An extension of the CHAID tree-based segmentation algorithm to multiple dependent variables
Huang et al. A case study of applying data mining techniques in an outfitter’s customer value analysis
CN110956273A (en) Credit scoring method and system integrating multiple machine learning models
Wu et al. User Value Identification Based on Improved RFM Model and K‐Means++ Algorithm for Complex Data Analysis
CN117151870B (en) Portrait behavior analysis method and system based on guest group
CN110046943B (en) Optimization method and optimization system for network consumer subdivision
CN110990711B (en) WeChat public number recommendation method and system based on machine learning
CN108230029A (en) Client trading behavior analysis method
Rahimi et al. Improve poultry farm efficiency in Iran: using combination neural networks, decision trees, and data envelopment analysis (DEA)
CN113111924A (en) Electric power customer classification method and device
Peng et al. The health care fraud detection using the pharmacopoeia spectrum tree and neural network analytic contribution hierarchy process
Aliyev et al. Segmenting bank customers via RFM model and unsupervised machine learning
CN111563628A (en) Real estate customer transaction time prediction method, device and storage medium
CN107992613A (en) A kind of Text Mining Technology protection of consumers&#39; rights index analysis method based on machine learning
Nimbalkar et al. Data mining using RFM Analysis
Sebt et al. Implementing a data mining solution approach to identify the valuable customers for facilitating electronic banking
CN116304929A (en) Financial manipulation recognition method and device based on A-stock market
Wu et al. Customer churn prediction for commercial banks using customer-value-weighted machine learning models
Zheng Application of silence customer segmentation in securities industry based on fuzzy cluster algorithm
Yang et al. An evidential reasoning rule-based ensemble learning approach for evaluating credit risks with customer heterogeneity
CN114565457A (en) Risk data identification method and device, storage medium and electronic equipment
Cahya et al. Weakening Feature Independence of Naïve Bayes Using Feature Weighting and Selection on Imbalanced Customer Review Data
Hiwase et al. Review on application of data mining in life insurance
CN113743752A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant