CN107491992B - Intelligent service recommendation method based on cloud computing - Google Patents

Intelligent service recommendation method based on cloud computing Download PDF

Info

Publication number
CN107491992B
CN107491992B CN201710742043.0A CN201710742043A CN107491992B CN 107491992 B CN107491992 B CN 107491992B CN 201710742043 A CN201710742043 A CN 201710742043A CN 107491992 B CN107491992 B CN 107491992B
Authority
CN
China
Prior art keywords
service
demander
characteristic data
data
selection decision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710742043.0A
Other languages
Chinese (zh)
Other versions
CN107491992A (en
Inventor
初佃辉
张小东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Zhiyuxin Information Technology Co ltd
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN201710742043.0A priority Critical patent/CN107491992B/en
Publication of CN107491992A publication Critical patent/CN107491992A/en
Application granted granted Critical
Publication of CN107491992B publication Critical patent/CN107491992B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/22Social work or social welfare, e.g. community support activities or counselling services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Child & Adolescent Psychology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an intelligent service recommendation method based on cloud computing, which solves the technical problems of large calculation amount, low efficiency, low accuracy and instability of the existing service recommendation algorithm, and comprises the steps of training a plurality of accurate service characteristics to obtain model parameters and storing the model parameters into a model database; the method is characterized in that a logistic regression algorithm is used for carrying out accurate service selection decision, after the service characteristic data of the demander is received, the data are sent to each computing node, after the computation is finished, the data are sent to reduce nodes for comprehensive analysis, and finally the required service is determined.

Description

Intelligent service recommendation method based on cloud computing
Technical Field
The invention relates to a recommendation system, in particular to an intelligent service recommendation method based on cloud computing.
Background
The development of information technology brings great convenience to the life of people. With the massive increase of information in networks, overload phenomena of information occur. Recommendation systems have been developed to allow users to accurately obtain desired information. Because of the enormous commercial value and interest, the recommendation system is of great interest both in academic and industrial areas. In academia, many recommendation methods are available, and in industry, recommendation systems are widely used in various fields. Service recommendations are one of the application instances of the recommendation system. It is an intelligent program that can solve similar item selection decision problems and has a preliminary level of expertise. The field of machine learning provides such systems with a number of classification methods that can be used to make actual predictions. How to make quick, efficient and accurate service recommendations is a challenge that needs to be faced with in solving such problems.
The existing recommendation methods mostly have the defects of low algorithm efficiency, low accuracy, large calculated amount, instability and the like, so that the research of a new intelligent service recommendation algorithm is imperative.
Disclosure of Invention
The invention provides an intelligent service recommendation algorithm based on cloud computing, which is small in calculation amount, high in efficiency, high in accuracy and good in stability, and aims to solve the technical problems of large calculation amount, low efficiency, low accuracy and instability of the existing service recommendation algorithm.
The invention comprises the following two steps:
(1) model training
Firstly, training a plurality of accurate service characteristics, dispersing training tasks of different accurate service selection decision data to different computing nodes by adopting a distributed training method to obtain different types of service selection model parameters, and storing the different types of service selection model parameters into a model database; the model database is established according to historical accurate service selection decision data, and different service characteristic data are extracted, arranged and updated regularly aiming at different services; the model database comprises a service characteristic training library and a service selection decision model library;
(2) service selection decision
And (3) performing accurate service selection decision by using a logistic regression algorithm, after receiving the service characteristic data of the demander, firstly filtering abnormal index information in the detected service characteristic data according to a detection standard data model, determining a required service range, sending the service characteristic data to each computing node, sending the service characteristic data to reduce nodes for comprehensive analysis after the computation is finished, and finally determining the required service.
Preferably, the computation process of each computing node in step (2) is the same, and the following lists a model for each computing node to train and make an accurate service selection decision:
(1) fitting function
The method adopts a fitting function for concentrating the probability obtained by calculation in a [0,1] interval and leading the calculation result to approach 0 or 1 as much as possible;
Figure GDA0002789777330000021
from the fitting function:
Figure GDA0002789777330000022
wherein x is the inputted service characteristic data, y is the service selection decision result, ftSelecting a decision threshold for the service; h isθ(x) Is a fitting function, theta is a fitting parameter, namely a service selection decision model parameter, and T is transposition;
(2) loss function
Figure GDA0002789777330000023
When y is 1:
Figure GDA0002789777330000031
when y is 0:
Figure GDA0002789777330000032
merging the loss functions:
cost(hθ(x),y)=-y log(hθ(x))-(1-y)log(1-hθ(x))
the loss function is adopted, so that when the prediction result is close to the actual value, the loss approaches to 0, and when the difference between the prediction result and the actual value is very large, the loss approaches to infinity;
(3) fitting parameter theta
Updating all parameters theta simultaneously
Figure GDA0002789777330000033
Wherein alpha is a set threshold value; m is the number of service characteristic data x; i is a subscript of the service characteristic data and represents the ith service characteristic data; g is an index of the service selection decision model parameter θ, representing the g-th θ.
(4) Feature scaling
Feature scaling is a method used to unify the independent variables or feature ranges, making the different features have the same effect on the differences;
Figure GDA0002789777330000034
where x is the input service characteristic data, xmaxIs the maximum value, x, thereofminIs the minimum value thereof; since x is plural, μ represents an average value of the plural x.
(5) Description of algorithm implementation
MRLRFDD accurate service selection decision algorithm
Inputting: service characteristic data provided by all service demanders, namely X ═ X (X)1,x2,…,xn) And whether it is the desired service Z ═ (Z)1,z2,…,zn)
And (3) outputting: the predicted result H ═ H1,h2,…,hn)
①:for i=1:n
②:for j=1:m
③:
Figure GDA0002789777330000041
④:for i=1:Iteration
⑤:
Figure GDA0002789777330000042
⑥:for g=1:m
⑦:
Figure GDA0002789777330000043
And (v): substituting various service characteristic data provided by the current service demander into H theta (x) to obtain a prediction result H;
wherein x is1,x2,…,xnRefers to the service characteristic data, z, provided by the 1 st, 2 nd, … th n service demanders1,z2,…,znWhether the service is needed by the 1 st, 2 nd, … th n service demanders or not is judged; n in X represents n service characteristic data which are in one-to-one correspondence with n in Z, n in H and n in for loop; m represents service characteristic data x for each submissioniM components in the sequence; i is a subscript of the service characteristic data, and represents the ith service characteristic data, i is 1,2, … n; g is a subscript of the service selection decision model parameter θ, representing the g-th θ; x is the number ofijThe jth service characteristic data submitted for the ith time.
Preferably, on the basis that a cloud computing-based precision service selection decision algorithm (MRLRFDD) is implemented, establishing a collaborative filtering recommendation algorithm (UDMDCFUB) of a multidimensional vector associated with a precision service selection decision requires the following steps:
(1) let C be the set of service requesters in all systems and S be the set of all service plans that can be recommended to the service requesters. In practice, the size of the C and S sets is usually large. The utility function u can be used for calculating the recommendation degree of the service scheme S to the service demander C, namely u: C × S → R, R is a full-order non-negative real number in a certain range, and the problem to be researched by recommendation is to find the objects S with the maximum recommendation degree R*As shown in formula (1):
Figure GDA0002789777330000051
(2) the demander has a score (0 for non-scoring) for each service plan, and the score of a demander for a service plan can be represented in the form of a one-dimensional matrix, i.e. Si'=(s(i,1),s(i,2),…,s(i,m)) All of the demanders' scores for the service plan can be expressed in the form of a multi-dimensional matrix, i.e. S ═ S (S)1',S2',…,Si',…,Sn'), wherein m is the number of scores of a demander to the service scheme, n is the number of demanders, and is in corresponding relation with m and n; setting a flag matrix F, F (i, j) to indicate whether a demander i scores a service scheme j, wherein F (i, j) is 1 when the demander i scores the service scheme j, and F (i, j) is 0 when the demander i does not score the service scheme j;
(3) mean value normalization processing, namely limiting the processed data within a certain range, wherein the normalization is to facilitate the subsequent data processing and ensure the accelerated convergence when the program runs;
Pj”=Pj'-aj (2)
wherein j is the jth service scheme, ajAverage value of scores for jth service plan, Pj' all scores for all requesters for the jth service plan, PjIs' to oneAll scores of all demanders to the jth service scheme after the treatment are processed;
(4) learning parameters X and theta; let the service characteristic data set of the preference of the demander be X ═ X1,x2,…,xm) M is the number of service characteristic data, and the data set of the service selection decision model parameters is theta ═ theta (theta)12,…,θn) N is the number of parameters of a service selection decision model, two data sets are initialized, and parameters X and theta are learned by a gradient descent method;
Figure GDA0002789777330000052
Figure GDA0002789777330000053
wherein s (i, j) is the score of the demander i on the service scheme j; obtaining the trained parameters X and theta, wherein X multiplied by theta is the preference degree of the demander to the service scheme; beta is a coefficient, lambda is a given parameter, and k is the kth term;
(5) description of algorithms
Collaborative filtering recommendation algorithm for UDMDCFUB multi-dimensional vector
Inputting: all service demanders score a service plan, i.e. S ═ S (S)1',S2',…,Sn') a flag matrix F, where n is the number of requesters;
and (3) outputting: predicting the scoring of all service plans by a demander
1:for i=1:n
2:for j=1:m
3:s(i,j)=s(i,j)-aj
4:for i=1:Iteration
5:
Figure GDA0002789777330000061
6:
Figure GDA0002789777330000062
7: prediction score θT*X+μ;
Wherein s is(i,j)The scores of the service proposal j of the demander i are expressed, and the meaning of i and j is the same as that of the formulas (3) and (4).
The invention provides a novel service decision and service recommendation algorithm based on cloud computing from the aspect of network accurate service recommendation so as to enable service demanders to obtain the best service supply, and the algorithm is simple, high in accuracy and good in stability.
Detailed Description
In this process, two stages are divided, briefly listed below:
1. map-reduce based precision service decision algorithm (MRLRFDD). The current popular services in the society are various, each service needs to provide accurate service content, so that a logistic regression algorithm is used for finding out a characteristic equation of the service, and therefore, each service needs to be trained to obtain required parameters, and the required parameters are stored in a service selection model parameter library for decision making. The service type is various, the training data amount is large, and in order to recommend precision to the service, the training is required to be carried out by using newly obtained data regularly, which is one of the characteristics of the precise service decision algorithm. The second characteristic is that the trained service selection model can not be directly selected according to the service characteristics, and since the service demander has the possibility of needing each service, the service requirement characteristic data of the service demander needs to be substituted into each model for measurement and calculation, and finally, the measurement and calculation results are combined to obtain the required service. Aiming at the two characteristics, the logistic regression algorithm based on map-reduce is designed, each computing node firstly trains a plurality of accurate service characteristics to obtain model parameters, and the model parameters are stored in a model database. After receiving the service characteristic data of the demander, the data are sent to each computing node, and after the computing is finished, the data are sent to the reduce node for comprehensive analysis, and finally the required service is determined.
2. Collaborative filtering recommendation algorithm (UDMDCFUB) of multidimensional vectors associated with precise service decisions. And filtering the service schemes provided by all service providers according to the result obtained by the accurate service decision algorithm, combining the historical selection result and evaluation of a plurality of similar service demanders to form data to be trained by the collaborative filtering recommendation algorithm based on the multidimensional matrix, and obtaining the personal preference attribute value of the service demander to the service scheme and the attribute values of all aspects of the service scheme after a series of calculations. When the system deduces the accurate demand of the service demander, the system can obtain the probability of the service demander for selecting a certain service scheme according to the personal preference of the service demander and the attribute value of the service scheme, and recommend the service scheme to the demander according to the set threshold value of the service recommendation probability.
In order to realize the aim, the invention adopts the following technical scheme:
1. and establishing a service characteristic training library and a service selection decision model library. When a logistic regression algorithm is used for accurate service selection decision making, a large amount of data is needed for training, then a sample base is established according to historical accurate service decision making data, and the sample base is updated regularly, so that the latest training data can be obtained, and a basis is provided for more accurate service decision making. The sample library is derived from common and reliable accurate service selection decision data, but usually, the information cannot be directly used, and different data are extracted and sorted for different services because data items and related indexes of each service investigation are different. The invention stores the data items to be extracted by different services, the extraction rules, the parameters obtained by the latest training and other information into the model database.
2. Accurate service decision algorithm based on map-reduce
The precise service selection decision is divided into two steps: model training and service selection decisions. Model parameters are obtained through model training, and the parameters and the demand data provided by the service demander are substituted into an accurate service selection decision equation, so that the accurate service demand of the service demander can be obtained. The more data is trained, the more accurate the result is, and the common service requirement reaches hundreds of thousands, so the process of data training is actually a large data mining process. Training on a single node is inefficient. In addition, because the accurate range of the service cannot be determined when the decision is selected for accurate service, and the requirement of the online accurate service selection decision on the internet on time is considered, the invention adopts a logistic regression accurate service selection decision algorithm based on map-reduce. The model training adopts a distributed training method, the training tasks of different accurate service selection decision data are dispersed to different computing nodes, and the training results are stored in a model database. Due to the uncertainty of the exact service selection decision, the exact service selection decision is divided into two phases. Firstly, according to a detection standard data model, abnormal index information in detection data is filtered out, and a possibly needed service range is determined. And then sent to different computing nodes for accurate service selection decision. Because the computation process is the same for each node, a model for each node to train and make accurate service decisions is listed below.
(1) Fitting function
The method adopts the fitting function to concentrate the probability obtained by calculation in the interval of [0,1] and make the calculation result approach to 0 or 1 as much as possible.
Figure GDA0002789777330000081
From the fitting function:
Figure GDA0002789777330000082
wherein x is the inputted service characteristic data, y is the service selection decision result, ftA decision threshold is selected for the service.
(2) Loss function
Such a loss function is employed in order that the loss approaches 0 when the prediction result is close to the actual value and approaches infinity when the prediction result is very different from the actual value.
Figure GDA0002789777330000083
When y is 1:
Figure GDA0002789777330000091
when y is 0:
Figure GDA0002789777330000092
merging the loss functions:
cost(hθ(x),y)=-y log(hθ(x))-(1-y)log(1-hθ(x))
(3) fitting parameter theta
Updating all parameters theta simultaneously
Figure GDA0002789777330000093
(4) Feature scaling
Feature scaling is a method used to unify the argument or feature range so that different features have the same effect on the difference.
Figure GDA0002789777330000094
(mu is the average of a plurality of X)
(5) Description of algorithm implementation
MRLRFDD accurate service selection decision algorithm
Inputting: all service demanders provide service requirement item, namely X ═ X1,x2,…,xn) And whether it is the desired service Z ═ (Z)1,z2,…,zn)
And (3) outputting: the predicted result H ═ H1,h2,…,hn)
①:for i=1:n
②:for j=1:m
③:
Figure GDA0002789777330000095
④:for i=1:Iteration
⑤:
Figure GDA0002789777330000101
⑥:for j=1:m
⑦:
Figure GDA0002789777330000102
And (v): substituting various service characteristic data provided by the current service demander into hθ(x) Obtaining a predicted structure H;
wherein x is1,x2,…,xnRefers to the service characteristic data, z, provided by the 1 st, 2 nd, … th n service demanders1,z2,…,znWhether the service is needed by the 1 st, 2 nd, … th n service demanders or not is judged; n in X represents n service characteristic data which are in one-to-one correspondence with n in Z, n in H and n in for loop; m represents service characteristic data x for each submissioniM components in the sequence; i is a subscript of the service characteristic data, and represents the ith service characteristic data, i is 1,2, … n; j is an index of the service selection decision model parameter θ, representing the jth θ.
3. Collaborative filtering recommendation algorithm (UDMDCFUB) of multidimensional vectors associated with precise service decisions.
(1) Let C be the set of service requesters in all systems and S be the set of all service plans that can be recommended to the service requesters. In practice, the size of the C and S sets is usually large. The utility function u can be used for calculating the recommendation degree of the service scheme S to the service demander C, namely u: C × S → R, R is a full-order non-negative real number in a certain range, and the problem to be researched by recommendation is to find the objects S with the maximum recommendation degree R*As shown in formula (1):
Figure GDA0002789777330000103
(2) demander to each serverIf there is a score (0 indicates no score), the score of a requester on the service plan can be represented in the form of a one-dimensional matrix, i.e., Si'=(s(i,1),s(i,2),…,s(i,m)) All of the demanders' scores for the service plan can be expressed in the form of a multi-dimensional matrix, i.e. S ═ S (S)1',S2',…,Si',…,Sn') where m is the number of service plan scores of a requester and n is the number of requesters, and m and n are in corresponding relationship with each other. And setting a flag matrix F, F (i, j) to indicate whether the demander i scores the service scheme j, wherein F (i, j) is 1 when the demander i scores the service scheme j, and F (i, j) is 0 when the demander i does not score the service scheme j.
(3) The mean value normalization processing is to limit the processed data within a certain range, and the first normalization is to facilitate the subsequent data processing and ensure the accelerated convergence when the program runs.
Pj”=Pj'-aj (2)
Wherein j is the jth service scheme, ajAverage value of scores for jth service plan, Pj' all scores for all requesters for the jth service plan, Pj"all scores of all demanders to the jth service scheme after normalization processing;
(4) the parameters X, θ are learned.
Let the service characteristic data set of the preference of the demander be X ═ X1,x2,…,xm) M is the number of service characteristic data, and the data set of the service scheme attribute parameter is theta ═ theta12,…,θn) And n is the number of parameters of the service selection decision model, two data sets are initialized, and the parameters X and theta are learned by a gradient descent method.
Figure GDA0002789777330000111
Figure GDA0002789777330000112
Note: s (i, j) is the score of the service plan j for the demander i. Obtaining the trained parameters X and theta, wherein X multiplied by theta is the preference degree of the demander to the service scheme; beta is a coefficient, lambda is a given parameter, and k is the kth term;
(5) description of algorithms
UDMDCFUB algorithm
Inputting: rating of service plan by all service demanders, i.e. S ═ S (S)1,S2,…,Sn) A mark matrix F, wherein n is the number of scores of all service demanders for the service scheme;
and (3) outputting: predicting the scoring of all service plans by a demander
1:for i=1:n
2:for j=1:m
3:s(i,j)=s(i,j)-aj
4:for i=1:Iteration
5:
Figure GDA0002789777330000121
6:
Figure GDA0002789777330000122
7: prediction score θT*X+μ。
Wherein s (i, j) represents the score of the demander i on the service scheme j, and the meaning of i and j is the same as that of the formulas (3) and (4).
The technical solution of the present invention will be described in detail by specific embodiments.
The core of the invention is to develop service characteristics and establish a service characteristic model by researching a large number of service cases. When a service demander puts forward a service demand with corresponding characteristics, corresponding services can be recommended. To ensure that the designed algorithm is practically usable, an accurate source of information must be found and the resulting service taken as a case. Therefore, the explanation of the concrete implementation is given here by taking the endowment service as an example. The method comprises the steps of obtaining characteristic association relations between detection indexes such as symptoms and biochemical pathologies and related diseases by mining historical information of diagnosis and treatment medical cases, establishing a mathematical model, inputting the detection indexes such as the symptoms and biochemical pathologies of a current patient, judging the probability of the patient suffering from one or more diseases, entering a medical service recommendation system as an input source after diagnosis is confirmed, and selecting a current disease service scheme suitable for the patient from a plurality of medical service schemes to recommend the patient according to a collaborative filtering recommendation algorithm of multidimensional vectors associated with disease diagnosis and a large number of patient evaluations by the system. For this purpose, the following steps are required.
1. And training a sample database.
The training sample database includes two types, one is sample data prepared for disease diagnosis, which is actually data extracted, washed and collated from the patient physical examination database, including urine routine, blood routine test, and the like; and the other is a patient evaluation sample library recommended for the medical service scheme, which comprises medical service unit information (including unit name, level, position and the like), the medical service scheme (medical service team information, treatment scheme, price, patient cure rate and the like), an evaluation matrix of the user on the medical service scheme and the like.
2. And establishing a model database.
During model training, the variables and related parameters for different diseases are different and may be adjusted at each stage, and therefore, cannot be fixed in the program. The invention stores variables, parameters, the number of the variables, the number of the parameters and the mapping relation between the variables and the training samples in a model database. The function of the method comprises two aspects: (1) extracting variables, variable quantity and mapping relation between the variables and a training data set during training at each stage, training, and storing training results back in a parameter table corresponding to a model database; (2) in each diagnosis, variables, parameters, the number of the variables and the number of the parameters are extracted from the model base to establish a dynamic model, and the disease probability is calculated.
3. Different diseases are associated with different symptoms, different disease sample data are used, and the trained parameter theta isThe number and the value are different, so the invention adopts a disease diagnosis algorithm based on map-reduce, each calculation node unit adopts a logistic regression algorithm, and the calculation is carried out in stages to obtain a training model. I.e. multiple nodes train on multiple diseases. Taking X from model database in each stage of each nodei(xi,j+1,xi,j+2,…,xi,j+k) Mapping the data to a sample database, extracting the latest sample, training, and obtaining a parameter thetaii,j+1i,j+2,…,θi,j+k) And storing the data back into the model library. When a certain patient is diagnosed, disease data of the patient is input firstly, then the disease data is sent to each map computing node, the computing nodes extract corresponding data from disease data according to diseases which can be processed by the computing nodes, disease judgment is carried out, results are sent to reduce nodes for screening, and the disease of the patient is obtained according to probability.
4. The system sends the disease of the patient obtained in the step 3 and the patient information to a service recommendation node, the service recommendation node extracts the associated evaluation matrix according to the previous diagnosis and treatment information of the patient and the evaluation of history relevant patients on the diagnosis and treatment of the disease, then the calculation is carried out according to the collaborative filtering recommendation algorithm of the multidimensional vector, and the most suitable diagnosis and treatment service scheme is recommended to the patient.
The algorithm involved in the present invention is described below using an example. In this example, we have used three common diseases for testing, namely hypertension (coded as I10.x02), coronary heart disease (coded as I25.101) and nephropathy (coded as N28.901), which have certain relationship, for example, patients suffering from hypertension for a long time may have influence on heart and kidney, and serious patients may have diseases of corresponding organs. In order to be able to make an accurate diagnosis, training and corresponding tests are carried out using blood routine and urine routine sample banks provided by the corresponding medical department. The sample data mapping table 1. Three cloud computing nodes are used, and data (physical examination, blood routine and urine routine) extracted from three sample libraries according to the mapping relation of the table 1 are used for training and diagnosing three diseases respectively.
TABLE 1 sample data mapping Table
Figure GDA0002789777330000141
When a disease is diagnosed, it is necessary to provide a patient with available clinical care solutions provided by hospitals, and table 2 abstracts six more typical hospital-provided medical care solutions, each code representing a medical care solution, for example, Jh _ s1 represents a medical care solution provided by a department in a hospital for hypertension, and the code corresponds to details of the services provided by a department in a hospital for hypertension. Because of the wide variety of diseases and the large number of medical units, the medical care plan is a database with huge data volume, and the evaluation matrix generated by the medical care plan is also quite huge. Therefore, when the service scheme is recommended, the service scheme matrix is extracted according to the disease, and then the multi-dimensional vector evaluation vector matrix is extracted.
TABLE 2 medical care plan table
Figure GDA0002789777330000142
Figure GDA0002789777330000151
After the preparation work is completed, we extract data from the sample library illustrating 2 algorithms.
1) MRLRFDD algorithm: 130 data are randomly extracted from a sample library to train hypertension diagnosis, 200 data are randomly extracted to train coronary heart disease diagnosis, and 50 data are randomly extracted to train kidney disease diagnosis. The theta values of hypertension (0.29042339526016764,2.320070076449664,3.1247681396613323,1.9912852617768821, -2.939785693846358 and 2.546702905413339), coronary heart disease (-0.5732159592392719,3.963633095493384,3.3987536729350865,3.8920547540621193 and 3.398593162956388) and kidney disease (0.2561313490409353,1.6548079722273288,4.911460779107366,4.2401342923473715,3.906841801494906 and 2.8548633117398188) are obtained from formula (2) in the MRLRFDD algorithm. Then, disease prediction was performed on 10 suspected patients using formula (2) and formula (1) in the MRLRFDD algorithm, and the results are shown in table 3.
TABLE 3 comparison of disease diagnosis results
Figure GDA0002789777330000152
Figure GDA0002789777330000161
2) UDMDCFUB algorithm: after the diagnosis is confirmed, a multi-dimensional evaluation matrix is filtered from the evaluation sample library according to the table 2, and a recommended prediction matrix of the medical care service scheme which needs to be selected for the patients with hypertension, coronary heart disease and nephropathy can be respectively obtained according to the formulas (3) and (4) in the UDMDCFUB algorithm.
Hypertension healthcare service plan prediction matrix:
Figure GDA0002789777330000162
coronary heart disease medical service scheme prediction matrix:
Figure GDA0002789777330000163
nephropathy healthcare service plan prediction matrix:
Figure GDA0002789777330000164
it can be seen from the above prediction matrix that the medical care schemes proposed by the military hospitals and the third hospitals are popular, and some specialized hospitals have their own advantages when treating certain diseases, for example, the people's hospitals are generally favored by patients when treating kidney diseases, and the recommended score reaches the full score (i.e. 5).
However, the above description is only exemplary of the present invention, and the scope of the present invention should not be limited thereby, and the replacement of the equivalent components or the equivalent changes and modifications made according to the protection scope of the present invention should be covered by the claims of the present invention.

Claims (3)

1. An intelligent service recommendation method based on cloud computing is characterized by comprising the following two steps:
(1) model training
Firstly, training a plurality of accurate service characteristics, dispersing training tasks of different accurate service selection decision data to different computing nodes by adopting a distributed training method to obtain different types of service selection model parameters, and storing the different types of service selection model parameters into a model database; the model database is established according to historical accurate service selection decision data, and different service characteristic data are extracted, arranged and updated regularly aiming at different services; the model database comprises a service characteristic training library and a service selection decision model library;
(2) service selection decision
And (3) performing accurate service selection decision by using a logistic regression algorithm, after receiving the service characteristic data of the demander, firstly filtering abnormal index information in the detected service characteristic data according to a detection standard data model, determining a required service range, sending the service characteristic data to each computing node, sending the service characteristic data to reduce nodes for comprehensive analysis after the computation is finished, and finally determining the required service.
2. The intelligent service recommendation method based on cloud computing as claimed in claim 1, wherein the computing process of each computing node in step (2) is the same, and the model for each computing node to train and make accurate service selection decision is listed as follows:
(1) fitting function
The method adopts a fitting function for concentrating the probability obtained by calculation in a [0,1] interval and leading the calculation result to approach 0 or 1 as much as possible;
Figure FDA0002789777320000011
from the fitting function:
Figure FDA0002789777320000012
wherein x is the inputted service characteristic data, y is the service selection decision result, ftSelecting a decision threshold for the service; h isθ(x) Is a fitting function, theta is a fitting parameter, namely a service selection decision model parameter, and T is transposition;
(2) loss function
Figure FDA0002789777320000021
When y is 1:
Figure FDA0002789777320000022
when y is 0:
Figure FDA0002789777320000023
merging the loss functions:
cost(hθ(x),y)=-y log(hθ(x))-(1-y)log(1-hθ(x))
the loss function is adopted, so that when the prediction result is close to the actual value, the loss approaches to 0, and when the difference between the prediction result and the actual value is very large, the loss approaches to infinity;
(3) fitting parameter theta
Updating all parameters theta simultaneously
Figure FDA0002789777320000024
Wherein alpha is a set threshold value; m is the number of service characteristic data x; i is a subscript of the service characteristic data and represents the ith service characteristic data; g is a subscript of the service selection decision model parameter θ, representing the g-th θ;
(4) feature scaling
Feature scaling is a method used to unify the independent variables or feature ranges, making the different features have the same effect on the differences;
Figure FDA0002789777320000025
where x is the input service characteristic data, xmaxIs the maximum value, x, thereofminIs the minimum value thereof; since x is plural, μ represents an average value of the plural x;
(5) description of algorithm implementation
MRLRFDD accurate service selection decision algorithm
Inputting: service characteristic data provided by all service demanders, namely X ═ X (X)1,x2,…,xn) And whether it is the desired service Z ═ (Z)1,z2,…,zn)
And (3) outputting: the predicted result H ═ H1,h2,…,hn)
①:for i=1:n
②:for j=1:m
③:
Figure FDA0002789777320000031
④:for i=1:Iteration
⑤:
Figure FDA0002789777320000032
⑥:for g=1:m
⑦:
Figure FDA0002789777320000033
And (v): substituting various service characteristic data provided by the current service demander into hθ(x) Obtaining a prediction result H;
wherein x is1,x2,…,xnRefers to the service characteristic data, z, provided by the 1 st, 2 nd, … th n service demanders1,z2,…,znWhether the service is needed by the 1 st, 2 nd, … th n service demanders or not is judged; n in X represents the number of service characteristic data, and the number is in one-to-one correspondence with n in Z, n in H and n in for cycle; m represents service characteristic data x for each submissioniM components in the sequence; i is a subscript of the service characteristic data, and represents the ith service characteristic data, i is 1,2, … n; g is a subscript of the service selection decision model parameter θ, representing the g-th θ; x is the number ofijThe jth service characteristic data submitted for the ith time.
3. The intelligent service recommendation method based on cloud computing as claimed in claim 2, wherein based on the implementation of the precise service selection decision algorithm based on cloud computing (MRLRFDD), a collaborative filtering recommendation algorithm (UDMDCFUB) of multidimensional vector associated with the precise service selection decision is established, and the following steps are required:
(1) setting C as a service demander set in all systems, and S as a service scheme set which can be recommended to the service demander; the utility function u can be used for calculating the recommendation degree of the service scheme S to the service demander C, namely u: C × S → R, R is a full-order non-negative real number in a certain range, and the problem to be researched by recommendation is to find the objects S with the maximum recommendation degree R*As shown in formula (1):
Figure FDA0002789777320000041
(2) the demander pairEach service plan has a score, and if no score is given by 0, the score of a certain demander i on the service plan can be given in the form of a one-dimensional matrix, i.e. Si'=(s(i,1),s(i,2),…,s(i,m)) All of the demanders' scores for the service plan can be expressed in the form of a multi-dimensional matrix, i.e. S ═ S (S)1',S2',…,Si',…,Sn'), wherein m is the number of scores of a demander to the service scheme, n is the number of demanders, and is in corresponding relation with m and n; setting a flag matrix F, F (i, j) to indicate whether a demander i scores a service scheme j, wherein F (i, j) is 1 when the demander i scores the service scheme j, and F (i, j) is 0 when the demander i does not score the service scheme j;
(3) mean value normalization processing, namely limiting the processed data within a certain range, wherein the normalization is to facilitate the subsequent data processing and ensure the accelerated convergence when the program runs;
Pj”=Pj'-aj (2)
wherein j is the jth service scheme, ajAverage value of scores for jth service plan, Pj' all scores for all requesters for the jth service plan, Pj"all scores of all demanders to the jth service scheme after normalization processing;
(4) learning parameters X and theta; let the service characteristic data set of the preference of the demander be X ═ X1,x2,…,xm) M is the number of service characteristic data, and the data set of the service selection decision model parameters is theta ═ theta (theta)12,…,θn) N is the number of parameters of a service selection decision model, two data sets are initialized, and parameters X and theta are learned by a gradient descent method;
Figure FDA0002789777320000042
Figure FDA0002789777320000051
wherein s (i, j) is the score of the demander i on the service scheme j; obtaining the trained parameters X and theta, wherein X multiplied by theta is the preference degree of the demander to the service scheme; beta is a coefficient, lambda is a given parameter, and k is the kth term;
(5) description of algorithms
Collaborative filtering recommendation algorithm for UDMDCFUB multi-dimensional vector
Inputting: all service demanders score a service plan, i.e. S ═ S (S)1',S2',…,Sn') a flag matrix F, where n is the number of requesters;
and (3) outputting: predicting the scoring of all service plans by a demander
1:for i=1:n
2:for j=1:m
3:s(i,j)=s(i,j)-aj
4:for i=1:Iteration
5:
Figure FDA0002789777320000052
6:
Figure FDA0002789777320000053
7: prediction score θT*X+μ;
Wherein s (i, j) represents the score of the demander i on the service scheme j, and the meaning of i and j is the same as that of the formulas (3) and (4).
CN201710742043.0A 2017-08-25 2017-08-25 Intelligent service recommendation method based on cloud computing Active CN107491992B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710742043.0A CN107491992B (en) 2017-08-25 2017-08-25 Intelligent service recommendation method based on cloud computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710742043.0A CN107491992B (en) 2017-08-25 2017-08-25 Intelligent service recommendation method based on cloud computing

Publications (2)

Publication Number Publication Date
CN107491992A CN107491992A (en) 2017-12-19
CN107491992B true CN107491992B (en) 2020-12-25

Family

ID=60645876

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710742043.0A Active CN107491992B (en) 2017-08-25 2017-08-25 Intelligent service recommendation method based on cloud computing

Country Status (1)

Country Link
CN (1) CN107491992B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658187A (en) * 2018-12-14 2019-04-19 东软集团股份有限公司 Recommend method, apparatus, storage medium and the electronic equipment of cloud service provider
CN110134878B (en) * 2019-05-16 2022-12-16 哈尔滨工业大学 Mobile service recommendation method based on user preference and service change bidirectional perception
CN110929885A (en) * 2019-11-29 2020-03-27 杭州电子科技大学 Smart campus-oriented distributed machine learning model parameter aggregation method
CN111834011A (en) * 2020-07-10 2020-10-27 华东师范大学 Long-term care-for-the-elderly oriented collaborative interactive service recommendation method
CN112687392A (en) * 2020-12-24 2021-04-20 深圳市智连众康科技有限公司 AI-based intelligent alopecia decision method, device and computer-readable storage medium
CN115905702B (en) * 2022-12-06 2023-10-10 雨果跨境(厦门)科技有限公司 Data recommendation method and system based on user demand analysis

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107026755A (en) * 2017-03-13 2017-08-08 南京邮电大学 A kind of service recommendation method based on sequence study

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120203723A1 (en) * 2011-02-04 2012-08-09 Telefonaktiebolaget Lm Ericsson (Publ) Server System and Method for Network-Based Service Recommendation Enhancement
CN103559303A (en) * 2013-11-15 2014-02-05 南京大学 Evaluation and selection method for data mining algorithm
CN103942279B (en) * 2014-04-01 2018-07-10 百度(中国)有限公司 Search result shows method and apparatus
CN106940801B (en) * 2016-01-04 2019-10-22 中国科学院声学研究所 A kind of deeply study recommender system and method for Wide Area Network
CN106126578B (en) * 2016-06-17 2019-07-19 清华大学 A kind of web service recommendation method and device
CN106685933B (en) * 2016-12-08 2020-06-19 腾讯科技(深圳)有限公司 Authorization policy recommendation and device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107026755A (en) * 2017-03-13 2017-08-08 南京邮电大学 A kind of service recommendation method based on sequence study

Also Published As

Publication number Publication date
CN107491992A (en) 2017-12-19

Similar Documents

Publication Publication Date Title
CN107491992B (en) Intelligent service recommendation method based on cloud computing
US20090287503A1 (en) Analysis of individual and group healthcare data in order to provide real time healthcare recommendations
Sotirov et al. A hybrid approach for modular neural network design using intercriteria analysis and intuitionistic fuzzy logic
US20080294692A1 (en) Synthetic Events For Real Time Patient Analysis
Gharehchopogh et al. Neural network application in diagnosis of patient: a case study
CN112233810B (en) Treatment scheme comprehensive curative effect evaluation method based on real world clinical data
CN108986907A (en) A kind of tele-medicine based on KNN algorithm divides the method for examining automatically
Vijayarani et al. An efficient clustering algorithm for predicting diseases from hemogram blood test samples
CN111883223A (en) Report interpretation method and system for structural variation in patient sample data
Zhang et al. Multi-attribute decision making: An innovative method based on the dynamic credibility of experts
CN116109195A (en) Performance evaluation method and system based on graph convolution neural network
Maharani et al. Comparison of topsis and maut methods for recipient determination home surgery
Situmorang Analysis optimization k-nearest neighbor algorithm with certainty factor in determining student career
CN118312816A (en) Cluster weighted clustering integrated medical data processing method and system based on member selection
CN109409522B (en) Biological network reasoning algorithm based on ensemble learning
Amin et al. Predictive Analysis of Heart disease using K-Means and Apriori algorithms
Raihan et al. Classification of covid-19 patients using deep learning architecture of inceptionv3 and resnet50
Oliveira et al. Evolutionary rank aggregation for recommender systems
Liu et al. A pre-trained large generative model for translating single-cell transcriptome to proteome
US20130253892A1 (en) Creating synthetic events using genetic surprisal data representing a genetic sequence of an organism with an addition of context
JP7101349B1 (en) Classification system
Chandrakar et al. Predicting examination results using association rule mining
CN115936455A (en) Probability language multi-attribute group decision method based on correlation coefficient and improved entropy
CN104834702A (en) Subject selection method aiming at scientific research project application
CN111755086A (en) Data anomaly detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240927

Address after: 5th Floor, Building 4, Sino German Cooperation Innovation Park, South of Jinxiu Avenue and West of Qingtan Road, Economic and Technological Development Zone, Hefei City, Anhui Province, China 230092

Patentee after: Anhui Zhiyuxin Information Technology Co.,Ltd.

Country or region after: China

Address before: 264209 No. 2, Wenhua West Road, Shandong, Weihai

Patentee before: HARBIN INSTITUTE OF TECHNOLOGY (WEIHAI)

Country or region before: China