CN104239504A - Data processing method for establishing of doctor competency model - Google Patents

Data processing method for establishing of doctor competency model Download PDF

Info

Publication number
CN104239504A
CN104239504A CN201410465407.1A CN201410465407A CN104239504A CN 104239504 A CN104239504 A CN 104239504A CN 201410465407 A CN201410465407 A CN 201410465407A CN 104239504 A CN104239504 A CN 104239504A
Authority
CN
China
Prior art keywords
data
dimension
value
invalid
valid data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410465407.1A
Other languages
Chinese (zh)
Inventor
金阿宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201410465407.1A priority Critical patent/CN104239504A/en
Publication of CN104239504A publication Critical patent/CN104239504A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a data processing method for establishing of a doctor competency model. The method comprises five steps of data screening, data decomposition, secondary screening of ineffective data, secondary integration of effective data units, and processing of ineffective data units. Compared with the prior art, the method has the advantages that by reutilizing the removed ineffective data of the prior art, the loss of information of original data samples due to the removal of data is greatly reduced; when the ineffective data is subjected to secondary utilization, the reliability of original effective data is maintained.

Description

A kind of data processing method built for doctor's Competency Model
Technical field
The present invention relates to a kind of data processing method, particularly relate to a kind of data processing method built for doctor's Competency Model.
Background technology
Document about medical training model study has a lot, mostly by formulating technicality culture scheme, training mode, construction of teaching team, establish and improve the measures such as security system and cultivate remarkable doctor.To in doctor's competence evaluation system, need to carry out Screening Treatment to a large amount of complex statistics data, thus reject the data of relative nullity, reduce the workload of data processing, improve the efficiency of data processing.The research emphasis of prior art is preference data to how.But in any case preferably, as long as reject data, the information that will will embody sample itself produces deviation effects, cannot realize making full use of of sample information.
Summary of the invention
Object of the present invention is just to provide a kind of data processing method built for doctor's Competency Model to solve the problem.
The present invention is achieved through the following technical solutions above-mentioned purpose:
The present invention includes following steps:
(1) data screening: successively use KMO data value, Ba Fei Z-TEK sphericity test data are screened source data, choose KMO data value be greater than 0.5 and the Ba Fei Z-TEK sphericity test data data that are less than 0.05 be valid data, remainder data is invalid data;
(2) data decomposition: all data are divided into several dimensions according to the difference of data content;
(3) invalid data postsearch screening: the maximal value and the minimum value that filter out each data dimension in valid data, then the maximal value of valid data in the data cell in each dimension of invalid data and identical dimensional and minimum value is used to compare, if the data cell in invalid data is between the maximal value and minimum value of valid data, be then valid data unit by the data unit index of this dimension in this invalid data, otherwise, continue to be labeled as invalid data unit;
(4) secondary of valid data unit merges: the mean value calculating each dimension of each valid data, then the data being labeled as valid data unit in invalid data are replaced the mean value of respective dimensions, other dimension still uses the mean value of Ben Weidu, form a new valid data unit, after using the first weighting factor to be weighted process all valid data unit newly obtained, merge in original valid data unit;
(5) process of invalid data unit: resequence being still labeled as invalid data cell in step C according to the correlativity of dimension, then weight sequencing is carried out to each dimension, the invalid data unit choosing each dimension according to dimension weighted value order from low to high successively carries out linear fit as the invalid data unit that desired value is adjacent to it, the quadratic search standard that the curved surface of the data of each dimension simulated composition builds as doctor's Competency Model.
Further, in step (4), the computing formula of the first weighting factor is:
a = 1 / e | x 1 2 + x 2 2 + . . . + x n 2 - y 1 2 + y 2 2 + . . . + y n 2 | .
In step (4), the new valid data formed after fusion are carried out again the screening process in step (1), if meet the standard in step (1), then without the need to additional process, if the standard in the step of not meeting (1), then the first weighting factor is adjusted, until new valid data meet the standard of step (1).To the step that the first weighting factor adjusts be: choose 1% of the first weighting factor as iteration step length, choose arbitrarily forward or negative sense iteration, by the calculating of the absolute value of the results change rate before and after iteration and iteration result and objective result difference, determine next iteration variable quantity, its pass is
q=D/k。
Beneficial effect of the present invention is:
The present invention is a kind of data processing method built for doctor's Competency Model, compared with prior art, the present invention, by re-using the invalid data of rejecting in prior art, greatly reduces the loss to primary data sample information due to data rejecting.The reliability of original valid data is maintained while secondary utilization is carried out to invalid data.
Embodiment
The invention will be further described below:
The present invention includes following steps:
(1) data screening: successively use KMO data value, Ba Fei Z-TEK sphericity test data are screened source data, choose KMO data value be greater than 0.5 and the Ba Fei Z-TEK sphericity test data data that are less than 0.05 be valid data, remainder data is invalid data;
(2) data decomposition: all data are divided into several dimensions according to the difference of data content;
(3) invalid data postsearch screening: the maximal value and the minimum value that filter out each data dimension in valid data, then the maximal value of valid data in the data cell in each dimension of invalid data and identical dimensional and minimum value is used to compare, if the data cell in invalid data is between the maximal value and minimum value of valid data, be then valid data unit by the data unit index of this dimension in this invalid data, otherwise, continue to be labeled as invalid data unit;
(4) secondary of valid data unit merges: the mean value calculating each dimension of each valid data, then the data being labeled as valid data unit in invalid data are replaced the mean value of respective dimensions, other dimension still uses the mean value of Ben Weidu, form a new valid data unit, after using the first weighting factor to be weighted process all valid data unit newly obtained, merge in original valid data unit;
(5) process of invalid data unit: resequence being still labeled as invalid data cell in step C according to the correlativity of dimension, then weight sequencing is carried out to each dimension, the invalid data unit choosing each dimension according to dimension weighted value order from low to high successively carries out linear fit as the invalid data unit that desired value is adjacent to it, the quadratic search standard that the curved surface of the data of each dimension simulated composition builds as doctor's Competency Model.Its surface equation is preferably spherical equation x 2+ y 2+ z 2=R 2or ellipsoid EQUATION x 2/ a 2+ y 2/ b 2+ z 2/ c 2=1.
Particularly, in described step (4), the computing formula of the first weighting factor is:
a = 1 / e | x 1 2 + x 2 2 + . . . + x n 2 - y 1 2 + y 2 2 + . . . + y n 2 | .
In formula: x nfor the n-th dimension data unit of new valid data, y nfor the n-th dimension data unit of the valid data in the source data that new valid data Euclidean distance is therewith nearest.
In step (4), the new valid data formed after fusion are carried out again the screening process in step (1), if meet the standard in step (1), then without the need to additional process, if the standard in the step of not meeting (1), then the first weighting factor is adjusted, until new valid data meet the standard of step (1).To the step that the first weighting factor adjusts be: choose 1% of the first weighting factor as iteration step length, choose arbitrarily forward or negative sense iteration, by the calculating of the absolute value of the results change rate before and after iteration and iteration result and objective result difference, determine next iteration variable quantity, its pass is
q=D/k。
In formula: k is the results change rate before and after iteration, and D is the absolute value of iteration result and objective result difference, and q is next iteration variable quantity.
Above-mentioned invalidating data are made up of the invalidating data cell of multiple dimension, and processing procedure is by realizing processing invalidating data to the process of invalidating data cell.
Before the above-mentioned data processing method of use, need to collect data sample and arrange.
Sum up by Interview Method and normative document the approach combined and set up competency key element dictionary.Text process and frequency analysis are carried out to interview recording, choose the frequency of occurrences in interview process be greater than 30% quality and event arrange, and extract unified for the feature with traditional Chinese medical science feature of personalization, be described by the mode meeting traditional Chinese medical science speech habits.After this, the definition of taking a whirl at property of qualitative characteristics, being defined in the corresponding requirements such as GMER, Ministry of Education's standard of having can be found, but great majority need oneself describe and confirm.29 qualitative characteristics obtained.International for 29 of interview gained integrate features GMER standard (" GMER ") is adapted, with reference to combining " traditional Chinese medicine undergraduate course (before CMD) education graduate basic demand " standard, analyze the document about doctor of traditional Chinese medicine's competency, standard is carried out operationalization definition.29 feature base have been carried out the establishment of scale exercise question, and each feature has carried out the design of problem, final formation competency feature dictionary.
Cure on the basis of remarkable competency dictionary hereinto, further investigation screening is carried out to 71 qualities and problem, take Li Kete 5 to score and form scale.The independent sample T score of each project being carried out to expert group and control group checks, and assay significantly illustrates that this project there are differences between expert group and control group, can play the effect identifying expert and general doctor, and this Item discrimination is better.Assay does not significantly illustrate that between this project expert group and control group, difference is not remarkable, and can not well identify expert and general doctor, this Item discrimination is poor.Using independent sample T assay between expert group and control group as the resolving ability standard of project analysis, have 22 projects owing to not reaching level of significance, the effect can not playing discriminating is deleted.All the other 49 projects reach conspicuousness due to T assay, and resolving ability is better retained.The 49 road exercise questions retained are compiled into preliminary survey questionnaire in a random basis.
To comprising 220 numbers of 49 problem items according to carrying out factor analysis exploratory, KMO data value is 0.849, Ba Fei Z-TEK sphericity test level of significance P < 0.001.Selected characteristic root is greater than 1 for intercepting common factor standard, and result has 9 common factor characteristic roots and is greater than 1, and can draw to only have front 5 common factor contribution rates comparatively greatly from analysis, from the 6th factor, contribution rate increases slowly.
The exercise question that factor loading is less than 0.5 is deleted after maximum biquadratic rotates, and the indefinite exercise question of factors belong totally 8 (the 2nd, 20,26,34,44,46,48,49 topic), 41 that retain are adhered to separately 5 factors, wherein factor I 9, factor Ⅱ 8, factor III 8, CA++ 8, accelerator factor 8.
The data method using the present embodiment to mention processes again to the valid data filtered out and invalid data, finally draw the key element group of 5 dimensions, i.e. moral force (morals ability), thinking ability (thinking ability), ditch concerted effort (communication ability), study idea (learning ability), put into practice power (practice ability), and these 5 key element groups are decomposed into 15 characteristic sum, 41 projects such as " ideal and faith " " occupation spirit ", thus build " 5A " model of traditional Chinese medicine " remarkable doctor " competency.
More than show and describe ultimate principle of the present invention and principal character and advantage of the present invention.The technician of the industry should understand; the present invention is not restricted to the described embodiments; what describe in above-described embodiment and instructions just illustrates principle of the present invention; without departing from the spirit and scope of the present invention; the present invention also has various changes and modifications, and these changes and improvements all fall in the claimed scope of the invention.Application claims protection domain is defined by appending claims and equivalent thereof.

Claims (4)

1., for the data processing method that doctor's Competency Model builds, it is characterized in that, comprise the following steps:
(1) data screening: successively use KMO data value, Ba Fei Z-TEK sphericity test data are screened source data, choose KMO data value be greater than 0.5 and the Ba Fei Z-TEK sphericity test data data that are less than 0.05 be valid data, remainder data is invalid data;
(2) data decomposition: all data are divided into several dimensions according to the difference of data content;
(3) invalid data postsearch screening: the maximal value and the minimum value that filter out each data dimension in valid data, then the maximal value of valid data in the data cell in each dimension of invalid data and identical dimensional and minimum value is used to compare, if the data cell in invalid data is between the maximal value and minimum value of valid data, be then valid data unit by the data unit index of this dimension in this invalid data, otherwise, continue to be labeled as invalid data unit;
(4) secondary of valid data unit merges: the mean value calculating each dimension of each valid data, then the data being labeled as valid data unit in invalid data are replaced the mean value of respective dimensions, other dimension still uses the mean value of Ben Weidu, form a new valid data unit, after using the first weighting factor to be weighted process all valid data unit newly obtained, merge in original valid data unit;
(5) process of invalid data unit: resequence being still labeled as invalid data cell in step C according to the correlativity of dimension, then weight sequencing is carried out to each dimension, the invalid data unit choosing each dimension according to dimension weighted value order from low to high successively carries out linear fit as the invalid data unit that desired value is adjacent to it, the quadratic search standard that the curved surface of the data of each dimension simulated composition builds as doctor's Competency Model.
2. the data processing method built for doctor's Competency Model according to claim 1, it is characterized in that: in step (4), the computing formula of the first weighting factor is:
a = 1 / e | x 1 2 + x 2 2 + . . . + x n 2 - y 1 2 + y 2 2 + . . . + y n 2 | .
3. the data processing method built for doctor's Competency Model according to claim 1, it is characterized in that: in step (4), the new valid data formed after fusion are carried out again the screening process in step (1), if meet the standard in step (1), then without the need to additional process, if the standard in the step of not meeting (1), then the first weighting factor is adjusted, until new valid data meet the standard of step (1).
4. the data processing method built for doctor's Competency Model according to claim 3, it is characterized in that: to the step that the first weighting factor adjusts be: choose 1% of the first weighting factor as iteration step length, choose arbitrarily forward or negative sense iteration, by the calculating of the absolute value of the results change rate before and after iteration and iteration result and objective result difference, determine next iteration variable quantity, its pass is
q=D/k。
CN201410465407.1A 2014-09-15 2014-09-15 Data processing method for establishing of doctor competency model Pending CN104239504A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410465407.1A CN104239504A (en) 2014-09-15 2014-09-15 Data processing method for establishing of doctor competency model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410465407.1A CN104239504A (en) 2014-09-15 2014-09-15 Data processing method for establishing of doctor competency model

Publications (1)

Publication Number Publication Date
CN104239504A true CN104239504A (en) 2014-12-24

Family

ID=52227563

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410465407.1A Pending CN104239504A (en) 2014-09-15 2014-09-15 Data processing method for establishing of doctor competency model

Country Status (1)

Country Link
CN (1) CN104239504A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107908744A (en) * 2017-11-16 2018-04-13 河南中医药大学 A kind of method of abnormality detection and elimination for big data cleaning
CN110491490A (en) * 2019-07-11 2019-11-22 深圳市翩翩科技有限公司 A kind of doctor's appraisal procedure and device
CN114821476A (en) * 2022-05-05 2022-07-29 北京容联易通信息技术有限公司 Bright kitchen range intelligent monitoring method and system based on deep learning detection

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107908744A (en) * 2017-11-16 2018-04-13 河南中医药大学 A kind of method of abnormality detection and elimination for big data cleaning
CN107908744B (en) * 2017-11-16 2021-05-18 河南中医药大学 Anomaly detection and elimination method for big data cleaning
CN110491490A (en) * 2019-07-11 2019-11-22 深圳市翩翩科技有限公司 A kind of doctor's appraisal procedure and device
CN114821476A (en) * 2022-05-05 2022-07-29 北京容联易通信息技术有限公司 Bright kitchen range intelligent monitoring method and system based on deep learning detection

Similar Documents

Publication Publication Date Title
Pal Mining educational data to reduce dropout rates of engineering students
Serageldin Sustainability as opportunity and the problem of social capital
CN107562918A (en) A kind of mathematical problem knowledge point discovery and batch label acquisition method
CN105931116A (en) Automated credit scoring system and method based on depth learning mechanism
Jeevalatha et al. Performance analysis of undergraduate students placement selection using decision tree algorithms
CN106384319A (en) Teaching resource personalized recommending method based on forgetting curve
CN106373057B (en) A kind of bad learner&#39;s recognition methods of the achievement of network-oriented education
CN104166731A (en) Discovering system for social network overlapped community and method thereof
Azwa et al. First semester computer science students’ academic performances analysis by using data mining classification algorithms
Gao Establishment of college English teachers’ teaching ability evaluation based on Clementine data mining
CN107133894B (en) Online learning grouping method based on complex network theory
CN108280164A (en) A kind of short text filtering and sorting technique based on classification related words
CN104239504A (en) Data processing method for establishing of doctor competency model
CN110347931A (en) The detection method and device of the new chapters and sections of article
Brown et al. Good communities and bad communities: Does membership affect performance?
CN108681749A (en) Privacy information discriminating method based on network social intercourse platform
Sreenivasulu et al. Implementation of latest machine learning approaches for students grade prediction
CN103955676B (en) Human face identification method and system
CN109739976A (en) Network social intercourse platform privacy discriminating method, system, storage medium and computer
Jayaraman Predicting student dropout by mining advisor notes
Xiao et al. The application of CART algorithm in analyzing relationship of MOOC learning behavior and grades
CN104346327A (en) Method and device for determining emotion complexity of texts
Osareh et al. Mapping and analyzing the scientific outcomes in autism spectrum disorder using lexical co-occurrence approach
Frischemeier Comparing groups by using TinkerPlots as part of a data analysis task—Tertiary students’ strategies and difficulties
Perdahcı et al. Sbm based community detection: School friendship network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Jin Aning

Document name: Notification of before Expiration of Request of Examination as to Substance

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Jin Aning

Document name: Notification of Passing Examination on Formalities

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20141224