CN112257418A - Questionnaire data processing method and device and storage medium - Google Patents

Questionnaire data processing method and device and storage medium Download PDF

Info

Publication number
CN112257418A
CN112257418A CN202011196272.5A CN202011196272A CN112257418A CN 112257418 A CN112257418 A CN 112257418A CN 202011196272 A CN202011196272 A CN 202011196272A CN 112257418 A CN112257418 A CN 112257418A
Authority
CN
China
Prior art keywords
statistical
questionnaire
data
processed
questionnaire data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011196272.5A
Other languages
Chinese (zh)
Inventor
周俊
方博
常春
李章民
陈忆馨
王祖坤
陈梦轩
霍妍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qingsi Technology Co ltd
Original Assignee
Beijing Qingsi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qingsi Technology Co ltd filed Critical Beijing Qingsi Technology Co ltd
Priority to CN202011196272.5A priority Critical patent/CN112257418A/en
Publication of CN112257418A publication Critical patent/CN112257418A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis

Abstract

The present disclosure relates to a questionnaire data processing method, device and storage medium, wherein the method comprises: determining one or more statistical methods applicable to the questionnaire data to be processed; according to the data type of the questionnaire data to be processed, selecting a target statistical method from one or more statistical methods to process the questionnaire data to be processed to obtain a statistical result; and automatically interpreting the statistical result according to the statistical index to obtain questionnaire interpretation information, wherein the numerical value of the statistical index reflects the characteristics of the statistical result. By the method, automatic processing of questionnaire data is realized, the value of the questionnaire data is conveniently and quickly mined, a digitalized scientific decision is provided for non-professional analysts, and the requirement of the non-professional analysts on deep mining of the value of the questionnaire data is met; meanwhile, the method provides analysis assistance for professional analysts, and effectively improves the analysis efficiency of the professional analysts.

Description

Questionnaire data processing method and device and storage medium
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method and an apparatus for processing questionnaire data, and a storage medium.
Background
Currently, questionnaires are widely used, and there are two main approaches to analyzing questionnaire data: firstly, data gathering results carried by a questionnaire platform are obtained; secondly, professional statistical software is used for manual analysis.
However, the questionnaire platform can only provide simple summary results, and professional statistical software is directed to professional analysts and has a too high threshold for non-professional analysts. Therefore, the two schemes cannot meet the requirement that non-professional analysts deeply mine the value of the questionnaire data.
Disclosure of Invention
In view of the above, the present disclosure provides a method and an apparatus for processing questionnaire data, and a storage medium.
According to an aspect of the present disclosure, there is provided a questionnaire data processing method including: determining one or more statistical methods applicable to the questionnaire data to be processed; according to the data type of the questionnaire data to be processed, selecting a target statistical method from the one or more statistical methods to process the questionnaire data to be processed to obtain a statistical result; automatically reading the statistical result according to the statistical index to obtain questionnaire reading information; wherein the value of the statistical indicator reflects the characteristics of the statistical result.
In one possible implementation, the determining one or more statistical methods applicable to the questionnaire data to be processed includes: and inputting the questionnaire data to be processed into a trained neural network to obtain the one or more statistical methods.
In a possible implementation manner, the selecting a target statistical method from the one or more statistical methods to process the questionnaire data to be processed according to the data type of the questionnaire data to be processed to obtain a statistical result, including: selecting a target statistical method corresponding to the data type of the questionnaire data to be processed from the one or more statistical methods according to the corresponding relation between the data type and the statistical method; and processing the questionnaire data to be processed according to the target statistical method to obtain a statistical result.
In a possible implementation manner, the automatically interpreting the statistical result according to the statistical indicator to obtain questionnaire interpretation information includes: screening the statistical result to obtain a screened statistical result; and automatically interpreting the screened statistical result according to the one or more statistical indexes to obtain questionnaire interpretation information.
In a possible implementation manner, the screening the statistical result to obtain a screened statistical result includes: screening the statistical result according to the evaluation index to obtain a screened statistical result; wherein the value of the evaluation index reflects the value of the statistical result.
In one possible implementation manner, the evaluation index includes: at least one of an Akaike Information Criterion (AIC) value, a Bayesian Information Criterion (BIC) value, a decision coefficient R-square value, an error root mean square value and a significance probability P value; the statistical indexes comprise: at least one of R-square, P-value, mean, percentage, standard deviation.
In one possible implementation, the method further includes: generating an analysis report of the questionnaire data to be processed according to the questionnaire interpretation information and the statistical result; and sending the analysis report to a user terminal so as to enable the user terminal to display the analysis report.
According to another aspect of the present disclosure, there is provided a questionnaire data processing apparatus including: a statistical method determination module for determining one or more statistical methods applicable to the questionnaire data to be processed; the data processing module is used for selecting a target statistical method from the one or more statistical methods to process the questionnaire data to be processed according to the data type of the questionnaire data to be processed to obtain a statistical result; the automatic interpretation module is used for automatically interpreting the statistical result according to the statistical index to obtain questionnaire interpretation information; wherein the value of the statistical indicator reflects the characteristics of the statistical result.
According to another aspect of the present disclosure, there is provided a questionnaire data processing apparatus including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the above method.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the above-described method.
In the disclosed embodiment, one or more statistical methods suitable for the questionnaire data to be processed are determined; according to the data type of the questionnaire data to be processed, selecting a target statistical method from one or more statistical methods to process the questionnaire data to be processed to obtain a statistical result; automatically reading the statistical result according to the statistical index to obtain questionnaire reading information; wherein the value of the statistical indicator reflects the characteristics of the statistical result. Therefore, automatic processing of questionnaire data is achieved, the value of the questionnaire data is conveniently and quickly mined, a digitalized scientific decision is provided for non-professional analysts, and the requirement that the non-professional analysts deeply mine the value of the questionnaire data is met. Meanwhile, analysis assistance can be provided for professional analysts, mental labor of the professional analysts can be greatly reduced, and the analysis efficiency of the professional analysts is effectively improved.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
FIG. 1 shows a flow diagram of a method of questionnaire data processing according to an embodiment of the present disclosure;
FIG. 2 shows a block diagram of a questionnaire data processing apparatus according to an embodiment of the present disclosure;
fig. 3 shows a block diagram of a questionnaire data processing apparatus according to an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
Currently, questionnaires are widely used, and there are two schemes for analyzing questionnaire data, one of which is a data summarization result carried by a questionnaire platform, such as a frequency result, a cross table result, and the like. There are many network platforms for collecting data using questionnaires, such as foreign survey monkeys (surveyymonkey), online survey software Quartrics, and the like, and domestic questionnaire platforms such as questionnaire star, questionnaire net, Tencent questionnaire, and the like. The conventional questionnaire platform mainly provides a data collection function and a simple questionnaire data summarization function, cannot provide data analysis and data mining functions of the questionnaire, and cannot effectively find the memory value of the data. The other scheme is that professional analysts use professional statistical software to analyze, and manually analyze and interpret indexes by means of various algorithms in the software so as to mine data values. If the questionnaire user wants to dig the data memory value more deeply, the analysis needs to be performed manually by using traditional client Software, such as Statistical Product and Service Solutions (SPSS), Statistical application Software R, Statistical analysis system (STATISTICAL ANALYSIS SYSTEM, SAS), automated Statistical Product and Service Software (spssaau), matrix laboratory (matrix), and other Software, but the Statistical Software such as SPSS, R, SAS, SPSSAU, and Matlab belongs to professional Statistical Software, and only professional analysts can know which analysis algorithm should be used specifically or how to view the realistic meaning of the algorithm index, which brings a high threshold to ordinary non-professional analysts.
Based on the above statements, the questionnaire platform provides a data summarization result, which only provides a basic data summarization function, and has no characteristics of deep data analysis and mining, and the data analysis value that the questionnaire user can find through the summarized data is extremely low. If the data memory value needs to be deeply mined, professional analysts need to use professional statistical software for analysis, the statistical software firstly assumes that a user is a professional analyst, can automatically perform index interpretation and analysis, cannot automatically select an analysis algorithm, cannot automatically interpret an index result, and completely depends on professional subjective selection of analysts for use and analysis.
Considering that the current questionnaire platform can only provide simple summary results, professional statistical software can only be actively analyzed by professional analysts, non-professionals cannot use the professional statistical software quickly, but the professional statistical software also has the function appeal of mining the memory value of questionnaire data, the technical scheme for processing questionnaire data is provided, the internal value of the questionnaire data is automatically mined, and the requirement that the non-professional analysts deeply mine the value of the questionnaire data is met.
Fig. 1 shows a flowchart of a questionnaire data processing method according to an embodiment of the present disclosure. As shown in fig. 1, the method may include:
step 101, determining one or more statistical methods applicable to the questionnaire data to be processed.
The questionnaire data to be processed can be questionnaire data stored in a related database, and the questionnaire data can comprise data corresponding to one or more questionnaire questions; the stored questionnaire data is uploaded by a questionnaire user (which may be a non-professional analyst or a professional analyst) and stored in a relevant database, which may be, for example, a MySql (relational database management system) database; the questionnaire data to be processed can also be data directly imported by the existing questionnaire platform, can be in butt joint with the questionnaire platform through a data input interface, and can be directly imported into the file data to be processed by the questionnaire platform; illustratively, the data input interface may be implemented using Java.
The statistical method may include: frequency analysis, chi-square analysis, cluster analysis, correlation analysis, linear regression analysis, variance analysis, T-test, logistic regression analysis (a generalized linear regression analysis model), correspondence analysis, non-parametric test, multiple response analysis, description analysis, and the like.
In the related art, the statistical method for questionnaire data is usually integrated into various statistical software, and a professional analyst is required to autonomously select the statistical method for analysis, so that the statistical method cannot be automatically selected. Meanwhile, the questionnaire platform can only provide simple frequency analysis and cross table function, and cannot meet the requirements. In this step, one or more statistical methods applicable to the questionnaire data to be processed can be automatically selected, so that non-professionals can determine statistical methods applicable to the questionnaire data to be processed without using professional statistical software.
For example, a study questionnaire about work satisfaction, which is shown in table 1 below, is taken as an example of the questionnaire data to be processed, wherein the questionnaire data includes data corresponding to 17 questionnaire questions (i.e., numbers Q1-Q17 in table 1), and for each questionnaire question, there are corresponding options, and the options may include two forms of selection (options such as A, B, C, D) and scoring (minimum score of 1, maximum score of 10) for different questionnaire questions.
TABLE 1 study questionnaire on job satisfaction
Figure BDA0002754101280000061
Figure BDA0002754101280000071
In a possible implementation manner, in this step, the determining one or more statistical methods applicable to the questionnaire data to be processed may include: and inputting the questionnaire data to be processed into a trained neural network to obtain the one or more statistical methods.
In the embodiment of the disclosure, the neural network can be trained in advance by using the questionnaire data sample to obtain the trained neural network; the questionnaire data to be processed can be input into the trained neural network, and the trained neural network can input one or more statistical methods, so that one or more statistical methods suitable for the questionnaire data to be processed can be automatically selected. Illustratively, a certain piece of questionnaire data to be processed is input into a trained neural network, and the trained neural network outputs frequency analysis, correlation analysis, regression analysis, chi-square analysis and variance analysis, which are five statistical methods.
For example, taking the above-mentioned study questionnaire about work satisfaction in table 1 as an example, the questionnaire data is inputted into a trained neural network, and the trained neural network outputs seven statistical methods of frequency analysis, description analysis, variance analysis, correlation analysis, linear regression analysis (entry method), linear regression analysis (step-by-step method), and chi-square analysis.
102, according to the data type of the questionnaire data to be processed, selecting a target statistical method from the one or more statistical methods to process the questionnaire data to be processed to obtain a statistical result;
the data type of the questionnaire data to be processed can be labeled by a questionnaire user in advance or provided by a questionnaire platform through a data input interface. The data type of the questionnaire data to be processed can be stored in the related database in advance so as to be called in the subsequent data processing process.
Exemplary data types may include: classification data and quantification data. The data type of the questionnaire data to be processed may include the data type of the data corresponding to one or more questionnaire questions.
For example, still taking a research questionnaire about work satisfaction in table 1 above as an example, the data type of the questionnaire is shown in table 2 below, wherein the data type of the questionnaire questions Q1-Q10 corresponds to classification data (abbreviated as: classification in table 2); the data types of the data corresponding to questionnaire questions Q11-Q17 are quantitative data (abbreviated as quantitative data in Table 2).
TABLE 2 data types of questionnaire data
Figure BDA0002754101280000081
Figure BDA0002754101280000091
In this step, according to the data type of the questionnaire data to be processed, selecting a target statistical method from the one or more statistical methods determined in the above step 101 and applicable to the questionnaire data to be processed, and processing the questionnaire data to be processed by using the target statistical method, thereby obtaining a statistical result; wherein the target statistical method comprises at least one of the one or more statistical methods described above.
For example, for data corresponding to each questionnaire question in the questionnaire data to be processed, a target statistical method may be selected from the obtained one or more statistical methods according to the data type of the data corresponding to each questionnaire question to process the data corresponding to each questionnaire question, so as to obtain a statistical result of the data corresponding to each questionnaire question; meanwhile, aiming at the data corresponding to the multiple questionnaire questions in the questionnaire data to be processed, a target statistical method can be selected from the obtained one or more statistical methods according to the data types of the data corresponding to the multiple questionnaire questions to process the data corresponding to the multiple questionnaire questions, so that statistical results of the data corresponding to the multiple questionnaire questions are obtained; therefore, the landing effect of using a specific statistical method for the data corresponding to the specific questionnaire questions is achieved.
For example, if one or more statistical methods obtained in the above steps include: frequency analysis, correlation analysis, regression analysis, chi-square analysis and variance analysis are carried out by five statistical methods; aiming at data corresponding to one questionnaire question, when the data type of the data corresponding to a certain questionnaire question is classified data, frequency analysis can be selected to process the data corresponding to the questionnaire question to obtain a statistical result of the data corresponding to the questionnaire question; when the data type of the data corresponding to a certain questionnaire question is quantitative data, the data corresponding to the questionnaire question can be processed by selecting correlation analysis and/or regression analysis to obtain a statistical result of the data corresponding to the questionnaire question. Meanwhile, aiming at the data corresponding to the two questionnaire questions, when the data type of the data corresponding to one questionnaire question is classified data and the data type of the data corresponding to the other questionnaire question is quantitative data, analysis of variance can be selected to process the data corresponding to the two questionnaire questions to obtain the statistical result of the data corresponding to the two questionnaire questions; when the data types of the data corresponding to the two questionnaire questions are quantitative data, processing the data corresponding to the two questionnaire questions by selecting regression analysis to obtain a statistical result of the data corresponding to the two questionnaire questions; when the data types of the data corresponding to the two questionnaire questions are classified data, chi-square analysis can be selected to process the data corresponding to the two questionnaire questions, and statistical results of the data corresponding to the two questionnaire questions are obtained.
In a possible implementation manner, in this step, the selecting a target statistical method from the one or more statistical methods to process the questionnaire data to be processed according to the data type of the questionnaire data to be processed to obtain a statistical result, which may include: selecting a target statistical method corresponding to the data type of the questionnaire data to be processed from the one or more statistical methods according to the corresponding relation between the data type and the statistical method; and processing the questionnaire data to be processed according to the target statistical method to obtain a statistical result.
In this step, one or more corresponding statistical methods can be automatically selected as a target statistical method according to the corresponding relationship between different data types and different statistical methods and according to the data type of the questionnaire data to be processed, and the questionnaire data to be processed is automatically processed by using the target statistical method to obtain a statistical result. For example, the correspondence between the data type and the statistical method may include a correspondence between classification data and each statistical method, a correspondence between quantitative data and each statistical method, and a correspondence between quantitative data and classification data and each statistical method.
For example, still taking the study questionnaire in table 1 above regarding work satisfaction as an example, the one or more statistical methods determined by the trained neural network include: frequency analysis, description analysis, variance analysis, correlation analysis, linear regression analysis (entry method), linear regression analysis (step-by-step method) and chi-square analysis. Table 3 shows the corresponding relationship between the data types and the seven statistical methods, and as shown in table 3, the target statistical method corresponding to the classification data (abbreviated as classification in table 3) is frequency analysis, the target statistical method corresponding to the quantitative data (abbreviated as quantitative in table 3) is description analysis, the target statistical method corresponding to the classification data and the quantitative data is variance analysis, the target statistical method between the two quantitative data is correlation analysis or linear regression analysis (entry method) or linear regression analysis (step-by-step method), and the target statistical method between the two classification data is chi-square analysis.
TABLE 3 data type and statistical method correspondence
Figure BDA0002754101280000101
Figure BDA0002754101280000111
With reference to table 3, the to-be-processed questionnaire data of the corresponding data type is processed by using a target statistical method to obtain a statistical result, and table 4 shows a situation of analyzing and processing data corresponding to each questionnaire question in the to-be-processed questionnaire data. As shown in Table 4, the data types of the questionnaire questions Q1-Q10 corresponding to the data are classified data (abbreviated as classification in Table 4), and frequency analysis is adopted; the data types of the data corresponding to the questionnaire questions Q11-Q17 are quantitative data (short for quantitative data in Table 4), and descriptive analysis is adopted; data corresponding to questionnaire questions Q1 to Q10 may be subjected to analysis of variance with data corresponding to questionnaire questions Q11 to Q17, respectively, and so on.
TABLE 4 analysis and processing table of data corresponding to each questionnaire question
Figure BDA0002754101280000112
And 103, automatically interpreting the statistical result according to a statistical index to obtain questionnaire interpretation information, wherein the numerical value of the statistical index reflects the characteristics of the statistical result.
In the step, each statistical result obtained in the step 102 is automatically decoded through a statistical index corresponding to each target statistical method; according to different values of the statistical indexes, different questionnaire interpretation information is automatically obtained, and therefore the requirements of non-professional analysts are met. Wherein, the statistical index may include: common statistical indicators, such as: mean, percentage, standard deviation, R-squared, P-value, etc., which are not limited by the disclosed embodiments.
Illustratively, the statistical results obtained by processing the same target statistical method are summarized, and then the summarized statistical results are automatically interpreted by using the statistical indexes corresponding to the target statistical method. For example, if the target statistical method is chi-square analysis, the statistical result of the data corresponding to each questionnaire question obtained by chi-square analysis may be automatically interpreted by a P value, and a P value less than 0.05 may indicate a difference or influence relationship between the data; if the target statistical method is linear regression analysis, the statistical results of the data corresponding to the questionnaire questions obtained by the linear regression analysis can be automatically interpreted through a P value and an R square value, wherein the R square value can represent the explanation strength of the analysis item X on the analysis item Y.
For example, still taking the above-mentioned questionnaire of table 1 for work satisfaction as an example, the significance probability P value, chi-square value (χ) corresponding to chi-square analysis is used2Value) was automatically interpreted from the statistical results obtained by chi-square analysis in table 4 above to obtain questionnaire interpretation information. Table 5 shows the statistical results of the chi-square analysis.
TABLE 5 statistics of chi-square analysis
Figure BDA0002754101280000121
Figure BDA0002754101280000131
As can be seen from the statistics of chi-square analysis in table 5, there are differences between gender, school calendar and leisure activity type after work, and the following questionnaire interpretation information is automatically generated while table 5 is automatically generated:
using chi-square test to study which kind of activities are most preferred after work at ordinary times? For gender, the study has 2 different relationships, as can be seen from table 5 above: for "which kind of activities are most preferred after work at ordinary times? "samples were significant for gender, 2 items of school calendar (P <0.05), meaning" which activities were most preferred at work at ordinary times? "the sample shows difference for 2 items of gender and school calendar, and the specific comparison percentage is compared with difference.
For "which kind of activities are most preferred after work at ordinary times? ", exhibited 0.01 level significance (χ) for gender2=14.909,P=0.005<0.01), the proportion of the male selecting to watch TV is 15.11 percent and is obviously higher than the selection proportion of the female by 5.49 percent according to the percentage comparison difference. The proportion of leisure entertainment such as KTV singing is 57.93% for women, significantly higher than 45.78% for men.
For "which kind of activities are most preferred after work at ordinary times? "shows significance at the 0.05 level (χ) for the academic calendar2=18.263,P=0.019<0.05) and by percentage comparison, the other proportion of 43.75% selected by high school and the following school calendar would be significantly higher than the average level of 16.45%.
In conclusion, the following steps are carried out: for "which kind of activities are most preferred after work at ordinary times? "samples all showed significant differences for gender, school calendar.
In a possible implementation manner, the automatically interpreting the statistical result according to the statistical indicator to obtain questionnaire interpretation information in this step may include: screening the statistical result to obtain a screened statistical result; and automatically interpreting the screened statistical result according to the one or more statistical indexes to obtain questionnaire interpretation information.
In the embodiment of the disclosure, in consideration of the fact that the data volume of the statistical results obtained in the above steps is usually large, but not every statistical result is valuable, the statistical results are automatically screened, and then the screened statistical results are automatically interpreted by using one or more statistical indexes to obtain questionnaire interpretation information, so that the requirements of non-professional analysts are met, and meanwhile, the computing resources are effectively saved.
In a possible implementation manner, the filtering the statistical result to obtain a filtered statistical result may include: screening the statistical result according to the evaluation index to obtain a screened statistical result; wherein the value of the evaluation index reflects the value of the statistical result.
In the embodiment of the present disclosure, the statistical result obtained above is automatically screened by using the evaluation index, so as to obtain the screened statistical result. The evaluation index may include common statistical indexes, such as: AIC values, BIC values, root mean square error, R-square values, P values, etc., which are not limited in this disclosure. In this way, statistical results that are valuable for statistical analysis can be automatically screened out.
In a possible implementation manner, the screening the statistical result according to the evaluation index to obtain a screened statistical result may include: screening out the optimal statistical result from the statistical results according to the evaluation indexes related to the research target to obtain the screened statistical result; wherein the evaluation index related to the research objective comprises: at least one of AIC value, BIC value, R-square value, error root mean square; and/or screening out the optimal statistical result from the statistical results according to the evaluation indexes related to the value evaluation to obtain the screened statistical result; wherein the evaluation index related to value evaluation comprises: significance probability P value.
In the embodiment of the present disclosure, the automatic screening of the statistical result relates to two aspects, which are respectively: "statistical result selection for the same research objective" and "filtering out valuable content", and accordingly, the evaluation indexes may include evaluation indexes related to the research objective and evaluation indexes related to value evaluation;
aiming at 'selection of statistical results of the same research target', screening out the optimal statistical result from the statistical results according to the evaluation indexes related to the research target; illustratively, if the study objective is the effect of X on Y, it is likely that both linear regression analysis and logistic regression analysis will be used to achieve the study objective, but the statistical results obtained by a particular statistical method are better? At this time, indexes such as AIC, BIC values, R square values, error root mean square and the like can be selected as evaluation indexes to be screened and filtered, and statistical results corresponding to linear regression analysis or logistic regression analysis are screened out to be used as optimal statistical results.
Aiming at the filtered valuable contents, screening out the optimal statistical result from the statistical results according to the evaluation indexes related to the value evaluation; considering that many statistics are correct but are not meaningful, such statistics do not need to be further processed, and the statistics can be filtered by combining with a standard common to statistics as an evaluation index. Illustratively, a statistically significant P value may be used as the evaluation index, and a P value greater than 0.05 indicates no difference or influence between the data of the statistical result, so that the statistical result may be filtered.
For example, taking a questionnaire about work satisfaction in table 1 as an example, the evaluation index is used to automatically screen each statistical result obtained in table 4 to obtain the screened statistical result.
For "statistical result selection of the same research target", in table 4 above, the linear regression (entry method) and the linear regression analysis (step-by-step method) have the same processing content, and the corresponding research targets are the same, and only the processing modes are different; at this time, it can be judged by using the F test value, the root mean square error, AIC, BIC, R square value, etc. as evaluation indexes, which statistical method is adopted to obtain a more optimal statistical result, thereby obtaining the statistical result after screening.
Regarding "filtering out valuable contents", the variance analysis in table 4 above using the P value as an evaluation index is actually performed 70 times, but many statistical results have no meaning (for example, the P value is greater than 0.05), and in this case, the statistical results with the P value greater than 0.05 can be filtered out. In the linear regression analysis in table 4 above, statistics were filtered if they all had P values greater than 0.05 or model F tests were meaningless. In the chi-square analysis of table 4 above, if the P value of the statistical result is less than 0.05, indicating that there is a discrepancy, the statistical result is retained.
In one possible implementation, the method may further include: generating an analysis report of the questionnaire data to be processed according to the questionnaire interpretation information and the statistical result; and sending the analysis report to a user terminal so as to enable the user terminal to display the analysis report.
In the embodiment of the present disclosure, on the basis of the obtained questionnaire interpretation information and statistical result, an analysis report including the questionnaire interpretation information and the statistical result is automatically generated, and exemplarily, the filtered statistical result and the questionnaire interpretation information may be summarized according to a target statistical method, so that each statistical result has an automatic text interpretation function, and is convenient for a user to visually read and use.
Compared with the current questionnaire platform, non-professionals can only use the common summarizing function, and professionals complete the value mining of questionnaire data by means of professional software. According to the embodiment of the disclosure, a target statistical method suitable for questionnaire data is automatically selected, statistical results of statistical index automatic screening are combined, questionnaire interpretation information is automatically generated, and an analysis report is automatically generated, so that the analysis report which embodies the internal value of questionnaire data is output and presented in simple and understandable characters, the technical problem that non-professional analysts cannot independently complete the internal value of questionnaire data is solved, the data value is conveniently and quickly mined, and a datamation scientific decision is provided for the non-professional analysts. Meanwhile, through the analysis report, the mental labor of professional analysts can be greatly reduced, and the analysis efficiency of the professional analysts is effectively improved.
The analysis report may be stored in the related data using a JSON format, and is fed back to the user terminal by the server for presentation, and the user terminal may present the analysis report according to a preset table (or a graph).
It should be noted that, in the questionnaire data processing method in the embodiment of the present disclosure, mining may also be performed by using a data mining algorithm, for example, processing questionnaire data to be processed by using an association rule, a decision tree, a random forest, and other algorithms, and automatically generating an analysis report; the questionnaire data to be processed can be automatically processed in a mode of setting a fixed template, and an analysis report can be automatically generated; the relevant content for processing the questionnaire data to be processed can refer to the above step 101-103, which is not described herein again.
It should be noted that, although the above embodiments are taken as examples to describe a questionnaire data processing method, those skilled in the art can understand that the present disclosure should not be limited thereto. In fact, the user can flexibly set each implementation mode according to personal preference and/or actual application scene, as long as the technical scheme of the disclosure is met.
Thus, in the disclosed embodiments, the statistical method or statistical methods applicable to the questionnaire data to be processed are determined; according to the data type of the questionnaire data to be processed, selecting a target statistical method from one or more statistical methods to process the questionnaire data to be processed to obtain a statistical result; automatically reading the statistical result according to the statistical index to obtain questionnaire reading information; wherein the value of the statistical indicator reflects the characteristics of the statistical result. Therefore, automatic processing of questionnaire data is realized, the value of the questionnaire data is conveniently and quickly mined, a digitalized scientific decision is provided for non-professional analysts, and the requirement of the non-professional analysts for deeply mining the value of the questionnaire data is met; meanwhile, analysis assistance can be provided for professional analysts, mental labor of the professional analysts can be greatly reduced, and the analysis efficiency of the professional analysts is effectively improved.
Fig. 2 shows a block diagram of a questionnaire data processing apparatus according to an embodiment of the present disclosure. As shown in fig. 2, the apparatus may include: a statistical method determination module 41, configured to determine one or more statistical methods applicable to the questionnaire data to be processed; the data processing module 42 is configured to select a target statistical method from the one or more statistical methods to process the questionnaire data to be processed according to the data type of the questionnaire data to be processed, so as to obtain a statistical result; an automatic interpretation module 43, configured to automatically interpret the statistical result according to the statistical indicator to obtain questionnaire interpretation information; wherein the value of the statistical indicator reflects the characteristics of the statistical result.
In a possible implementation manner, the statistical method determining module 41 is further configured to: and inputting the questionnaire data to be processed into a trained neural network to obtain the one or more statistical methods.
In a possible implementation manner, the data processing module 42 is further configured to: selecting a target statistical method corresponding to the data type of the questionnaire data to be processed from the one or more statistical methods according to the corresponding relation between the data type and the statistical method; and processing the questionnaire data to be processed according to the target statistical method to obtain a statistical result.
In a possible implementation manner, the automatic interpretation module 43 is further configured to: screening the statistical result to obtain a screened statistical result; and automatically interpreting the screened statistical result according to the one or more statistical indexes to obtain questionnaire interpretation information.
In a possible implementation manner, the automatic interpretation module 43 is further configured to: screening the statistical result according to the evaluation index to obtain a screened statistical result; wherein the value of the evaluation index reflects the value of the statistical result.
In one possible implementation manner, the evaluation index includes: at least one of an Akaichi information criterion AIC value, a Bayesian information criterion BIC value, a decision coefficient R square value, an error root mean square and a significance probability P value; the statistical indexes comprise: at least one of R-square, P-value, mean, percentage, standard deviation.
In one possible implementation, the apparatus further includes: the report module is used for generating an analysis report of the questionnaire data to be processed according to the questionnaire interpretation information and the statistical result; and sending the analysis report to a user terminal so as to enable the user terminal to display the analysis report.
It should be noted that, although the above embodiments are described as examples of the questionnaire data processing apparatus, those skilled in the art will understand that the present disclosure should not be limited thereto. In fact, the user can flexibly set each implementation mode according to personal preference and/or actual application scene, as long as the technical scheme of the disclosure is met.
Thus, in the disclosed embodiments, the statistical method or statistical methods applicable to the questionnaire data to be processed are determined; according to the data type of the questionnaire data to be processed, selecting a target statistical method from one or more statistical methods to process the questionnaire data to be processed to obtain a statistical result; and automatically interpreting the statistical result according to the statistical index to obtain questionnaire interpretation information, wherein the numerical value of the statistical index reflects the characteristics of the statistical result. Therefore, automatic processing of questionnaire data is realized, the value of the questionnaire data is conveniently and quickly mined, a digitalized scientific decision is provided for non-professional analysts, and the requirement of the non-professional analysts for deeply mining the value of the questionnaire data is met; meanwhile, analysis assistance can be provided for professional analysts, mental labor of the professional analysts can be greatly reduced, and the analysis efficiency of the professional analysts is effectively improved.
The embodiment of the present disclosure further provides a questionnaire data processing apparatus, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the above method.
The disclosed embodiments also provide a non-transitory computer-readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the above-described method.
Fig. 3 shows a block diagram of a questionnaire data processing apparatus 1900 according to an embodiment of the present disclosure. For example, the apparatus 1900 may be provided as a server. Referring to fig. 3, the device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the above-described method.
The device 1900 may also include a power component 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and an input/output (I/O) interface 1958. The device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the apparatus 1900 to perform the above-described methods.
The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A questionnaire data processing method characterized by comprising:
determining one or more statistical methods applicable to the questionnaire data to be processed;
according to the data type of the questionnaire data to be processed, selecting a target statistical method from the one or more statistical methods to process the questionnaire data to be processed to obtain a statistical result;
automatically reading the statistical result according to the statistical index to obtain questionnaire reading information; wherein the value of the statistical indicator reflects the characteristics of the statistical result.
2. The method of claim 1, wherein determining one or more statistical methods applicable to the questionnaire data to be processed comprises:
and inputting the questionnaire data to be processed into a trained neural network to obtain the one or more statistical methods.
3. The method according to claim 1, wherein the selecting a target statistical method from the one or more statistical methods to process the questionnaire data to be processed according to the data type of the questionnaire data to be processed to obtain a statistical result comprises:
selecting a target statistical method corresponding to the data type of the questionnaire data to be processed from the one or more statistical methods according to the corresponding relation between the data type and the statistical method;
and processing the questionnaire data to be processed according to the target statistical method to obtain a statistical result.
4. The method according to claim 1, wherein automatically interpreting the statistical result according to a statistical index to obtain questionnaire interpretation information comprises:
screening the statistical result to obtain a screened statistical result;
and automatically interpreting the screened statistical result according to the one or more statistical indexes to obtain questionnaire interpretation information.
5. The method of claim 4, wherein the screening the statistical result to obtain the screened statistical result comprises:
screening the statistical result according to the evaluation index to obtain a screened statistical result; wherein the value of the evaluation index reflects the value of the statistical result.
6. The method according to claim 5, wherein the evaluation index includes: at least one of an Akaichi information criterion AIC value, a Bayesian information criterion BIC value, a decision coefficient R square value, an error root mean square and a significance probability P value;
the statistical indexes comprise: at least one of R-square, P-value, mean, percentage, standard deviation.
7. The method of claim 1, further comprising:
generating an analysis report of the questionnaire data to be processed according to the questionnaire interpretation information and the statistical result;
and sending the analysis report to a user terminal so as to enable the user terminal to display the analysis report.
8. A questionnaire data processing apparatus characterized by comprising:
a statistical method determination module for determining one or more statistical methods applicable to the questionnaire data to be processed;
the data processing module is used for selecting a target statistical method from the one or more statistical methods to process the questionnaire data to be processed according to the data type of the questionnaire data to be processed to obtain a statistical result;
the automatic interpretation module is used for automatically interpreting the statistical result according to the statistical index to obtain questionnaire interpretation information; wherein the value of the statistical indicator reflects the characteristics of the statistical result.
9. A questionnaire data processing apparatus characterized by comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any one of claims 1 to 7 when executing the memory-stored executable instructions.
10. A non-transitory computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the method of any of claims 1 to 7.
CN202011196272.5A 2020-10-30 2020-10-30 Questionnaire data processing method and device and storage medium Pending CN112257418A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011196272.5A CN112257418A (en) 2020-10-30 2020-10-30 Questionnaire data processing method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011196272.5A CN112257418A (en) 2020-10-30 2020-10-30 Questionnaire data processing method and device and storage medium

Publications (1)

Publication Number Publication Date
CN112257418A true CN112257418A (en) 2021-01-22

Family

ID=74267573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011196272.5A Pending CN112257418A (en) 2020-10-30 2020-10-30 Questionnaire data processing method and device and storage medium

Country Status (1)

Country Link
CN (1) CN112257418A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010742A (en) * 2021-03-01 2021-06-22 歌尔微电子股份有限公司 Data processing method, device, equipment and medium
CN113781123A (en) * 2021-09-15 2021-12-10 北京有竹居网络技术有限公司 Questionnaire data processing method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145028A (en) * 2017-06-27 2019-01-04 中国石油化工股份有限公司 The statistical analysis system and method for a kind of refinery checking maintenance contractor security capabilities
CN110069550A (en) * 2019-04-23 2019-07-30 深圳市承儒科技有限公司 One kind is based on the associated statistical analysis technique of education and cloud platform system
CN110990461A (en) * 2019-12-12 2020-04-10 国家电网有限公司大数据中心 Big data analysis model algorithm model selection method and device, electronic equipment and medium
CN111144902A (en) * 2019-12-13 2020-05-12 深圳中兴飞贷金融科技有限公司 Questionnaire data processing method and device, storage medium and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145028A (en) * 2017-06-27 2019-01-04 中国石油化工股份有限公司 The statistical analysis system and method for a kind of refinery checking maintenance contractor security capabilities
CN110069550A (en) * 2019-04-23 2019-07-30 深圳市承儒科技有限公司 One kind is based on the associated statistical analysis technique of education and cloud platform system
CN110990461A (en) * 2019-12-12 2020-04-10 国家电网有限公司大数据中心 Big data analysis model algorithm model selection method and device, electronic equipment and medium
CN111144902A (en) * 2019-12-13 2020-05-12 深圳中兴飞贷金融科技有限公司 Questionnaire data processing method and device, storage medium and electronic equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010742A (en) * 2021-03-01 2021-06-22 歌尔微电子股份有限公司 Data processing method, device, equipment and medium
CN113010742B (en) * 2021-03-01 2023-03-21 歌尔微电子股份有限公司 Data processing method, device, equipment and medium
CN113781123A (en) * 2021-09-15 2021-12-10 北京有竹居网络技术有限公司 Questionnaire data processing method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US20150256475A1 (en) Systems and methods for designing an optimized infrastructure for executing computing processes
US20140053069A1 (en) Identifying and mitigating risks in contract document using text analysis with custom high risk clause dictionary
EP2618296A1 (en) Social media data analysis system and method
EP2648152A1 (en) Data solutions system
US20200401580A1 (en) Interaction between visualizations and other data controls in an information system by matching attributes in different datasets
US9558245B1 (en) Automatic discovery of relevant data in massive datasets
CN111680165B (en) Information matching method and device, readable storage medium and electronic equipment
CN112257418A (en) Questionnaire data processing method and device and storage medium
US10885055B2 (en) Automated data enrichment and signal detection for exploring dataset values
CN108829716B (en) Conference agenda generation method and device for conference to be held
US10620783B2 (en) Using social data to assist editors in addressing reviewer feedback in a document review workflow
US10057358B2 (en) Identifying and mapping emojis
US20130259362A1 (en) Attribute cloud
AU2013202482A1 (en) Determining local calculation configurations in an accounting application through user contribution
CN109447694B (en) User characteristic analysis method and system
JP2021500639A (en) Prediction engine for multi-step pattern discovery and visual analysis recommendations
CN111435369A (en) Music recommendation method, device, terminal and storage medium
US9785404B2 (en) Method and system for analyzing data in artifacts and creating a modifiable data network
US11763070B2 (en) Method and system for labeling and organizing data for summarizing and referencing content via a communication network
US20160162801A1 (en) Quick Path to Train, Score, and Operationalize a Machine Learning Project
US20150170068A1 (en) Determining analysis recommendations based on data analysis context
CN115393034A (en) Method for carrying out risk identification on enterprise account based on natural language processing technology
Tran et al. Detecting Filter Bubbles in Ongoing News Stories.
CN114330720A (en) Knowledge graph construction method and device for cloud computing and storage medium
CN114090601A (en) Data screening method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination