Disclosure of Invention
The invention aims to provide a data analysis management system and method based on an artificial intelligence cloud platform, which are used for solving the problems in the background technology.
In order to solve the technical problems, the invention provides the following technical scheme: the system comprises a data acquisition module, a data processing module, an operation management module and a data storage module.
The data acquisition module is used for acquiring the received application file information, the evaluation account information and the evaluation process information; the data processing module is used for processing the application file information, classifying according to application types, calculating the text similarity of each application file and distributing the text similarity to different evaluation accounts; comprehensively analyzing and evaluating account information and evaluation process information, and calculating the reliability of the application file; the running management module adjusts the state of the corresponding application file according to the reliability, and determines the file flow direction through the file state; the data storage module is used for carrying out backup storage on all the information.
The data acquisition module comprises a file information acquisition unit, an account information acquisition unit and a rating information acquisition unit.
The file information acquisition unit is used for acquiring type information, content information and state information of all application files. The type information refers to the application type of the application file, the content information refers to the main text in the application file, and the state information refers to the current state of the application file, including the state to be evaluated, the state to be evaluated and the state to be re-evaluated.
The system receives in real time a large number of different types of application files, which may have not yet been rated, or which may have been rated to be re-rated. Each application file contains a large amount of text information, and different types of application file formats are different.
The account information acquisition unit is used for acquiring the historical misjudgment rate of the estimated account, the estimated account state, the type of the application file of the last evaluation and the similarity. The historical misjudgment rate refers to the ratio of the number of evaluation errors in the historical evaluation application files of the corresponding evaluation account to the total number of the evaluation application files, and the evaluation account state comprises an idle state and a busy state.
The evaluation information acquisition unit is used for acquiring the evaluation time and the evaluation result of the evaluation application file. The evaluation time refers to the time it takes for the evaluation account to evaluate from the beginning of the evaluation to the end of the evaluation, and the evaluation result includes pass and reject.
The evaluation object uses the evaluation account to perform evaluation work, and the evaluation result is passed or rejected. Because the evaluation link is one link in the whole business process, the evaluation result can influence other links whether passing or refusing. When the evaluation result is passing, the application file flows into the next link; and when the assessment result is refusal, the application file flows back to the last link. When the next link or the last link feeds back the erroneous judgment of the link, the system automatically distributes the corresponding application files in a classified mode to carry out re-assessment. If the subsequent final evaluation result is not consistent with the previous evaluation result, the previous evaluation error is described, the ratio of the number of evaluation errors in the historical evaluation application files of each evaluation account to the total number of the evaluation application files is counted as the historical misjudgment rate, and the higher the historical misjudgment rate is, the greater the misjudgment possibility of the evaluation account is represented, and the lower the reliability degree of the evaluation result is.
The data processing module comprises a file classification dispatch unit and an assessment behavior analysis unit.
The file classification and dispatch unit is used for classifying the application files with all states of to-be-evaluated states or review states and dispatching the application files to the evaluation account in the idle state for evaluation. Firstly, classifying all application files according to type information, wherein the application files with the same type are one type; secondly, performing text similarity calculation on all application files in each class and reference files corresponding to the type, and binding the obtained similarity with the application files; and finally, judging the application files which should be dispatched at this time according to the type and the similarity of the application files which are evaluated by the evaluation account for the last time.
The type or similarity of the application file being dispatched should be different from the type or similarity of the file last assessed by the corresponding assessment account. When all the idle state evaluation accounts can be distributed to application files with different types, the distribution is directly carried out. When the evaluation account in the partial idle state cannot be distributed to the application files with different types, the partial evaluation account preferentially selects the application file with the largest similarity difference with the last evaluation file of the corresponding evaluation account in the corresponding class for distribution, and the rest evaluation accounts normally distribute the application files with different types.
The evaluation behavior analysis unit is used for calculating the reliability of the evaluation process. Substituting the historical misjudgment rate, the evaluation time and the evaluation result of the evaluation account into a formula, and calculating to obtain the reliability, wherein the reliability is bound with the corresponding application file.
The reliability determines the reliability of the assessment behavior. The higher the reliability, the higher the reliability of the assessment behavior, and the lower the reliability.
The operation management module comprises a state management unit and a flow direction management unit.
The state management unit is used for adjusting the states of the evaluation account and the application file. Evaluating account states includes a free state and a busy state. The idle state refers to the assessment account not being subject to assessment work, and the busy state refers to the assessment account being subject to assessment work. The states of the application file include a state to be evaluated, an evaluated state and a re-evaluation state. The state to be evaluated refers to that the application file is not evaluated, the evaluated state refers to that the application file is evaluated and the reliability is not in a reliability abnormal section, and the re-evaluation state refers to that the application file is evaluated and the reliability is in a reliability abnormal section.
When the evaluation account is evaluating the application file, the system automatically adjusts the state of the evaluation account to a busy state, and the system does not send any task any more in the state. When the assessment of the assessment account is finished, the system automatically adjusts the assessment account state to a idle state in which the system can dispatch tasks.
When the application file is dispatched for the first time and the evaluation is not completed yet, the system automatically sets the state of the corresponding application file as the state to be evaluated. And when the evaluation is finished, judging whether the reliability of the application file is in a reliability abnormal section, setting the corresponding application file state as a re-evaluation state if the reliability of the application file is in the reliability abnormal section, and setting the corresponding application file state as an evaluated state if the reliability of the application file is not in the reliability abnormal section.
The flow direction management unit is used for judging the flow direction according to the state of the application file. And when the state of the application file is the rated state, ending the evaluation, and sending the evaluation result to the next link. When the application file state is the re-evaluation state, the evaluation is not finished, the re-evaluation is needed, the corresponding application file is sent to the data processing module, and the re-classification distribution is carried out for evaluation.
The data storage module is used for storing the application file information, the evaluation account information, the assessment process information and the state adjustment information into a database for subsequent tracing operation.
A data analysis management method based on an artificial intelligence cloud platform comprises the following steps:
s1, classifying all application files and then distributing the application files to each evaluation account;
s2, evaluating the application file by the evaluation account, and collecting evaluation information;
s3, calculating reliability through the assessment information, and adjusting the state;
s4, defining a flow direction according to the state of the application file.
In S1, the classification and dispatch steps for the application file are as follows:
s101, collecting type information and content information of all application files; the type information refers to the application type of the application file, and the content information refers to the main text in the application file.
S102, acquiring type information of all application files, and dividing the application files with the same type into the same class, wherein one or more application files are arranged in each class.
S103, acquiring main body texts of all application files through content information. And simultaneously, acquiring the main text of the reference file corresponding to each class. And calculating the similarity between the main text of each application file and the main text of the reference file corresponding to the class of the application file, and binding the obtained similarity with the application file. The method comprises the following steps:
s103-1, word segmentation processing is carried out on the text content, and each text is converted into a word list.
S103-2, removing repeated words in each word list to obtain two different word sets.
S103-3, combining two different word sets to obtain a total word set.
S103-4, constructing a vector for each word in the total word set, wherein each dimension of the vector represents the number of times the word appears in the text. The similarity between the two vectors is calculated using the formula:
where cos (θ) is the similarity of the application documents, |a| represents the modulus of vector a, |b| represents the modulus of vector b, and a×b represents the inner product of vector a and vector b.
S104, obtaining the file types and the similarity of the last evaluation of all the evaluation accounts, finding out application files of different types from the application files, and judging whether all the evaluation accounts can be distributed to the application files of different types. If yes, directly distributing; if the result is no, the type of all the application files cannot meet the type requirement of all the evaluation accounts, and part of the evaluation accounts can be distributed with the application files with the same type as the last evaluation file. In this case, the part of the evaluation accounts preferentially select the application files with the largest similarity difference with the last evaluation file of the corresponding evaluation account in the corresponding class to be distributed, and the rest of the evaluation accounts normally distribute the application files with different types. The formula is as follows:
x=argmax{|n-s|:s∈S}
wherein argmax represents the similarity with the maximum difference between the last evaluation file similarity of the evaluation account, x represents the application file corresponding to argmax, n represents the last evaluation file similarity of the evaluation account, S represents the element in the corresponding class, and S represents the set of all the similarities in the corresponding class.
In S2, the evaluation information refers to information of all the evaluation accounts in the evaluation process, including a historical misjudgment rate, an evaluation time and an evaluation result of each evaluation account. The historical misjudgment rate refers to the ratio of the number of rating errors in the historical rating application files of the corresponding rating account to the total number of rating application files. The evaluation time refers to the time it takes for an evaluation account to evaluate from the beginning of the evaluation to the end of the evaluation. The assessment results include pass and reject.
In S2 and S3, the reliability calculation and file status setting steps are as follows:
s301, acquiring the historical misjudgment rates of all the evaluation accounts which are executing the assessment work, modifying the states of the evaluation accounts into busy states, and marking.
S302, when the marked evaluation account is detected to finish the evaluation work, the state of the corresponding evaluation account is modified to be in an idle state, marking is canceled, and meanwhile, the evaluation time and the evaluation result are collected.
S303, substituting the historical misjudgment rate, the evaluation time and the evaluation result of the evaluation account into a formula, and calculating to obtain the reliability, wherein the reliability is bound with the corresponding application file. The reliability calculation formula is as follows:
wherein KKD is the reliability of the application file; p is the type of the evaluation result, the result is the refuting value-1, and the result is the passing value 1; e is the influence weight of the historical misjudgment rate; h is the historical misjudgment rate of the estimated account; f is an evaluation time influence coefficient, and the value is greater than 1; t is the evaluation time; k is the reliability of the last assessment.
S304, judging whether the reliability of the application file is in a reliability abnormal section, if so, adjusting the state of the corresponding application file from the state to be evaluated to the re-evaluation state, and if not, adjusting the state of the corresponding application file from the state to be evaluated to the evaluated state. The judgment formula is as follows:
wherein result is the application file state adjustment result, K min K is the minimum value of the reliability abnormal interval max The reliability abnormal interval maximum value KKD is the application file reliability.
In S4, the states of the application file include an evaluated state and a review state. The rated state refers to the rated and reliable application files meeting the requirements, and the re-rated state refers to the rated and reliable application files not meeting the requirements.
And when the state of the application file is the rated state, ending the evaluation, and sending the evaluation result to the next link. When the state of the application file is the re-evaluation state, the re-evaluation is needed, the step is skipped to the step S1, and the re-classification dispatch is carried out for evaluation.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the method, each application file is classified and the text similarity is calculated in the dispatching stage, the application files with different file types or text similarity which are evaluated by the account last time are dispatched to each evaluation account, the situation that the evaluation account repeatedly evaluates the application files with the same type or text high similarity is avoided, and the probability of reducing misjudgment in repeated work is reduced.
2. The invention judges the reliability degree of the evaluation action by analyzing the historical misjudgment rate of the evaluation account and the evaluation time of the time, and defines the evaluation accuracy by judging whether the reliability is in the reliability abnormal section. Under the condition of abnormal reliability, the method and the system are automatically distributed to other evaluation accounts for re-evaluation, and a plurality of evaluation accounts evaluate the same application file, so that the accuracy of evaluation can be effectively improved.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the present invention provides the following technical solutions: the system comprises a data acquisition module, a data processing module, an operation management module and a data storage module.
The data acquisition module is used for acquiring the received application file information, the evaluation account information and the evaluation process information; the data processing module is used for processing the application file information, classifying according to application types, calculating the text similarity of each application file and distributing the text similarity to different evaluation accounts; comprehensively analyzing and evaluating account information and evaluation process information, and calculating the reliability of the application file; the running management module adjusts the state of the corresponding application file according to the reliability, and determines the file flow direction through the file state; the data storage module is used for carrying out backup storage on all the information.
The data acquisition module comprises a file information acquisition unit, an account information acquisition unit and an assessment information acquisition unit.
The file information acquisition unit is used for acquiring type information, content information and state information of all application files. The type information refers to the application type of the application file, the content information refers to the main text in the application file, and the state information refers to the current state of the application file, including the state to be evaluated, the state to be evaluated and the state to be re-evaluated.
The system receives in real time a large number of different types of application files, which may have not yet been rated, or which may have been rated to be re-rated. Each application file contains a large amount of text information, and different types of application file formats are different.
The account information acquisition unit is used for acquiring the historical misjudgment rate of the estimated account, the estimated account state, the type of the application file of the last evaluation and the similarity. The historical misjudgment rate refers to the ratio of the number of evaluation errors in the historical evaluation application files of the corresponding evaluation account to the total number of the evaluation application files, and the evaluation account state comprises an idle state and a busy state.
The evaluation information acquisition unit is used for acquiring the evaluation time and the evaluation result of the evaluation application file. The evaluation time refers to the time it takes for the evaluation account to evaluate from the beginning of the evaluation to the end of the evaluation, and the evaluation result includes pass and reject.
The evaluation object uses the evaluation account to perform evaluation work, and the evaluation result is passed or rejected. Because the evaluation link is one link in the whole business process, the evaluation result can influence other links whether passing or refusing. When the evaluation result is passing, the application file flows into the next link; and when the assessment result is refusal, the application file flows back to the last link. When the next link or the last link feeds back the erroneous judgment of the link, the system automatically distributes the corresponding application files in a classified mode to carry out re-assessment. If the subsequent final evaluation result is not consistent with the previous evaluation result, the previous evaluation error is described, the ratio of the number of evaluation errors in the historical evaluation application files of each evaluation account to the total number of the evaluation application files is counted as the historical misjudgment rate, and the higher the historical misjudgment rate is, the greater the misjudgment possibility of the evaluation account is represented, and the lower the reliability degree of the evaluation result is.
The data processing module comprises a file classification dispatch unit and an assessment behavior analysis unit.
The file classification and dispatch unit is used for classifying all application files with states to be evaluated or reevaluated and dispatching the application files to the evaluation account in the idle state for evaluation. Firstly, classifying all application files according to type information, wherein the application files with the same type are one type; secondly, performing text similarity calculation on all application files in each class and reference files corresponding to the type, and binding the obtained similarity with the application files; and finally, judging the application files which should be dispatched at this time according to the type and the similarity of the application files which are evaluated by the evaluation account for the last time.
The type or similarity of the application file being dispatched should be different from the type or similarity of the file last assessed by the corresponding assessment account. When all the idle state evaluation accounts can be distributed to application files with different types, the distribution is directly carried out. When the evaluation account in the partial idle state cannot be distributed to the application files with different types, the partial evaluation account preferentially selects the application file with the largest similarity difference with the last evaluation file of the corresponding evaluation account in the corresponding class for distribution, and the rest evaluation accounts normally distribute the application files with different types.
The assessment behavior analysis unit is used for calculating the reliability of the assessment process. Substituting the historical misjudgment rate, the evaluation time and the evaluation result of the evaluation account into a formula, and calculating to obtain the reliability, wherein the reliability is bound with the corresponding application file.
The reliability determines the reliability of the assessment behavior. The higher the reliability, the higher the reliability of the assessment behavior, and the lower the reliability.
The operation management module comprises a state management unit and a flow direction management unit.
The state management unit is used for adjusting the states of the evaluation account and the application file. Evaluating account states includes a free state and a busy state. The idle state refers to the assessment account not being subject to assessment work, and the busy state refers to the assessment account being subject to assessment work. The states of the application file include a state to be evaluated, an evaluated state and a re-evaluation state. The state to be evaluated refers to that the application file is not evaluated, the evaluated state refers to that the application file is evaluated and the reliability is not in a reliability abnormal section, and the re-evaluation state refers to that the application file is evaluated and the reliability is in a reliability abnormal section.
When the evaluation account is evaluating the application file, the system automatically adjusts the state of the evaluation account to a busy state, and the system does not send any task any more in the state. When the assessment of the assessment account is finished, the system automatically adjusts the assessment account state to a idle state in which the system can dispatch tasks.
When the application file is dispatched for the first time and the evaluation is not completed yet, the system automatically sets the state of the corresponding application file as the state to be evaluated. And when the evaluation is finished, judging whether the reliability of the application file is in a reliability abnormal section, setting the corresponding application file state as a re-evaluation state if the reliability of the application file is in the reliability abnormal section, and setting the corresponding application file state as an evaluated state if the reliability of the application file is not in the reliability abnormal section.
The flow direction management unit is used for judging the flow direction according to the state of the application file. And when the state of the application file is the rated state, ending the evaluation, and sending the evaluation result to the next link. When the application file state is the re-evaluation state, the evaluation is not finished, the re-evaluation is needed, the corresponding application file is sent to the data processing module, and the re-classification distribution is carried out for evaluation.
The data storage module is used for storing the application file information, the evaluation account information, the assessment process information and the state adjustment information into the database for subsequent tracing operation.
Referring to fig. 2, the present invention provides a technical solution, a data analysis management method based on an artificial intelligence cloud platform, the method includes the following steps:
s1, classifying all application files and then distributing the application files to each evaluation account;
s2, evaluating the application file by the evaluation account, and collecting evaluation information;
s3, calculating reliability through the assessment information, and adjusting the state;
s4, defining a flow direction according to the state of the application file.
In S1, the classification and dispatch steps for the application file are as follows:
s101, collecting type information and content information of all application files; the type information refers to the application type of the application file, and the content information refers to the main text in the application file.
S102, acquiring type information of all application files, and dividing the application files with the same type into the same class, wherein one or more application files are arranged in each class.
S103, acquiring main body texts of all application files through content information. And simultaneously, acquiring the main text of the reference file corresponding to each class. And calculating the similarity between the main text of each application file and the main text of the reference file corresponding to the class of the application file, and binding the obtained similarity with the application file. The method comprises the following steps:
s103-1, word segmentation processing is carried out on the text content, and each text is converted into a word list.
S103-2, removing repeated words in each word list to obtain two different word sets.
S103-3, combining two different word sets to obtain a total word set.
S103-4, constructing a vector for each word in the total word set, wherein each dimension of the vector represents the number of times the word appears in the text. The similarity between the two vectors is calculated using the formula:
where cos (θ) is the similarity of the application documents, |a| represents the modulus of vector a, |b| represents the modulus of vector b, and a×b represents the inner product of vector a and vector b.
S104, obtaining the file types and the similarity of the last evaluation of all the evaluation accounts, finding out application files of different types from the application files, and judging whether all the evaluation accounts can be distributed to the application files of different types. If yes, directly distributing; if the result is no, the type of all the application files cannot meet the type requirement of all the evaluation accounts, and part of the evaluation accounts can be distributed with the application files with the same type as the last evaluation file. In this case, the part of the evaluation accounts preferentially select the application files with the largest similarity difference with the last evaluation file of the corresponding evaluation account in the corresponding class to be distributed, and the rest of the evaluation accounts normally distribute the application files with different types. The formula is as follows:
x=argmax{|n-s|:s∈S}
wherein argmax represents the similarity with the maximum difference between the last evaluation file similarity of the evaluation account, x represents the application file corresponding to argmax, n represents the last evaluation file similarity of the evaluation account, S represents the element in the corresponding class, and S represents the set of all the similarities in the corresponding class.
In S2, the evaluation information refers to information of all the evaluation accounts in the evaluation process, including a historical misjudgment rate, an evaluation time and an evaluation result of each evaluation account. The historical misjudgment rate refers to the ratio of the number of rating errors in the historical rating application files of the corresponding rating account to the total number of rating application files. The evaluation time refers to the time it takes for an evaluation account to evaluate from the beginning of the evaluation to the end of the evaluation. The assessment results include pass and reject.
In S2 and S3, the reliability calculation and file status setting steps are as follows:
s301, acquiring the historical misjudgment rates of all the evaluation accounts which are executing the assessment work, modifying the states of the evaluation accounts into busy states, and marking.
S302, when the marked evaluation account is detected to finish the evaluation work, the state of the corresponding evaluation account is modified to be in an idle state, marking is canceled, and meanwhile, the evaluation time and the evaluation result are collected.
S303, substituting the historical misjudgment rate, the evaluation time and the evaluation result of the evaluation account into a formula, and calculating to obtain the reliability, wherein the reliability is bound with the corresponding application file. The reliability calculation formula is as follows:
wherein KKD is the reliability of the application file; p is the type of the evaluation result, the result is the refuting value-1, and the result is the passing value 1; e is the influence weight of the historical misjudgment rate; h is the historical misjudgment rate of the estimated account; f is an evaluation time influence coefficient, and the value is greater than 1; t is the evaluation time; k is the reliability of the last assessment.
S304, judging whether the reliability of the application file is in a reliability abnormal section, if so, adjusting the state of the corresponding application file from the state to be evaluated to the re-evaluation state, and if not, adjusting the state of the corresponding application file from the state to be evaluated to the evaluated state. The judgment formula is as follows:
wherein result is the application file state adjustment result, K min K is the minimum value of the reliability abnormal interval max The reliability abnormal interval maximum value KKD is the application file reliability.
In S4, the states of the application file include an evaluated state and a review state. The rated state refers to the rated and reliable application files meeting the requirements, and the re-rated state refers to the rated and reliable application files not meeting the requirements.
And when the state of the application file is the rated state, ending the evaluation, and sending the evaluation result to the next link. When the state of the application file is the re-evaluation state, the re-evaluation is needed, the step is skipped to the step S1, and the re-classification dispatch is carried out for evaluation.
Embodiment one:
assuming that the historical misjudgment rates of the A, B and C three evaluation accounts are 5%, 10% and 15% respectively, the three evaluation accounts are respectively assigned with X, Y and Z application files of different types, and after the evaluation is completed, the following results are obtained:
a: evaluation time: evaluation results were carried out for 120 seconds: through;
b: evaluation time: 150 seconds, evaluation result: performing revetment;
c: evaluation time: evaluation results 80 seconds: through;
assuming that the influence weight of the historical misjudgment rate is 10, the influence weight of the evaluation time is 1.2, the reliability abnormal interval is [ -30, +30], and the reliability of the application files corresponding to A, B and C is:
X:
Y:
Z:
the X state is evaluated, the evaluation result is passed, and the evaluation is finished; the Y and Z states are reevaluated and the evaluation is continued;
and respectively distributing the Y and the Z to the C and the A evaluation accounts for evaluation, and obtaining the following products after the evaluation is completed:
c: evaluation time: evaluation results were carried out for 120 seconds: performing revetment;
a: evaluation time: evaluation results were carried out for 60 seconds: performing revetment;
the reliability of the application files corresponding to C and A is:
Y:
Z:
the Y state is evaluated, the evaluation result is refused, and the evaluation is finished; z state is reevaluation, and evaluation is continued;
and (3) distributing the Z to the evaluation account B for evaluation, and obtaining:
b: evaluation time: evaluation results were carried out for 120 seconds: performing revetment;
the reliability of the application file corresponding to B is:
B:
and the Z state is evaluated, the evaluation result is refused, and the evaluation is finished.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.