Data screening method and data screening system for achievement transfer transformation
Technical Field
The invention relates to the field of analysis systems, in particular to a data screening method and a data screening system for achievement transfer and transformation.
Background
The concept of scientific and technological achievement transformation can be divided into a broad sense and a narrow sense. The generalized scientific and technological achievement transformation comprises application of various achievements, improvement of the quality of workers, enhancement of skills, increase of efficiency and the like. Since science and technology is the first productivity, productivity includes people, production tools, and labor objects. Thus, this potential productivity of science and technology is translated into direct productivity, ultimately through increased human quality, improved production tools and labor objectives. In this sense, the generalized scientific and technological achievement transformation means that the scientific and technological achievement is transferred from creation to use, so that the quality, skill or knowledge of workers in the use area is increased, the labor tools are improved, the labor efficiency is improved, and the economy is developed. The narrow-sense scientific and technological achievement transformation actually only refers to the transformation of technical achievements, namely, the innovative technical achievements are transferred from scientific research units to production departments, so that new products are increased, the process is improved, the benefit is improved, and finally the economy is improved. The conversion of scientific and technological achievements generally refers to the conversion of the type, and the conversion rate of the scientific and technological achievements refers to the ratio of the application number of the technical achievements to the total number of the technical achievements.
At present, the conversion of scientific and technological achievements mainly depends on an intermediary organization for docking, or a customer who needs to buy the scientific and technological achievements finds an inventor himself or an inventor who needs to sell the scientific and technological achievements actively finds an enterprise for cooperation or sale, and the docking success rate is very low under the influence of the veins and the regions. In contrast, although people who search for docks on the internet can get rid of the limitation of regions, the technical data volume is quite large, the difficulty is very high for users without search experience, and the efficiency is extremely low.
In view of this, the present application is specifically made.
Disclosure of Invention
The first purpose of the present invention is to provide a data screening method for achievement transfer transformation, which can effectively improve the technical document retrieval efficiency, ensure the accuracy of the target document, be very suitable for users without retrieval experience, greatly reduce the retrieval burden, contribute to improving the technical document matching efficiency and matching accuracy, and have positive significance for promoting achievement transfer transformation.
The second objective of the present invention is to provide a data screening system for achievement transfer transformation, which can effectively improve the search efficiency of technical documents, ensure the accuracy of target documents, be very suitable for users without search experience, greatly reduce the search burden, help to improve the matching efficiency and matching accuracy of technical documents, and have positive significance for promoting achievement transfer transformation.
The embodiment of the invention is realized by the following steps:
a data screening method for outcome transfer transformation, comprising the steps of:
s1, searching the technical files according to the initial keywords, sequencing the search results according to the relevancy, taking N technical files with the highest relevancy as the initial results, and executing the step S2;
s2, taking the browsed and un-clicked technical files as a first classification, taking the browsed and clicked technical files as a second classification, and executing the step S3;
s3, sorting the unviewed initial results according to the relevance of the unviewed initial results and the first classification, eliminating technical files with high relevance, wherein the elimination amount is half of the unviewed initial results, and performing the step S4 by taking the difference between the relevance of the remaining technical files according to the first classification and 100% as the relevance;
s4, supplementing the technical files with the previous relevance in the rest retrieval results in the step S1 as correction results, setting the supplement amount to be N parts of the technical files of the correction results, analyzing the supplemented technical files according to the relevance between the technical files and the initial keywords and the relevance between the technical files and the second classification, and executing the step S5;
and S5, reordering the correction results according to the correlation values.
Further, in step S4, analyzing the supplemented technical document by relevance to the initial keyword and the second category includes: the sum of 50% of the degree of correlation with the initial keyword and 50% of the degree of correlation with the second classification is taken as the new degree of correlation.
Further, step S2 includes taking the viewed and click-viewed but not-collected technical documents as a third category; in step S4, analyzing the relevancy of the supplemented technical document further includes: the new correlation was obtained by subtracting 10% of the correlation with the third classification.
Further, step S5 further includes: step S2 is re-executed.
Further, the discrimination keyword in the second classification and the discrimination keyword in the third classification are set as the first reference keyword, and in step S4, the degree of correlation with the second classification is the degree of correlation with the first reference keyword.
Further, the discrimination keyword in the third classification and the discrimination keyword in the second classification are set as the second reference keyword, and in step S4, the degree of correlation with the third classification is the degree of correlation with the second reference keyword.
Further, the value range of N is 80-150.
Further, in step S3, when the number of half of the unviewed initial results is less than or equal to 10, all the unviewed initial results are rejected.
Further, in step S3, when the number of half of the unviewed initial results is greater than or equal to 30, the culling number is set to 40.
A data screening system for outcome transfer transformation, comprising: the device comprises a retrieval module, a classification module, a rejection module, a supplement module and a sorting module.
The retrieval module is used for retrieving the technical files according to the initial keywords, sequencing the retrieval results according to the relevancy and taking N technical files with the highest relevancy as the initial results.
The classification module is used for classifying the browsed technical files which are not clicked and viewed into a first classification, and classifying the browsed technical files which are clicked, viewed and collected into a second classification.
The removing module is used for sorting the unviewed initial results according to the relevancy of the unviewed initial results and the first classification, removing the technical files with high relevancy, wherein the removing amount is half of the unviewed initial results, and the difference between the relevancy of the remaining technical files according to the first classification and 100% is used as the relevancy of the remaining technical files.
And the supplement module is used for supplementing the technical files with the prior relevance in the rest retrieval results of the initial keywords as the correction results, the supplement amount is set to supplement the technical files of the correction results to N, and the supplemented technical files are analyzed according to the relevance between the initial keywords and the second classification.
And the sorting module is used for re-sorting the correction results according to the correlation values.
The embodiment of the invention has the beneficial effects that:
according to the data screening method for result transfer and conversion, provided by the embodiment of the invention, the initial keywords are provided, the initial keywords are searched to obtain the initial search results, the search results are ranked according to the relevancy of the initial keywords, and N technical files with the highest relevancy are taken as the initial results and displayed to a user for reading.
The user does not need to see all the technical documents during reading, and can pick the technical document which is considered to be the most relevant by the user. Therefore, during the process of reading the above N technical documents, the following situations may exist: 1. browsed but not clicked to view details; 2. browsing and clicking to check details, and finally collecting the technical file; 3. browsed and clicked to view the details, but the technical file was not finally collected.
The technical files which are browsed but not clicked to view the details are taken as a first category, the browsed and clicked to view the details, and finally the technical files collected by the technical files are taken as a second category.
And sequencing the technical files which are not browsed in the N technical files according to the correlation degree with the first classification, eliminating the technical files with high correlation degree, wherein the elimination amount is half of the quantity of the technical files which are not browsed, and calculating the difference between the correlation degree of the remaining technical files according to the first classification and 100% to be used as the correlation degree. For example: the correlation degree of the rest of the technical files according to the first classification is 60%, then the difference between the technical files and 100% is 100% -60% =40%, and then the finally obtained 40% is the correlation degree.
Because only N technical files with the highest relevancy are selected when the initial keyword is used for searching, the technical files with the highest relevancy in the rest technical files after the N technical files with the highest relevancy are taken are supplemented, the supplementation amount is set to be the sum of the previous elimination amount and the number of the browsed technical files, and the technical files are supplemented to N again through the supplementation. Meanwhile, the technical documents to be added need to be analyzed according to the relevance between the technical documents and the initial keywords and the second classification.
And sequencing the rest technical files in the original N technical files and the newly supplemented technical files according to the obtained correlation degree, and displaying the sequenced technical files and the newly supplemented technical files to the user for reading.
Through the design, the N target files can be continuously optimized by combining the reading preference and the reading tendency used on the basis of the initial keywords, so that the N target files cover technical files really needed by the user as much as possible.
In general, the data screening method for achievement transfer transformation provided by the embodiment of the invention can effectively improve the technical document retrieval efficiency, ensure the accuracy of the target document, is very suitable for users without retrieval experience, greatly reduces the retrieval burden, is beneficial to improving the technical document matching efficiency and matching accuracy, and has positive significance for promoting achievement transfer transformation. The data screening system for achievement transfer and transformation provided by the embodiment of the invention can effectively improve the retrieval efficiency of the technical documents, ensures the accuracy of the target documents, is very suitable for users without retrieval experience, greatly reduces the retrieval burden, is beneficial to improving the matching efficiency and matching accuracy of the technical documents, and has positive significance for promoting achievement transfer and transformation.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a schematic flow chart of a data screening method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a data screening system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Examples
Referring to fig. 1, the present embodiment provides a data screening method for outcome transfer transformation, which includes the following steps:
s1, searching the technical files according to the initial keywords, sequencing the search results according to the relevancy, taking N technical files with the highest relevancy as the initial results, and executing the step S2;
s2, taking the browsed and un-clicked technical files as a first classification, taking the browsed and clicked technical files as a second classification, and executing the step S3;
s3, sorting the unviewed initial results according to the relevance of the unviewed initial results and the first classification, eliminating technical files with high relevance, wherein the elimination amount is half of the unviewed initial results, and performing the step S4 by taking the difference between the relevance of the remaining technical files according to the first classification and 100% as the relevance;
s4, supplementing the technical files with the previous relevance in the rest retrieval results in the step S1 as correction results, setting the supplement amount to be N parts of the technical files of the correction results, analyzing the supplemented technical files according to the relevance between the technical files and the initial keywords and the relevance between the technical files and the second classification, and executing the step S5; and
and S5, reordering the correction results according to the correlation values.
The method comprises the steps of searching for initial keywords by providing the initial keywords to obtain initial search results, sequencing the search results according to the relevancy of the initial keywords, taking N technical files with the highest relevancy as the initial results, and displaying the initial results to a user for reading.
The user does not need to see all the technical documents during reading, and can pick the technical document which is considered to be the most relevant by the user. Therefore, during the process of reading the above N technical documents, the following situations may exist: 1. browsed but not clicked to view details; 2. browsing and clicking to check details, and finally collecting the technical file; 3. browsed and clicked to view the details, but the technical file was not finally collected.
The technical files which are browsed but not clicked to view the details are taken as a first category, the browsed and clicked to view the details, and finally the technical files collected by the technical files are taken as a second category.
And sequencing the technical files which are not browsed in the N technical files according to the correlation degree with the first classification, eliminating the technical files with high correlation degree, wherein the elimination amount is half of the quantity of the technical files which are not browsed, and calculating the difference between the correlation degree of the remaining technical files according to the first classification and 100% to be used as the correlation degree. For example: the correlation degree of the rest of the technical files according to the first classification is 60%, then the difference between the technical files and 100% is 100% -60% =40%, and then the finally obtained 40% is the correlation degree.
Because only N technical files with the highest relevancy are selected when the initial keyword is used for searching, the technical files with the highest relevancy in the rest technical files after the N technical files with the highest relevancy are taken are supplemented, the supplementation amount is set to be the sum of the previous elimination amount and the number of the browsed technical files, and the technical files are supplemented to N again through the supplementation. Meanwhile, the technical documents to be added need to be analyzed according to the relevance between the technical documents and the initial keywords and the second classification.
And sequencing the rest technical files in the original N technical files and the newly supplemented technical files according to the obtained correlation degree, and displaying the sequenced technical files and the newly supplemented technical files to the user for reading.
Through the design, the N target files can be continuously optimized by combining the reading preference and the reading tendency used on the basis of the initial keywords, so that the N target files cover technical files really needed by the user as much as possible. The user does not need to carry out professional retrieval, the technical file retrieval efficiency is effectively improved, the accuracy of the target file is guaranteed, the method is very suitable for users without retrieval experience, the retrieval burden is greatly reduced, the technical file matching efficiency and matching accuracy are improved, and the method has positive significance for promoting achievement transfer and transformation.
Further, step S5 further includes: step S2 is re-executed. By repeatedly executing the steps S2-S5, the accuracy of the target file can be improved continuously, and the accuracy can be higher as the use time is longer.
In this embodiment, in step S4, analyzing the supplemented technical document according to the relevance to the initial keyword and the second category includes: the sum of 50% of the degree of correlation with the initial keyword and 50% of the degree of correlation with the second classification is taken as the new degree of correlation. When the correlation degree analysis is carried out on the supplemented technical files, the correlation degree analysis with the initial keywords and the correlation degree analysis with the second classification are respectively carried out on the supplemented technical files, and finally, 50% of the two correlation degrees are respectively taken and added to form a new correlation degree, wherein the correlation degree is the correlation degree of the two correlation degrees with the initial keywords and the second classification.
Step S2 further includes classifying the technical documents that have been browsed and clicked through but not collected as a third category; in step S4, analyzing the relevancy of the supplemented technical document further includes: the new correlation was obtained by subtracting 10% of the correlation with the third classification.
And continuously improving the correlation by using the initial keywords, the first classification, the second classification and the third classification, thereby continuously improving the accuracy of the target file.
Further, the discrimination keyword in the second classification and the discrimination keyword in the third classification are set as the first reference keyword, and in step S4, the degree of correlation with the second classification is the degree of correlation with the first reference keyword. The discrimination keyword in the third classification and the discrimination keyword in the second classification are set as the second reference keyword, and the degree of correlation with the third classification is the degree of correlation with the second reference keyword in step S4. Through this design, can further promote the precision and the screening efficiency of screening.
In this embodiment, the value range of N is 80 to 150, which is set to 100. In step S3, when half of the number of the unviewed initial results is less than or equal to 10, the unviewed initial results are all culled. In step S3, when the number of half of the unviewed initial results is greater than or equal to 30, the culling number is set to 40. This can effectively reduce the interference of scattered samples.
Referring to fig. 2, the present embodiment further provides a data screening system for result transfer transformation, which is used for executing the data screening method. The data screening system comprises: the device comprises a retrieval module, a classification module, a rejection module, a supplement module and a sorting module.
The retrieval module is used for retrieving the technical files according to the initial keywords, sequencing the retrieval results according to the relevancy and taking N technical files with the highest relevancy as the initial results.
The classification module is used for classifying the browsed technical files which are not clicked and viewed into a first classification, and classifying the browsed technical files which are clicked, viewed and collected into a second classification.
The removing module is used for sorting the unviewed initial results according to the relevancy of the unviewed initial results and the first classification, removing the technical files with high relevancy, wherein the removing amount is half of the unviewed initial results, and the difference between the relevancy of the remaining technical files according to the first classification and 100% is used as the relevancy of the remaining technical files.
And the supplement module is used for supplementing the technical files with the prior relevance in the rest retrieval results of the initial keywords as the correction results, the supplement amount is set to supplement the technical files of the correction results to N, and the supplemented technical files are analyzed according to the relevance between the initial keywords and the second classification.
And the sorting module is used for re-sorting the correction results according to the correlation values.
In summary, the data screening method for achievement transfer transformation can effectively improve the technical document retrieval efficiency, ensure the accuracy of the target document, is very suitable for users without retrieval experience, greatly reduces the retrieval burden, is beneficial to improving the technical document matching efficiency and matching accuracy, and has positive significance for promoting achievement transfer transformation. The data screening system for achievement transfer and transformation can effectively improve the searching efficiency of the technical documents, ensures the accuracy of the target documents, is very suitable for users without searching experience, greatly reduces the searching burden, is beneficial to improving the matching efficiency and the matching accuracy of the technical documents, and has positive significance for promoting the achievement transfer and transformation.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.