CN115774860A - Domain engine technology identification method based on multi-source data fusion calculation - Google Patents

Domain engine technology identification method based on multi-source data fusion calculation Download PDF

Info

Publication number
CN115774860A
CN115774860A CN202211651352.4A CN202211651352A CN115774860A CN 115774860 A CN115774860 A CN 115774860A CN 202211651352 A CN202211651352 A CN 202211651352A CN 115774860 A CN115774860 A CN 115774860A
Authority
CN
China
Prior art keywords
technology
index
technical
data
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202211651352.4A
Other languages
Chinese (zh)
Inventor
李海坤
杨璐绮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fiberhome Universe Technology Nanjing Co ltd
Original Assignee
Fiberhome Universe Technology Nanjing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fiberhome Universe Technology Nanjing Co ltd filed Critical Fiberhome Universe Technology Nanjing Co ltd
Priority to CN202211651352.4A priority Critical patent/CN115774860A/en
Publication of CN115774860A publication Critical patent/CN115774860A/en
Withdrawn legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of engine technology identification, in particular to a field engine technology identification method based on multi-source data fusion calculation, which comprises the following steps: acquiring index parameter information of a corresponding technology based on the massive multi-source heterogeneous data in the database; constructing a computable quantitative identification index system of the technology, and quantizing index parameters; carrying out standardization processing on the calculation result of the quantized index parameter; and (3) distributing weight to the index parameters after the standardization processing through the established engine technology identification model so as to obtain an output value, and judging whether the technology is the engine technology or not according to the output value. The invention realizes accurate identification of the engine technology in the multi-source heterogeneous data environment by utilizing the fusion calculation idea, and has certain guidance and reference significance in the aspect of identifying the engine technology with strong correlation drive and industrial influence in the multi-source heterogeneous complex data environment.

Description

Domain engine technology identification method based on multi-source data fusion calculation
Technical Field
The invention belongs to the technical field of engine technology identification, and particularly relates to a field engine technology identification method for multi-source data fusion calculation
Background
The development and innovation of the scientific technology are important factors for the development and progress of the society, and the development of the scientific technology is a necessary way for improving the happiness index of the national life, driving the high-speed development of the society and enhancing the national soft strength. However, the scientific technologies cover a large amount of contents, and the industries are complex in variety, and the influence and the acting force degree and the extent of the development of the national society of different scientific technologies are different, among the scientific technologies, some scientific technologies can only influence in a single field, and others can lead and drive in a plurality of fields, and exert influence in a wide range, and can revolutionize key core industrial systems, and this part of technologies is the engine technology, how to predict and identify future key engine technologies in the industrial fields from numerous scientific and technological methods in advance?
Present engine technology recognition starts from multidimensional data, with emphasis on: the method comprises the following steps of technical news quantity, average technical patent quantity, patent market value coverage, patent quoted rate, patent conversion rate, technical maturity, related patent novelty, innovation technology growth rate, horizontal and vertical project quantity, average quoted times of papers, production, study and research cooperation rate, technical field coverage and industry expansion degree parameter information.
The technical news amount mainly refers to the news amount of the technology released through a news media platform and the like; the average technical patent number refers to the average annual authorization number of the technical subject field in a certain period of time; the patent quoted rate refers to the quoted frequency of the related patents of the technical subject; the technical maturity refers to the ratio of the invention patents in a certain technical field to the total amount of the invention patents in the technical field; the patent conversion rate refers to the condition that related patents under the technical subject are transferred; the technology maturity refers to the development potential of the technology in the future; the novelty of the related patent refers to the growth of the technology for judging the technical category distribution of the technical subject; the patent market value coverage range refers to the ratio of GDP of related patents in China to GDP of developed countries under the technical theme; the innovation degree technology growth rate refers to the ratio of the application accumulation amount and the authorized accumulation amount of the technology in a certain time period; the horizontal and vertical project quantity refers to the quantity of national or enterprise fund projects obtained by the technology; the average number of times of introduction of the paper refers to the number of times of introduction of the paper written by the technology; the ratio of the obstetrics, study and research cooperation means that the papers of the school and enterprise cooperation account for the proportion of all papers written by adopting the technology; the technical field coverage range refers to the distribution quantity of the related patent technical fields under the technical subject; the industry expansion degree refers to the growth speed of the number of the technical patents so as to reflect the technology expansion speed.
Future key engine technologies in the industry field are predicted and identified in advance from multi-source heterogeneous data of the database, the development is supplemented in time, the key core technologies are firmly grasped, and the phenomenon of 'neck technology' can be avoided. Furthermore, the engine technology is recognized in advance, the high points of science and technology can be seized, and emerging industries and markets are created and cultivated, so that the competitive soft strength of the international science and technology in China is improved.
Therefore, engine technology identification is of particular significance to national, social and national development progress, but the existing identification research work about engine technology is insufficient in timeliness and multi-data fusion, and the problem of how to accurately identify the engine technology which has strong driving and revolutionary influence on social and industrial development from many key, emerging and leading-edge technical fields under the background of large data volume and dimension scale is urgently to be solved by scientific and technical information intelligence work.
Disclosure of Invention
The invention aims to provide a field engine technology identification method based on multi-source data fusion calculation, which aims to realize effective fusion of multi-source heterogeneous data in engine technology identification and exploration of an application method under the information background of huge data volume and dimension scale so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme:
a field engine technology identification method based on multi-source data fusion calculation comprises the following steps:
s1: acquiring index parameter information of a corresponding technology based on the massive multi-source heterogeneous data in the database;
s2: constructing a computable quantitative identification index system of the technology, and quantizing the index parameters in the step S1;
s3: standardizing the calculation result of the index parameters in the step S2;
s4: and distributing weight to the index parameters after the standardization processing through the established engine technology identification model so as to obtain an output value, and judging whether the technology is the engine technology or not according to the output value.
Preferably, the index parameter information according to the technique in the step S1 includes:
number of technical news: the news volume of the technology is released through a news media platform and the like;
average number of technical patents: the technology belongs to the technical field of average number of related patents per year in a certain period of time;
patent market value coverage patent quoted rates: the frequencies to which the related patents are cited within the technical field of the technology;
the patent conversion rate refers to the condition that related patents in the technical field of the technology are transferred;
the technology maturity refers to the development potential of the technology in the future;
novelty of related patent: judging the growth of the technology by the technical category distribution in the technical field of the technology;
patent market value coverage: the ratio of GDP of related patents in the technical field of the technology to GDP of developed countries in China;
innovation degree technology growth rate: the technology applies for the ratio of the accumulation amount to the authorized accumulation amount within a certain period of time;
number of items in horizontal and vertical directions: the number of national or enterprise fund projects acquired by the technology;
average number of quotes in the paper: the times at which papers written using the techniques are cited;
obstetric and scientific research cooperation ratio: the papers of the school-enterprise cooperation account for the proportion of all papers written by adopting the technology;
technical field coverage: the number of related patent technology fields distributed under the technical subject;
the industrial expansion degree is as follows: the technology expansion speed is reflected by the increase speed of the number of the technical patents.
Preferably, the step S2 of constructing a three-level quantization identification index system specifically includes the following steps:
according to the quantity of the technical news in the last three years, a quantitative index N of the technical news growth rate is calculated, and the calculation formula is as follows:
Figure BDA0004010802490000031
wherein, N a For the year the technology-related news volume, N, is released a-1 Publishing the technology-related news volume, N, for the previous year a-2 The quantity of news related to the technology is released for the first two years, and the index reflects the market attention degree of the technology;
according to the related patent number related to the technology in the last three years, an average technology patent number quantization index ZCS is calculated, and the calculation formula is as follows:
Figure BDA0004010802490000032
wherein ZCS t For the number of patents related to the technology in the last three years, ZCS p The number of patents for the technology in a specified time period;
patent market value coverage quantitative index ZGDP m The calculation formula is as follows:
Figure BDA0004010802490000033
wherein, GDP i For the total value of the technology produced in the last three years in the designated year in the ith country,
Figure BDA0004010802490000034
assigning years to i countries for the technology in nearly three yearsTotal value of production, GDP a The production total value of China in a set year; a higher index value indicates a higher market value of the technology;
the patent quoted rate quantization index PCS calculation formula is as follows:
Figure BDA0004010802490000041
wherein, PCS a PCS is a statistical measure of the number of patented references to the technology in the last three years n The total amount of the technical patent application in a specified time period;
the calculation formula of the patent conversion ratio PTS quantitative index is as follows:
Figure BDA0004010802490000042
the PTSN is the number of patents subjected to patent transfer in the technology-related patents in the last three years, and the TSN is the number of patents allowed in the technology-related patents in a specified time period;
the specific calculation formula of the technical maturity quantization index ZS is as follows:
Figure BDA0004010802490000043
wherein M is a Number of related inventions, M, applied for said technology in the current year b The higher the index is, the lower the maturity is;
the specific calculation formula of the related patent novelty quantitative index ZN is as follows:
Figure BDA0004010802490000044
wherein, C = {1,2, \8230:, i },1 ≦ i; specifically, C is the mathematical set of all the years of the technological development, i represents the ith year in the set C, Y i Is a stand forThe number of related patents applied by the technology in the ith year, n is the total number of related patents of the technology in all the years, and the larger the index is, the higher the novelty is;
the specific calculation formula of the innovation degree quantization index ZO is as follows:
Figure BDA0004010802490000045
wherein, B j The ratio of the number of patents cited for this patent in the last three years to the total number cited, S B For the mathematical set of cited patent classes under the technology, the larger the index is, the higher the innovation degree is; the specific calculation formula of the quantitative index TG of the technical growth rate is
Figure BDA0004010802490000046
Wherein TN is the patent application number of the technology in the current year, and TA is the patent application number of the technology in the last three years;
the specific calculation formula of the quantitative index of the increase rate of the number of items is as follows:
Figure BDA0004010802490000051
wherein, TC a Obtaining the number of items, TC, in the horizontal and vertical direction for the technology in the current year a-1 Obtaining a quantity of items, TC, in the horizontal and vertical direction for the technology in the previous year a-2 The number of horizontal and vertical items is obtained in the first two years, and the larger the index is, the higher the industrial superiority is;
average number of quotes PCI p The calculation formula of the quantization index is as follows:
Figure BDA0004010802490000052
the PC is the times of introducing the scientific and technological papers of the technology in nearly three years, the n is the times of introducing the related papers of the technology in a specified period, and the larger the index value is, the higher the innovation of the scientific research of the technology is;
the specific calculation formula of the PGR quantitative index of the number increase rate of the relevant papers is as follows:
Figure BDA0004010802490000053
wherein, PGR a PGR is the total number of distribution of the technology in the relevant scientific and technological papers of the year a-1 PGR is the total number of distribution of the related scientific papers of the previous year a-2 Is the distribution total number of related scientific papers of the technology in the last two years;
the calculation formula of the quantitative index TRCR of the obstetrical and scientific research cooperation ratio is as follows:
Figure BDA0004010802490000054
wherein, TRCN is the number of enterprises and college research institutes appearing in the author unit in the related paper, and PR is the total number of related scientific and technical papers of the technology under a specified period;
the industry expansion degree is that the specific calculation formula of the TFGR quantization index of the technical field growth rate is as follows:
Figure BDA0004010802490000055
wherein, TFGR a TFGR represents the number of related patents in each technical field in the current year a-1 TFGR, the number of related patents in each technical field one year before the technology a-2 The number of the related patents in each technical field in two years before the technology.
The index parameter calculation result is standardized, and the standardization processing methods of different data types are as follows:
s31: when the index data is type 0-1 data: the 0-1 type data does not need further standardization processing because the result is mapped in the [0,1] interval;
s32: when the index data is interval type [ - ∞, + ∞ ] data: since the interval-type [ - ∞, + ∞ ] data is not within the interval range [0,1], normalization is required.
Figure BDA0004010802490000061
X in the formula i For the original value of the corresponding data type, P i Is a normalized output value;
s33: when the index data is the proportion data: the proportion type data do not strictly fall within the range of the [0,1] standardization interval, and the standardization treatment is required to be carried out:
Figure BDA0004010802490000062
x in the formula i For the original value of the corresponding data type, P i For the purpose of the normalized output value,
Figure BDA0004010802490000063
is the average value of the quantized data index;
s34: when the index data is an absolute numerical index: the absolute numerical type index processing is divided into two parts, the average technical patent number horizontal and vertical item number, the item number growth rate and the average quoted times of the thesis are standardized by an average reference value, and the method comprises the following steps:
Figure BDA0004010802490000064
x in the formula i For the original value of the corresponding data type, P i For the purpose of the normalized output value,
Figure BDA0004010802490000065
is the average value of the quantized data index;
s35: the data indexes of the patent market value coverage range and the coverage range of the technical field are not suitable for the average value to be standardized, and a threshold value needs to be separately set for standardization treatment:
the standardization of the patent market value coverage quantitative data indexes is as follows:
Figure BDA0004010802490000066
x in the formula i For the original value of the corresponding data type, P i Is a normalized output value;
s36: the technical field coverage quantitative data index standardization process comprises the following steps:
Figure BDA0004010802490000067
x in the formula i For the original value of the corresponding data type, P i Is a normalized output value.
Preferably, the step S4 includes the steps of:
s41: establishing an engine technology recognition output model; the output value of the technique is determined by an output function, and the specific output function is as follows:
Figure BDA0004010802490000071
in the formula, k refers to a specific technology, and the corresponding standardized index parameter in the technology is c jk J is a label corresponding to a quantifiable index of a specific type of the technology, and the index weight of the standardized index parameter corresponding to the label is h jk
S42: defining each index data type under the computability quantization identification index system in the step S2, calculating through the quantization index system, reclassifying the index data types through the index data types, mapping the index data types to the step S41 one by one, determining the classification of j,and correspondingly importing the corresponding index data into c jk
S43: based on the idea of a subjective weighting method, determining the weight of the engine identification technical index of the technology in the prediction field category by using an analytic hierarchy process, carrying out standardized value taking on the weight calculation result, and mapping the weight calculation result to the step S41 in a one-to-one manner jk A value of (d);
s44: determination of index data c by step S42 and step S43 jk And weight h of corresponding index data jk And calculating by using the output function to obtain a final output value, establishing different interval recognition results for the output value as reference standards, and performing matching recognition according to the output value.
Preferably, the identification interval established for the output value in step S44 is, as shown in table 1, specifically:
when the output value M = (0.6,1.0 ]; said technique is engine technique;
when the output value M = (0.4,0.6 ]; it is not determined whether the technology is the engine technology;
when the output value M = [0,0.4]; the technique is illustrated as a non-engine technique.
Has the beneficial effects that: compared with the prior art, the invention has the beneficial effects that:
1. the invention realizes the effective fusion of multi-source heterogeneous data in the engine technology identification and the exploration of an application method, and utilizes the data fusion calculation thought to carry out digital calculation on the engine technology in the multi-source heterogeneous data environment, thereby accurately identifying whether the technical point has high leading property, high mobility, high potentiality and high value. Specifically, the high leadability includes industrial superiority, scientific research leadability, and scientific research innovativeness; the high drivability includes the coverage of the technical field and the extent of the industry; high potential includes technical foresight, technical activity, technical expectation; the high value includes the technical marketization degree, the technical scale and the technical applicability;
2. according to the method, relevant data information is extracted from the fusion classification of mass data, future engine technologies are captured from a fine place, the technical points are analyzed, index parameters of the technical points are identified through multiple dimensions, and the acquired data are subjected to quantization and standardization processing in sequence, so that the establishment of a later-stage engine technology identification output model is facilitated, and the engine technologies are predicted; the method has certain guiding and reference significance in the technical aspect of identifying the engine with strong correlation drive and industrial influence under the multi-source heterogeneous complex data environment.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a technical point quantifiable index system diagram under multi-source heterogeneous data fusion.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that the terms "vertical", "upper", "lower", "horizontal", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of describing the present invention and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention.
In the description of the present invention, it should also be noted that, unless otherwise explicitly specified or limited, the terms "disposed," "mounted," "connected," and "connected" are to be construed broadly and may, for example, be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
Please refer to fig. 1-2: the invention provides a field engine technology identification method based on multi-source data fusion calculation, which comprises the following steps:
s1: acquiring index parameter information of a corresponding technology based on the massive multi-source heterogeneous data in the database;
s2: constructing a computable quantitative identification index system of the technology, and quantizing the index parameters in the step S1;
s3: standardizing the calculation result of the index parameters in the step S2;
s4: and distributing weight to the index parameters after the standardization processing through the established engine technology identification model so as to obtain an output value, and judging whether the technology is the engine technology or not according to the output value.
The multi-source data types include absolute numerical type, interval type [ - ∞, + ∞ ], integer type 0-1, and proportion type.
The index parameter information according to the technique in the step S1 includes:
number of technical news: the news volume of the technology is released through a news media platform and the like;
average number of technical patents: the technology field comprises the average number of the related patents authorized in each year within a certain period of time;
patent market value coverage patent quoted rates: the frequencies to which the related patents are cited within the technical field of the technology;
the patent conversion rate refers to the condition that related patents in the technical field of the technology are transferred;
the technology maturity refers to the development potential of the technology in the future;
related patent novelty: judging the growth of the technology by the technical category distribution in the technical field of the technology;
patent market value coverage: the ratio of GDP of related patents in the technical field of the technology to GDP of developed countries in China;
innovation degree technology growth rate: the technology applies for the ratio of the accumulation amount to the authorized accumulation amount within a certain period of time;
number of items in horizontal and vertical directions: the number of national or enterprise fund projects acquired by the technology;
average quoted times of papers: the times at which papers written using the described techniques are cited;
ratio of production, study and cooperation: all papers written by adopting the technology are occupied by papers of school-enterprise cooperation;
technical field coverage: the number of distributions in the related patent technology field under the technical subject;
the industry expansion degree is as follows: the technology expansion speed is reflected by the increase speed of the number of the technical patents.
Preferably, the step S2 of constructing a three-level quantization identification index system specifically includes the following steps:
according to the technical news quantity in the last three years, a quantitative index N of the technical news growth rate is calculated, and the calculation formula is as follows:
Figure BDA0004010802490000091
wherein, N a For the year the technology-related news volume, N, is released a-1 Publishing the technology-related news volume, N, for the previous year a-2 The quantity of news related to the technology is released for the first two years, and the index reflects the market attention degree of the technology;
according to the related patent number related to the technology in the last three years, an average technology patent number quantification index ZCS is calculated, and the calculation formula is as follows:
Figure BDA0004010802490000101
wherein, ZCS t Number of patents related to the technology in the last three years, ZCS p The number of patents in a specified time period for the technology;
patent market value coverage quantitative index ZGDP m Formula for calculationThe following were used:
Figure BDA0004010802490000102
wherein, GDP i For the total value of the technology produced in the last three years in the designated year in the ith country,
Figure BDA0004010802490000103
GDP, the total value of production of the technology in the last three years in specified years in i countries a The production total value of China in a set year; a higher index value indicates a higher market value of the technology;
the patent quoted rate quantization index PCS calculation formula is as follows:
Figure BDA0004010802490000104
wherein, PCS a PCS is a statistical measure of the number of patented references to the technology in the last three years n The total amount of the technical patent application in a specified time period;
the calculation formula of the patent conversion ratio PTS quantitative index is as follows:
Figure BDA0004010802490000105
the PTSN is the number of patents subjected to patent transfer in the technology-related patents in the last three years, and the TSN is the number of patents allowed in the technology-related patents in a specified time period;
the specific calculation formula of the technical maturity quantization index ZS is as follows:
Figure BDA0004010802490000106
wherein, M a Number of related inventions, N, applied for said technology in the current year b Do the technology apply for the relevant utility model in the current yearThe higher the index is, the lower the maturity is;
the related patent novelty quantitative index ZN is specifically calculated by the formula:
Figure BDA0004010802490000107
wherein, C = {1,2, \8230:, i },1 ≦ i; specifically, C is a mathematical set of all the years of the development of the technology, i represents the ith year in the set C, and Y represents i The number of related patents applied in the ith year by the technology and n is the total number of related patents of the technology in all the years, and the larger the index is, the higher the novelty is;
the specific calculation formula of the innovation degree quantization index ZO is as follows:
Figure BDA0004010802490000111
wherein, B j The ratio of the number of patents cited for this patent in the last three years to the total number cited, S B The larger the index is, the higher the innovation degree is for the mathematical set of cited patent categories under the technology; the specific calculation formula of the quantitative index TG of the technical growth rate is
Figure BDA0004010802490000112
Wherein TN is the number of patent applications of the technology in the current year, and TA is the number of patent applications of the technology in the last three years;
the specific calculation formula of the quantitative index of the increase rate of the number of items is as follows:
Figure BDA0004010802490000113
wherein, TC a For the technology to obtain the number of horizontal and vertical items, TC, in the current year a-1 For the technology to obtain the transverse and longitudinal directions in the previous yearTo the number of items, TC a-2 The number of horizontal and vertical items is obtained in the first two years, and the larger the index is, the higher the industrial superiority is;
average number of quotes PCI p The quantization index calculation formula is as follows:
Figure BDA0004010802490000114
the PC is the times of introduction of the technology into the scientific and technological paper in the last three years, the n is the times of introduction of the relevant paper of the technology in a specified time period, and the index value is larger, which indicates that the scientific research innovation of the technology is higher;
the specific calculation formula of the PGR quantitative index of the number increase rate of the relevant papers is as follows:
Figure BDA0004010802490000115
wherein, PGR a PGR is the total number of distribution of the technology in the relevant scientific and technological papers of the year a-1 PGR is the total number of distribution of related scientific papers of the technology in the previous year a-2 Is the distribution total number of related scientific papers of the technology in the last two years;
the calculation formula of the quantitative index TRCR of the obstetrical and scientific research cooperation ratio is as follows:
Figure BDA0004010802490000116
wherein, TRCN is the number of enterprises and college research institutes appearing in the author unit in the related papers, and PR is the total number of related scientific and technical papers of the technology in a specified time period;
the specific calculation formula of the TFGR quantization index with the industry expansion degree as the technical field growth rate is as follows:
Figure BDA0004010802490000121
wherein, TFGR a TFGR represents the number of related patents in each technical field in the current year a-1 TFGR is the number of related patents in each technical field in one year before the technology a-2 The number of the related patents in each technical field in two years before the technology.
The index parameter calculation result is standardized, and the standardization processing methods of different data types are as follows:
s31: when the index data is type 0-1 data: the 0-1 type data does not need further standardization processing because the result is mapped in the [0,1] interval;
s32: when the index data is interval-type [ - ∞, + ∞ ] data: since the interval-type [ - ∞, + ∞ ] data is not within the interval range [0,1], normalization is required.
Figure BDA0004010802490000122
X in the formula i For the original value of the corresponding data type, P i Is a normalized output value;
s33: when the index data is the proportion data: the proportion type data does not strictly fall within the range of the [0,1] standardization interval, and standardization treatment is required to be carried out:
Figure BDA0004010802490000123
x in the formula i For the original value of the corresponding data type, P i For the purpose of the normalized output value,
Figure BDA0004010802490000124
is the average value of the quantized data index;
s34: when the index data is an absolute numerical index: the absolute numerical type index processing is divided into two parts, the average technical patent number horizontal and vertical item number, the item number growth rate and the average quoted times of the thesis are standardized by an average reference value, and the method comprises the following steps:
Figure BDA0004010802490000125
x in the formula i For the original value of the corresponding data type, P i For the purpose of the normalized output value,
Figure BDA0004010802490000126
is the average value of the quantized data index;
s35: the data indexes of the patent market value coverage range and the coverage range of the technical field are not suitable for the average value to be standardized, and a threshold value needs to be separately set for standardization treatment:
the patent market value coverage quantitative data index standardization process is as follows:
Figure BDA0004010802490000131
x in the formula i For the original value of the corresponding data type, P i Is a normalized output value;
s36: the technical field coverage quantitative data index standardization process comprises the following steps:
Figure BDA0004010802490000132
x in the formula i For the original value of the corresponding data type, P i Is a normalized output value.
The step S4 includes the steps of:
s41: establishing an engine technology recognition output model; the output value of the technique is determined by an output function, which is specifically as follows:
Figure BDA0004010802490000133
in the formula, k denotes a toolThe technology is specifically described in the category, and the corresponding standardization index parameter in the technology is c jk J is a label corresponding to a quantifiable index of a specific type of the technology, and the index weight of the standardized index parameter corresponding to the label is h jk
S42: determining each index data type under the computability quantization identification index system in the step S2, calculating through the quantization index system, reclassifying the index data types through the index data types, mapping the index data types to the step S41 one by one, determining the classification of j, and correspondingly importing the corresponding index data into c jk
S43: based on the idea of a subjective weighting method, determining the weight of the engine identification technical index of the technology in the prediction field category by using an analytic hierarchy process, carrying out standardized value taking on the weight calculation result, and mapping the weight calculation result to the step S41 in a one-to-one manner jk A value of (d);
s44: determination of index data c by step S42 and step S43 jk And weight h of corresponding index data jk And calculating by using an output function to obtain a final output value, establishing different interval recognition results for the output value as reference standards, and performing matching recognition according to the output value.
The identification interval established for the output value in step S44 is specifically:
when the output value M = (0.6, 1.0]; the technology is the engine technology;
when the output value M = (0.4,0.6 ]; it is not determined whether the technology is the engine technology;
when the output value M = [0,0.4]; the technique is illustrated as a non-engine technique.
The details are shown in table 1:
table 1: output value prediction reference table
Recognition result Output value
Engine technology (0.6,1.0]
Uncertainty (0.4,0.6]
Non-engine technology [0,0.4]
Example 1: the invention provides a field engine technology identification method based on multi-source data fusion calculation, taking the technical field of remote sensing as an example, and specifically comprising the following steps:
step 1: the multi-source heterogeneous index parameter information in the technical field of remote sensing specifically comprises: the technical news amount, the average technical patent amount, the patent market value coverage range patent quoted rate, the patent conversion rate and the technical maturity, the related patent novelty, the innovation degree and the technical growth rate horizontal and vertical project amount, the average quoted times of the thesis, the research and development cooperation rate, the technical field coverage range and the industry expansion degree parameter information. The multi-source data required by the remote sensing field specifically comprises scientific thesis data, news data, patent invention data and scientific research project data, and the remote sensing data source specifically comprises WOS scientific thesis data, a ProjectGate global scientific research project database and a CSA news database. The data types include absolute numerical type, interval type, 0-1 integer type and proportion type.
Step 2: a quantitative identification index system with computability is constructed, taking the selected remote sensing technical field as an example, and the calculation results of each quantitative index are shown in table 2:
and 3, step 3: the index calculation result is standardized to process the data into data between 0 and 1, so that different processing modes can be selected according to different data types, and specific data is available to a computer, as shown in table 2:
and 4, step 4: based on the idea of a subjective weighting method, determining the weight of the engine identification technical index of the technology in the category of the prediction field by using an analytic hierarchy process, carrying out standardized value taking on the weight calculation result, and mapping the weight calculation result to h in the step S41 one by one jk A value of (d); the weight assignment result, the index weight in the engine technology identification index system based on the multi-metadata fusion is shown in table 2:
table 2: related index parameter statistical table in certain remote sensing technology field
Figure BDA0004010802490000151
And 5: the calculated data identifies an output model through an engine technology; the output value calculated by the output function is specifically as follows:
Figure BDA0004010802490000152
wherein, M =0.6443> < 0.6, therefore, the remote sensing technology field is identified as an engine technology in the method provided by the invention, and the identification result is consistent with the actual situation.
The above embodiments are only for illustrating the specific embodiments of the present invention in detail, and the protection scope of the present invention should not be limited thereby, and any modifications made on the basis of the technical solutions according to the technical ideas presented in the present invention are within the protection scope of the present invention.

Claims (6)

1. A field engine technology identification method based on multi-source data fusion calculation is characterized by comprising the following steps:
s1: acquiring index parameter information of a corresponding technology based on the massive multi-source heterogeneous data in the database;
s2: constructing a computable quantitative identification index system of the technology, and quantizing the index parameters in the step S1;
s3: standardizing the calculation result of the index parameters in the step S2;
s4: and (3) distributing weight to the index parameters after the standardization processing through the established engine technology identification model so as to obtain an output value, and judging whether the technology is the engine technology or not according to the output value.
2. The method for recognizing the field engine technology based on the multi-source data fusion calculation according to claim 1, wherein the index parameter information according to the technology in the step S1 includes:
number of technical news: the news volume of the technology is released through a news media platform and the like;
average number of technical patents: the technology belongs to the technical field of average number of related patents per year in a certain period of time;
patent market value coverage patent quoted rate: the frequencies to which the related patents are cited within the technical field of the technology;
the patent conversion rate refers to the condition that related patents in the technical field of the technology are transferred;
the technology maturity refers to the development potential of the technology in the future;
related patent novelty: judging the growth of the technology by the technical category distribution in the technical field of the technology;
patent market value coverage: the ratio of GDP of related patents in China to GDP of developed countries under the technical field of the technology;
innovation degree technology growth rate: the technology applies for the proportion of the accumulation amount and the authorized accumulation amount within a certain time period;
number of items in horizontal and vertical directions: the number of national or enterprise fund projects earned by the technology;
average quoted times of papers: the times at which papers written using the described techniques are cited;
ratio of production, study and cooperation: all papers written by adopting the technology are occupied by papers of school-enterprise cooperation;
technical field coverage: the number of related patent technology fields distributed under the technical subject;
the industrial expansion degree is as follows: the technology expansion speed is reflected by the increase speed of the number of the technical patents.
3. The field engine technology identification method based on multi-source data fusion calculation of claim 1, wherein a three-level quantization identification index system is constructed in the step S2, and the method specifically comprises the following steps:
according to the technical news quantity in the last three years, a quantitative index N of the technical news growth rate is calculated, and the calculation formula is as follows:
Figure FDA0004010802480000021
wherein N is a For the year the technology-related news volume, N, is released a-1 Publishing the technology-related news volume, N, for the previous year a-2 The quantity of news related to the technology is released for the first two years, and the index reflects the market attention degree of the technology;
according to the related patent number related to the technology in the last three years, an average technology patent number quantization index ZCS is calculated, and the calculation formula is as follows:
Figure FDA0004010802480000022
wherein, ZCS t Number of patents related to the technology in the last three years, ZCS p The number of patents for the technology in a specified time period;
patent market value coverage quantitative index ZGDP m The calculation formula is as follows:
Figure FDA0004010802480000023
wherein, GDP i For the technology in the ith country in the last three yearsThe family specifies the total value of production at the year,
Figure FDA0004010802480000024
GDP, the total value of production of the technology in the last three years in specified years in i countries a The production total value of China in a set year; a higher index value indicates a higher market value of the technology;
the patent quoted rate quantization index PCS calculation formula is as follows:
Figure FDA0004010802480000025
wherein, PCS a PCS is a statistical measure of the number of patented references to the technology in the last three years n The total amount of the technical patent application in a specified time period;
the calculation formula of the patent conversion ratio PTS quantitative index is as follows:
Figure FDA0004010802480000026
the PTSN is the number of patents subjected to patent transfer in the technology-related patents in the last three years, and the TSN is the number of patents allowed in the technology-related patents in a specified time period;
the specific calculation formula of the technical maturity quantization index ZS is as follows:
Figure FDA0004010802480000031
wherein M is a For the technology, the related invention patent number, M, is applied in the current year b The higher the index is, the lower the maturity is;
the specific calculation formula of the related patent novelty quantitative index ZN is as follows:
Figure FDA0004010802480000032
wherein, C = {1,2, \8230:, i },1 ≦ i; specifically, C is the mathematical set of all the years of the technological development, i represents the ith year in the set C, Y i The number of related patents applied by the technology in the ith year, n is the total number of related patents of the technology in all the years, and the larger the index is, the higher the novelty is;
the specific calculation formula of the innovation degree quantization index ZO is as follows:
Figure FDA0004010802480000033
wherein, B j For the ratio of the number of patents cited in the patent in the last three years to the total cited number, S B The larger the index is, the higher the innovation degree is for the mathematical set of cited patent categories under the technology; the specific calculation formula of the quantitative index TG of the technical growth rate is
Figure FDA0004010802480000034
Wherein TN is the patent application number of the technology in the current year, and TA is the patent application number of the technology in the last three years;
the specific calculation formula of the quantitative index of the increase rate of the number of items is as follows:
Figure FDA0004010802480000035
wherein, TC a For the technology to obtain the number of horizontal and vertical items, TC, in the current year a-1 Obtaining a quantity of items, TC, in the horizontal and vertical direction for the technology in the previous year a-2 The number of horizontal and vertical items is obtained in the first two years, and the larger the index is, the higher the industrial superiority is;
average number of quoted papersPCI p The calculation formula of the quantization index is as follows:
Figure FDA0004010802480000036
the PC is the times of introduction of the technology into the scientific and technological paper in the last three years, the n is the times of introduction of the relevant paper of the technology in a specified time period, and the index value is larger, which indicates that the scientific research innovation of the technology is higher;
the specific calculation formula of the PGR quantitative index of the number increase rate of the relevant papers is as follows:
Figure FDA0004010802480000041
wherein, PGR a PGR is the total number of distribution of the technology in the relevant scientific and technological papers of the year a-1 PGR is the total number of distribution of related scientific papers of the technology in the previous year a-2 The total number of the distribution of the related scientific and technical papers in the previous two years;
the calculation formula of the quantitative index TRCR of the obstetrical and scientific research cooperation ratio is as follows:
Figure FDA0004010802480000042
wherein, TRCN is the number of enterprises and college research institutes appearing in the author unit in the related papers, and PR is the total number of related scientific and technical papers of the technology in a specified time period;
the specific calculation formula of the TFGR quantization index with the industry expansion degree as the technical field growth rate is as follows:
Figure FDA0004010802480000043
wherein, TFGR a TFGR represents the number of related patents in each technical field in the current year a-1 Is that it isNumber of related patents in each technical field before surgery, TFGR a-2 The number of the related patents in each technical field in two years before the technology.
4. The method for recognizing the field engine technology based on the multi-source data fusion calculation according to claim 1, wherein the step S3 of standardizing the index data specifically includes:
s31: when the index data is type 0-1 data: the 0-1 type data does not need further standardization treatment because the result is mapped in the 0,1 interval;
s32: when the index data is interval-type [ - ∞, + ∞ ] data: since the interval-type [ - ∞, + ∞ ] data is not within the interval range [0,1], normalization is required.
Figure FDA0004010802480000044
X in the formula i For the original value of the corresponding data type, P i Is a normalized output value;
s33: when the index data is proportion data: the proportion type data do not strictly fall within the range of the [0,1] standardization interval, and the standardization treatment is required to be carried out:
Figure FDA0004010802480000045
x in the formula i For the original value of the corresponding data type, P i For the purpose of the normalized output value,
Figure FDA0004010802480000046
is the average value of the quantized data index;
s34: when the index data is an absolute numerical index: the absolute numerical type index processing is divided into two parts, the average technical patent number horizontal and vertical item number, the item number growth rate and the average quoted times of the thesis are standardized by an average reference value, and the method comprises the following steps:
Figure FDA0004010802480000051
x in the formula i For the original value of the corresponding data type, P i For the purpose of the normalized output value,
Figure FDA0004010802480000052
is the average value of the quantized data index;
s35: the data indexes of the patent market value coverage range and the coverage range of the technical field are not suitable for the average value to be standardized, and a threshold value needs to be separately set for standardization treatment:
the patent market value coverage quantitative data index standardization process is as follows:
Figure FDA0004010802480000053
x in the formula i For the original value of the corresponding data type, P i Is a normalized output value;
s36: the technical field coverage quantitative data index standardization process comprises the following steps:
Figure FDA0004010802480000054
x in the formula i For the original value of the corresponding data type, P i Is a normalized output value.
5. The field engine technology identification method based on multi-source data fusion calculation of claim 1, wherein the step S4 comprises the following steps:
s41: establishing an engine technology recognition output model; the output value of the technique is determined by an output function, and the specific output function is as follows:
Figure FDA0004010802480000055
in the formula, k refers to a specific technology, and the corresponding standardized index parameter in the technology is c jk J is a label corresponding to a quantifiable index of a specific type of the technology, and the index weight of the standardized index parameter corresponding to the label is h jk
S42: determining each index data type under the index system of the computability quantization identification in the step S2, calculating through the quantization index system, reclassifying the index data types through the index data types, mapping the index data types to the step S41 one by one, determining the classification of j, and correspondingly leading the corresponding index data into c jk
S43: based on the idea of a subjective weighting method, determining the weight of the engine identification technical index of the technology in the prediction field category by using an analytic hierarchy process, carrying out standardized value taking on the weight calculation result, and mapping the weight calculation result to the step S41 in a one-to-one manner jk A value of (d);
s44: determination of index data c by step S42 and step S43 jk And weight h of corresponding index data jk And calculating by using the output function to obtain a final output value, establishing different interval recognition results for the output value as reference standards, and performing matching recognition according to the output value.
6. The method for recognizing the field engine technology based on the multi-source data fusion calculation according to claim 5, wherein the recognition interval established for the output value in the step S44 specifically includes:
when the output value M = (0.6,1.0 ]; said technique is engine technique;
when the output value M = (0.4, 0.6]; it is not determined whether the technology is the engine technology;
when the output value M = [0,0.4]; the technique is illustrated as a non-engine technique.
CN202211651352.4A 2022-12-21 2022-12-21 Domain engine technology identification method based on multi-source data fusion calculation Withdrawn CN115774860A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211651352.4A CN115774860A (en) 2022-12-21 2022-12-21 Domain engine technology identification method based on multi-source data fusion calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211651352.4A CN115774860A (en) 2022-12-21 2022-12-21 Domain engine technology identification method based on multi-source data fusion calculation

Publications (1)

Publication Number Publication Date
CN115774860A true CN115774860A (en) 2023-03-10

Family

ID=85393106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211651352.4A Withdrawn CN115774860A (en) 2022-12-21 2022-12-21 Domain engine technology identification method based on multi-source data fusion calculation

Country Status (1)

Country Link
CN (1) CN115774860A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116502924A (en) * 2023-06-28 2023-07-28 深圳市建筑设计研究总院有限公司 Operation management system and method for building industry chain innovation system
CN117056867A (en) * 2023-10-12 2023-11-14 中交第四航务工程勘察设计院有限公司 Multi-source heterogeneous data fusion method and system for digital twin

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116502924A (en) * 2023-06-28 2023-07-28 深圳市建筑设计研究总院有限公司 Operation management system and method for building industry chain innovation system
CN116502924B (en) * 2023-06-28 2024-01-26 深圳市建筑设计研究总院有限公司 Operation management system and method for building industry chain innovation system
CN117056867A (en) * 2023-10-12 2023-11-14 中交第四航务工程勘察设计院有限公司 Multi-source heterogeneous data fusion method and system for digital twin
CN117056867B (en) * 2023-10-12 2024-01-23 中交第四航务工程勘察设计院有限公司 Multi-source heterogeneous data fusion method and system for digital twin

Similar Documents

Publication Publication Date Title
CN115774860A (en) Domain engine technology identification method based on multi-source data fusion calculation
US7930242B2 (en) Methods and systems for multi-credit reporting agency data modeling
CN110134719B (en) Identification and classification method for sensitive attribute of structured data
CN116701725B (en) Engineer personnel data portrait processing method based on deep learning
CN112652386A (en) Triage data processing method and device, computer equipment and storage medium
CN117131449A (en) Data management-oriented anomaly identification method and system with propagation learning capability
CN111625578A (en) Feature extraction method suitable for time sequence data in cultural science and technology fusion field
CN113157814B (en) Query-driven intelligent workload analysis method under relational database
CN112967759B (en) DNA material evidence identification STR typing comparison method based on memory stack technology
CN115935212A (en) Adjustable load clustering method and system based on longitudinal trend prediction
CN115036034A (en) Similar patient identification method and system based on patient characterization map
Muningsih et al. Combination of K-Means method with Davies Bouldin index and decision tree method with parameter optimization for best performance
CN113792794A (en) Feature selection method based on membrane algorithm
CN111125198A (en) Computer data mining clustering method based on time sequence
CN111815125A (en) Innovative entity science and technology evaluation system optimization method and device based on technical atlas
Minerva et al. Evolutionary approaches for statistical modelling
Yin et al. Research on unequal time series clustering for hot topics
CN117493442B (en) Data standardization method and device
CN115831339B (en) Medical system risk management and control pre-prediction method and system based on deep learning
CN117453805B (en) Visual analysis method for uncertainty data
CN117454892B (en) Metadata management method, device, terminal equipment and storage medium
CN115862746A (en) Accurate single-cell multi-omics matching data generation method
CN117010722A (en) Substation load identification and decomposition method, system, chip and equipment
Tian et al. Design of Network Teaching Resource Security Integration Algorithm Based on Multi-Dimensional Association Rules
CN116628627A (en) Big data digital planning management system and method based on cloud computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20230310

WW01 Invention patent application withdrawn after publication