CN115525927A - Intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence - Google Patents

Intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence Download PDF

Info

Publication number
CN115525927A
CN115525927A CN202210351897.7A CN202210351897A CN115525927A CN 115525927 A CN115525927 A CN 115525927A CN 202210351897 A CN202210351897 A CN 202210351897A CN 115525927 A CN115525927 A CN 115525927A
Authority
CN
China
Prior art keywords
conversion data
occlusion
word
historical
shielding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210351897.7A
Other languages
Chinese (zh)
Inventor
温杨馨
王淑芸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong Aopu Technology Co ltd
Original Assignee
Nantong Aopu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong Aopu Technology Co ltd filed Critical Nantong Aopu Technology Co ltd
Priority to CN202210351897.7A priority Critical patent/CN115525927A/en
Publication of CN115525927A publication Critical patent/CN115525927A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3338Query expansion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Bioethics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence-based scientific and technological achievement transformation data intelligent monitoring method and system. The method comprises the following steps: acquiring historical conversion data, randomly shielding a certain term in the historical conversion data, calculating the semantic difference between the shielded conversion data and the historical conversion data, and selecting a first shielding word; expanding the first shielding word to two sides until the semantic difference degree is greater than a preset threshold value to obtain a first shielding area; increasing the lexical items of the first occlusion word until the minimum semantic difference degree is larger than a preset threshold value to obtain a second occlusion area; constructing a first structure vector according to the intersection ratio of the first occlusion area and the second occlusion area; grouping all the first structure vectors; obtaining keywords of historical conversion data; and acquiring a second structure vector of the current conversion data, selecting a group most similar to the second structure vector, and judging whether the current conversion data is abnormal. The embodiment of the invention can realize content abnormity monitoring of the converted data.

Description

Intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence-based scientific and technological achievement transformation data intelligent monitoring method and system.
Background
The scientific and technological achievement transformation refers to the activities of subsequent tests, development, application and popularization of scientific and technological achievements with practical values generated by scientific research and technical development to improve the productivity level until new products, new processes and new materials are formed, new industries are developed and the like. Promoting the transformation of scientific and technological achievements and accelerating the industrialization of scientific and technological achievements become a new trend of scientific and technological policies of all countries in the world.
With the continuous development of the improvement of the scientific and technological system, especially the emergence of major improvement measures in the aspects of resource allocation, plan management, conversion of scientific and technological achievements and the like, and the rise of the innovation aspects of the public in public entrepreneurship, the speed of converting the scientific and technological achievements into practical productivity is accelerated. Meanwhile, the monitoring of the scientific and technological achievement transformation data is gradually promoted.
The prior art adopts artifical means to the intelligent monitoring of scientific and technological achievement conversion data usually, higher human cost need be consumed to this kind of mode, and the subjectivity of artificial monitoring is too strong to lead to the monitoring effect to have the deviation, consequently there is prior art to propose to carry out the process monitoring through artificial intelligence network, but current can only monitor and only monitor in order to change the process, do not consider the data anomaly of scientific and technological achievement conversion data itself, consequently have potential data to falsify the risk, thereby lead to scientific and technological achievement conversion process to be obstructed.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide a scientific and technological achievement transformation data intelligent monitoring method and system based on artificial intelligence, and the adopted technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides an artificial intelligence-based method for intelligently monitoring scientific and technological achievement transformation data, including the following steps:
acquiring historical conversion data of a historical result conversion file, randomly shielding a certain lexical item in the historical conversion data, calculating semantic difference between the shielded conversion data and the historical conversion data, and taking the lexical item corresponding to the minimum semantic difference as a first shielding word;
expanding the first shielding word to two sides, shielding the historical conversion data until the semantic difference degree is greater than a preset threshold value, and taking the shielding word at the moment as a first shielding area;
adding the lexical items of the first occlusion words, occluding the historical conversion data, traversing the unoccluded lexical items when the lexical items are added, further selecting second occlusion words, continuing to expand the traversal lexical items of the second occlusion words until the minimum semantic difference degree is greater than the preset threshold value, and taking the occlusion words at the moment as second occlusion areas;
constructing a first structure vector according to the intersection ratio of the first occlusion area and the second occlusion area; dividing the first structure vectors of all the historical conversion data into a plurality of groups through clustering; removing terms contained in the first structure vector from each historical conversion data to obtain keywords of the historical conversion data;
acquiring a second structure vector of the current conversion data, and selecting a group most similar to the second structure vector; and comparing the keywords of the current conversion data with the keywords corresponding to each first structure vector in the group, and when the maximum similarity obtained by comparison is below a similarity threshold, converting the current conversion data into abnormal data.
Preferably, the step of collecting the historical conversion data comprises:
and establishing a historical document library for scientific and technological achievement conversion, taking the subject characters of each document as the identification of the document, and taking the word vectors corresponding to the identification as the historical conversion data.
Preferably, the calculating process of the semantic difference degree includes:
reconstructing the history conversion data after the lexical item is shielded through an auto-encoder to obtain the shielded conversion data, and calculating Euclidean distance between the history conversion data and word vectors corresponding to the shielded conversion data to be used as the semantic difference degree.
Preferably, the step of obtaining the first occlusion region includes:
and when the semantic difference degree is not greater than a preset threshold value, the first expansion lexical item is used as the center to continue expanding towards two sides according to the expansion size until the semantic difference degree is greater than the preset threshold value, so that the first shielding area is obtained.
Preferably, the step of acquiring the second occlusion region includes:
randomly adding a lexical item to the first occlusion word to expand the first occlusion word into a second expansion lexical item, utilizing the second expansion lexical item to occlude the historical conversion data to obtain second conversion data, calculating semantic difference between word vectors of the historical conversion data and the second conversion data, traversing unoccluded words, taking an occlusion word corresponding to the minimum value of the semantic difference as a second occlusion word, and when the semantic difference is not greater than a preset threshold, continuing to expand the second occlusion word by randomly adding a lexical item until the semantic difference is greater than the preset threshold to obtain a second occlusion area.
Preferably, the constructing step of the first structure vector includes:
calculating the intersection ratio according to the quantity of the terms corresponding to the first occlusion area and the second occlusion area;
and acquiring the difference position of the intersection of the first occlusion area and the second occlusion area and the difference lexical item of the union set, and constructing the first structure vector according to the difference position and the intersection ratio.
Preferably, the dividing the first structure vectors of all the historical conversion data into a plurality of groups by clustering includes:
presetting the initial category number of clusters, clustering all the first structure vectors to obtain a plurality of initial categories, calculating the difference of the first structure vectors in each initial category, and further obtaining the category difference among all the initial category differences;
and gradually increasing the initial category number, sequentially calculating the category difference, selecting the category number corresponding to the minimum category difference as an ideal number, and taking the clustering result corresponding to the ideal number as a grouping result.
Preferably, the selecting a group most similar to the second structure vector includes:
and acquiring central structure vectors positioned at the geometric center in each group, respectively calculating the similarity between each central structure vector and the second structure vector, and taking the group corresponding to the maximum value of the similarity as the group most similar to the second structure vector.
In a second aspect, another embodiment of the present invention provides an artificial intelligence-based system for intelligently monitoring scientific and technological achievement transformation data, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method for intelligently monitoring the scientific and technological achievement transformation data when executing the computer program.
The embodiment of the invention at least has the following beneficial effects:
obtaining a first structure vector by analyzing the historical conversion data, and grouping the first structure vector; and then acquiring a second structure vector of the current conversion data, selecting a group most similar to the second structure vector, and comparing the groups with the second structure vector respectively by keywords to judge whether the current conversion data is abnormal.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart illustrating steps of an intelligent monitoring method for scientific and technological achievement transformation data based on artificial intelligence according to an embodiment of the present invention.
Detailed Description
In order to further explain the technical means and effects of the present invention adopted to achieve the predetermined objects, the following describes the method and system for intelligently monitoring transformed data based on artificial intelligence scientific and technological achievement according to the present invention, with reference to the accompanying drawings and preferred embodiments, and the detailed implementation, structure, features and effects thereof are described in detail as follows. In the following description, different "one embodiment" or "another embodiment" refers to not necessarily the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The intelligent monitoring of the transformation data acts on the preprocessing stage of the transformation process, namely primary intelligent monitoring is carried out on the transformation data, so that the subsequent transformation process can be smoothly carried out.
The following describes a specific scheme of the intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence in detail with reference to the accompanying drawings.
Referring to fig. 1, a flowchart illustrating steps of an intelligent monitoring method for scientific and technological achievement transformation data based on artificial intelligence according to an embodiment of the present invention is shown, where the method includes the following steps:
and S001, acquiring historical conversion data, randomly shielding a certain lexical item in the historical conversion data, calculating the semantic difference degree between the shielded conversion data and the historical conversion data, and taking the lexical item corresponding to the minimum semantic difference degree as a first shielding word.
The method comprises the following specific steps:
1. historical conversion data is collected.
And establishing a historical document library for conversion of scientific and technological achievements, taking the subject characters of each document as the identification of the document, and taking the word vectors corresponding to the identification as historical conversion data.
In the embodiment of the present invention, the title of a file is used as the identifier of the file, and in other embodiments, the title abstract or other words representing the subject of the file may also be used as the identifier.
As an example, the title of the present invention, "method and system for intelligently monitoring transformed scientific and technological achievement data based on artificial intelligence" obtains a word vector thereof [ based on artificial intelligence, scientific and technological achievement, transformed data, intelligence, monitoring, method, and system ] through natural language processing (nlp), and the word vector is transformed data of the present invention.
2. And acquiring a first occlusion word.
And reconstructing the historical conversion data of the shielded lexical item through a self-encoder to obtain the shielded conversion data, and calculating the historical conversion data and the Euclidean distance between word vectors corresponding to the shielded conversion data to be used as the semantic difference.
And constructing a self-encoder, wherein the input of the self-encoder is single historical conversion data, the output of the self-encoder is reconstructed conversion data, the historical conversion data is used as a training data set in the specific training process, and the input Euclidean distance and the output Euclidean distance are used as reconstruction loss of the self-encoder.
Reconstructing the single historical conversion data by using a trained self-encoder to obtain the conversion data x' reconstructed before shielding; randomly shielding the historical conversion data, wherein shielding is to cover the lexical items by blanks, randomly shielding single lexical item in the historical conversion data, and inputting the single lexical item into a trained self-encoder to obtain conversion data x 'reconstructed after shielding' m X 'and x' m The Euclidean distances dis (x ', x ') were calculated by comparison ' m ) As semantic difference degrees, wherein m represents that the mth term is shielded, shielding processing is respectively carried out on all terms in the converted data, and a plurality of dis (x ', x' m ) Selecting min [ dis (x ', x' m )]Corresponding shieldingThe term is used as a first occlusion word.
And S002, expanding the first shielding word to two sides, shielding the historical conversion data until the semantic difference degree is greater than a preset threshold value, and taking the shielding word at the moment as a first shielding area.
The method comprises the following specific steps:
the method comprises the steps of taking a first shielding word as a center, expanding the length of a term of the first shielding word to two sides by taking the length of the term of the first shielding word as an expansion size to obtain a first expansion term, utilizing the first expansion term to shield historical conversion data to obtain first conversion data, calculating semantic difference between the historical conversion data and a word vector of the first conversion data, comparing the semantic difference with a preset threshold, and when the semantic difference is not greater than the preset threshold, continuing expanding the term to two sides by taking the first expansion term as the center according to the expansion size until the semantic difference is greater than the preset threshold to obtain a first shielding area.
Carrying out shielding expansion on the basis of a first shielding word, namely the shielding coverage part size changes, the shielding coverage part size of the first shielding word is 1, the expansion size is 1, the size change of each expansion is L = L +2, namely, the left and right expansion of a lexical item position are respectively carried out, a first expansion lexical item is obtained after the first expansion, historical conversion data are shielded by utilizing the first expansion lexical item and sent into a self-encoder, reconstructed first conversion data are obtained, and the Euclidean distance dis between the reconstructed first conversion data and the historical conversion data is calculated 1 (x′,x″ m ) Wherein, x ″) m Representing the reconstructed first conversion data after the first expansion lexical item taking the mth lexical item as the central lexical item under the condition of 1 st expansion, and setting a preset threshold value m of Euclidean distance dis Contrast dis 1 (x′,x″ m ) And a preset threshold value m dis Size of (d), when dis 1 (x′,x″ m )>m dis The first expansion lexical item is a first shielding area; otherwise, continuing to expand towards two sides by taking the first expansion lexical item as the center to obtain dis 2 (x′,x″ m ) Construction of a solution for c-dis c (x′,x″ m ) C represents the transformation data during the expansion for the c-th time, and acquires the first satisfaction dis c (x′,x″ m )>m dis And c-1 is selected as the final expansion times, and the occlusion terms at the moment are obtained as the first occlusion area.
And S003, increasing the lexical items of the first shielding words, shielding the historical conversion data, traversing the unoccluded lexical items when the lexical items are increased, further selecting the second shielding words, continuously expanding the traversal lexical items of the second shielding words until the minimum semantic difference degree is greater than a preset threshold value, and taking the shielding words at the moment as a second shielding area.
The method comprises the following specific steps:
randomly adding a lexical item to the first occlusion word to expand the first occlusion word into a second expansion lexical item, utilizing the second expansion lexical item to occlude historical conversion data to obtain second conversion data, calculating semantic difference between word vectors of the historical conversion data and the second conversion data, traversing unoccluded words, taking an occlusion word corresponding to the minimum value of the semantic difference as a second occlusion word, and when the semantic difference is not greater than a preset threshold, continuously expanding the second occlusion word by randomly adding a lexical item until the minimum value of the semantic difference is greater than the preset threshold to obtain a second occlusion area.
And on the basis of the first occlusion word, randomly adding a lexical item to obtain a second expansion lexical item, using the second expansion lexical item to occlude historical conversion data, inputting the historical conversion data into the self-encoder to obtain second conversion data, and calculating the semantic difference between word vectors of the historical conversion data and the second conversion data.
Traversing all the unoccluded words to obtain a plurality of second expansion terms and corresponding second conversion data, calculating semantic difference degrees between the word vectors of the historical conversion data and each second conversion data, selecting the occluded word corresponding to the minimum value of the semantic difference degrees as the second occluded word, and if the semantic difference degree corresponding to the second occluded word is not more than a preset threshold value m dis And continuously adding a lexical item to obtain a third occlusion word, and repeating the steps until the minimum value of the semantic difference degree is greater than a preset threshold value.
When the shielding lexical items are increased, the non-shielding lexical items are traversed every time, the shielding lexical items corresponding to the minimum value of the semantic difference degree are selected every time, the conversion data are updated, until the minimum value of the semantic difference degree is larger than a preset threshold value, and the shielding lexical items at the moment are used as second shielding areas.
Step S004, constructing a first structure vector according to the intersection ratio of the first occlusion area and the second occlusion area; dividing the first structure vectors of all historical conversion data into a plurality of groups through clustering; and removing the terms contained in the first structure vector from each historical conversion data to obtain the keywords of the historical conversion data.
The method comprises the following specific steps:
1. a first structure vector is constructed.
Calculating the intersection ratio according to the quantity of the terms corresponding to the first occlusion area and the second occlusion area; and acquiring the difference position of the intersection of the first occlusion area and the second occlusion area and the difference lexical item of the union set, and constructing a first structure vector according to the difference position and the intersection ratio.
And acquiring the intersection ratio of the first occlusion area and the second occlusion area, calculating the number of the terms covered by the intersection ratio, and acquiring the position information of the difference terms in the area obtained by subtracting the intersection from the union, wherein the position information is represented by a label.
Obtaining the number of the difference terms of the first occlusion area and the second occlusion area corresponding to all the historical conversion data, obtaining the maximum number Z, constructing the difference position description of each historical conversion data, wherein the difference position description is a vector of Z elements, if the number of the difference terms of a certain historical conversion data is less than Z, carrying out zero filling processing, and carrying out Concat processing on the intersection ratio and the difference position description to obtain a first structure vector of Z +1 elements.
2. All first structure vectors are grouped.
Presetting the initial category number of clusters, clustering all first structure vectors to obtain a plurality of initial categories, calculating the difference of the first structure vectors in each initial category, and further acquiring the category difference among all initial category differences; and gradually increasing the number of the initial categories, sequentially calculating the category difference, selecting the category number corresponding to the minimum category difference as an ideal number, and taking the clustering result corresponding to the ideal number as a grouping result.
And clustering all the first structure vectors by adopting a K-means clustering algorithm according to the preset initial category number to obtain a plurality of initial categories.
As an example, in the embodiment of the present invention, the initial category number is 3, that is, the initial K is set to 3.
It should be noted that, the distance measurement does not use the euclidean distance of the conventional K-means clustering algorithm, but uses cosine similarity to perform the distance measurement.
And respectively acquiring intra-class variances of the initial cluster sets as differences of the first structure vectors in each initial class, and calculating the variances between the intra-class variances as class differences.
Adjusting K = K +1, performing the same processing according to the steps, further acquiring a plurality of category differences, and fitting K-sigma 2 And selecting K 'corresponding to the minimum point as the optimized K, and taking the corresponding cluster set after the K' clustering as a final grouping result.
3. And removing the terms contained in the first structure vector from each historical conversion data to obtain the keywords of the historical conversion data.
And acquiring the first structure vector and screening out keywords, so that clustering grouping is performed based on the structure representation subsequently, and then comparison with the keyword information of the current conversion data is performed.
Step S005, a second structure vector of the current conversion data is obtained, and a group most similar to the second structure vector is selected; and comparing the keywords of the current conversion data with the keywords corresponding to each first structure vector in the group, and when the maximum similarity obtained by comparison is below a similarity threshold, the current conversion data is abnormal data.
The method comprises the following specific steps:
1. and acquiring a second structure vector of the current conversion data, and selecting a group most similar to the second structure vector.
And acquiring current conversion data, and acquiring a second structure vector of the current conversion data by the same method to serve as a structure representation.
Meanwhile, central structure vectors in each group at the geometric center are obtained, the similarity between each central structure vector and the second structure vector is respectively calculated, and the group corresponding to the maximum similarity value is used as the group most similar to the second structure vector.
As an example, the maximum value of the similarity in the embodiment of the present invention is the minimum value of the euclidean distance, and other methods capable of calculating the vector similarity, such as cosine similarity, may also be used in other embodiments.
2. And judging whether the current conversion data is abnormal or not.
Respectively carrying out cosine similarity on the keyword information of the current conversion data and the historical data keyword information in the selected cluster set n Calculating, n represents the nth historical data keyword information in the cluster set and has a value range of [0, 1%]Closer to 1 means more similar, and closer to 0 means less similar.
Obtaining the maximum value max(s) of a plurality of cosine similarity degrees n ) And setting a similarity threshold m s If max(s) n )≤m s If the current conversion data is possibly abnormal data, storing and early warning the current conversion data, and subsequently carrying out manual verification; otherwise, judging the data as normal data and continuing the subsequent process processing.
In addition, m is s For empirical threshold, as an example, set to m in an embodiment of the invention s =0.6。
In summary, the embodiment of the present invention collects the historical conversion data of the historical result conversion file, randomly occludes a term in the historical conversion data, calculates the semantic difference between the occluded conversion data and the historical conversion data, and takes the term corresponding to the minimum semantic difference as the first occluded word; expanding the first shielding word to two sides, shielding the historical conversion data until the semantic difference degree is greater than a preset threshold value, and taking the shielding word at the moment as a first shielding area; increasing the lexical items of the first occlusion word, occluding the historical conversion data, traversing the unoccluded lexical items when the lexical items are increased, further selecting a second occlusion word, continuously expanding the traversal lexical items of the second occlusion word until the minimum semantic difference degree is greater than a preset threshold value, and taking the occlusion word at the moment as a second occlusion area; constructing a first structure vector according to the intersection ratio of the first occlusion area and the second occlusion area; dividing the first structure vectors of all historical conversion data into a plurality of groups through clustering; removing terms contained in the first structure vector from each historical conversion data to obtain keywords of the historical conversion data; acquiring a second structure vector of the current conversion data, and selecting a group most similar to the second structure vector; and comparing the keywords of the current conversion data with the keywords corresponding to each first structure vector in the group, and when the maximum similarity obtained by comparison is below a similarity threshold, the current conversion data is abnormal data. According to the embodiment of the invention, when an unknown result conversion file is obtained, the unknown result conversion file can be quickly matched into a corresponding group, and keyword comparison is carried out, so that abnormal monitoring of conversion data is realized.
The embodiment of the invention also provides an artificial intelligence-based scientific and technological achievement transformation data intelligent monitoring system which comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the steps of the method are realized when the processor executes the computer program. Because the intelligent monitoring method for the scientific and technological achievement transformation data based on artificial intelligence is described in detail in the above, the detailed description is omitted.
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.

Claims (9)

1. The scientific and technological achievement transformation data intelligent monitoring method based on artificial intelligence is characterized by comprising the following steps:
acquiring historical conversion data of a historical result conversion file, randomly shielding a certain lexical item in the historical conversion data, calculating semantic difference between the shielded conversion data and the historical conversion data, and taking the lexical item corresponding to the minimum semantic difference as a first shielding word;
expanding the first shielding word to two sides, shielding the historical conversion data until the semantic difference degree is greater than a preset threshold value, and taking the shielding word at the moment as a first shielding area;
increasing the lexical items of the first occlusion words, occluding the historical conversion data, traversing unoccluded lexical items when the lexical items are increased, further selecting second occlusion words, continuing to expand the traversal lexical items of the second occlusion words until the minimum semantic difference degree is greater than the preset threshold value, and taking the occlusion words at the moment as second occlusion areas;
constructing a first structure vector according to the intersection ratio of the first occlusion area and the second occlusion area; dividing the first structure vectors of all the historical conversion data into a plurality of groups through clustering; removing terms contained in the first structure vector from each historical conversion data to obtain keywords of the historical conversion data;
acquiring a second structure vector of the current conversion data, and selecting a group most similar to the second structure vector; and comparing the keywords of the current conversion data with the keywords corresponding to each first structure vector in the group, and when the maximum similarity obtained by comparison is below a similarity threshold, converting the current conversion data into abnormal data.
2. The method of claim 1, wherein the step of collecting historical conversion data comprises:
and establishing a historical document library for scientific and technological achievement conversion, taking the subject characters of each document as the identification of the document, and taking the word vectors corresponding to the identification as the historical conversion data.
3. The method according to claim 1, wherein the calculating of the semantic difference degree comprises:
reconstructing the history conversion data after the lexical item is shielded through an auto-encoder to obtain the shielded conversion data, and calculating Euclidean distance between the history conversion data and word vectors corresponding to the shielded conversion data to be used as the semantic difference degree.
4. The method of claim 1, wherein the step of obtaining the first occlusion region comprises:
and expanding the term length of the first shielding word to two sides by taking the first shielding word as a center and taking the term length of the first shielding word as an expansion size to obtain a first expansion term, shielding the historical conversion data by using the first expansion term to obtain first conversion data, calculating the semantic difference between the word vectors of the historical conversion data and the first conversion data, comparing the semantic difference with a preset threshold, and when the semantic difference is not greater than the preset threshold, continuing expanding the term to two sides by taking the first expansion term as the center according to the expansion size until the semantic difference is greater than the preset threshold to obtain the first shielding area.
5. The method according to claim 1, wherein the step of obtaining the second occlusion region comprises:
and randomly adding a lexical item to the first occlusion word to expand the first occlusion word into a second expansion lexical item, utilizing the second expansion lexical item to occlude the historical conversion data to obtain second conversion data, calculating the semantic difference between word vectors of the historical conversion data and the second conversion data, traversing the unoccluded word, taking the occlusion word corresponding to the minimum value of the semantic difference as a second occlusion word, and when the semantic difference is not greater than a preset threshold, continuously performing expansion of the lexical item to the second occlusion word until the semantic difference is greater than the preset threshold, so as to obtain the second occlusion area.
6. The method of claim 1, wherein the constructing of the first structure vector comprises:
calculating the intersection ratio according to the number of terms corresponding to the first occlusion area and the second occlusion area;
and acquiring the difference position of the difference terms of the intersection and union of the first occlusion area and the second occlusion area, and constructing the first structure vector according to the difference position and the intersection ratio.
7. The method of claim 1, wherein the grouping the first structure vectors of all the historical conversion data into a plurality of groups by clustering comprises:
presetting the initial category number of clusters, clustering all the first structure vectors to obtain a plurality of initial categories, calculating the difference of the first structure vectors in each initial category, and further obtaining the category difference among all the initial category differences;
and gradually increasing the initial category number, sequentially calculating the category difference, selecting the category number corresponding to the minimum category difference as an ideal number, and taking the clustering result corresponding to the ideal number as a grouping result.
8. The method of claim 1, wherein selecting the group most similar to the second structure vector comprises:
and acquiring central structure vectors positioned at the geometric center in each group, respectively calculating the similarity between each central structure vector and the second structure vector, and taking the group corresponding to the maximum value of the similarity as the group most similar to the second structure vector.
9. Intelligent monitoring system for scientific and technological achievement transformation data based on artificial intelligence, comprising a memory, a processor and a computer program stored in the memory and capable of running on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 8 when executing the computer program.
CN202210351897.7A 2022-04-02 2022-04-02 Intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence Pending CN115525927A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210351897.7A CN115525927A (en) 2022-04-02 2022-04-02 Intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210351897.7A CN115525927A (en) 2022-04-02 2022-04-02 Intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN115525927A true CN115525927A (en) 2022-12-27

Family

ID=84695626

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210351897.7A Pending CN115525927A (en) 2022-04-02 2022-04-02 Intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN115525927A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117808633A (en) * 2024-02-29 2024-04-02 北京大众益康科技有限公司 Early warning method and device for technical research transformation in sleep field, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117808633A (en) * 2024-02-29 2024-04-02 北京大众益康科技有限公司 Early warning method and device for technical research transformation in sleep field, electronic equipment and storage medium
CN117808633B (en) * 2024-02-29 2024-05-28 北京大众益康科技有限公司 Early warning method and device for technical research transformation in sleep field, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109948647B (en) Electrocardiogram classification method and system based on depth residual error network
CN111914644A (en) Dual-mode cooperation based weak supervision time sequence action positioning method and system
Cateni et al. A hybrid feature selection method for classification purposes
CN111950294A (en) Intention identification method and device based on multi-parameter K-means algorithm and electronic equipment
CN114332500A (en) Image processing model training method and device, computer equipment and storage medium
TWI752486B (en) Training method, feature extraction method, device and electronic device
CN115525927A (en) Intelligent monitoring method and system for scientific and technological achievement transformation data based on artificial intelligence
Ayyad et al. A new distributed feature selection technique for classifying gene expression data
CN112765468A (en) Personalized user service customization method and device
KR20210124811A (en) Apparatus and method for generating training data for network failure diagnosis
CN110263917B (en) Neural network compression method and device
CN109542949B (en) Formal vector-based decision information system knowledge acquisition method
CN113268370A (en) Root cause alarm analysis method, system, equipment and storage medium
CN113421546A (en) Cross-tested multi-mode based speech synthesis method and related equipment
CN110866609A (en) Interpretation information acquisition method, device, server and storage medium
WO2022252694A1 (en) Neural network optimization method and apparatus
CN114268625B (en) Feature selection method, device, equipment and storage medium
CN112686306B (en) ICD operation classification automatic matching method and system based on graph neural network
JPH11143875A (en) Device and method for automatic word classification
CN114723043A (en) Convolutional neural network convolutional kernel pruning method based on hypergraph model spectral clustering
CN113762505A (en) Clustering pruning method of convolutional neural network according to norm of channel L2
CN117290709B (en) Method, system, device and storage medium for continuous dynamic intent decoding
CN111950615A (en) Network fault feature selection method based on tree species optimization algorithm
Cai et al. ACF: An Adaptive Compression Framework for Multimodal Network in Embedded Devices
Elkano et al. On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination