CN116186359A - Integrated management method, system and storage medium for multi-source heterogeneous data of universities - Google Patents
Integrated management method, system and storage medium for multi-source heterogeneous data of universities Download PDFInfo
- Publication number
- CN116186359A CN116186359A CN202310488579.XA CN202310488579A CN116186359A CN 116186359 A CN116186359 A CN 116186359A CN 202310488579 A CN202310488579 A CN 202310488579A CN 116186359 A CN116186359 A CN 116186359A
- Authority
- CN
- China
- Prior art keywords
- data
- college
- heterogeneous data
- source heterogeneous
- universities
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000007726 management method Methods 0.000 title claims abstract description 37
- 238000003860 storage Methods 0.000 title claims abstract description 19
- 238000013523 data management Methods 0.000 claims abstract description 67
- 238000000034 method Methods 0.000 claims abstract description 36
- 230000010354 integration Effects 0.000 claims abstract description 24
- 230000004927 fusion Effects 0.000 claims abstract description 8
- 239000013598 vector Substances 0.000 claims description 50
- 238000013528 artificial neural network Methods 0.000 claims description 34
- 230000002776 aggregation Effects 0.000 claims description 17
- 238000004220 aggregation Methods 0.000 claims description 17
- 230000007246 mechanism Effects 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 9
- 238000004458 analytical method Methods 0.000 claims description 8
- 230000005540 biological transmission Effects 0.000 claims description 6
- 230000004931 aggregating effect Effects 0.000 claims description 5
- 238000003062 neural network model Methods 0.000 claims description 5
- 238000013506 data mapping Methods 0.000 claims description 4
- 238000013135 deep learning Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 230000005012 migration Effects 0.000 claims description 4
- 238000013508 migration Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 description 8
- 238000010276 construction Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 210000002569 neuron Anatomy 0.000 description 4
- 239000011800 void material Substances 0.000 description 4
- 230000004913 activation Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Tourism & Hospitality (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- Computing Systems (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses an integrated management method, system and storage medium for multi-source heterogeneous data of universities, which comprises the following steps: acquiring multi-source heterogeneous data of a college, classifying according to service types and user identity categories, fusing and standardizing heterogeneous data with different sources, constructing a college data knowledge graph for relationship classification, and vectorizing the college data knowledge graph through a graph representation method; generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics to generate a related graph structure, and performing data management; and constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and carrying out data sharing on the college multi-source heterogeneous data. The invention carries out all-round data integration and college data management by constructing the college data knowledge graph, realizes the fusion and sharing of data information and business, and promotes the modernization of a college management system and management capability.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to an integrated management method, an integrated management system and a storage medium for multi-source heterogeneous data in colleges and universities.
Background
Along with the rapid development of informatization construction of universities, data management becomes an intelligent campus construction basic condition, and is also an important guarantee for the intelligent campus construction from service oriented to service oriented. The investment of the intelligent campus construction of universities is increased year by year, and data management is an important foundation for the intelligent campus construction of new period, and more high-school and accurate data support is provided for school teaching, scientific research, management and service. The data management work is delayed due to the fact that the importance and investment of local universities are insufficient, so that the data quality is poor, the standards are different, and the data cannot be shared; aiming at the application requirements of intelligent campuses, data management work must be carried out as early as possible, and data standards, data resource systems and diversified data services are established.
The data standard is a standardization formulated for consistency, integrity and accuracy of using and exchanging data in the data treatment process, and is an important guarantee for improving the data quality. However, in the early stage of the construction of the informatization system, the university does not have unified planning, so that a plurality of data standards are not unified, and therefore, the sharing of data cannot be realized, and a data island is caused. Therefore, there is a need to manage the data of the existing data center of the university, and simultaneously establish a corresponding topic library and a unified data open platform to complete the processes of data aggregation, cleaning, conversion and the like, construct a multi-source heterogeneous large data dimension view oriented to the data of the university, and perform omnibearing data integration and data management of the university.
Disclosure of Invention
In order to solve the technical problems, the invention provides an integrated management method, an integrated management system and a storage medium for multi-source heterogeneous data in colleges and universities.
The first aspect of the invention provides an integrated management method for multi-source heterogeneous data in a college, which comprises the following steps:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
meanwhile, a college data sharing model is built according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing through the college data sharing model.
In this scheme, classify the heterogeneous data of said colleges and universities according to business type and user identity class that the data corresponds to, and fuse and normalize heterogeneous data of different sources, specifically:
college heterogeneous data of different data sources are obtained, the college heterogeneous data of the different data sources are subjected to data marking according to service labels of the data sources, and data aggregation is carried out under the same service label to realize college multi-source heterogeneous data service classification;
acquiring relevant knowledge in the service field according to each service label, determining different expression modes of relevant terms through the relevant knowledge, determining equivalent concepts according to different expression modes of the same relevant terms, and unifying entities of the relevant terms based on the equivalent concepts by college heterogeneous data under the same service label;
performing data conflict detection after unifying related term entities, calculating the similarity between the entities and a preset entity after unifying the related term entities of the college heterogeneous data of different data sources, and performing entity-entity matching according to the similarity;
performing association and fusion on the multi-source heterogeneous data of the university according to the matching result and a preset ontology, deleting nonsensical entities, and setting a user identity class label by using user identity class information corresponding to the multi-source heterogeneous data of the university;
And (3) matching the multi-source heterogeneous data of the university with the data tag with a preset ontology through an entity to construct a university heterogeneous data knowledge base.
In the scheme, relationship classification is carried out on multi-source heterogeneous data of a college based on a college knowledge graph, and the college data knowledge graph is vectorized through a graph representation method, specifically:
establishing data mapping according to a college heterogeneous data knowledge base, converting data into a triplet form, extracting attribute information corresponding to entities in the college heterogeneous data knowledge base, and establishing a corresponding entity attribute information through a preset ontology to construct a relation table;
acquiring the number of the entity attribute information associated with each entity attribute information by inquiring the relation table, encoding according to the number of the entity attribute information and the corresponding data ID, and representing the encoded information in a graphical mode;
generating a triplet form of entities and relations in a college heterogeneous data knowledge base, mapping, constructing a college knowledge graph in a graphical mode, and learning and representing the college knowledge graph through a graph convolution neural network;
representation of college data knowledge graph,/>Representing a set of entity nodes in a knowledge graph of the college data, < - >Representing entity node closuresTying and collecting;
and vectorizing entity nodes in the college knowledge graph, aggregating neighbor node feature vectors into own entity node vectors by using a graph convolution neural network, and carrying out information propagation on updated node features through convolution calculation.
In the scheme, demand features are generated according to data management demands of universities, a feature data set is selected according to a feature selection strategy based on the demand features, a related graph structure is generated according to the feature data set, potential data connection is acquired according to the graph structure, and data management is performed by combining time features, specifically:
acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as the service semantic characteristics;
establishing a retrieval task in a college data knowledge graph according to the demand characteristics, importing the demand characteristics into a low-dimensional vector space, and calculating similarity by calculating Euclidean distance between demand characteristic vectors and entity node vectors in the low-dimensional vector space;
Obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating the features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to the importance degree;
and learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of the entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring potential data connection.
In this scheme, still include, obtain the time characteristic that marked entity node corresponds the high school heterogeneous data, specifically be:
the method comprises the steps of outputting real college heterogeneous data after graph convolution neural network feature aggregation to form a sequence signal, training a time convolution neural network model, and inputting the sequence signal into the time convolution neural network;
carrying out residual error connection hole convolution on the input sequence signal in a time convolution neural network until all convolution calculation is finished, and outputting time characteristics of college heterogeneous data through a full connection layer;
And carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
In the scheme, a college data sharing model is constructed according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing by the college data sharing model, specifically:
acquiring a data characteristic and a graph structure corresponding to the current college data management requirement, and judging whether the data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristic;
extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence according to the weight of each entity node in a shared entity node set, wherein the higher the weight is, the higher the sharing priority is;
calculating the mahalanobis distance between each entity node in the shared entity node set, taking the mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of college heterogeneous data through node migration according to the minimum overhead principle;
And integrating the data of the college heterogeneous data which can be shared according to the integration path according to the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
The second aspect of the present invention also provides an integrated management system for multi-source heterogeneous data in a college, the system comprising: the system comprises a memory and a processor, wherein the memory comprises an integrated management method program of multi-source heterogeneous data of a college, and the integrated management method program of multi-source heterogeneous data of the college realizes the following steps when being executed by the processor:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
Meanwhile, a college data sharing model is built according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing through the college data sharing model.
The third aspect of the present invention also provides a computer readable storage medium, where the computer readable storage medium includes a method program for integrated management of multi-source heterogeneous data in a college, where the method program for integrated management of multi-source heterogeneous data in a college implements the steps of the method for integrated management of multi-source heterogeneous data in a college according to any one of the above steps when executed by a processor.
The invention discloses an integrated management method, system and storage medium for multi-source heterogeneous data of universities, which comprises the following steps: acquiring multi-source heterogeneous data of a college, classifying according to service types and user identity categories, fusing and standardizing heterogeneous data with different sources, constructing a college data knowledge graph for relationship classification, and vectorizing the college data knowledge graph through a graph representation method; generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics to generate a related graph structure, and performing data management; and constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and carrying out data sharing on the college multi-source heterogeneous data. The invention carries out all-round data integration and college data management by constructing the college data knowledge graph, realizes the fusion and sharing of data information and business, and promotes the modernization of a college management system and management capability.
Drawings
FIG. 1 shows a flow chart of an integrated management method for multi-source heterogeneous data in colleges and universities according to the present invention;
FIG. 2 is a flow chart of a method of the present invention for acquiring data in accordance with a graph structure for data management potentially associated with temporal features;
FIG. 3 is a flow chart of a method for data integration and data sharing of multi-source heterogeneous data in a college by a college data sharing model;
fig. 4 shows a block diagram of an integrated management system for multi-source heterogeneous data in colleges and universities according to the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.
Fig. 1 shows a flowchart of an integrated management method for multi-source heterogeneous data in a college according to the present invention.
As shown in fig. 1, a first aspect of the present invention provides an integrated management method for multi-source heterogeneous data in a college, including:
s102, acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
s104, constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the colleges and universities knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
s106, generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data links according to the graph structure, and carrying out data management by combining time characteristics;
and S108, constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and integrating and sharing the data of the college multi-source heterogeneous data through the college data sharing model.
It should be noted that, acquiring college heterogeneous data of different data sources, marking the college heterogeneous data of the different data sources according to service labels of the data sources, and aggregating the data under the same service label to realize college multi-source heterogeneous data service classification; because the same entity names in different data possibly have different languages or expression modes, acquiring related knowledge in the service field according to each service tag, determining an equivalent concept according to different expression modes of the same related term by the related knowledge, unifying the entities of the related term by the college heterogeneous data under the same service tag based on the equivalent concept, matching the college heterogeneous data in each data source so as to achieve entity resolution, performing conflict detection of the data after unifying the related term entities, presetting body information corresponding to each data in a college heterogeneous data knowledge spectrogram based on the current data management state and habit information of a college, calculating the similarity of the entity after unifying the related term entities of the college heterogeneous data of different data sources and a preset body, and performing body-entity matching according to the similarity; and carrying out association and fusion on the multi-source heterogeneous data of the colleges and universities according to the matching result and a preset body, deleting nonsensical entities, and simultaneously setting a user identity class label by utilizing user identity class information corresponding to the multi-source heterogeneous data of the colleges and universities, wherein the user identity information comprises: teacher, student, employee, etc.; and (3) matching the multi-source heterogeneous data of the university with the data tag with a preset ontology through an entity to construct a university heterogeneous data knowledge base.
It should be noted that, establishing a data mapping according to a college heterogeneous data knowledge base to convert data into a triplet form, extracting attribute information corresponding to entities in the college heterogeneous data knowledge base, and establishing a corresponding entity attribute information through a preset ontology to construct a relationship table; acquiring the number of the entity attribute information associated with each entity attribute information by inquiring the relation table, encoding according to the number of the entity attribute information and the corresponding data ID, and representing the encoded information in a graphical mode; generating a triplet form of entities and relations in a college heterogeneous data knowledge base, mapping, constructing a college knowledge graph in a graphical mode, and learning and representing the college knowledge graph through a graph convolution neural network; representation of college data knowledge graph,/>Representing a set of entity nodes in a knowledge graph of the college data, < ->Representing a set of entity node relationships; vectorizing entity nodes in a college knowledge graph, aggregating neighbor node feature vectors into own entity node vectors by using a graph convolution neural network, information spreading the updated node features through convolution calculation, and obtaining entity nodes/>The method is represented by a neighbor entity node aggregation mechanism, and concretely comprises the following steps:
Wherein,,representing entity nodes aggregated by neighbor entity nodes +.>Is represented by a vectorization of>Representing an activation function->Representing entity node->Weight matrix with neighbor entity node, +.>A vector representation of neighboring entity nodes representing an entity node.
FIG. 2 is a flow chart illustrating a method of the present invention for acquiring data in accordance with a graph structure potentially associated with temporal features for data management.
According to the embodiment of the invention, demand characteristics are generated according to the data management demand of a college, a characteristic data set is selected according to a characteristic selection strategy based on the demand characteristics, a related graph structure is generated according to the characteristic data set, potential data connection is acquired according to the graph structure, and data management is performed by combining time characteristics, specifically:
s202, acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as service semantic characteristics;
s204, establishing a retrieval task in a college data knowledge graph according to the demand features, importing the demand features into a low-dimensional vector space, and calculating the similarity by calculating Euclidean distances of the demand feature vectors and the entity node vectors in the low-dimensional vector space;
S206, obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
s208, setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to importance degrees;
and S210, learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of the entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring data potential relations.
The method includes the steps that real college heterogeneous data after image convolution neural network feature aggregation are output to form a sequence signal, a time convolution neural network model is trained, and the sequence signal is input to the time convolution neural network; carrying out residual connection hole convolution on the input sequence signal in the time convolution neural network until all convolution calculation is finished, wherein the calculation formula of the hole convolution is as follows:
Wherein,,representing the input sequence signal as +.>When the corresponding cavity convolutions result, +.>Representing convolution kernel +.>Representing the one-dimensional convolution kernel size,/->For the number of convolution kernels, < >>Is a one-dimensional sequence, < > and->Representing the calculation result of the upper layer neuron based on the lower layer neuron, the +.>Is a void factor;
outputting time characteristics of the college heterogeneous data through the full connection layer; and carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
The graph structure and the neighbor nodes of each entity node are subjected to feature aggregation and feature extraction through the graph convolution neural network, the aggregated data features are input into the time convolution neural network, the void convolution is introduced to ensure that the calculated amount is not increased, the longer sequence signal convolution is realized, the training error is effectively reduced through the residual network, the graph convolution neural network is subjected to joint analysis by combining the time correlation with the extracted structural features, and the current data management requirement of colleges and universities is realized.
FIG. 3 shows a flow chart of a method for data integration and data sharing of multi-source heterogeneous data in a college by a college data sharing model.
According to the embodiment of the invention, a college data sharing model is constructed according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is integrated and shared by the college data sharing model, specifically:
S302, acquiring data characteristics and a graph structure corresponding to the current college data management requirement, and judging whether data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristics;
s304, extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
s306, constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence of each entity node in the shared entity node set according to the weight of each entity node, wherein the higher the weight is, the higher the sharing priority is;
s308, calculating the Mahalanobis distance between each entity node in the shared entity node set, taking the Mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of the college heterogeneous data through node migration according to the minimum overhead principle;
and S310, carrying out data integration on the college heterogeneous data which can be shared according to the integration path and the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
It should be noted that, the mahalanobis distance between each entity node in the shared entity node set is calculated, and the calculation formula is as follows:
wherein,,representing entity nodes +.>And->Position information in a low-dimensional space, +.>Representing entity node->And->The mahalanobis distance between>Representing covariance matrix of marked entity node, + represents covariance matrix taking generalized inverse, ++>Representing the matrix transpose.
According to the embodiment of the invention, the existing regulation system of the university is evaluated according to the fusion analysis of the multi-source heterogeneous data of the university, and the method specifically comprises the following steps:
acquiring multi-source heterogeneous data of colleges and universities generated by different data sources under a target regulation system, and generating corresponding data characteristics by analyzing and acquiring graph structure information and time characteristics of the multi-source heterogeneous data;
matching the data features with corresponding user identity category information, and carrying out principal component analysis on the data features corresponding to different user identity categories to obtain parameter information with highest contribution degree as principal component direction;
generating constraint information of different user identities based on a target regulation system, and projecting characteristic signals in the main component direction according to data characteristics corresponding to different user identity categories to obtain characteristic scatter diagrams of the different user identity categories;
Determining a constraint range in a feature scatter diagram according to the constraint information, counting the scatter points of each user identity category falling into the constraint range, comparing and judging the scatter points with a preset scatter point threshold value, and acquiring deviation to evaluate a target regulation system according to a preset deviation interval;
and when the evaluation result of the target regulation system does not meet the preset standard, carrying out adaptive adjustment on the target regulation system according to the deviation.
Fig. 4 shows a block diagram of an integrated management system for multi-source heterogeneous data in colleges and universities according to the present invention.
The second aspect of the present invention also provides an integrated management system 4 for multi-source heterogeneous data in a college, the system comprising: the memory 41 and the processor 42, wherein the memory comprises an integrated management method program of multi-source heterogeneous data of a college, and the integrated management method program of multi-source heterogeneous data of the college realizes the following steps when being executed by the processor:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
Generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
meanwhile, a college data sharing model is built according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing through the college data sharing model.
It should be noted that, acquiring college heterogeneous data of different data sources, marking the college heterogeneous data of the different data sources according to service labels of the data sources, and aggregating the data under the same service label to realize college multi-source heterogeneous data service classification; because the same entity names in different data possibly have different languages or expression modes, acquiring related knowledge in the service field according to each service tag, determining an equivalent concept according to different expression modes of the same related term by the related knowledge, unifying the entities of the related term by the college heterogeneous data under the same service tag based on the equivalent concept, matching the college heterogeneous data in each data source so as to achieve entity resolution, performing conflict detection of the data after unifying the related term entities, presetting body information corresponding to each data in a college heterogeneous data knowledge spectrogram based on the current data management state and habit information of a college, calculating the similarity of the entity after unifying the related term entities of the college heterogeneous data of different data sources and a preset body, and performing body-entity matching according to the similarity; and carrying out association and fusion on the multi-source heterogeneous data of the colleges and universities according to the matching result and a preset body, deleting nonsensical entities, and simultaneously setting a user identity class label by utilizing user identity class information corresponding to the multi-source heterogeneous data of the colleges and universities, wherein the user identity information comprises: teacher, student, employee, etc.; and (3) matching the multi-source heterogeneous data of the university with the data tag with a preset ontology through an entity to construct a university heterogeneous data knowledge base.
It should be noted that, establishing a data mapping according to a college heterogeneous data knowledge base to convert data into a triplet form, extracting attribute information corresponding to entities in the college heterogeneous data knowledge base, and establishing a corresponding entity attribute information through a preset ontology to construct a relationship table; acquiring the number of the entity attribute information associated with each entity attribute information by inquiring the relation table, encoding according to the number of the entity attribute information and the corresponding data ID, and representing the encoded information in a graphical mode; generating a triplet form of entities and relations in a college heterogeneous data knowledge base, mapping, constructing a college knowledge graph in a graphical mode, and learning and representing the college knowledge graph through a graph convolution neural network; representation of college data knowledge graph,/>Representing a set of entity nodes in a knowledge graph of the college data, < ->Representing a set of entity node relationships; will be highVector representation is carried out on entity nodes in the knowledge graph, neighbor node feature vectors are aggregated into own entity node vectors by using a graph convolution neural network, information transmission is carried out on updated node features through convolution calculation, and the entity nodes are represented by adopting the method >The method is represented by a neighbor entity node aggregation mechanism, and concretely comprises the following steps:
wherein,,representing entity nodes aggregated by neighbor entity nodes +.>Is represented by a vectorization of>Representing an activation function->Representing entity node->Weight matrix with neighbor entity node, +.>A vector representation of neighboring entity nodes representing an entity node.
According to the embodiment of the invention, demand characteristics are generated according to the data management demand of a college, a characteristic data set is selected according to a characteristic selection strategy based on the demand characteristics, a related graph structure is generated according to the characteristic data set, potential data connection is acquired according to the graph structure, and data management is performed by combining time characteristics, specifically:
acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as the service semantic characteristics;
establishing a retrieval task in a college data knowledge graph according to the demand characteristics, importing the demand characteristics into a low-dimensional vector space, and calculating similarity by calculating Euclidean distance between demand characteristic vectors and entity node vectors in the low-dimensional vector space;
Obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating the features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to the importance degree;
and learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of the entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring potential data connection.
The method includes the steps that real college heterogeneous data after image convolution neural network feature aggregation are output to form a sequence signal, a time convolution neural network model is trained, and the sequence signal is input to the time convolution neural network; carrying out residual connection hole convolution on the input sequence signal in the time convolution neural network until all convolution calculation is finished, wherein the calculation formula of the hole convolution is as follows:
Wherein,,representing the input sequence signal as +.>When the corresponding cavity convolutions result, +.>Representing convolution kernel +.>Representing the one-dimensional convolution kernel size,/->For the number of convolution kernels, < >>Is a one-dimensional sequence, < > and->Representing the calculation result of the upper layer neuron based on the lower layer neuron, the +.>Is a void factor;
outputting time characteristics of the college heterogeneous data through the full connection layer; and carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
The graph structure and the neighbor nodes of each entity node are subjected to feature aggregation and feature extraction through the graph convolution neural network, the aggregated data features are input into the time convolution neural network, the void convolution is introduced to ensure that the calculated amount is not increased, the longer sequence signal convolution is realized, the training error is effectively reduced through the residual network, the graph convolution neural network is subjected to joint analysis by combining the time correlation with the extracted structural features, and the current data management requirement of colleges and universities is realized.
According to the embodiment of the invention, a college data sharing model is constructed according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is integrated and shared by the college data sharing model, specifically:
Acquiring a data characteristic and a graph structure corresponding to the current college data management requirement, and judging whether the data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristic;
extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence according to the weight of each entity node in a shared entity node set, wherein the higher the weight is, the higher the sharing priority is;
calculating the mahalanobis distance between each entity node in the shared entity node set, taking the mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of college heterogeneous data through node migration according to the minimum overhead principle;
and integrating the data of the college heterogeneous data which can be shared according to the integration path according to the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
It should be noted that, the mahalanobis distance between each entity node in the shared entity node set is calculated, and the calculation formula is as follows:
wherein,,representing entity nodes +.>And->Position information in a low-dimensional space, +.>Representing entity node->And->The mahalanobis distance between>Representing covariance matrix of marked entity node, + represents covariance matrix taking generalized inverse, ++>Representing the matrix transpose.
The third aspect of the present invention also provides a computer readable storage medium, where the computer readable storage medium includes a method program for integrated management of multi-source heterogeneous data in a college, where the method program for integrated management of multi-source heterogeneous data in a college implements the steps of the method for integrated management of multi-source heterogeneous data in a college according to any one of the above steps when executed by a processor.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, or the like, which can store program codes.
Alternatively, the above-described integrated units of the present invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. The integrated management method for the multi-source heterogeneous data of the universities is characterized by comprising the following steps of:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
meanwhile, a college data sharing model is built according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing through the college data sharing model.
2. The integrated management method of multi-source heterogeneous data in colleges and universities according to claim 1, wherein the multi-source heterogeneous data in colleges and universities are classified according to service types and user identity categories corresponding to the data, and fusion and standardization processing are performed on heterogeneous data with different sources, specifically:
College heterogeneous data of different data sources are obtained, the college heterogeneous data of the different data sources are subjected to data marking according to service labels of the data sources, and data aggregation is carried out under the same service label to realize college multi-source heterogeneous data service classification;
acquiring relevant knowledge in the service field according to each service label, determining different expression modes of relevant terms through the relevant knowledge, determining equivalent concepts according to different expression modes of the same relevant terms, and unifying entities of the relevant terms based on the equivalent concepts by college heterogeneous data under the same service label;
performing data conflict detection after unifying related term entities, calculating the similarity between the entities and a preset entity after unifying the related term entities of the college heterogeneous data of different data sources, and performing entity-entity matching according to the similarity;
performing association and fusion on the multi-source heterogeneous data of the university according to the matching result and a preset ontology, deleting nonsensical entities, and setting a user identity class label by using user identity class information corresponding to the multi-source heterogeneous data of the university;
and (3) matching the multi-source heterogeneous data of the university with the data tag with a preset ontology through an entity to construct a university heterogeneous data knowledge base.
3. The integrated management method for multi-source heterogeneous data of colleges and universities according to claim 1, wherein the relationship classification is performed on the multi-source heterogeneous data of the colleges and universities based on the knowledge graph of the colleges and universities, and the knowledge graph of the data of the colleges and universities is vectorized by a graph representation method, specifically comprising:
establishing data mapping according to a college heterogeneous data knowledge base, converting data into a triplet form, extracting attribute information corresponding to entities in the college heterogeneous data knowledge base, and establishing a corresponding entity attribute information through a preset ontology to construct a relation table;
acquiring the number of the entity attribute information associated with each entity attribute information by inquiring the relation table, encoding according to the number of the entity attribute information and the corresponding data ID, and representing the encoded information in a graphical mode;
generating a triplet form of entities and relations in a college heterogeneous data knowledge base, mapping, constructing a college knowledge graph in a graphical mode, and learning and representing the college knowledge graph through a graph convolution neural network;
representation of college data knowledge graph,/>Representing a set of entity nodes in a knowledge graph of the college data, < ->Representing a set of entity node relationships;
And vectorizing entity nodes in the college knowledge graph, aggregating neighbor node feature vectors into own entity node vectors by using a graph convolution neural network, and carrying out information propagation on updated node features through convolution calculation.
4. The integrated management method of multi-source heterogeneous data of colleges and universities according to claim 1, wherein the method is characterized in that demand characteristics are generated according to data management demands of the colleges and universities, a characteristic data set is selected according to a characteristic selection strategy based on the demand characteristics, a related graph structure is generated according to the characteristic data set, potential data connection is obtained according to the graph structure, and data management is performed by combining time characteristics, specifically:
acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as the service semantic characteristics;
establishing a retrieval task in a college data knowledge graph according to the demand characteristics, importing the demand characteristics into a low-dimensional vector space, and calculating similarity by calculating Euclidean distance between demand characteristic vectors and entity node vectors in the low-dimensional vector space;
Obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating the features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to the importance degree;
and learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of the entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring potential data connection.
5. The integrated management method of multi-source heterogeneous data of colleges and universities according to claim 4, further comprising obtaining time characteristics of marked entity nodes corresponding to the heterogeneous data of colleges and universities, specifically:
the method comprises the steps of outputting real college heterogeneous data after graph convolution neural network feature aggregation to form a sequence signal, training a time convolution neural network model, and inputting the sequence signal into the time convolution neural network;
Carrying out residual error connection hole convolution on the input sequence signal in a time convolution neural network until all convolution calculation is finished, and outputting time characteristics of college heterogeneous data through a full connection layer;
and carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
6. The integrated management method of multi-source heterogeneous data of colleges and universities according to claim 1, wherein a college data sharing model is constructed according to the data characteristics and the corresponding graph structures of the multi-source heterogeneous data of the colleges and universities, and the multi-source heterogeneous data of the colleges and universities are integrated and shared by the college data sharing model, specifically:
acquiring a data characteristic and a graph structure corresponding to the current college data management requirement, and judging whether the data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristic;
extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence according to the weight of each entity node in a shared entity node set, wherein the higher the weight is, the higher the sharing priority is;
Calculating the mahalanobis distance between each entity node in the shared entity node set, taking the mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of college heterogeneous data through node migration according to the minimum overhead principle;
and integrating the data of the college heterogeneous data which can be shared according to the integration path according to the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
7. An integrated management system for multi-source heterogeneous data in a college, the system comprising: the system comprises a memory and a processor, wherein the memory comprises an integrated management method program of multi-source heterogeneous data of a college, and the integrated management method program of multi-source heterogeneous data of the college realizes the following steps when being executed by the processor:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
Generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
meanwhile, a college data sharing model is built according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing through the college data sharing model.
8. The integrated management system of multi-source heterogeneous data for universities according to claim 7, wherein a demand feature is generated according to a data management demand of the universities, a feature data set is selected according to a feature selection policy based on the demand feature, a correlation graph structure is generated according to the feature data set, potential data connection is obtained according to the graph structure, and data management is performed in combination with a time feature, specifically:
acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as the service semantic characteristics;
Establishing a retrieval task in a college data knowledge graph according to the demand characteristics, importing the demand characteristics into a low-dimensional vector space, and calculating similarity by calculating Euclidean distance between demand characteristic vectors and entity node vectors in the low-dimensional vector space;
obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating the features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to the importance degree;
and learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of the entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring potential data connection.
9. The integrated management system of multi-source heterogeneous data for universities according to claim 7, further comprising obtaining a time characteristic of the marked entity node corresponding to the heterogeneous data for universities, specifically:
The method comprises the steps of outputting real college heterogeneous data after graph convolution neural network feature aggregation to form a sequence signal, training a time convolution neural network model, and inputting the sequence signal into the time convolution neural network;
carrying out residual error connection hole convolution on the input sequence signal in a time convolution neural network until all convolution calculation is finished, and outputting time characteristics of college heterogeneous data through a full connection layer;
and carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
10. A computer-readable storage medium, characterized by: the computer readable storage medium includes an integrated management method program for multi-source heterogeneous data of a college, and when the integrated management method program for multi-source heterogeneous data of the college is executed by a processor, the integrated management method steps for multi-source heterogeneous data of the college are implemented as set forth in any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310488579.XA CN116186359B (en) | 2023-05-04 | 2023-05-04 | Integrated management method, system and storage medium for multi-source heterogeneous data of universities |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310488579.XA CN116186359B (en) | 2023-05-04 | 2023-05-04 | Integrated management method, system and storage medium for multi-source heterogeneous data of universities |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116186359A true CN116186359A (en) | 2023-05-30 |
CN116186359B CN116186359B (en) | 2023-09-01 |
Family
ID=86442660
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310488579.XA Active CN116186359B (en) | 2023-05-04 | 2023-05-04 | Integrated management method, system and storage medium for multi-source heterogeneous data of universities |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116186359B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116910824A (en) * | 2023-08-28 | 2023-10-20 | 广东中山网传媒信息科技有限公司 | Safety big data analysis method and system based on distributed multi-source measure |
CN116976808A (en) * | 2023-07-21 | 2023-10-31 | 中国矿业大学(北京) | Multisource heterogeneous coal mine geologic data management system, method, electronic equipment and storage medium |
CN117235281A (en) * | 2023-09-22 | 2023-12-15 | 武汉贝塔世纪科技有限公司 | Multi-element data management method and system based on knowledge graph technology |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107633075A (en) * | 2017-09-22 | 2018-01-26 | 吉林大学 | A kind of multi-source heterogeneous data fusion platform and fusion method |
US20210034651A1 (en) * | 2019-07-29 | 2021-02-04 | Pavel Atanasov | Systems and Methods for Multi-Source Reference Class Identification, Base Rate Calculation, and Prediction |
CN113111244A (en) * | 2020-12-31 | 2021-07-13 | 绍兴亿都信息技术股份有限公司 | Multisource heterogeneous big data fusion system based on traditional Chinese medicine knowledge large-scale popularization |
WO2021189971A1 (en) * | 2020-10-26 | 2021-09-30 | 平安科技(深圳)有限公司 | Medical plan recommendation system and method based on knowledge graph representation learning |
US20210319043A1 (en) * | 2020-04-13 | 2021-10-14 | Singapore University Of Technology And Design | Multi-source data management mechanism and platform |
US20210319329A1 (en) * | 2020-03-30 | 2021-10-14 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating knowledge graph, method for relation mining |
WO2022041730A1 (en) * | 2020-08-28 | 2022-03-03 | 康键信息技术(深圳)有限公司 | Medical field intention recognition method, apparatus and device, and storage medium |
CN114443854A (en) * | 2021-12-30 | 2022-05-06 | 深圳晶泰科技有限公司 | Processing method and device of multi-source heterogeneous data, computer equipment and storage medium |
CN114491068A (en) * | 2022-01-21 | 2022-05-13 | 武汉东湖大数据交易中心股份有限公司 | Method and system for constructing knowledge graph of industrial park by fusing multi-source heterogeneous data |
CN115525768A (en) * | 2022-09-21 | 2022-12-27 | 中国电子科技集团公司第十四研究所 | Visual construction method and device for domain knowledge graph |
-
2023
- 2023-05-04 CN CN202310488579.XA patent/CN116186359B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107633075A (en) * | 2017-09-22 | 2018-01-26 | 吉林大学 | A kind of multi-source heterogeneous data fusion platform and fusion method |
US20210034651A1 (en) * | 2019-07-29 | 2021-02-04 | Pavel Atanasov | Systems and Methods for Multi-Source Reference Class Identification, Base Rate Calculation, and Prediction |
US20210319329A1 (en) * | 2020-03-30 | 2021-10-14 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating knowledge graph, method for relation mining |
US20210319043A1 (en) * | 2020-04-13 | 2021-10-14 | Singapore University Of Technology And Design | Multi-source data management mechanism and platform |
WO2022041730A1 (en) * | 2020-08-28 | 2022-03-03 | 康键信息技术(深圳)有限公司 | Medical field intention recognition method, apparatus and device, and storage medium |
WO2021189971A1 (en) * | 2020-10-26 | 2021-09-30 | 平安科技(深圳)有限公司 | Medical plan recommendation system and method based on knowledge graph representation learning |
CN113111244A (en) * | 2020-12-31 | 2021-07-13 | 绍兴亿都信息技术股份有限公司 | Multisource heterogeneous big data fusion system based on traditional Chinese medicine knowledge large-scale popularization |
CN114443854A (en) * | 2021-12-30 | 2022-05-06 | 深圳晶泰科技有限公司 | Processing method and device of multi-source heterogeneous data, computer equipment and storage medium |
CN114491068A (en) * | 2022-01-21 | 2022-05-13 | 武汉东湖大数据交易中心股份有限公司 | Method and system for constructing knowledge graph of industrial park by fusing multi-source heterogeneous data |
CN115525768A (en) * | 2022-09-21 | 2022-12-27 | 中国电子科技集团公司第十四研究所 | Visual construction method and device for domain knowledge graph |
Non-Patent Citations (2)
Title |
---|
冯勇;张丽颖;顾兆旭;马技;: "面向高校多源异构数据环境的元数据集成方法", 辽宁大学学报(自然科学版), no. 02 * |
文莉莉;邬满;: "基于大数据与知识图谱的知识共享服务平台", 电子元器件与信息技术, no. 03 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116976808A (en) * | 2023-07-21 | 2023-10-31 | 中国矿业大学(北京) | Multisource heterogeneous coal mine geologic data management system, method, electronic equipment and storage medium |
CN116910824A (en) * | 2023-08-28 | 2023-10-20 | 广东中山网传媒信息科技有限公司 | Safety big data analysis method and system based on distributed multi-source measure |
CN116910824B (en) * | 2023-08-28 | 2024-02-06 | 广东中山网传媒信息科技有限公司 | Safety big data analysis method and system based on distributed multi-source measure |
CN117235281A (en) * | 2023-09-22 | 2023-12-15 | 武汉贝塔世纪科技有限公司 | Multi-element data management method and system based on knowledge graph technology |
CN117235281B (en) * | 2023-09-22 | 2024-07-05 | 武汉贝塔世纪科技有限公司 | Multi-element data management method and system based on knowledge graph technology |
Also Published As
Publication number | Publication date |
---|---|
CN116186359B (en) | 2023-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116186359B (en) | Integrated management method, system and storage medium for multi-source heterogeneous data of universities | |
EP3985578A1 (en) | Method and system for automatically training machine learning model | |
CN109165840B (en) | Risk prediction processing method, risk prediction processing device, computer equipment and medium | |
CN109272396B (en) | Customer risk early warning method, device, computer equipment and medium | |
US11461847B2 (en) | Applying a trained model to predict a future value using contextualized sentiment data | |
CN111026842B (en) | Natural language processing method, natural language processing device and intelligent question-answering system | |
CN110363449B (en) | Risk identification method, device and system | |
CN107025509B (en) | Decision making system and method based on business model | |
CN112215604B (en) | Method and device for identifying transaction mutual-party relationship information | |
Li et al. | Long-short term spatiotemporal tensor prediction for passenger flow profile | |
CN110458324B (en) | Method and device for calculating risk probability and computer equipment | |
CN113627566B (en) | Phishing early warning method and device and computer equipment | |
CN110674636B (en) | Power consumption behavior analysis method | |
CN112989761B (en) | Text classification method and device | |
Windiatmoko et al. | Developing FB chatbot based on deep learning using RASA framework for university enquiries | |
CN110637321A (en) | Dynamic claims submission system | |
Choi et al. | Hybrid information mixing module for stock movement prediction | |
Wang et al. | Metro traffic flow prediction via knowledge graph and spatiotemporal graph neural network | |
CN111353728A (en) | Risk analysis method and system | |
CN116910341A (en) | Label prediction method and device and electronic equipment | |
CN115660695A (en) | Customer service personnel label portrait construction method and device, electronic equipment and storage medium | |
US20170293863A1 (en) | Data analysis system, and control method, program, and recording medium therefor | |
CN115062126A (en) | Statement analysis method and device, electronic equipment and readable storage medium | |
CN110163761B (en) | Suspicious item member identification method and device based on image processing | |
CN114154564A (en) | Method and device for determining relevance based on heterogeneous graph, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |