CN116186359B - Integrated management method, system and storage medium for multi-source heterogeneous data of universities - Google Patents

Integrated management method, system and storage medium for multi-source heterogeneous data of universities Download PDF

Info

Publication number
CN116186359B
CN116186359B CN202310488579.XA CN202310488579A CN116186359B CN 116186359 B CN116186359 B CN 116186359B CN 202310488579 A CN202310488579 A CN 202310488579A CN 116186359 B CN116186359 B CN 116186359B
Authority
CN
China
Prior art keywords
data
college
heterogeneous data
source heterogeneous
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310488579.XA
Other languages
Chinese (zh)
Other versions
CN116186359A (en
Inventor
李广垒
王飞
陈祖涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Baoxin Information Technology Co ltd
Original Assignee
Anhui Baoxin Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Baoxin Information Technology Co ltd filed Critical Anhui Baoxin Information Technology Co ltd
Priority to CN202310488579.XA priority Critical patent/CN116186359B/en
Publication of CN116186359A publication Critical patent/CN116186359A/en
Application granted granted Critical
Publication of CN116186359B publication Critical patent/CN116186359B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Computing Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an integrated management method, system and storage medium for multi-source heterogeneous data of universities, which comprises the following steps: acquiring multi-source heterogeneous data of a college, classifying according to service types and user identity categories, fusing and standardizing heterogeneous data with different sources, constructing a college data knowledge graph for relationship classification, and vectorizing the college data knowledge graph through a graph representation method; generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics to generate a related graph structure, and performing data management; and constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and carrying out data sharing on the college multi-source heterogeneous data. The invention carries out all-round data integration and college data management by constructing the college data knowledge graph, realizes the fusion and sharing of data information and business, and promotes the modernization of a college management system and management capability.

Description

Integrated management method, system and storage medium for multi-source heterogeneous data of universities
Technical Field
The invention relates to the technical field of data processing, in particular to an integrated management method, an integrated management system and a storage medium for multi-source heterogeneous data in colleges and universities.
Background
Along with the rapid development of informatization construction of universities, data management becomes an intelligent campus construction basic condition, and is also an important guarantee for the intelligent campus construction from service oriented to service oriented. The investment of the intelligent campus construction of universities is increased year by year, and data management is an important foundation for the intelligent campus construction of new period, and more high-school and accurate data support is provided for school teaching, scientific research, management and service. The data management work is delayed due to the fact that the importance and investment of local universities are insufficient, so that the data quality is poor, the standards are different, and the data cannot be shared; aiming at the application requirements of intelligent campuses, data management work must be carried out as early as possible, and data standards, data resource systems and diversified data services are established.
The data standard is a standardization formulated for consistency, integrity and accuracy of using and exchanging data in the data treatment process, and is an important guarantee for improving the data quality. However, in the early stage of the construction of the informatization system, the university does not have unified planning, so that a plurality of data standards are not unified, and therefore, the sharing of data cannot be realized, and a data island is caused. Therefore, there is a need to manage the data of the existing data center of the university, and simultaneously establish a corresponding topic library and a unified data open platform to complete the processes of data aggregation, cleaning, conversion and the like, construct a multi-source heterogeneous large data dimension view oriented to the data of the university, and perform omnibearing data integration and data management of the university.
Disclosure of Invention
In order to solve the technical problems, the invention provides an integrated management method, an integrated management system and a storage medium for multi-source heterogeneous data in colleges and universities.
The first aspect of the invention provides an integrated management method for multi-source heterogeneous data in a college, which comprises the following steps:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
meanwhile, a college data sharing model is built according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing through the college data sharing model.
In this scheme, classify the heterogeneous data of said colleges and universities according to business type and user identity class that the data corresponds to, and fuse and normalize heterogeneous data of different sources, specifically:
college heterogeneous data of different data sources are obtained, the college heterogeneous data of the different data sources are subjected to data marking according to service labels of the data sources, and data aggregation is carried out under the same service label to realize college multi-source heterogeneous data service classification;
acquiring relevant knowledge in the service field according to each service label, determining different expression modes of relevant terms through the relevant knowledge, determining equivalent concepts according to different expression modes of the same relevant terms, and unifying entities of the relevant terms based on the equivalent concepts by college heterogeneous data under the same service label;
performing data conflict detection after unifying related term entities, calculating the similarity between the entities and a preset entity after unifying the related term entities of the college heterogeneous data of different data sources, and performing entity-entity matching according to the similarity;
performing association and fusion on the multi-source heterogeneous data of the university according to the matching result and a preset ontology, deleting nonsensical entities, and setting a user identity class label by using user identity class information corresponding to the multi-source heterogeneous data of the university;
And (3) matching the multi-source heterogeneous data of the university with the data tag with a preset ontology through an entity to construct a university heterogeneous data knowledge base.
In the scheme, relationship classification is carried out on multi-source heterogeneous data of a college based on a college knowledge graph, and the college data knowledge graph is vectorized through a graph representation method, specifically:
establishing data mapping according to a college heterogeneous data knowledge base, converting data into a triplet form, extracting attribute information corresponding to entities in the college heterogeneous data knowledge base, and establishing a corresponding entity attribute information through a preset ontology to construct a relation table;
acquiring the number of the entity attribute information associated with each entity attribute information by inquiring the relation table, encoding according to the number of the entity attribute information and the corresponding data ID, and representing the encoded information in a graphical mode;
generating a triplet form of entities and relations in a college heterogeneous data knowledge base, mapping, constructing a college knowledge graph in a graphical mode, and learning and representing the college knowledge graph through a graph convolution neural network;
representation of college data knowledge graph,/>Representing a set of entity nodes in a knowledge graph of the college data, < - >Representing a set of entity node relationships;
and vectorizing entity nodes in the college knowledge graph, aggregating neighbor node feature vectors into own entity node vectors by using a graph convolution neural network, and carrying out information propagation on updated node features through convolution calculation.
In the scheme, demand features are generated according to data management demands of universities, a feature data set is selected according to a feature selection strategy based on the demand features, a related graph structure is generated according to the feature data set, potential data connection is acquired according to the graph structure, and data management is performed by combining time features, specifically:
acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as the service semantic characteristics;
establishing a retrieval task in a college data knowledge graph according to the demand characteristics, importing the demand characteristics into a low-dimensional vector space, and calculating similarity by calculating Euclidean distance between demand characteristic vectors and entity node vectors in the low-dimensional vector space;
Obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating the features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to the importance degree;
and learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of the entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring potential data connection.
In this scheme, still include, obtain the time characteristic that marked entity node corresponds the high school heterogeneous data, specifically be:
the method comprises the steps of outputting real college heterogeneous data after graph convolution neural network feature aggregation to form a sequence signal, training a time convolution neural network model, and inputting the sequence signal into the time convolution neural network;
carrying out residual error connection hole convolution on the input sequence signal in a time convolution neural network until all convolution calculation is finished, and outputting time characteristics of college heterogeneous data through a full connection layer;
And carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
In the scheme, a college data sharing model is constructed according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing by the college data sharing model, specifically:
acquiring a data characteristic and a graph structure corresponding to the current college data management requirement, and judging whether the data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristic;
extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence according to the weight of each entity node in a shared entity node set, wherein the higher the weight is, the higher the sharing priority is;
calculating the mahalanobis distance between each entity node in the shared entity node set, taking the mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of college heterogeneous data through node migration according to the minimum overhead principle;
And integrating the data of the college heterogeneous data which can be shared according to the integration path according to the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
The second aspect of the present invention also provides an integrated management system for multi-source heterogeneous data in a college, the system comprising: the system comprises a memory and a processor, wherein the memory comprises an integrated management method program of multi-source heterogeneous data of a college, and the integrated management method program of multi-source heterogeneous data of the college realizes the following steps when being executed by the processor:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
Meanwhile, a college data sharing model is built according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing through the college data sharing model.
The third aspect of the present invention also provides a computer readable storage medium, where the computer readable storage medium includes a method program for integrated management of multi-source heterogeneous data in a college, where the method program for integrated management of multi-source heterogeneous data in a college implements the steps of the method for integrated management of multi-source heterogeneous data in a college according to any one of the above steps when executed by a processor.
The invention discloses an integrated management method, system and storage medium for multi-source heterogeneous data of universities, which comprises the following steps: acquiring multi-source heterogeneous data of a college, classifying according to service types and user identity categories, fusing and standardizing heterogeneous data with different sources, constructing a college data knowledge graph for relationship classification, and vectorizing the college data knowledge graph through a graph representation method; generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics to generate a related graph structure, and performing data management; and constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and carrying out data sharing on the college multi-source heterogeneous data. The invention carries out all-round data integration and college data management by constructing the college data knowledge graph, realizes the fusion and sharing of data information and business, and promotes the modernization of a college management system and management capability.
Drawings
FIG. 1 shows a flow chart of an integrated management method for multi-source heterogeneous data in colleges and universities according to the present application;
FIG. 2 is a flow chart of a method of the present application for acquiring data in accordance with a graph structure for data management potentially associated with temporal features;
FIG. 3 is a flow chart of a method for data integration and data sharing of multi-source heterogeneous data in a college by a college data sharing model;
fig. 4 shows a block diagram of an integrated management system for multi-source heterogeneous data in colleges and universities according to the present application.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will be more clearly understood, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, without conflict, the embodiments of the present application and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced in other ways than those described herein, and therefore the scope of the present application is not limited to the specific embodiments disclosed below.
Fig. 1 shows a flowchart of an integrated management method for multi-source heterogeneous data in a college according to the present application.
As shown in fig. 1, a first aspect of the present invention provides an integrated management method for multi-source heterogeneous data in a college, including:
s102, acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
s104, constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the colleges and universities knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
s106, generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data links according to the graph structure, and carrying out data management by combining time characteristics;
and S108, constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and integrating and sharing the data of the college multi-source heterogeneous data through the college data sharing model.
It should be noted that, acquiring college heterogeneous data of different data sources, marking the college heterogeneous data of the different data sources according to service labels of the data sources, and aggregating the data under the same service label to realize college multi-source heterogeneous data service classification; because the same entity names in different data possibly have different languages or expression modes, acquiring related knowledge in the service field according to each service tag, determining an equivalent concept according to different expression modes of the same related term by the related knowledge, unifying the entities of the related term by the college heterogeneous data under the same service tag based on the equivalent concept, matching the college heterogeneous data in each data source so as to achieve entity resolution, performing conflict detection of the data after unifying the related term entities, presetting body information corresponding to each data in a college heterogeneous data knowledge spectrogram based on the current data management state and habit information of a college, calculating the similarity of the entity after unifying the related term entities of the college heterogeneous data of different data sources and a preset body, and performing body-entity matching according to the similarity; and carrying out association and fusion on the multi-source heterogeneous data of the colleges and universities according to the matching result and a preset body, deleting nonsensical entities, and simultaneously setting a user identity class label by utilizing user identity class information corresponding to the multi-source heterogeneous data of the colleges and universities, wherein the user identity information comprises: teacher, student, employee, etc.; and (3) matching the multi-source heterogeneous data of the university with the data tag with a preset ontology through an entity to construct a university heterogeneous data knowledge base.
It should be noted that, establishing a data mapping according to a college heterogeneous data knowledge base to convert data into a triplet form, extracting attribute information corresponding to entities in the college heterogeneous data knowledge base, and establishing a corresponding entity attribute information through a preset ontology to construct a relationship table; acquiring the number of the entity attribute information associated with each entity attribute information by inquiring the relation table, encoding according to the number of the entity attribute information and the corresponding data ID, and representing the encoded information in a graphical mode; generating a triplet form of entities and relations in a college heterogeneous data knowledge base, mapping, constructing a college knowledge graph in a graphical mode, and learning and representing the college knowledge graph through a graph convolution neural network; representation of college data knowledge graph,/>Representing a set of entity nodes in a knowledge graph of the college data, < ->Representing a set of entity node relationships; vectorizing entity nodes in a college knowledge graph, aggregating neighbor node feature vectors into own entity node vectors by using a graph convolution neural network, carrying out information propagation on updated node features through convolution calculation, and carrying out entity node->The method is represented by a neighbor entity node aggregation mechanism, and concretely comprises the following steps:
Wherein, the liquid crystal display device comprises a liquid crystal display device,representing entity nodes aggregated by neighbor entity nodes +.>Is represented by a vectorization of>Representing an activation function->Representing entity node->Weight matrix with neighbor entity node, +.>A vector representation of neighboring entity nodes representing an entity node.
FIG. 2 is a flow chart illustrating a method of the present invention for acquiring data in accordance with a graph structure potentially associated with temporal features for data management.
According to the embodiment of the invention, demand characteristics are generated according to the data management demand of a college, a characteristic data set is selected according to a characteristic selection strategy based on the demand characteristics, a related graph structure is generated according to the characteristic data set, potential data connection is acquired according to the graph structure, and data management is performed by combining time characteristics, specifically:
s202, acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as service semantic characteristics;
s204, establishing a retrieval task in a college data knowledge graph according to the demand features, importing the demand features into a low-dimensional vector space, and calculating the similarity by calculating Euclidean distances of the demand feature vectors and the entity node vectors in the low-dimensional vector space;
S206, obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
s208, setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to importance degrees;
and S210, learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of the entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring data potential relations.
The method includes the steps that real college heterogeneous data after image convolution neural network feature aggregation are output to form a sequence signal, a time convolution neural network model is trained, and the sequence signal is input to the time convolution neural network; carrying out residual connection hole convolution on the input sequence signal in the time convolution neural network until all convolution calculation is finished, wherein the calculation formula of the hole convolution is as follows:
Wherein, the liquid crystal display device comprises a liquid crystal display device,representing the input sequence signal as +.>When the corresponding cavity convolutions result, +.>Representing convolution kernel +.>Representing the one-dimensional convolution kernel size,/->For the number of convolution kernels, < >>Is a one-dimensional sequence, < > and->Representing the calculation result of the upper layer neuron based on the lower layer neuron, the +.>Is a void factor;
outputting time characteristics of the college heterogeneous data through the full connection layer; and carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
The graph structure and the neighbor nodes of each entity node are subjected to feature aggregation and feature extraction through the graph convolution neural network, the aggregated data features are input into the time convolution neural network, the void convolution is introduced to ensure that the calculated amount is not increased, the longer sequence signal convolution is realized, the training error is effectively reduced through the residual network, the graph convolution neural network is subjected to joint analysis by combining the time correlation with the extracted structural features, and the current data management requirement of colleges and universities is realized.
FIG. 3 shows a flow chart of a method for data integration and data sharing of multi-source heterogeneous data in a college by a college data sharing model.
According to the embodiment of the invention, a college data sharing model is constructed according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is integrated and shared by the college data sharing model, specifically:
S302, acquiring data characteristics and a graph structure corresponding to the current college data management requirement, and judging whether data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristics;
s304, extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
s306, constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence of each entity node in the shared entity node set according to the weight of each entity node, wherein the higher the weight is, the higher the sharing priority is;
s308, calculating the Mahalanobis distance between each entity node in the shared entity node set, taking the Mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of the college heterogeneous data through node migration according to the minimum overhead principle;
and S310, carrying out data integration on the college heterogeneous data which can be shared according to the integration path and the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
It should be noted that, the mahalanobis distance between each entity node in the shared entity node set is calculated, and the calculation formula is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing entity nodes +.>And->Position information in a low-dimensional space, +.>Representing entity node->And->The mahalanobis distance between>Representing covariance matrix of marked entity node, + represents covariance matrix taking generalized inverse, ++>Representing the matrix transpose.
According to the embodiment of the invention, the existing regulation system of the university is evaluated according to the fusion analysis of the multi-source heterogeneous data of the university, and the method specifically comprises the following steps:
acquiring multi-source heterogeneous data of colleges and universities generated by different data sources under a target regulation system, and generating corresponding data characteristics by analyzing and acquiring graph structure information and time characteristics of the multi-source heterogeneous data;
matching the data features with corresponding user identity category information, and carrying out principal component analysis on the data features corresponding to different user identity categories to obtain parameter information with highest contribution degree as principal component direction;
generating constraint information of different user identities based on a target regulation system, and projecting characteristic signals in the main component direction according to data characteristics corresponding to different user identity categories to obtain characteristic scatter diagrams of the different user identity categories;
Determining a constraint range in a feature scatter diagram according to the constraint information, counting the scatter points of each user identity category falling into the constraint range, comparing and judging the scatter points with a preset scatter point threshold value, and acquiring deviation to evaluate a target regulation system according to a preset deviation interval;
and when the evaluation result of the target regulation system does not meet the preset standard, carrying out adaptive adjustment on the target regulation system according to the deviation.
Fig. 4 shows a block diagram of an integrated management system for multi-source heterogeneous data in colleges and universities according to the present invention.
The second aspect of the present invention also provides an integrated management system 4 for multi-source heterogeneous data in a college, the system comprising: the memory 41 and the processor 42, wherein the memory comprises an integrated management method program of multi-source heterogeneous data of a college, and the integrated management method program of multi-source heterogeneous data of the college realizes the following steps when being executed by the processor:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
Generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
meanwhile, a college data sharing model is built according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is subjected to data integration and data sharing through the college data sharing model.
It should be noted that, acquiring college heterogeneous data of different data sources, marking the college heterogeneous data of the different data sources according to service labels of the data sources, and aggregating the data under the same service label to realize college multi-source heterogeneous data service classification; because the same entity names in different data possibly have different languages or expression modes, acquiring related knowledge in the service field according to each service tag, determining an equivalent concept according to different expression modes of the same related term by the related knowledge, unifying the entities of the related term by the college heterogeneous data under the same service tag based on the equivalent concept, matching the college heterogeneous data in each data source so as to achieve entity resolution, performing conflict detection of the data after unifying the related term entities, presetting body information corresponding to each data in a college heterogeneous data knowledge spectrogram based on the current data management state and habit information of a college, calculating the similarity of the entity after unifying the related term entities of the college heterogeneous data of different data sources and a preset body, and performing body-entity matching according to the similarity; and carrying out association and fusion on the multi-source heterogeneous data of the colleges and universities according to the matching result and a preset body, deleting nonsensical entities, and simultaneously setting a user identity class label by utilizing user identity class information corresponding to the multi-source heterogeneous data of the colleges and universities, wherein the user identity information comprises: teacher, student, employee, etc.; and (3) matching the multi-source heterogeneous data of the university with the data tag with a preset ontology through an entity to construct a university heterogeneous data knowledge base.
It should be noted that, establishing a data mapping according to a college heterogeneous data knowledge base to convert data into a triplet form, extracting attribute information corresponding to entities in the college heterogeneous data knowledge base, and establishing a corresponding entity attribute information through a preset ontology to construct a relationship table; acquiring the number of the entity attribute information associated with each entity attribute information by inquiring the relation table, encoding according to the number of the entity attribute information and the corresponding data ID, and representing the encoded information in a graphical mode; generating a triplet form of entities and relations in a college heterogeneous data knowledge base, mapping, constructing a college knowledge graph in a graphical mode, and learning and representing the college knowledge graph through a graph convolution neural network; representation of college data knowledge graph,/>Representing a set of entity nodes in a knowledge graph of the college data, < ->Representing a set of entity node relationships; vectorizing entity nodes in a college knowledge graph, aggregating neighbor node feature vectors into own entity node vectors by using a graph convolution neural network, carrying out information propagation on updated node features through convolution calculation, and carrying out entity node->The method is represented by a neighbor entity node aggregation mechanism, and concretely comprises the following steps:
Wherein, the liquid crystal display device comprises a liquid crystal display device,representing entity nodes aggregated by neighbor entity nodes +.>Is represented by a vectorization of>Representing an activation function->Representing entity node->Weight matrix with neighbor entity node, +.>A vector representation of neighboring entity nodes representing an entity node.
According to the embodiment of the invention, demand characteristics are generated according to the data management demand of a college, a characteristic data set is selected according to a characteristic selection strategy based on the demand characteristics, a related graph structure is generated according to the characteristic data set, potential data connection is acquired according to the graph structure, and data management is performed by combining time characteristics, specifically:
acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as the service semantic characteristics;
establishing a retrieval task in a college data knowledge graph according to the demand characteristics, importing the demand characteristics into a low-dimensional vector space, and calculating similarity by calculating Euclidean distance between demand characteristic vectors and entity node vectors in the low-dimensional vector space;
obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
Setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating the features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to the importance degree;
and learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of the entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring potential data connection.
The method includes the steps that real college heterogeneous data after image convolution neural network feature aggregation are output to form a sequence signal, a time convolution neural network model is trained, and the sequence signal is input to the time convolution neural network; carrying out residual connection hole convolution on the input sequence signal in the time convolution neural network until all convolution calculation is finished, wherein the calculation formula of the hole convolution is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing the input sequence signal as +.>When the corresponding cavity convolutions result, +.>Representing convolution kernel +.>Representing the one-dimensional convolution kernel size,/->For the number of convolution kernels, < > >Is a one-dimensional sequence, < > and->Representing the calculation result of the upper layer neuron based on the lower layer neuron, the +.>Is a void factor;
outputting time characteristics of the college heterogeneous data through the full connection layer; and carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
The graph structure and the neighbor nodes of each entity node are subjected to feature aggregation and feature extraction through the graph convolution neural network, the aggregated data features are input into the time convolution neural network, the void convolution is introduced to ensure that the calculated amount is not increased, the longer sequence signal convolution is realized, the training error is effectively reduced through the residual network, the graph convolution neural network is subjected to joint analysis by combining the time correlation with the extracted structural features, and the current data management requirement of colleges and universities is realized.
According to the embodiment of the invention, a college data sharing model is constructed according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and the college multi-source heterogeneous data is integrated and shared by the college data sharing model, specifically:
acquiring a data characteristic and a graph structure corresponding to the current college data management requirement, and judging whether the data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristic;
Extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence according to the weight of each entity node in a shared entity node set, wherein the higher the weight is, the higher the sharing priority is;
calculating the mahalanobis distance between each entity node in the shared entity node set, taking the mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of college heterogeneous data through node migration according to the minimum overhead principle;
and integrating the data of the college heterogeneous data which can be shared according to the integration path according to the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
It should be noted that, the mahalanobis distance between each entity node in the shared entity node set is calculated, and the calculation formula is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing entity nodes +.>And->Position information in a low-dimensional space, +. >Representing entity node->And->The mahalanobis distance between>Representing covariance matrix of marked entity node, + represents covariance matrix taking generalized inverse, ++>Representing the matrix transpose.
The third aspect of the present application also provides a computer readable storage medium, where the computer readable storage medium includes a method program for integrated management of multi-source heterogeneous data in a college, where the method program for integrated management of multi-source heterogeneous data in a college implements the steps of the method for integrated management of multi-source heterogeneous data in a college according to any one of the above steps when executed by a processor.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, or the like, which can store program codes.
Alternatively, the above-described integrated units of the present invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (7)

1. The integrated management method for the multi-source heterogeneous data of the universities is characterized by comprising the following steps of:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
meanwhile, constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and carrying out data integration and data sharing on the college multi-source heterogeneous data through the college data sharing model;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management by combining time characteristics, wherein the method specifically comprises the following steps:
Acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as the service semantic characteristics;
establishing a retrieval task in a college data knowledge graph according to the demand characteristics, importing the demand characteristics into a low-dimensional vector space, and calculating similarity by calculating Euclidean distance between demand characteristic vectors and entity node vectors in the low-dimensional vector space;
obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating the features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to the importance degree;
learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring potential data connection;
Constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and integrating and sharing the data of the college multi-source heterogeneous data through the college data sharing model, wherein the method specifically comprises the following steps:
acquiring a data characteristic and a graph structure corresponding to the current college data management requirement, and judging whether the data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristic;
extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence according to the weight of each entity node in a shared entity node set, wherein the higher the weight is, the higher the sharing priority is;
calculating the mahalanobis distance between each entity node in the shared entity node set, taking the mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of college heterogeneous data through node migration according to the minimum overhead principle;
And integrating the data of the college heterogeneous data which can be shared according to the integration path according to the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
2. The integrated management method of multi-source heterogeneous data in colleges and universities according to claim 1, wherein the multi-source heterogeneous data in colleges and universities are classified according to service types and user identity categories corresponding to the data, and fusion and standardization processing are performed on heterogeneous data with different sources, specifically:
college heterogeneous data of different data sources are obtained, the college heterogeneous data of the different data sources are subjected to data marking according to service labels of the data sources, and data aggregation is carried out under the same service label to realize college multi-source heterogeneous data service classification;
acquiring relevant knowledge in the service field according to each service label, determining different expression modes of relevant terms through the relevant knowledge, determining equivalent concepts according to different expression modes of the same relevant terms, and unifying entities of the relevant terms based on the equivalent concepts by college heterogeneous data under the same service label;
performing data conflict detection after unifying related term entities, calculating the similarity between the entities and a preset entity after unifying the related term entities of the college heterogeneous data of different data sources, and performing entity-entity matching according to the similarity;
Performing association and fusion on the multi-source heterogeneous data of the university according to the matching result and a preset ontology, deleting nonsensical entities, and setting a user identity class label by using user identity class information corresponding to the multi-source heterogeneous data of the university;
and (3) matching the multi-source heterogeneous data of the university with the data tag with a preset ontology through an entity to construct a university heterogeneous data knowledge base.
3. The integrated management method for multi-source heterogeneous data of colleges and universities according to claim 1, wherein the relationship classification is performed on the multi-source heterogeneous data of the colleges and universities based on the knowledge graph of the colleges and universities, and the knowledge graph of the data of the colleges and universities is vectorized by a graph representation method, specifically comprising:
establishing data mapping according to a college heterogeneous data knowledge base, converting data into a triplet form, extracting attribute information corresponding to entities in the college heterogeneous data knowledge base, and establishing a corresponding entity attribute information through a preset ontology to construct a relation table;
acquiring the number of the entity attribute information associated with each entity attribute information by inquiring the relation table, encoding according to the number of the entity attribute information and the corresponding data ID, and representing the encoded information in a graphical mode;
Generating a triplet form of entities and relations in a college heterogeneous data knowledge base, mapping, constructing a college knowledge graph in a graphical mode, and learning and representing the college knowledge graph through a graph convolution neural network;
representation of college data knowledge graph,/>Representing a set of entity nodes in a knowledge graph of the college data, < ->Representing a set of entity node relationships;
and vectorizing entity nodes in the college knowledge graph, aggregating neighbor node feature vectors into own entity node vectors by using a graph convolution neural network, and carrying out information propagation on updated node features through convolution calculation.
4. The integrated management method of multi-source heterogeneous data of a college according to claim 1, further comprising obtaining a time characteristic of the marked entity node corresponding to the college heterogeneous data, specifically:
the method comprises the steps of outputting real college heterogeneous data after graph convolution neural network feature aggregation to form a sequence signal, training a time convolution neural network model, and inputting the sequence signal into the time convolution neural network;
carrying out residual error connection hole convolution on the input sequence signal in a time convolution neural network until all convolution calculation is finished, and outputting time characteristics of college heterogeneous data through a full connection layer;
And carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
5. An integrated management system for multi-source heterogeneous data in a college, the system comprising: the system comprises a memory and a processor, wherein the memory comprises an integrated management method program of multi-source heterogeneous data of a college, and the integrated management method program of multi-source heterogeneous data of the college realizes the following steps when being executed by the processor:
acquiring multi-source heterogeneous data of a college, classifying the multi-source heterogeneous data of the college according to service types and user identity categories corresponding to the data, and fusing and standardizing heterogeneous data with different sources;
constructing a college data knowledge graph from the standardized multi-source heterogeneous data of the colleges and universities, classifying the relationship of the multi-source heterogeneous data of the colleges and universities based on the college knowledge graph, and vectorizing the college data knowledge graph by a graph representation method;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management in combination with time characteristics;
Meanwhile, constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and carrying out data integration and data sharing on the college multi-source heterogeneous data through the college data sharing model;
generating demand characteristics according to data management demands of universities, selecting characteristic data sets according to characteristic selection strategies based on the demand characteristics, generating a related graph structure according to the characteristic data sets, acquiring potential data connection according to the graph structure, and carrying out data management by combining time characteristics, wherein the method specifically comprises the following steps:
acquiring current college data management requirements, extracting service types corresponding to the requirements, extracting keyword information based on the current college data management requirements, and generating requirement characteristics according to the service types and the service semantic characteristics by taking word vectors corresponding to the keyword information as the service semantic characteristics;
establishing a retrieval task in a college data knowledge graph according to the demand characteristics, importing the demand characteristics into a low-dimensional vector space, and calculating similarity by calculating Euclidean distance between demand characteristic vectors and entity node vectors in the low-dimensional vector space;
obtaining entity node vectors with similarity meeting preset standards, marking, constructing a characteristic data set according to marked entity nodes, and distributing different weights to the marked entity nodes through a multichannel attention mechanism;
Setting weight factors according to the similarity between the demand feature vectors and the entity node vectors, acquiring feature data of corresponding channels through the weight factors, updating the features in the channels by using a self-attention mechanism in different channels, and distributing different weights according to the importance degree;
learning and representing the feature data sets distributed with different weights by using a graph convolution neural network, acquiring a related graph structure according to the connection relation of entity nodes, extracting data features by using a neighbor aggregation and message transmission mechanism, and acquiring potential data connection;
constructing a college data sharing model according to the data characteristics and the corresponding graph structure of the college multi-source heterogeneous data, and integrating and sharing the data of the college multi-source heterogeneous data through the college data sharing model, wherein the method specifically comprises the following steps:
acquiring a data characteristic and a graph structure corresponding to the current college data management requirement, and judging whether the data information corresponding to the current college data management requirement can be shared or partially shared according to the data characteristic;
extracting college heterogeneous data capable of sharing from the data characteristics, acquiring corresponding entity nodes according to the college heterogeneous data capable of sharing, and generating a shared entity node set;
Constructing a college data sharing model based on deep learning, training by utilizing the data characteristics and the graph structure, and setting the sharing sequence according to the weight of each entity node in a shared entity node set, wherein the higher the weight is, the higher the sharing priority is;
calculating the mahalanobis distance between each entity node in the shared entity node set, taking the mahalanobis distance as overhead information of data integration, taking the entity node with the highest priority as a starting point, and generating an integration path of college heterogeneous data through node migration according to the minimum overhead principle;
and integrating the data of the college heterogeneous data which can be shared according to the integration path according to the preset data standard of the shared object, and carrying out data sharing on the shared object by a preset method.
6. The integrated management system for multi-source heterogeneous data of universities according to claim 5, further comprising obtaining a time characteristic of the marked entity node corresponding to the heterogeneous data of the universities, specifically:
the method comprises the steps of outputting real college heterogeneous data after graph convolution neural network feature aggregation to form a sequence signal, training a time convolution neural network model, and inputting the sequence signal into the time convolution neural network;
Carrying out residual error connection hole convolution on the input sequence signal in a time convolution neural network until all convolution calculation is finished, and outputting time characteristics of college heterogeneous data through a full connection layer;
and carrying out data management analysis according to the data characteristic time characteristics of the college heterogeneous data, and realizing the current college data management requirement.
7. A computer-readable storage medium, characterized by: the computer readable storage medium includes an integrated management method program for multi-source heterogeneous data of a college, and when the integrated management method program for multi-source heterogeneous data of the college is executed by a processor, the integrated management method steps for multi-source heterogeneous data of the college are implemented as set forth in any one of claims 1 to 4.
CN202310488579.XA 2023-05-04 2023-05-04 Integrated management method, system and storage medium for multi-source heterogeneous data of universities Active CN116186359B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310488579.XA CN116186359B (en) 2023-05-04 2023-05-04 Integrated management method, system and storage medium for multi-source heterogeneous data of universities

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310488579.XA CN116186359B (en) 2023-05-04 2023-05-04 Integrated management method, system and storage medium for multi-source heterogeneous data of universities

Publications (2)

Publication Number Publication Date
CN116186359A CN116186359A (en) 2023-05-30
CN116186359B true CN116186359B (en) 2023-09-01

Family

ID=86442660

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310488579.XA Active CN116186359B (en) 2023-05-04 2023-05-04 Integrated management method, system and storage medium for multi-source heterogeneous data of universities

Country Status (1)

Country Link
CN (1) CN116186359B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116976808A (en) * 2023-07-21 2023-10-31 中国矿业大学(北京) Multisource heterogeneous coal mine geologic data management system, method, electronic equipment and storage medium
CN116910824B (en) * 2023-08-28 2024-02-06 广东中山网传媒信息科技有限公司 Safety big data analysis method and system based on distributed multi-source measure
CN117235281A (en) * 2023-09-22 2023-12-15 武汉贝塔世纪科技有限公司 Multi-element data management method and system based on knowledge graph technology

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633075A (en) * 2017-09-22 2018-01-26 吉林大学 A kind of multi-source heterogeneous data fusion platform and fusion method
CN113111244A (en) * 2020-12-31 2021-07-13 绍兴亿都信息技术股份有限公司 Multisource heterogeneous big data fusion system based on traditional Chinese medicine knowledge large-scale popularization
WO2021189971A1 (en) * 2020-10-26 2021-09-30 平安科技(深圳)有限公司 Medical plan recommendation system and method based on knowledge graph representation learning
WO2022041730A1 (en) * 2020-08-28 2022-03-03 康键信息技术(深圳)有限公司 Medical field intention recognition method, apparatus and device, and storage medium
CN114443854A (en) * 2021-12-30 2022-05-06 深圳晶泰科技有限公司 Processing method and device of multi-source heterogeneous data, computer equipment and storage medium
CN114491068A (en) * 2022-01-21 2022-05-13 武汉东湖大数据交易中心股份有限公司 Method and system for constructing knowledge graph of industrial park by fusing multi-source heterogeneous data
CN115525768A (en) * 2022-09-21 2022-12-27 中国电子科技集团公司第十四研究所 Visual construction method and device for domain knowledge graph

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4004832A4 (en) * 2019-07-29 2023-08-09 Pavel Atanasov Systems and methods for multi-source reference class identification, base rate calculation, and prediction
CN111324643B (en) * 2020-03-30 2023-08-29 北京百度网讯科技有限公司 Knowledge graph generation method, relationship mining method, device, equipment and medium
US11461367B2 (en) * 2020-04-13 2022-10-04 Singapore University Of Technology And Design Multi-source data management mechanism and platform

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633075A (en) * 2017-09-22 2018-01-26 吉林大学 A kind of multi-source heterogeneous data fusion platform and fusion method
WO2022041730A1 (en) * 2020-08-28 2022-03-03 康键信息技术(深圳)有限公司 Medical field intention recognition method, apparatus and device, and storage medium
WO2021189971A1 (en) * 2020-10-26 2021-09-30 平安科技(深圳)有限公司 Medical plan recommendation system and method based on knowledge graph representation learning
CN113111244A (en) * 2020-12-31 2021-07-13 绍兴亿都信息技术股份有限公司 Multisource heterogeneous big data fusion system based on traditional Chinese medicine knowledge large-scale popularization
CN114443854A (en) * 2021-12-30 2022-05-06 深圳晶泰科技有限公司 Processing method and device of multi-source heterogeneous data, computer equipment and storage medium
CN114491068A (en) * 2022-01-21 2022-05-13 武汉东湖大数据交易中心股份有限公司 Method and system for constructing knowledge graph of industrial park by fusing multi-source heterogeneous data
CN115525768A (en) * 2022-09-21 2022-12-27 中国电子科技集团公司第十四研究所 Visual construction method and device for domain knowledge graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向高校多源异构数据环境的元数据集成方法;冯勇;张丽颖;顾兆旭;马技;;辽宁大学学报(自然科学版)(02);全文 *

Also Published As

Publication number Publication date
CN116186359A (en) 2023-05-30

Similar Documents

Publication Publication Date Title
EP3985578A1 (en) Method and system for automatically training machine learning model
CN116186359B (en) Integrated management method, system and storage medium for multi-source heterogeneous data of universities
CN109165840B (en) Risk prediction processing method, risk prediction processing device, computer equipment and medium
CN111026842B (en) Natural language processing method, natural language processing device and intelligent question-answering system
US11461847B2 (en) Applying a trained model to predict a future value using contextualized sentiment data
US20210365963A1 (en) Target customer identification method and device, electronic device and medium
CN111159367B (en) Information processing method and related equipment
CN112989761B (en) Text classification method and device
CN112215604A (en) Method and device for identifying information of transaction relationship
CN109376226A (en) Complain disaggregated model, construction method, system, classification method and the system of text
CN111368096A (en) Knowledge graph-based information analysis method, device, equipment and storage medium
CN115994226B (en) Clustering model training system and method based on federal learning
CN114090755A (en) Reply sentence determination method and device based on knowledge graph and electronic equipment
CN110637321A (en) Dynamic claims submission system
CN116821372A (en) Knowledge graph-based data processing method and device, electronic equipment and medium
Choi et al. Hybrid information mixing module for stock movement prediction
JP7316165B2 (en) Information processing method and information processing device
CN111353728A (en) Risk analysis method and system
CN116304155A (en) Three-dimensional member retrieval method, device, equipment and medium based on two-dimensional picture
CN111615178B (en) Method and device for identifying wireless network type and model training and electronic equipment
CN110163761B (en) Suspicious item member identification method and device based on image processing
Sujithra et al. An intellectual decision system for classification of mental health illness on social media using computational intelligence approach
CN112016004A (en) Multi-granularity information fusion-based job crime screening system and method
Windiatmoko et al. Mi-Botway: A deep learning-based intelligent university enquiries chatbot
CN116484230B (en) Method for identifying abnormal business data and training method of AI digital person

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant