CN117971790A - Image cloud data sharing method and system - Google Patents

Image cloud data sharing method and system Download PDF

Info

Publication number
CN117971790A
CN117971790A CN202311790307.1A CN202311790307A CN117971790A CN 117971790 A CN117971790 A CN 117971790A CN 202311790307 A CN202311790307 A CN 202311790307A CN 117971790 A CN117971790 A CN 117971790A
Authority
CN
China
Prior art keywords
data
lake
sharing
medical image
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311790307.1A
Other languages
Chinese (zh)
Inventor
郑军
徐辉
吴鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Clp Tongshang Digital Technology Shanghai Co ltd
Original Assignee
Clp Tongshang Digital Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Clp Tongshang Digital Technology Shanghai Co ltd filed Critical Clp Tongshang Digital Technology Shanghai Co ltd
Priority to CN202311790307.1A priority Critical patent/CN117971790A/en
Publication of CN117971790A publication Critical patent/CN117971790A/en
Pending legal-status Critical Current

Links

Landscapes

  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention relates to the technical field of data processing, in particular to an image cloud data sharing method and system, comprising the steps of collecting and updating first data according to a sharing request and connecting a data lake table; preprocessing the acquired first data to form second data; storing the preprocessed second data into a data lake through a data management method; sharing the second data to the data demander based on Kafka according to the sharing request; the data management method comprises the steps of establishing a data catalog to organize and classify second data in a data lake; establishing a data tag to search and classify the second data; simplifying metadata, namely cleaning redundant information of second data, wherein the metadata comprises basic information of the second data; the first data is basic data, and the second data is medical image main data; the method solves the problem of collecting original data from different data sources in real time, dynamically updates various data when sharing the data to a data demand party, and keeps consistency.

Description

Image cloud data sharing method and system
Technical Field
The invention relates to the technical field of data processing, in particular to an image cloud data sharing method and system.
Background
With the rapid development of modern medical imaging technology, medical imaging has become an important auxiliary diagnosis and treatment technology. The medical image cloud is based on data storage of medical image information, uses medical image cloud computing application service as a core, uses virtualization and big data technology as support, and provides various online cloud service modes based on medical images for medical institutions, medical insurance departments, quality control departments and individuals of testees in a cloud transmission mode.
In medical image clouds, there are a large number of medical image-related underlying data including patient data, medical technology personnel data, medical equipment data, medical institution data, administrative supervisor data, image examination modality dictionary, image examination department dictionary, and the like. The basic data surpassing department, the surpassing flow, the surpassing theme and the surpassing system; the basic data are cleaned, processed and integrated to form main data in the medical image cloud.
To fully utilize the main data resources, management of the main data is particularly important, so how to collect and store the main data in real time and to purposefully share and distribute the main data is still a problem to be overcome.
Disclosure of Invention
The present application is directed to overcoming the above-mentioned drawbacks or problems in the prior art, and in a first aspect, the present application provides an image cloud data sharing method, applied to a data processing side, including
Collecting and updating first data according to the sharing request and connecting the first data with a data lake table;
Preprocessing the acquired first data to form second data;
storing the preprocessed second data into a data lake through a data management method;
and according to the sharing request, sharing the second data to the data requiring party based on the message queue middleware.
Optionally, the data management method includes creating a data directory to organize and classify the second data in the data lake; establishing a data tag to search and classify the second data; simplifying metadata, namely cleaning redundant information of the second data, wherein the metadata comprises basic information of the second data;
the first data is basic data, and the second data is medical image main data.
Optionally, receiving a sharing request by using a data lake gateway, wherein the data lake gateway receives the sharing request through a distributed message queue, an API interface or a JDBC interface; the data lake gateway adopts a decentralised distributed architecture, nodes in the data lake gateway cluster are equivalent, SDK (software development kit) for accessing the data lake gateway is provided for ETL (electronic toll collection), and a node list in the data lake gateway cluster is maintained by adopting a Zookeeper.
Optionally, storing the preprocessed second data to a source end original layer of the data lake through ETL; the pre-treatment may comprise the steps of,
Converting the format of the first data which is acquired and updated by using an adapter to convert the first data into a data lake table format;
Cleaning and deleting data with high repeatability and similarity or error data, and supplementing and filling incomplete data;
and labeling the cleaned data, and adding metadata information to describe the content and the source of the main data of the medical image.
Optionally, the data organization format of the data lake table metadata is JSON format.
Optionally, the data management method further includes establishing resource searching, and selecting and searching and positioning main data of the medical image by inputting keywords or labels through a search engine; configuring access rights of medical image main data, setting life cycle of the medical image main data, counting use conditions of the medical image main data, and establishing backup and disaster recovery of the medical image main data.
Optionally, message queue middleware receives the second data stored in the data lake and routes its data to a data demander; when the data requiring party receives the second data matched with the sharing request, the message queue middleware deletes the data; when the data demand side receives the second data and fails, the message queue middleware continuously transmits the data or transmits the data to an error queue.
Optionally, configuring data routing authority, when receiving the sharing request, dynamically matching a corresponding authority policy, and controlling and determining a data range of the data requiring party access request; the authority policy comprises a role, an attribute tag and an authority set, wherein the attribute tag is used for describing or classifying the sensitivity degree and the data type of data, and the corresponding authority set is configured according to the role and the attribute tag.
In a second aspect, the present application provides an image cloud data sharing system, including
The acquisition module is used for acquiring and updating the first data according to the sharing request and connecting the first data with the data lake table;
The integration module is used for preprocessing the acquired first data to form second data;
The storage module is used for storing the preprocessed second data into the data lake through a data management method; the data management method comprises the steps of establishing a data catalog to organize and classify the second data in the data lake; establishing a data tag to search and classify the second data; simplifying metadata, namely cleaning redundant information of the second data, wherein the metadata comprises basic information of the second data; the first data are basic data, and the second data are medical image main data;
And the sharing module is used for sharing the second data to the data demand party based on the message queue middleware according to the sharing request.
In a third aspect, the present application provides an electronic device comprising: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored by the memory to implement the method of any one of the first aspects.
In a fourth aspect, the present application provides a computer-readable storage medium storing computer-executable instructions for implementing the method of any one of the first aspects when executed by a processor.
In summary, the application provides an image cloud data sharing method, which uses a data lake gateway to receive a sharing request, uses the data lake gateway as a unified working interface for data lake entering, converges all fragmented lake entering operations into the data lake gateway for unified processing, ensures the stability, high performance and safety of data lake entering through the data lake gateway, solves the problem of collecting original data from different data sources in real time, stores the data in a unified standard and format, uses the data lake gateway as a safety barrier, can intensively process the safety problem, improves the real-time performance of data sharing, and can implement request buffering and current limiting strategies to ensure that the data lake cannot be overloaded and crashed; meanwhile, when sharing data to a data demand party, various data are dynamically updated, and consistency is maintained.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments below are briefly introduced, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a data store according to an embodiment of the present application;
FIG. 2 is a schematic diagram of data sharing according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a medical image primary data sharing system according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are preferred embodiments of the invention and should not be taken as excluding other embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without creative efforts, are within the protection scope of the present invention.
In order to clearly describe the technical solution of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect. For example, the first device and the second device are merely for distinguishing between different devices, and are not limited in their order of precedence. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.
In the present application, the words "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
The medical image cloud platform comprises a query the Lakers subsystem, an image data quality control system, a hospital original retrieval subsystem and other subsystems, wherein the subsystems are mutually communicated through an IP network, and the subsystems can be regarded as data demand parties for providing sharing demands for medical image main data.
In order to fully utilize main data resources, collect main data in real time, store and pointedly share and distribute the main data, the application provides an image cloud data sharing method, which is applied to a data processing side and comprises the following steps of
Collecting and updating first data according to the sharing request and connecting the first data with a data lake table;
Preprocessing the acquired first data to form second data;
storing the preprocessed second data into a data lake through a data management method;
According to the sharing request, sharing the second data to a data demand party based on the message queue middleware;
The data management method comprises the steps of establishing a data catalog to organize and classify the second data in the data lake; establishing a data tag to search and classify the second data; simplifying metadata, namely cleaning redundant information of the second data, wherein the metadata comprises basic information of the second data;
the first data is basic data, and the second data is medical image main data.
Specifically, as shown in fig. 1, the conventional main data uses a relational database as a storage component of the main data, the main data is collected by timing batch data collection in the service database, the data integration is completed by using SQL of the database, and then the data sharing is provided by the EBS enterprise bus and the web service, so that the real-time performance of the data sharing is relatively low, and the database may be overloaded or crashed.
The application adopts the form of data lake to store the medical image main data acquired or updated from the data source, and stores the data in the form of lake table; in the traditional data lake architecture, the data entering the data lake in the data source is directly realized through an ETL program, so that the data entering operation is fragmented, the entering modules are repeatedly developed, and unified and effective flow control, fusing and monitoring means are lacked.
For this reason, as shown in fig. 1, in the present application, a sharing request sent by a data demander is received through a data lake gateway, the sharing request is sent to the data lake gateway through a distributed message queue, an API interface or a JDBC interface, and the data lake gateway collects or updates first data according to the sharing request, where the first data is basic data, and the basic data includes, for example, patient data, medical personnel data, medical equipment data, medical institution data, administrative supervision institution data, an image inspection modality dictionary, an image inspection department dictionary, and the like.
The data lake gateway is used as a unified working interface for storing medical image main data into a data lake, fragmented lake entering operation is fully converged to the unified operation of the data lake gateway, and unified data lake entering access control, flow control, fusing and real-time monitoring are then implemented on the basis; the data lake gateway acts as a security barrier and can centrally handle security issues such as performing authentication and authorization, verifying the identity of the user, ensuring that only authorized data-demanding parties share data.
The data lake gateway provides a unified data sharing request interface, and the data sharing party and the data demand party are fully decoupled, so that the real-time performance of data sharing is improved, and meanwhile, request buffering and current limiting strategies can be implemented to ensure that the data lake cannot be overloaded and crashed.
Specifically, the data lake gateway adopts a decentralised distributed architecture, and the data lake gateway adopts the distributed architecture, so that a central fault point and performance bottlenecks can be eliminated. Each node in the data lake gateway cluster is a peer node, a Zookeeper is responsible for maintaining a node list, and an upstream ETL program is provided with an SDK for accessing the data lake gateway to complete the integration work among software; the stability, high performance and safety of data entering the lake can be ensured through the data lake gateway.
After the data lake gateway collects or updates the first data according to the sharing request, preprocessing the collected first data to form second data, wherein the second data is preferably medical image main data; in the application, the first data, namely basic data, accords with the characteristics of the surpassing department, the surpassing flow, the surpassing theme, the surpassing system and the surpassing technology of the main data, is the actual main data of the medical image, and forms the main data in the medical image cloud after cleaning, processing and integration of the basic data.
When the first data is collected or updated, the data lake gateway is connected with a corresponding data lake table, and the adapter is used for carrying out format conversion on the collected and updated first data according to the sharing request so as to convert the collected and updated first data into a data lake table format; secondly, cleaning the extracted data, such as removing repeated and error data, filling missing values and the like, specifically cleaning and deleting the data with more repeatability and similarity or the error data, and filling incomplete data in a supplementing manner; and then, marking the cleaned data, adding metadata information for describing the content and the source of the data, and writing the second data into a source end original layer of the data lake through ETL storage to finish the aggregation work.
Preferably, log records are carried out in the process of writing the data storage into the data lake, and are used for tracking and auditing afterwards, when the data is successfully stored and written into the data lake, statistical information such as the data quantity of the lake is returned, and when the data storage and writing into the data lake fail to be executed, the failure reason is returned.
Preferably, in the present application, the organization format of the data lake metadata is JSON format, JSON (JavaScript Object Notation) is a lightweight data exchange format, is a flexible and easily-extensible data organization manner, is easy to read, write and understand, and supports complex data structures. The use JSON (JavaScript Object Notation) may simplify the storage and management of metadata.
Specifically, JSON format allows data to be organized in a hierarchical fashion, and complex metadata structures, such as tree structures, attribute lists, etc., can be represented using objects and arrays for data structuring; the text format of the JSON is very easy to understand and read, so that the storage and maintenance of metadata are more visual, the JSON data can be easily understood, and complex analysis is not needed; JSON has good flexibility, can easily add, delete or modify metadata fields as needed, and is very helpful for adapting to changing data requirements and architecture changes; meanwhile, JSON, as a general data format, supports almost all programming languages and platforms, meaning that metadata can be easily shared and transferred between different systems and applications; to this end, JSON structures may be defined to represent domain-specific metadata standards that help ensure consistency and interoperability.
On version upgrade iteration, utilizing a version control tool to track and manage changes of JSON data so as to audit and record history; in terms of updating and screening, JSON data can be retrieved and screened using various query languages and tools, making it easier to search and analyze metadata; in terms of API integration, JSON data can be easily integrated into various applications and services, including Web services, RESTful APIs, and mobile applications; while JSON data may be encrypted and digitally signed to ensure confidentiality and integrity of metadata in terms of data security.
Processing data stored in the data lake by a data management method in order to organize and manage main data in the data lake; specifically, the data management method comprises the steps of establishing a data catalog, and establishing a concise and clear medical image main data catalog so as to organize and classify second data in the data lake; for example, the established data catalog includes patient data, medical facility data, medical device and facility data, wherein the patient data may further include personal basic information, medical diagnosis, medical records, health history, etc., the medical facility data may include hospital information, department information, doctor information, nurses, caretaker information, etc., and the medical device and facility data may include medical device information, diagnostic tools, etc.
The data management method further comprises the steps of establishing a data tag, marking concise data tags for the main data of the medical image, wherein the data tag can report basic information such as patient ID, image type, acquisition date, image equipment and the like, and is used for searching and classifying subsequent data; metadata is simplified, and the metadata only comprises basic information of main data of the medical image, such as file name, data format, acquisition date and the like, so that redundant information is avoided, and the simplicity of the metadata is maintained.
The data management method marks the cleaned data throughout the data processing process, such as when the basic data is collected and updated, adds metadata information for describing the content and the source of the data, marks concise data on the main data of the medical image in the storage process, and shares the data based on the data catalogue in the sharing process.
The data management method also comprises establishing resource search, configuring access authority of the medical image main data, setting life cycle of the medical image main data, counting use condition of the medical image main data, and establishing backup and disaster tolerance of the medical image main data
Specifically, a keyword or a label is input through a search engine to select and search and locate main data of the medical image; by developing an intelligent resource searching function, inputting keywords or selecting labels, and rapidly searching and positioning main data of medical images, the search engine has high-efficiency and rapid searching capability, so that a user can conveniently and rapidly find required data; by setting the access rights of the data resources, the medical image main data can be ensured to be accessed by authorized personnel only, and rights management can be simply configured according to roles, so that the safety and privacy of the data are protected.
By setting the life cycle of the main data of the medical image, including data acquisition, storage, backup, cleaning and the like, only necessary data is ensured to be stored in a data lake, and unnecessary storage expenditure is avoided; a simple resource utilization statistical system is established, statistics and analysis are carried out on the use condition of the main data of the medical image, and the management strategy of the data resource is optimized according to the use condition of the data. And the backup and disaster recovery mechanism of the main data of the medical image is simply established, so that the safety and reliability of the data are ensured.
As shown in fig. 1 and fig. 2, according to the sharing request, the second data is shared to the data demand party based on the message queue middleware according to the data directory, specifically, the message queue middleware may be a message queue middleware such as Kafka or RocketMQ, in this embodiment, kafka is taken as an example;
The data demander subscribes to data topics or channels of interest, which may be categorized based on the type of data, topic, keyword, etc.; the medical image main data stored in the data lake is distributed to Kafka, and the data distributed to Kafka may contain various types of information such as real-time data, event data, sensor data, and the like.
After Kafka receives medical image main data stored in a data lake, data is routed to a corresponding data demand party according to a sharing request of the data demand party; the data demander receives the data routed from the Kafka and processes the data, and the data processing operation can be data storage, real-time analysis, service logic execution and the like, which depend on the application scenario.
When the data demand party receives the medical image main data matched with the sharing request and is successfully processed, the data demand party sends a confirmation to the Kafka to indicate that the data is successfully consumed; while Kafka will remove data from the message queue or data lake; similarly, when the data demand side receives the main data of the medical image and has errors or faults, the data is continuously transmitted by the Kafka, and if a plurality of attempts still fail, the data can be transmitted to an error queue or error storage by the Kafka for subsequent processing.
In the actual sharing process, configuring data routing authorities, dynamically matching corresponding authority strategies, and controlling and determining the data range of the access request of the data demander; specifically, according to attribute tags introduced by the main data of the medical image, the role or the authority set of the data authority is set according to the information of the sensitivity degree, the data type, the affiliated business departments and the like of the data, but the authority is not directly related to the specific data. Roles may include departments, positions, etc., with sets of permissions covering common data manipulation permissions, such as view, edit, delete, etc.
The authority policy consists of roles, data attribute tags and authority sets, wherein the corresponding authority sets are configured for each role and data attribute tag combination according to specific requirements, when a sharing request of a data requiring party is received, a data authority routing module of Kafka dynamically matches the corresponding authority policy according to the roles and the data attribute tags of the data requiring party, and whether the data requiring party is allowed to access the requested data is automatically determined according to the configured routing authorities.
Preferably, when a role of a certain level does not explicitly configure the authority policy, the authority policy of the upper level role can be searched and inherited, so that the complexity of authority configuration is reduced; and a visual authority configuration interface is provided, so that the authority strategy can be intuitively checked and adjusted, and the authority strategy takes effect in real time. The interface may also present a relationship diagram of data attribute tags and roles to better understand rights configuration.
Optionally, data authority audit is performed to record changes and usage of data access authorities, and audit information includes personnel modifying authority policies, time of modifying authority policies, and related data operation logs.
Optionally, artificial intelligence technology can be introduced to analyze the risk of the data, automatically identify the sensitivity and risk degree of the data, and provide intelligent recommendation of the data authority configuration.
The application provides an image cloud data sharing method, which is characterized in that a data lake gateway is adopted to receive a sharing request, and is used as a unified working interface for data lake entering, fragmented lake entering operations are all converged into the data lake gateway for unified processing, the stability, high performance and safety of data lake entering are ensured by the data lake gateway, the problem of real-time acquisition of original data from different data sources is solved, and the data lake is stored in a unified standard format; meanwhile, when sharing data to a data demand party, various data are dynamically updated, and consistency is maintained.
As shown in fig. 3, a medical image cloud main data sharing system provided by an embodiment of the present application includes
The acquisition module is used for acquiring and updating the first data according to the sharing request and connecting the first data with the data lake table;
The integration module is used for preprocessing the acquired first data to form second data;
The storage module is used for storing the preprocessed second data into the data lake through a data management method; the data management method comprises the steps of establishing a data catalog to organize and classify the second data in the data lake; establishing a data tag to search and classify the second data; simplifying metadata, namely cleaning redundant information of the second data, wherein the metadata comprises basic information of the second data; the first data are basic data, and the second data are medical image main data;
And the sharing module is used for sharing the second data to the data demand party based on the message queue middleware according to the sharing request.
The specific implementation principle and effect of the medical image cloud main data sharing system can be referred to the relevant description and effect corresponding to the above embodiment, and will not be repeated here.
Taking a certain hospital as an example, the sharing of medical technical personnel information, medical structure information, medical organization information, medical image equipment information, manpower, organization or equipment main data in the hospital can be realized by adopting the technical scheme provided by the embodiment of the application.
The embodiment of the application also provides a schematic structural diagram of an electronic device, and fig. 4 is a schematic structural diagram of an electronic device provided by the embodiment of the application, as shown in fig. 4, the electronic device may include: a processor and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes the computer-executable instructions stored in the memory to cause the processor to perform the method of any of the embodiments described above. Wherein the memory and the processor may be connected by a bus.
Embodiments of the present application also provide a computer-readable storage medium storing computer program-executable instructions that, when executed by a processor, are configured to implement a method as described in any of the foregoing embodiments of the present application.
The embodiment of the application also provides a chip for running instructions, and the chip is used for executing the method in any of the previous embodiments executed by the electronic equipment in any of the previous embodiments.
Embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, performs a method as in any of the preceding embodiments of the present application, as in any of the preceding embodiments performed by an electronic device.
In the several embodiments provided by the present application, it should be understood that the disclosed systems and methods may be implemented in other ways. For example, the system embodiments described above are merely illustrative, e.g., the division of modules is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces, indirect coupling or communication connection of modules, electrical, mechanical, or other forms.
The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to implement the solution of this embodiment.
In addition, each functional module in the embodiments of the present application may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit. The units formed by the modules can be realized in a form of hardware or a form of hardware and software functional units. The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or processor to perform some of the steps of the methods described in the various embodiments of the application.
It should be appreciated that the Processor may be a central processing unit (Central Processing Unit, abbreviated as CPU), or may be other general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, abbreviated as DSP), application SPECIFIC INTEGRATED Circuit (ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The Memory may include a high-speed random access Memory (Random Access Memory, abbreviated as RAM), and may further include a Non-volatile Memory (NVM), such as at least one magnetic disk Memory, and may also be a U-disk, a removable hard disk, a read-only Memory, a magnetic disk, or an optical disk.
The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or to one type of bus.
The storage medium may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random-Access Memory (SRAM), electrically erasable programmable Read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ ONLY MEMORY EEPROM), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an Application SPECIFIC INTEGRATED Circuits (ASIC). It is also possible that the processor and the storage medium reside as discrete components in an electronic device or a master device.
The foregoing description of the embodiments and description is presented to illustrate the scope of the invention, but is not to be construed as limiting the scope of the invention. Modifications, equivalents, and other improvements to the embodiments of the invention or portions of the features disclosed herein, as may occur to persons skilled in the art upon use of the invention or the teachings of the embodiments, are intended to be included within the scope of the invention, as may be desired by persons skilled in the art from a logical analysis, reasoning, or limited testing, in combination with the common general knowledge and/or knowledge of the prior art.

Claims (11)

1. An image cloud data sharing method is characterized in that: application to data processing parties including
Collecting and updating first data according to the sharing request and connecting the first data with a data lake table;
Preprocessing the acquired first data to form second data;
storing the preprocessed second data into a data lake through a data management method;
and according to the sharing request, sharing the second data to the data requiring party based on the message queue middleware.
2. The image cloud data sharing method as claimed in claim 1, wherein: the data management method comprises the following steps:
Establishing a data catalog to organize and classify the second data in the data lake;
Establishing a data tag to search and classify the second data;
simplifying metadata, namely cleaning redundant information of the second data, wherein the metadata comprises basic information of the second data;
the first data is basic data, and the second data is medical image main data.
3. The image cloud data sharing method as claimed in claim 1, wherein: receiving a sharing request by using a data lake gateway, wherein the data lake gateway receives the sharing request through a distributed message queue, an API interface or a JDBC interface; the data lake gateway adopts a decentralised distributed architecture, nodes in the data lake gateway cluster are equivalent, SDK (software development kit) for accessing the data lake gateway is provided for ETL (electronic toll collection), and a node list in the data lake gateway cluster is maintained by adopting a Zookeeper.
4. The image cloud data sharing method as claimed in claim 1, wherein: storing the preprocessed second data to a source end original layer of a data lake through ETL; the pre-treatment may comprise the steps of,
Converting the format of the first data which is acquired and updated by using an adapter to convert the first data into a data lake table format;
Cleaning and deleting data with high repeatability and similarity or error data, and supplementing and filling incomplete data;
and labeling the cleaned data, and adding metadata information to describe the content and the source of the main data of the medical image.
5. The method for sharing image cloud data as claimed in claim 4, wherein: and the data organization format of the data lake table metadata is JSON format.
6. The image cloud data sharing method as claimed in claim 1, wherein: the data management method further comprises the steps of establishing resource searching, and selecting and searching and positioning main data of the medical image by inputting keywords or labels through a search engine; configuring access rights of medical image main data, setting life cycle of the medical image main data, counting use conditions of the medical image main data, and establishing backup and disaster recovery of the medical image main data.
7. The image cloud data sharing method as claimed in claim 1, wherein: message queue middleware receives the second data stored in the data lake and routes the data thereof to a data demander; when the data requiring party receives the second data matched with the sharing request, the message queue middleware deletes the data; when the data demand side receives the second data and fails, the message queue middleware continuously transmits the data or transmits the data to an error queue.
8. The method for sharing image cloud data as claimed in claim 7, wherein: configuring data routing authorities, dynamically matching corresponding authority strategies when receiving the sharing request, and controlling and determining a data range of the data requiring party access request; the authority policy comprises a role, an attribute tag and an authority set, wherein the attribute tag is used for describing or classifying the sensitivity degree and the data type of data, and the corresponding authority set is configured according to the role and the attribute tag.
9. An image cloud data sharing system is characterized in that: comprising
The acquisition module is used for acquiring and updating the first data according to the sharing request and connecting the first data with the data lake table;
The integration module is used for preprocessing the acquired first data to form second data;
The storage module is used for storing the preprocessed second data into the data lake through a data management method; the data management method comprises the steps of establishing a data catalog to organize and classify the second data in the data lake; establishing a data tag to search and classify the second data; simplifying metadata, namely cleaning redundant information of the second data, wherein the metadata comprises basic information of the second data; the first data are basic data, and the second data are medical image main data;
And the sharing module is used for sharing the second data to the data demand party based on the message queue middleware according to the sharing request.
10. An electronic device, characterized by a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes the computer-executable instructions stored in the memory to implement the image cloud data sharing method according to any one of claims 1-8.
11. A computer readable storage medium, wherein a computer program is stored on the storage medium, and when the computer program is executed by a processor, the image cloud data sharing method according to any one of claims 1 to 8 is implemented.
CN202311790307.1A 2023-12-25 2023-12-25 Image cloud data sharing method and system Pending CN117971790A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311790307.1A CN117971790A (en) 2023-12-25 2023-12-25 Image cloud data sharing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311790307.1A CN117971790A (en) 2023-12-25 2023-12-25 Image cloud data sharing method and system

Publications (1)

Publication Number Publication Date
CN117971790A true CN117971790A (en) 2024-05-03

Family

ID=90846720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311790307.1A Pending CN117971790A (en) 2023-12-25 2023-12-25 Image cloud data sharing method and system

Country Status (1)

Country Link
CN (1) CN117971790A (en)

Similar Documents

Publication Publication Date Title
US11755628B2 (en) Data relationships storage platform
US7177877B2 (en) Method and system for externalizing conditional logic for collecting multi-purpose objects
US20080244008A1 (en) Method and system for data exchange among data sources
CN108667725A (en) A kind of industrial AnyRouter and implementation method based on a variety of accesses and edge calculations
US20240037122A1 (en) COMPUTING SYSTEM PROVIDING BLOCKCHAIN-FACILITATED SEMANTIC INTEROPERABILITY BETWEEN MULTIPLE DISPARATE SYSTEMS OF RECORD (SORs) AND RELATED METHODS
Voit et al. Big data processing for full-text search and visualization with Elasticsearch
US11567735B1 (en) Systems and methods for integration of multiple programming languages within a pipelined search query
US11500876B2 (en) Method for duplicate determination in a graph
US11321366B2 (en) Systems and methods for machine learning models for entity resolution
CN112597218A (en) Data processing method and device and data lake framework
CN113094385A (en) Data sharing fusion platform and method based on software definition open toolset
US11748634B1 (en) Systems and methods for integration of machine learning components within a pipelined search query to generate a graphic visualization
Mishra et al. Challenges in big data application: a review
CN117971790A (en) Image cloud data sharing method and system
US11838171B2 (en) Proactive network application problem log analyzer
CN113380414B (en) Data acquisition method and system based on big data
US20240232236A1 (en) Systems and methods for machine learning models for entity resolution
Mayuri et al. A Study on Use of Big Data in Cloud Computing Environment
Soboliev et al. Distributed System of Intelligent Content Monitoring Agents.
Ch CLOUD COMPUTING ENVIRONMENT WITH BIGDATA
CN117743426A (en) One-stop fusion input platform and method
Settas et al. Detecting similarities in antipattern ontologies using semantic social networks: implications for software project management
CN114971189A (en) Government affair cloud data acquisition, processing and management system and method based on block chain
Costa et al. MAID–Multi Agent for the Integration of Data
Zheng et al. Agent-Based Data Loading Method in MedImGrid

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination