CN115718571B - Data management method and device based on multidimensional features - Google Patents

Data management method and device based on multidimensional features Download PDF

Info

Publication number
CN115718571B
CN115718571B CN202211475101.5A CN202211475101A CN115718571B CN 115718571 B CN115718571 B CN 115718571B CN 202211475101 A CN202211475101 A CN 202211475101A CN 115718571 B CN115718571 B CN 115718571B
Authority
CN
China
Prior art keywords
data
target
storage
execution
management
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211475101.5A
Other languages
Chinese (zh)
Other versions
CN115718571A (en
Inventor
陈志标
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Computing Sciences
Original Assignee
Shenzhen Institute of Computing Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Computing Sciences filed Critical Shenzhen Institute of Computing Sciences
Priority to CN202211475101.5A priority Critical patent/CN115718571B/en
Publication of CN115718571A publication Critical patent/CN115718571A/en
Application granted granted Critical
Publication of CN115718571B publication Critical patent/CN115718571B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a data management method and a device based on multidimensional characteristics, wherein the method relates to a storage end, a routing end, a management end and an execution end, and comprises the following steps: the management terminal receives the data access request sent by the routing terminal and generates initial data characteristic statistical information and triggering conditions corresponding to the storage terminal according to the data access request; the management end determines a target storage end according to the trigger condition, generates the task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end. So that the user traffic demands are met with a relatively optimal cost and simple one-stop solution; providing optimal performance and cost to the user; transparent to upper users, users do not need to deploy multiple systems or add external tools to complete lifecycle management of the data.

Description

Data management method and device based on multidimensional features
Technical Field
The application relates to the field of data management, in particular to a data management method and device based on multidimensional features.
Background
With the development of mobile internet and enterprise digital transformation, the enterprise needs to manage larger data, mainly presents the demands of real-time perception of changing data, agile service analysis requirements on data, and longer data period to be stored, so that in order to realize the demands, the enterprise often needs to put in a very complex data management system with large labor cost and invest a large amount of resources to store the massive data, the burden is more and more difficult for large enterprises, and the burden is more and more small enterprises, so that the large data becomes unique to a small number of enterprises with financial resources and technical resources, and the general enterprises cannot really enjoy the improvement of the productivity brought by the large data.
The data has life cycle, the data shows different characteristics at different stages, the service load characteristics required to be supported are also different, in order to manage the enterprise data, the enterprise generally adopts a plurality of systems to manage the data with different periods and different service demands, and the data flows among different systems through tools such as data integration, data conversion, data migration and the like. By adopting the data management scheme, the main problems are as follows: 1. the same data is stored in a plurality of systems in a redundant way, the cost is high, and the global consistency of the data is difficult to ensure; 2. the migration of data between systems requires users to implement by using third party tools and the like, requires professional technicians to implement, has a high threshold, has high delay of data flow, and is often difficult to efficiently process under abnormal conditions.
As data has become one of the production elements, enterprises begin to recognize the value of the data, hope to save various data generated in the production process of the enterprises, how to manage the mass data with the lowest cost and provide real-time data access capability is a key capability of digital conversion enterprises, and is also a key whether the enterprises can stand still in a fierce competition environment, so as to reduce the data management cost and threshold of the enterprises, and begin to present some data management system schemes for fusion management, wherein all the existing schemes only consider a certain dimension but cannot realize multiple dimensions.
Disclosure of Invention
In view of the foregoing, the present application has been developed to provide a method and apparatus for multidimensional feature-based data management that overcomes or at least partially solves the foregoing, including:
a data management method based on multidimensional features, the method relates to a storage end, a routing end, a management end and an execution end, and the method comprises the following steps:
the management terminal receives the data access request sent by the routing terminal and generates initial data characteristic statistical information and triggering conditions corresponding to the storage terminal according to the data access request;
The management end determines a target storage end according to the trigger condition, generates the task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end;
and the management end updates or deletes the target data characteristic statistical information corresponding to the target storage end according to the first processing instruction.
Further, the step of generating initial data feature statistics and triggering conditions corresponding to the storage end according to the data access request includes:
the management end determines access data in the storage end according to the data access request;
the management end determines data granularity information corresponding to the access data in the storage end according to the access data;
and the management end generates initial data characteristic statistical information and the triggering condition corresponding to the storage end according to the data granularity information.
Further, the step of generating the task execution request according to the target storage end and the trigger condition and sending the task execution request to the execution end includes:
The management end obtains all data in the storage end and determines whether target data meeting the triggering condition exists or not;
when all the data in the storage end have target data which correspondingly meet the triggering condition, the management end generates the task execution request according to the target data;
and the management end sends the task execution request to the execution end.
The embodiment of the invention also provides a data management method based on the multidimensional characteristic, the method relates to a storage end, a routing end, a management end and an execution end, and the method comprises the following steps:
when the execution end receives a task execution request sent by the management end, the execution end determines a task type corresponding to the task execution request according to the task execution request;
the execution end obtains target data from the storage end according to the trigger condition in the task execution request;
the execution end carries out data processing on the target data according to the task type and the target storage end, wherein the task type comprises migration or deletion;
the execution end generates a first processing instruction responding to the management end according to the target data, the target storage end and the task type, and sends the first processing instruction to the management end; the first processing instruction is used for updating or deleting the target data characteristic statistical information corresponding to the target storage end at the management end;
The execution end generates a second processing instruction responding to the routing end according to the target data, the target storage end and the task type, and sends the second processing instruction to the routing end; the second processing instruction is configured to add or delete, at the routing end, a target access path corresponding to a target access path formed from the target data to the target storage end.
Further, the data processing is performed on the target data according to the task type and the target storage end, where the task type includes a step of migration or deletion, and the method includes:
when the task type is the migration, the execution end migrates the target data from the storage end to the target storage end according to the task type;
or alternatively, the first and second heat exchangers may be,
and when the task type is the deletion, deleting the target data from the target storage end by the execution end according to the task type, wherein the target storage end is an initial storage end corresponding to the target data.
The embodiment of the invention also provides a data management method based on the multidimensional characteristic, the method relates to a storage end, a routing end, a management end and an execution end, and the method comprises the following steps:
The router end responds to the data access information aiming at the upper layer data to determine an access path;
the routing end generates the data access request according to the access path and sends the data access request to the management end;
and the router adds or deletes the target access path from the target storage end according to the second processing instruction sent by the execution end.
The embodiment of the invention also provides a data management device based on the multidimensional characteristic, the device relates to a storage end, a routing end, a management end and an execution end, and the device comprises:
the receiving module is used for receiving the data access request sent by the router by the management end and generating initial data characteristic statistical information and triggering conditions corresponding to the storage end according to the data access request;
the first generation module is used for determining a target storage end by the management end according to the trigger condition, generating the task execution request according to the target storage end and the trigger condition, sending the task execution request to the execution end, and receiving a first processing instruction sent by the execution end;
and the first data processing module is used for updating or deleting the target data characteristic statistical information corresponding to the target storage end by the management end according to the first processing instruction.
The embodiment of the invention also provides a data management device based on the multidimensional characteristic, the device relates to a storage end, a routing end, a management end and an execution end, and the device comprises:
the first determining module is used for determining a task type corresponding to the task execution request according to the task execution request when the execution end receives the task execution request sent by the management end;
the acquisition module is used for the execution end to acquire target data from the storage end according to the trigger condition in the task execution request;
the second data processing module is used for the execution end to perform data processing on the target data according to the task type and the target storage end, wherein the task type comprises migration or deletion;
the third data processing module is used for generating a first processing instruction responding to the management end by the execution end according to the target data, the target storage end and the task type, and sending the first processing instruction to the management end; the first processing instruction is used for updating or deleting the target data characteristic statistical information corresponding to the target storage end at the management end;
The fourth data processing module is used for generating a second processing instruction responding to the routing end by the execution end according to the target data, the target storage end and the task type, and sending the second processing instruction to the routing end; the second processing instruction is configured to add or delete, at the routing end, a target access path corresponding to a target access path formed from the target data to the target storage end.
The embodiment of the invention also provides a data management device based on the multidimensional characteristic, the device relates to a storage end, a routing end, a management end and an execution end, and the device comprises:
the second determining module is used for determining an access path by the routing end in response to the data access information aiming at the upper layer data;
the second generation module is used for generating the data access request by the routing end according to the access path and sending the data access request to the management end;
and the fifth data processing module is used for adding or deleting the target access path from the target storage end according to the second processing instruction sent by the execution end by the routing end.
The embodiment of the invention also provides computer equipment, which comprises a processor, a memory and a computer program stored on the memory and capable of running on the processor, wherein the computer program is executed by the processor to realize the steps of the data management method based on the multi-dimensional characteristics.
Embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs a step of data management based on a multi-dimensional feature as described above.
The application has the following advantages:
in the embodiment of the application, compared with the prior art, only one dimension is considered, and the multi-dimension cannot be realized; the data are stored in a plurality of systems in a redundant way, the cost is high, and the global consistency of the data is difficult to ensure; the application provides a solution of a data management method based on multidimensional characteristics, which is specifically as follows: the management terminal receives the data access request sent by the routing terminal and generates initial data characteristic statistical information and triggering conditions corresponding to the storage terminal according to the data access request; the management end determines a target storage end according to the trigger condition, generates the task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end; and the management end updates or deletes the target data characteristic statistical information corresponding to the target storage end according to the first processing instruction. The task execution request is generated according to the target storage end and the triggering condition, so that the problems that multiple dimensions cannot be achieved, data consistency cannot be guaranteed, the cost is high and the data processing is low in efficiency are solved, and data are automatically migrated and managed according to multiple different dimensions such as data and business, so that the user business requirement is met by a relatively optimal cost and a simple one-stop solution; the contradiction between read-write optimization mutual exclusion and the contradiction between data cooling and heating and storage cost are solved, and optimal performance and cost are provided for users; transparent to upper users, users do not need to deploy multiple systems or add external tools to complete lifecycle management of the data.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the description of the present application will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
FIG. 1 is a flowchart illustrating a method for managing data based on multidimensional features according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a method for managing data based on multidimensional features according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for managing data based on multidimensional features according to an embodiment of the present application;
FIG. 4 is a block diagram of a data management device based on multidimensional features according to an embodiment of the present application;
FIG. 5 is a block diagram of a data management device based on multidimensional features according to an embodiment of the present application;
FIG. 6 is a block diagram of a data management device based on multidimensional features according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order that the manner in which the above recited objects, features and advantages of the present application are obtained will become more readily apparent, a more particular description of the application briefly described above will be rendered by reference to the appended drawings. It will be apparent that the described embodiments are some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The inventors found by analyzing the prior art that: the data has life cycle, the data shows different characteristics at different stages, the service load characteristics required to be supported are different, in order to manage the enterprise data, an enterprise generally adopts a plurality of systems to manage the data with different periods and different service requirements, and flows among different systems through data integration, data conversion, data migration and other tools, for example, when the data is just generated, the data generally needs to support transactions in a production system, point search point change with low time delay is supported, the data is generally supported by adopting a traditional OLTP (On-Line Transaction Processing, online transaction processing) database, then the data is integrated into a history archiving database in real time in order to meet the history detail query of a user, and the archiving database generally needs to be stored in a centralized way according to the user so as to carry out data retrieval according to the user dimension; then, in order to support report forms, business intelligence, etc., data is converted and stored in a data warehouse through tools such as ETL (Extract-Transform-Load) to describe a process of extracting, transforming, and loading data from a source end to a destination end. With such a data management scheme, the main problems are:
1. The same data is stored in a plurality of systems in a redundant way, the cost is high, and the global consistency of the data is difficult to ensure.
2. The migration of data between systems requires users to implement by using third party tools and the like, requires professional technicians to implement, has a high threshold, has high delay of data flow, and is often difficult to efficiently process under abnormal conditions.
The first technical scheme is as follows:
C-Store(Mike Stonebraker,et,al.C-store:a column-oriented DBMS.In VLDB,pages 553–564,2005.)
write-optimization-oriented storage and query-optimization-oriented storage are mutually exclusive in design and implementation, such as write-oriented storage, which is generally implemented by Rowstore (row storage), query-oriented storage, which is generally implemented by Column Store (Column Store is a Column-based database system), and a mixed architecture scheme is first proposed in the C-Store paper, and the architecture of the mixed architecture realizes two modules. One module is responsible for handling fast write (Writeable Store, WS) and one module is responsible for providing efficient queries (Read-Optimized Store, RS) while running a single copy move in between to migrate data in WS to RS.
This approach has been widely used in many big data systems, like SAP HANA, vertical, etc. in commercial systems. The disadvantage of this solution is:
Only one dimension is possible and the fixing is to distinguish between reading and writing, only solving this particular problem.
The multiple Mover merges data from WS to RS in a periodic manner, which may result in frequent round-trip merging of data if the data changes more frequently.
The second technical scheme is as follows:
and (3) layering and storing cold and hot data: a hierarchical storage method based on time sequence data cold-hot classification, announcement number: CN108268217a; publication (bulletin) day: 2021, 04, 30; the patent is characterized in that data are divided into real-time data, sunday data, historical data and cold data through a specific time rule, a real-time data layer, a sunday data layer, a historical data layer and a cold data layer are established and are sequentially stored in a memory, an SSD storage hard disk, an HDD storage hard disk and a distributed storage system HDFS respectively, wherein the real-time data are data within 24 hours, the sunday data are data within two weeks, the historical data are data within two years, and the cold data are data within 10 years. The various data are periodically imported from the upper storage layer to the lower storage layer in batches through a batch transfer service.
The disadvantage of this solution is: the data can be simply divided and migrated only according to the time window of the data, and the data migration granularity is relatively coarse; only append and query of data are supported, and modification of data is not supported.
In summary, it can be known that only one dimension is considered in the prior art, but multiple dimensions cannot be realized; the data are stored in a plurality of systems in a redundant way, the cost is high, and the global consistency of the data is difficult to ensure; the data management method based on the multidimensional features is used for solving the problems; the specific steps are as follows:
referring to fig. 1, a flowchart illustrating steps of a data management method based on a multidimensional feature according to an embodiment of the present application is shown;
a data management method based on multidimensional features, the method relates to a storage end, a routing end, a management end and an execution end, and the method comprises the following steps:
s110, the management end receives a data access request sent by the routing end and generates initial data characteristic statistical information and triggering conditions corresponding to the storage end according to the data access request;
s120, the management end determines a target storage end according to the trigger condition, generates the task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end;
And S130, the management end updates or deletes the target data characteristic statistical information corresponding to the target storage end according to the first processing instruction.
In the embodiment of the application, compared with the prior art, only one dimension is considered, and the multi-dimension cannot be realized; the data are stored in a plurality of systems in a redundant way, the cost is high, and the global consistency of the data is difficult to ensure; the application provides a solution of a data management method based on multidimensional characteristics, which is specifically as follows: the management terminal receives the data access request sent by the routing terminal and generates initial data characteristic statistical information and triggering conditions corresponding to the storage terminal according to the data access request; the management end determines a target storage end according to the trigger condition, generates the task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end; and the management end updates or deletes the target data characteristic statistical information corresponding to the target storage end according to the first processing instruction. The task execution request is generated according to the target storage end and the triggering condition, so that the problems that multiple dimensions cannot be achieved, data consistency cannot be guaranteed, the cost is high and the data processing is low in efficiency are solved, and data are automatically migrated and managed according to multiple different dimensions such as data and business, so that the user business requirement is met by a relatively optimal cost and a simple one-stop solution; the contradiction between read-write optimization mutual exclusion and the contradiction between data cooling and heating and storage cost are solved, and optimal performance and cost are provided for users; transparent to upper users, users do not need to deploy multiple systems or add external tools to complete lifecycle management of the data.
Next, a data management method based on a multi-dimensional feature in the present exemplary embodiment will be further described.
In one embodiment of the invention, the storage end comprises more than 2 storage engines, and each storage engine provides data management capability of data feature optimization oriented to different dimensions.
In an embodiment of the present invention, the management end is configured to manage and define a data management capability and a life cycle management policy of each storage engine, that is, the storage end, and track data optimization feature information, that is, data feature statistics information, of each storage engine, and trigger data management tasks, such as data migration, when a definition rule, that is, a trigger condition is satisfied.
In an embodiment of the present invention, the management end receives a data access request sent by the routing end, and generates initial data feature statistics information and triggering conditions corresponding to the storage end according to the data access request, where the initial data feature statistics information includes initial dimension information, initial optimization feature information and initial granularity information corresponding to the storage end; according to the data in the data access request sent by the routing end in the execution process, mapping the data to the data granularity defined by the engine, and updating the data granularity, namely granularity information; each storage end has corresponding data characteristic statistical information.
In one embodiment of the invention, the data access request includes, but is not limited to, data query, data access, data addition, data deletion.
In an embodiment of the present invention, the step of generating initial data feature statistics and triggering conditions corresponding to the storage terminal according to the data access request includes:
the management end determines access data in the storage end according to the data access request;
the management end determines data granularity information corresponding to the access data in the storage end according to the access data;
and the management end generates initial data characteristic statistical information and the triggering condition corresponding to the storage end according to the data granularity information.
In a specific embodiment, according to the data to be accessed, i.e. the access data, in the data access request sent by the routing end in the execution process, mapping the data granularity, i.e. the granularity information, defined by the storage engine, i.e. the storage end, and updating the data granularity, i.e. the granularity information.
In a specific embodiment, when the data access request is newly added data, the initial data feature statistical information corresponding to the storage end for storing the newly added data can be generated through the newly added data.
In one embodiment, the characteristics of the data are defined by classification by dimension information, including but not limited to:
dimension one: defining the data as dynamic data and static data according to the change dimension of the data, namely the data change dimension;
dimension two: defining the data as hot data, warm data and cold data according to the access frequency of the data, namely the dimension of the data access frequency;
dimension three: according to the business characteristics of the data, the data is defined as search data and analysis data, namely, the dimension of the data query business.
In a specific embodiment, the optimization feature information refers to the optimization features of the engine, including but not limited to:
optimizing performance based on thermal data characteristics; optimizing the cost based on the cold data characteristics; optimizing based on dynamic data characteristics; optimizing based on static data characteristics; access feature optimization is based on the retrieval.
In a particular embodiment, granularity information refers to the statistical unit granularity of data, including but not limited to:
counting according to each row of the data table; statistics per page/block of the data table; statistics, such as time, are performed according to a certain range interval of the data table; counting according to the data sheets of the data table; statistics are performed by partition of the data table.
In an embodiment of the present invention, when the data access request is newly added data, initial data feature statistics information corresponding to the storage end for storing the newly added data can be generated through the newly added data, where the initial data feature statistics information includes initial dimension information, initial optimization feature information, and initial granularity information, which are used to define a feature statistics management structure corresponding to the storage end for storing the newly added data; that is, each storage end has corresponding data feature statistical information.
In an embodiment of the present invention, the step of generating the task execution request according to the target storage end and the trigger condition, and sending the task execution request to the execution end includes:
the management end obtains all data in the storage end and determines whether target data meeting the triggering condition exists or not;
when all the data in the storage end have target data which correspondingly meet the triggering condition, the management end generates the task execution request according to the target data;
and the management end sends the task execution request to the execution end.
In an embodiment of the present invention, determining whether there is data satisfying a trigger condition in all data in a storage terminal through the trigger condition in the storage terminal, where the trigger condition includes a periodic or external event, specifically: periodicity refers to a period of time, such as: triggering every minute, triggering every hour, triggering at a specific moment every day, and month; external events generally refer to the triggering of a change in statistical information caused by access to upper layers to a certain threshold, such as: if it is counted that no access to a certain data is made for two consecutive weeks, the migration of the data to a certain engine is triggered.
In an embodiment of the present application, the management end updates or deletes the data feature statistics corresponding to the target storage end according to the first processing instruction; the management end updates or deletes the data characteristic statistical information of the target storage end by receiving a first processing instruction returned by the execution end, wherein the updating or deleting is carried out according to the task type corresponding to the first processing instruction;
in one embodiment, when the task type is migration, the data feature statistics are updated in the target storage.
In another embodiment, when the task type is delete, the data feature statistics are deleted in the target storage.
Referring to fig. 2, a flowchart illustrating steps of a data management method based on a multidimensional feature according to an embodiment of the present application is shown;
a data management method based on multidimensional features, the method relates to a storage end, a routing end, a management end and an execution end, and the method comprises the following steps:
s210, when the execution end receives a task execution request sent by the management end, the execution end determines a task type corresponding to the task execution request according to the task execution request;
S220, the execution end acquires target data from the storage end according to the trigger condition in the task execution request;
s230, the execution end performs data processing on the target data according to the task type and the target storage end, wherein the task type comprises migration or deletion;
s240, the executing end generates a first processing instruction responding to the management end according to the target data, the target storage end and the task type, and sends the first processing instruction to the management end; the first processing instruction is used for updating or deleting the target data characteristic statistical information corresponding to the target storage end at the management end;
s250, the execution end generates a second processing instruction responding to the routing end according to the target data, the target storage end and the task type, and sends the second processing instruction to the routing end; the second processing instruction is configured to add or delete, at the routing end, a target access path corresponding to a target access path formed from the target data to the target storage end.
In an embodiment of the present invention, the data processing is performed on the target data according to the task type and the target storage end, where the task type includes a step of migration or deletion, and the method includes:
When the task type is the migration, the execution end migrates the target data from the storage end to the target storage end according to the task type;
or alternatively, the first and second heat exchangers may be,
and when the task type is the deletion, deleting the target data from the target storage end by the execution end according to the task type, wherein the target storage end is an initial storage end corresponding to the target data.
In an embodiment of the present invention, the executing end obtains the target data from the storage end according to the trigger condition; acquiring target data in a storage end through a triggering condition; in a specific embodiment, all data in the storage end are acquired, and target data is screened from all data through a triggering condition, wherein the triggering condition can be a preset threshold value, and data larger than the preset threshold value are determined as target data.
In an embodiment of the present invention, when the task type is the migration, the executing end migrates the target data from the storage end to the target storage end according to the task type; in a specific embodiment, when the task type is migration, all data, i.e. target data, of which the storage end reaches a threshold value, are copied to a target storage end in the storage end according to defined data granularity, i.e. granularity information.
In an embodiment of the present invention, when the task type is the deletion, the executing terminal deletes the target data from the target storage terminal according to the task type, where the target storage terminal is an initial storage terminal corresponding to the target data; in a specific embodiment, when the task type is delete, deleting all data, i.e. target data, of which the target storage end reaches a threshold value in the target storage end according to defined data granularity, i.e. granularity information; specifically, when the task type is delete, the target storage is the initial storage corresponding to the target data.
In an embodiment of the present invention, the executing end generates a first processing instruction in response to the managing end according to the target data, the target storage end and the task type;
the execution end sends the first processing instruction to the management end; the first processing instruction is used for updating or deleting the corresponding data characteristic statistical information in the target storage end at the management end.
In a specific embodiment, when copying all data, namely target data, of which the storage end reaches a threshold value to a target storage end, generating a first processing instruction according to a task type and sending the first processing instruction to a management end;
In an embodiment of the present application, the executing end generates a second processing instruction in response to the routing end according to the target data, the target storage end and the task type;
the execution end sends the second processing instruction to the routing end; the second processing instruction is used for adding or deleting a target access path formed by the target data to the target storage end in the target storage end at the routing end;
in a specific embodiment, after all the data reaching the threshold in the target storage end, that is, the target data is deleted, a second processing instruction is generated according to the task type and sent to the routing end.
Referring to fig. 3, a flowchart illustrating steps of a data management method based on a multidimensional feature according to an embodiment of the present application is shown;
a data management method based on multidimensional features, the method relates to a storage end, a routing end, a management end and an execution end, and the method comprises the following steps:
s310, the router end responds to the data access information aiming at the upper layer data to determine an access path;
s320, the router generates the data access request according to the access path and sends the data access request to the management end;
S330, the router adds or deletes the target access path from the target storage end according to the second processing instruction sent by the execution end.
In an embodiment of the present invention, a routing end is configured to determine which storage end stores data, and may perform transparent routing of an upper layer data access request to a corresponding storage end, where the newly added data is routed to the storage end; in one embodiment, the storage includes an initial storage and a plurality of target storages, and the newly added data is routed to the initial storage.
In an embodiment of the present invention, the routing end determines an access path in response to data access information for upper layer data; the access path can be used for knowing which storage ends need to be accessed; determining which storage end is accessed according to the data access information in the router according to the data condition to be accessed; such as: for the first newly added data, the data are put into an initial storage end, route information, namely an access path, is recorded in the initial storage end, and the route information is updated when the data are migrated between the storage ends; for subsequent access, the storage end on which the data to be accessed is located can be known according to the routing end.
In one embodiment of the invention, a data access request refers to an access of data generally including data addition, deletion, and verification (database is generally represented by DML), and a change of data definition (database is generally represented by DDL); and generating a corresponding target request according to the data addition, deletion, check or change.
In an embodiment of the present invention, the routing end generates the data access request according to the access path, and sends the data access request to the management end.
In an embodiment of the present invention, the routing end adds or deletes a target access path from the target storage end according to a second processing instruction sent by the execution end; in a specific embodiment, when the task type is migration, adding an access path formed between the storage end and the target storage end of the target data to the target storage end; in another embodiment, when the task type of the data task is a deletion task, all access paths corresponding to the target data are deleted in the storage end.
Example 1
The following illustrates a storage engine, i.e. a storage end, optimized by 3 different dimensional characteristics, which can be easily extended to a larger number of storage engines, i.e. storage ends, and the dimensional characteristics only include 3 dimensions, which should be easily extended to a larger number of dimensions or be correspondingly deformed.
The following three storage engines are added, namely a storage end, and the related attributes are as follows:
storage engine 1: changing dimension, dynamic data characteristics, recording level granularity, no change in a trigger threshold value of 30 days, migration of data tasks and storage engine 2 of a target storage end;
storage engine 2: the access frequency dimension, the thermal data characteristic, the data batch granularity, the trigger threshold value is 20 (the access frequency is low), the attenuation period is 1 degree per hour, the increment of each access is 10 degrees, the maximum is 720 degrees (the access is simplified, the attenuation can be performed by a complex model such as Newton's cooling law), the data task is migration, and the target storage end is the storage engine 3;
storage engine 3: the method comprises the steps of accessing a time dimension, heating data characteristics, granularity of data files, aging a trigger threshold of 10 years, deleting data tasks and no target storage end.
Establishing a data structure of the latest update time of each updated record for the storage engine 1 as data characteristic statistical information of the storage engine 1;
establishing a data structure of access temperature of each batch of data for the storage engine 2 as data characteristic statistical information of the storage engine 2;
a data structure of creation time of each data file is established for the storage engine 3 as data characteristic statistical information of the storage engine 3;
Suppose that 10 pieces of data are inserted 8/1/2012:
ID NAME SALARY
1 LIU 30000
2 LI 9000
3 WANG 20000
4 ZHAO 10000
5 HAN 500000
6 FAN 4000
7 GUO 60000
8 MAO 7000
9 CAI 40000
10 GAO 12000
initially, all these 10 pieces of data are inserted into the storage engine 1, i.e. the storage side, and the corresponding routing information is updated.
The data with the ID of 8 is updated in 8 months 2012, such as from 7000 to 10000, at this time, the data feature statistical information of the storage engine 1 is newly added with the ID of 8, the updating time is 8 months 2012, and the other data is not updated until 31 months 2012.
At this time, the regular task triggers the data task of the storage engine 1, all but the record with ID 8 is migrated to the storage engine 2, the data is deleted from the storage engine 1, the route information is updated, the routes with ID 1 to 7 and ID 9 to 10 are updated to the storage engine 2, ID 1 to 5 is used as one data batch a, ID 6, 7, 9 and 10 data is used as data batch B, and the temperature of the initialized data batch a and data batch 2 is 100.
At this time, if the user wants to access the data, but accesses the data with ID of 8, the data will be automatically routed to the storage engine 1, and other data will access the storage engine 2; assuming that 10 data lot B data were queried on day 8, 31 in 2012, the temperature in the data characteristic statistics of data lot B would be 200 degrees, while the data temperature of data lot a would be 100 degrees.
From 31 days, the data is attenuated by 10 degrees every day, when the temperature of the batch is reduced to 20 degrees, a data migration task is triggered, the data of the data batch A is migrated from the storage engine 2 to the storage engine 3, and the route is updated, namely, the route with the ID of 1-5 is changed into the route to the storage engine 3, the storage engine 3 stores the ID of 1-5 as a whole to the data file Z, and the creation time of the update feature statistics data file Z is 2012, 9 months and 4 days.
At this time, if the user accesses the data, automatic routing will be implemented: data with ID 8 is at storage engine 1 (assuming that ID 8 is always updated during this period), data with IDs 6, 7, 9, and 10 are at storage engine 2 (assuming that batch 2 is always queried during this period), and data with IDs 1-5 are at storage engine 3.
Finally, at 2022, 9 and 4, data file Z will trigger the data deletion task, delete directly from storage engine 3, and delete the relevant routing information and feature statistics, as it reaches the threshold condition of 10 years. At this time, the user revisits the data with the ID of 1 to 5, and no more.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
Referring to fig. 4, a block diagram of a data management device based on multidimensional features according to an embodiment of the present application is shown;
a data management apparatus based on multidimensional features, the apparatus involving a storage side, a routing side, a management side, and an execution side, the apparatus comprising:
a receiving module 410, configured to receive, by the management end, a data access request sent by the routing end, and generate initial data feature statistics information and a triggering condition corresponding to the storage end according to the data access request;
the first generating module 420 is configured to determine, by the management end, a target storage end according to the trigger condition, generate the task execution request according to the target storage end and the trigger condition, send the task execution request to the execution end, and receive a first processing instruction sent by the execution end;
the first data processing module 430 is configured to update or delete the target data feature statistics corresponding to the target storage according to the first processing instruction by the management end.
In one embodiment of the present application, the receiving module 410 includes:
the first determining submodule is used for determining access data in the storage end by the management end according to the data access request;
The second determining submodule is used for determining data granularity information corresponding to the access data in the storage end according to the access data by the management end;
and the first updating sub-module is used for generating initial data characteristic statistical information corresponding to the storage end and the triggering condition by the management end according to the data granularity information.
In an embodiment of the present application, the first generating module 420 includes:
a third determining submodule, configured to, according to the obtaining, by the management end, all the data in the storage end, and determine whether there is target data that meets the triggering condition;
the first generation sub-module is used for generating the task execution request according to the target data by the management end when all the data in the storage end have the target data which correspondingly meet the trigger condition;
and the first sending submodule is used for sending the task execution request to the execution end by the management end.
Referring to fig. 5, a block diagram of a data management device based on multidimensional features according to an embodiment of the present application is shown;
a data management apparatus based on multidimensional features, the apparatus involving a storage side, a routing side, a management side, and an execution side, the apparatus comprising:
A first determining module 510, configured to, when the executing end receives a task execution request sent by the management end, determine, by the executing end according to the task execution request, a task type corresponding to the task execution request;
the acquiring module 520 is configured to acquire target data from the storage terminal according to a trigger condition in the task execution request by the execution terminal;
the second data processing module 530 is configured to perform data processing on the target data by the execution end according to the task type and the target storage end, where the task type includes migration or deletion;
a third data processing module 540, configured to generate a first processing instruction according to the target data, the target storage end, and the task type by the execution end, and send the first processing instruction to the management end; the first processing instruction is used for updating or deleting the target data characteristic statistical information corresponding to the target storage end at the management end;
a fourth data processing module 550, configured to generate a second processing instruction according to the target data, the target storage end, and the task type by using the execution end, and send the second processing instruction to the routing end; the second processing instruction is configured to add or delete, at the routing end, a target access path corresponding to a target access path formed from the target data to the target storage end.
In one embodiment of the present application, the second data processing module 530 includes:
the first execution sub-module is used for the execution end to migrate the target data from the storage end to the target storage end according to the task type when the task type is the migration;
or alternatively, the first and second heat exchangers may be,
a second execution sub-module, configured to delete the target data from the target storage according to the task type when the task type is the deletion, where the target storage is an initial storage corresponding to the target data
Referring to fig. 6, a block diagram of a data management device based on multidimensional features according to an embodiment of the present application is shown;
a data management apparatus based on multidimensional features, the apparatus involving a storage side, a routing side, a management side, and an execution side, the apparatus comprising:
a second determining module 610, configured to determine an access path by the routing peer in response to the data access information for the upper layer data;
a second generating module 620, configured to generate the data access request according to the access path by the routing end, and send the data access request to the management end;
And a fifth data processing module 630, configured to add or delete a target access path from the target storage according to the second processing instruction sent by the execution end by the routing end.
Referring to fig. 7, a computer device illustrating a data management method based on a multi-dimensional feature according to the present invention may specifically include the following:
the computer device 12 described above is embodied in the form of a general purpose computing device, and the components of the computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.
Bus 18 represents one or more of several types of bus 18 structures, including a memory bus 18 or memory controller, a peripheral bus 18, an accelerated graphics port, a processor, or a local bus 18 using any of a variety of bus 18 architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus 18, micro channel architecture (MAC) bus 18, enhanced ISA bus 18, video Electronics Standards Association (VESA) local bus 18, and Peripheral Component Interconnect (PCI) bus 18.
Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (commonly referred to as a "hard disk drive"). Although not shown in fig. 7, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk such as a CD-ROM, DVD-ROM, or other optical media may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. The memory may include at least one program product having a set (e.g., at least one) of program modules 42, the program modules 42 being configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, a memory, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules 42, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, camera, etc.), one or more devices that enable an operator to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet, through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. It should be appreciated that although not shown in fig. 7, other hardware and/or software modules may be used in connection with computer device 12, including, but not limited to: microcode, device drivers, redundant processing units 16, external disk drive arrays, RAID systems, tape drives, data backup storage systems 34, and the like.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, implementing a multi-dimensional feature-based data management method provided by an embodiment of the present application.
That is, the processing unit 16 realizes when executing the program: the management terminal receives the data access request sent by the routing terminal and generates initial data characteristic statistical information and triggering conditions corresponding to the storage terminal according to the data access request; the management end determines a target storage end according to the trigger condition, generates the task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end; and the management end updates or deletes the target data characteristic statistical information corresponding to the target storage end according to the first processing instruction.
In an embodiment of the present application, the present application further provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a multi-dimensional feature-based data management method as provided in all embodiments of the present application:
That is, the program is implemented when executed by a processor: the management terminal receives the data access request sent by the routing terminal and generates initial data characteristic statistical information and triggering conditions corresponding to the storage terminal according to the data access request; the management end determines a target storage end according to the trigger condition, generates the task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end; and the management end updates or deletes the target data characteristic statistical information corresponding to the target storage end according to the first processing instruction.
Any combination of one or more computer readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the operator's computer, partly on the operator's computer, as a stand-alone software package, partly on the operator's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the operator computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (e.g., connected through the internet using an internet service provider). In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The foregoing has described in detail a method and apparatus for managing data based on multidimensional features, and specific examples are applied to illustrate the principles and embodiments of the present application, and the above examples are only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. The method is characterized in that the method relates to a storage end, a routing end, a management end and an execution end, wherein the multi-dimensional characteristics at least comprise any two of a data change dimension, a data access frequency dimension and a data query service dimension, the storage end comprises an initial storage end and a plurality of target storage ends, and each storage end has corresponding data characteristic statistical information, and the method comprises the following steps:
the management terminal receives a data access request sent by the routing terminal, and generates initial data characteristic statistical information and triggering conditions corresponding to the storage terminal according to the data access request, wherein the initial data characteristic statistical information comprises initial dimension information, and the characteristics of the data are classified and defined through the initial dimension information;
The management end determines a target storage end according to the trigger condition, generates a task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end, wherein the execution end is used for determining a task type corresponding to the task execution request according to the task execution request and acquiring target data in the storage end according to the trigger condition in the task execution request, the execution end carries out data processing on the target data according to the task type and the target storage end, the task type comprises migration or deletion, and generates a first processing instruction responding to the management end according to the target data, the target storage end and the task type;
and the management end updates or deletes the target data characteristic statistical information corresponding to the target storage end according to the first processing instruction.
2. The method of claim 1, wherein the step of generating initial data characteristic statistics and triggering conditions corresponding to the storage side in accordance with the data access request comprises:
The management end determines access data in the storage end according to the data access request;
the management end determines data granularity information corresponding to the access data in the storage end according to the access data;
and the management end generates initial data characteristic statistical information and the triggering condition corresponding to the storage end according to the data granularity information.
3. The method according to claim 1, wherein the step of generating the task execution request according to the target storage side and the trigger condition, and transmitting the task execution request to the execution side includes:
the management end acquires all data in the storage end and determines whether target data meeting the triggering condition exists or not;
when all the data in the storage end have target data which correspondingly meet the triggering condition, the management end generates the task execution request according to the target data;
and the management end sends the task execution request to the execution end.
4. The method is characterized in that the method relates to a storage end, a routing end, a management end and an execution end, wherein the multi-dimensional characteristics at least comprise any two of a data change dimension, a data access frequency dimension and a data query service dimension, the storage end comprises an initial storage end and a plurality of target storage ends, and each storage end has corresponding data characteristic statistical information, and the method comprises the following steps:
When the execution end receives a task execution request sent by the management end, the execution end determines a task type corresponding to the task execution request according to the task execution request, wherein the management end is used for receiving a data access request sent by the routing end and generating initial data characteristic statistical information and triggering conditions corresponding to the storage end according to the data access request, the initial data characteristic statistical information comprises initial dimension information, and the characteristics of data are classified and defined through the initial dimension information; the management end determines a target storage end according to the triggering condition and generates a task execution request according to the target storage end and the triggering condition;
the execution end obtains target data from the storage end according to the trigger condition in the task execution request;
the execution end carries out data processing on the target data according to the task type and the target storage end, wherein the task type comprises migration or deletion;
the execution end generates a first processing instruction responding to the management end according to the target data, the target storage end and the task type, and sends the first processing instruction to the management end; the first processing instruction is used for updating or deleting the target data characteristic statistical information corresponding to the target storage end at the management end;
The execution end generates a second processing instruction responding to the routing end according to the target data, the target storage end and the task type, and sends the second processing instruction to the routing end; the second processing instruction is configured to add or delete, at the routing end, a target access path corresponding to a target access path formed from the target data to the target storage end.
5. The method of claim 4, wherein the data processing is performed on the target data according to the task type and the target storage side, wherein the task type includes a step of migrating or deleting, and the method includes:
when the task type is the migration, the execution end migrates the target data from the storage end to the target storage end according to the task type;
or alternatively, the first and second heat exchangers may be,
and when the task type is the deletion, deleting the target data from the target storage end by the execution end according to the task type, wherein the target storage end is an initial storage end corresponding to the target data.
6. The method is characterized in that the method relates to a storage end, a routing end, a management end and an execution end, wherein the multi-dimensional characteristics at least comprise any two of a data change dimension, a data access frequency dimension and a data query service dimension, the storage end comprises an initial storage end and a plurality of target storage ends, and each storage end has corresponding data characteristic statistical information, and the method comprises the following steps:
The router end responds to the data access information aiming at the upper layer data to determine an access path;
the router generates the data access request according to the access path and sends the data access request to the management end, wherein the management end is used for receiving the data access request sent by the router and generating initial data characteristic statistical information and triggering conditions corresponding to the storage end according to the data access request, the initial data characteristic statistical information comprises initial dimension information, and the characteristics of the data are classified and defined through the initial dimension information; the management end determines a target storage end according to the trigger condition, generates a task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end, wherein the execution end is used for determining a task type corresponding to the task execution request according to the task execution request and acquiring target data in the storage end according to the trigger condition in the task execution request, the execution end carries out data processing on the target data according to the task type and the target storage end, the task type comprises migration or deletion, and generates a first processing instruction responding to the management end according to the target data, the target storage end and the task type;
The router adds or deletes a target access path from a target storage end according to a second processing instruction sent by the execution end, wherein the execution end is used for generating a second processing instruction responding to the router end according to target data, the target storage end and a task type, and sending the second processing instruction to the router end, and the second processing instruction is used for adding or deleting the target access path corresponding to the target data in the target storage end and formed by the target data to the target storage end in the router end.
7. A data management device based on a multi-dimensional feature, wherein the device relates to a storage end, a routing end, a management end and an execution end, the multi-dimensional feature at least comprises any two of a data change dimension, a data access frequency dimension and a data query service dimension, the storage end comprises an initial storage end and a plurality of target storage ends, each storage end has data feature statistical information corresponding to the storage end, and the device comprises:
the receiving module is used for receiving the data access request sent by the router end by the management end, generating initial data characteristic statistical information and triggering conditions corresponding to the storage end according to the data access request, wherein the initial data characteristic statistical information comprises initial dimension information, and classifying and defining the characteristics of the data through the initial dimension information;
The first generation module is used for determining a target storage end by the management end according to the trigger condition, generating a task execution request according to the target storage end and the trigger condition, sending the task execution request to the execution end, and receiving a first processing instruction sent by the execution end, wherein the execution end is used for determining a task type corresponding to the task execution request according to the task execution request and acquiring target data in the storage end according to the trigger condition in the task execution request, the execution end carries out data processing on the target data according to the task type and the target storage end, the task type comprises migration or deletion, and generating a first processing instruction responding to the management end according to the target data, the target storage end and the task type;
and the first data processing module is used for updating or deleting the target data characteristic statistical information corresponding to the target storage end by the management end according to the first processing instruction.
8. A data management device based on a multi-dimensional feature, wherein the device relates to a storage end, a routing end, a management end and an execution end, the multi-dimensional feature at least comprises any two of a data change dimension, a data access frequency dimension and a data query service dimension, the storage end comprises an initial storage end and a plurality of target storage ends, each storage end has data feature statistical information corresponding to the storage end, and the device comprises:
The first determining module is used for determining a task type corresponding to the task execution request according to the task execution request when the execution end receives the task execution request sent by the management end, wherein the management end is used for receiving a data access request sent by the routing end and generating initial data characteristic statistical information and triggering conditions corresponding to the storage end according to the data access request, the initial data characteristic statistical information comprises initial dimension information, and the characteristics of data are classified and defined through the initial dimension information; the management end determines a target storage end according to the triggering condition and generates a task execution request according to the target storage end and the triggering condition;
the acquisition module is used for the execution end to acquire target data from the storage end according to the trigger condition in the task execution request;
the second data processing module is used for the execution end to perform data processing on the target data according to the task type and the target storage end, wherein the task type comprises migration or deletion;
the third data processing module is used for generating a first processing instruction responding to the management end by the execution end according to the target data, the target storage end and the task type, and sending the first processing instruction to the management end; the first processing instruction is used for updating or deleting the target data characteristic statistical information corresponding to the target storage end at the management end;
The fourth data processing module is used for generating a second processing instruction responding to the routing end by the execution end according to the target data, the target storage end and the task type, and sending the second processing instruction to the routing end; the second processing instruction is configured to add or delete, at the routing end, a target access path corresponding to a target access path formed from the target data to the target storage end.
9. A data management device based on a multi-dimensional feature, wherein the device relates to a storage end, a routing end, a management end and an execution end, the multi-dimensional feature at least comprises any two of a data change dimension, a data access frequency dimension and a data query service dimension, the storage end comprises an initial storage end and a plurality of target storage ends, each storage end has data feature statistical information corresponding to the storage end, and the device comprises:
the second determining module is used for determining an access path by the routing end in response to the data access information aiming at the upper layer data;
the second generation module is used for generating the data access request by the routing end according to the access path and sending the data access request to the management end, wherein the management end is used for receiving the data access request sent by the routing end and generating initial data characteristic statistical information and triggering conditions corresponding to the storage end according to the data access request, the initial data characteristic statistical information comprises initial dimension information, and the characteristics of the data are classified and defined through the initial dimension information; the management end determines a target storage end according to the trigger condition, generates a task execution request according to the target storage end and the trigger condition, sends the task execution request to the execution end, and receives a first processing instruction sent by the execution end, wherein the execution end is used for determining a task type corresponding to the task execution request according to the task execution request and acquiring target data in the storage end according to the trigger condition in the task execution request, the execution end carries out data processing on the target data according to the task type and the target storage end, the task type comprises migration or deletion, and generates a first processing instruction responding to the management end according to the target data, the target storage end and the task type;
And the fifth data processing module is used for adding or deleting the target access path from the target storage end according to the second processing instruction sent by the execution end by the routing end, wherein the execution end is used for generating a second processing instruction responding to the routing end according to target data, the target storage end and the task type, and sending the second processing instruction to the routing end, and the second processing instruction is used for adding or deleting the target access path which corresponds to the target access path formed from the target data to the target storage end in the target storage end.
10. A computer device comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, which computer program, when executed by the processor, implements the method of any one of claims 1 to 6.
CN202211475101.5A 2022-11-23 2022-11-23 Data management method and device based on multidimensional features Active CN115718571B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211475101.5A CN115718571B (en) 2022-11-23 2022-11-23 Data management method and device based on multidimensional features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211475101.5A CN115718571B (en) 2022-11-23 2022-11-23 Data management method and device based on multidimensional features

Publications (2)

Publication Number Publication Date
CN115718571A CN115718571A (en) 2023-02-28
CN115718571B true CN115718571B (en) 2023-08-22

Family

ID=85256090

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211475101.5A Active CN115718571B (en) 2022-11-23 2022-11-23 Data management method and device based on multidimensional features

Country Status (1)

Country Link
CN (1) CN115718571B (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7949637B1 (en) * 2007-06-27 2011-05-24 Emc Corporation Storage management for fine grained tiered storage with thin provisioning
EP2811409A1 (en) * 2012-05-07 2014-12-10 Hitachi, Ltd. Computer system, storage management computer, and storage management method
WO2017024802A1 (en) * 2015-08-07 2017-02-16 中兴通讯股份有限公司 System having multiple coexisting storage media, file operation method and device, and computer storage medium
CN107193638A (en) * 2017-05-30 2017-09-22 南京邮电大学 A kind of quick self-adapted moving method of network function perceived based on multi-dimensional environment
WO2018059032A1 (en) * 2016-09-30 2018-04-05 华为技术有限公司 Data migration method for virtual node, and virtual node
CN108959525A (en) * 2018-06-28 2018-12-07 郑州云海信息技术有限公司 A kind of cold and hot data visualization method, system, equipment and computer storage medium
US10657154B1 (en) * 2017-08-01 2020-05-19 Amazon Technologies, Inc. Providing access to data within a migrating data partition
CN111930489A (en) * 2020-09-15 2020-11-13 南京领行科技股份有限公司 Task scheduling method, device, equipment and storage medium
CN112231299A (en) * 2020-09-27 2021-01-15 中国建设银行股份有限公司 Method and device for dynamically adjusting feature library
CN113296696A (en) * 2021-03-02 2021-08-24 阿里巴巴新加坡控股有限公司 Data access method, computing device and storage medium
CN114003555A (en) * 2021-10-29 2022-02-01 泰康保险集团股份有限公司 Data management method, system, computer equipment and storage medium
CN114222990A (en) * 2019-08-13 2022-03-22 国际商业机器公司 Automatic lasso software detection with on-demand file system locking and automatic repair
CN114415965A (en) * 2022-01-25 2022-04-29 中国农业银行股份有限公司 Data migration method, device, equipment and storage medium
CN114584618A (en) * 2020-12-01 2022-06-03 北京金山云网络技术有限公司 Information interaction method, device, equipment, storage medium and system
CN114912815A (en) * 2022-05-24 2022-08-16 湖南铁路联创技术发展有限公司 Index automatic definition method, system and storage medium based on big data wide table
CN115291806A (en) * 2022-08-11 2022-11-04 北京青云科技股份有限公司 Processing method, processing device, electronic equipment and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8001327B2 (en) * 2007-01-19 2011-08-16 Hitachi, Ltd. Method and apparatus for managing placement of data in a tiered storage system
US8898124B2 (en) * 2010-12-16 2014-11-25 International Business Machines Corporation Controlling database trigger execution with trigger return data
US9715353B2 (en) * 2014-09-16 2017-07-25 International Business Machines Corporation Data set management
US11138160B2 (en) * 2016-07-13 2021-10-05 International Business Machines Corporation Application performance using multidimensional predictive algorithm for automated tiering mechanisms
US10168915B2 (en) * 2017-01-19 2019-01-01 International Business Machines Corporation Workload performance in a multi-tier storage environment
US11327988B2 (en) * 2019-06-20 2022-05-10 International Business Machines Corporation Dynamically self-indexing database-management system
CN110569233A (en) * 2019-06-30 2019-12-13 华为技术有限公司 Hot data management method, device and system
US11704279B2 (en) * 2019-10-23 2023-07-18 International Business Machines Corporation Event driven migration, recall, and data placement optimization based on custom metadata

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7949637B1 (en) * 2007-06-27 2011-05-24 Emc Corporation Storage management for fine grained tiered storage with thin provisioning
EP2811409A1 (en) * 2012-05-07 2014-12-10 Hitachi, Ltd. Computer system, storage management computer, and storage management method
WO2017024802A1 (en) * 2015-08-07 2017-02-16 中兴通讯股份有限公司 System having multiple coexisting storage media, file operation method and device, and computer storage medium
WO2018059032A1 (en) * 2016-09-30 2018-04-05 华为技术有限公司 Data migration method for virtual node, and virtual node
CN107193638A (en) * 2017-05-30 2017-09-22 南京邮电大学 A kind of quick self-adapted moving method of network function perceived based on multi-dimensional environment
US10657154B1 (en) * 2017-08-01 2020-05-19 Amazon Technologies, Inc. Providing access to data within a migrating data partition
CN108959525A (en) * 2018-06-28 2018-12-07 郑州云海信息技术有限公司 A kind of cold and hot data visualization method, system, equipment and computer storage medium
CN114222990A (en) * 2019-08-13 2022-03-22 国际商业机器公司 Automatic lasso software detection with on-demand file system locking and automatic repair
CN111930489A (en) * 2020-09-15 2020-11-13 南京领行科技股份有限公司 Task scheduling method, device, equipment and storage medium
CN112231299A (en) * 2020-09-27 2021-01-15 中国建设银行股份有限公司 Method and device for dynamically adjusting feature library
CN114584618A (en) * 2020-12-01 2022-06-03 北京金山云网络技术有限公司 Information interaction method, device, equipment, storage medium and system
CN113296696A (en) * 2021-03-02 2021-08-24 阿里巴巴新加坡控股有限公司 Data access method, computing device and storage medium
CN114003555A (en) * 2021-10-29 2022-02-01 泰康保险集团股份有限公司 Data management method, system, computer equipment and storage medium
CN114415965A (en) * 2022-01-25 2022-04-29 中国农业银行股份有限公司 Data migration method, device, equipment and storage medium
CN114912815A (en) * 2022-05-24 2022-08-16 湖南铁路联创技术发展有限公司 Index automatic definition method, system and storage medium based on big data wide table
CN115291806A (en) * 2022-08-11 2022-11-04 北京青云科技股份有限公司 Processing method, processing device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
范长军 ; 胡志成 ; 杨佳东 ; .基于ALUA的多路径存储系统.计算机系统应用.2017,(第10期),全文. *

Also Published As

Publication number Publication date
CN115718571A (en) 2023-02-28

Similar Documents

Publication Publication Date Title
JP7113040B2 (en) Versioned hierarchical data structure for distributed data stores
CN108874971B (en) Tool and method applied to mass tagged entity data storage
CN104781812B (en) Policy driven data placement and information lifecycle management
CN109906448B (en) Method, apparatus, and medium for facilitating operations on pluggable databases
US9672241B2 (en) Representing an outlier value in a non-nullable column as null in metadata
US8555018B1 (en) Techniques for storing data
JP4648723B2 (en) Method and apparatus for hierarchical storage management based on data value
US10417265B2 (en) High performance parallel indexing for forensics and electronic discovery
US10169446B1 (en) Relational modeler and renderer for non-relational data
Chavan et al. Survey paper on big data
US20070198591A1 (en) Partition by growth table space
US11449509B2 (en) Workflow driven database partitioning
CN104536904A (en) Data management method, equipment and system
CN102955792A (en) Method for implementing transaction processing for real-time full-text search engine
US10860562B1 (en) Dynamic predicate indexing for data stores
US10762068B2 (en) Virtual columns to expose row specific details for query execution in column store databases
US11544229B1 (en) Enhanced tracking of data flows
US10095738B1 (en) Dynamic assignment of logical partitions according to query predicate evaluations
CN115718571B (en) Data management method and device based on multidimensional features
CN112115115B (en) File moving method, equipment and storage medium based on data warehouse
CN113849482A (en) Data migration method and device and electronic equipment
US9442948B2 (en) Resource-specific control blocks for database cache
CN112100175A (en) Partition data directional transmission method and device
KR20210110123A (en) Apparatus for nonrelationlal database using relational database structure and data standardization method thereof
Singh NoSQL: A new horizon in big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant