CN114595291B - Collection task adjusting method and device based on database annotation - Google Patents

Collection task adjusting method and device based on database annotation Download PDF

Info

Publication number
CN114595291B
CN114595291B CN202210500164.5A CN202210500164A CN114595291B CN 114595291 B CN114595291 B CN 114595291B CN 202210500164 A CN202210500164 A CN 202210500164A CN 114595291 B CN114595291 B CN 114595291B
Authority
CN
China
Prior art keywords
collection
data
task
annotation
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210500164.5A
Other languages
Chinese (zh)
Other versions
CN114595291A (en
Inventor
沈瑶
任通
毛云青
叶海涛
齐韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCI China Co Ltd
Original Assignee
CCI China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCI China Co Ltd filed Critical CCI China Co Ltd
Priority to CN202210500164.5A priority Critical patent/CN114595291B/en
Publication of CN114595291A publication Critical patent/CN114595291A/en
Application granted granted Critical
Publication of CN114595291B publication Critical patent/CN114595291B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The application provides a method and a device for adjusting a collection task based on database annotation, which comprises the following steps: establishing at least one collection task aiming at different collection data on a data collection platform, and modifying metadata of the collection data and/or adding collection annotation information in a database based on the collection task adjustment condition of the collection data, wherein the metadata records explanation data corresponding to the explanation collection data, and the collection annotation information records task adjustment information corresponding to the collection data; the data collection platform synchronously acquires collection data and corresponding metadata and/or collection annotation information, if the metadata is detected to be changed or the collection annotation information is detected, the collection task corresponding to the collection data is associated, the collection task is adjusted based on the metadata and/or the collection annotation information, the collection task adjustment condition of the collection data can be rapidly obtained according to the metadata and/or the collection annotation information, and the data collection task adjustment efficiency is improved.

Description

Collection task adjusting method and device based on database annotation
Technical Field
The application relates to the field of big data processing, in particular to a collection task adjusting method and device based on database annotation.
Background
Data collection refers to the collection and integration of data from different sources into the same database. In the early stage of data collection work, related data implementation personnel can establish related collection tasks according to data information to be collected, but due to the problems of incomplete data information and unreasonable data field design in the early stage, the collection modes of metadata and a data table need to be adjusted in a database again in the later operation process, and after the metadata is adjusted in the database, management personnel need to manually adjust the related collection tasks.
That is, when the aggregation task occurs, the modification of the aggregation task is independent, and can only be performed manually by the manager on the platform, so as to increase the implementation steps of the manager.
Disclosure of Invention
The embodiment of the application provides a method and a device for adjusting a collection task based on database annotation, which are used for quickly adjusting configuration information of an established collection task by adding annotation when metadata is adjusted in a database, so that the operation flow of data implementers is reduced, and the collection efficiency is increased.
In a first aspect, an embodiment of the present application provides a collection task adjustment method based on database annotations, where the method includes: establishing at least one collection task aiming at different collection data on a data collection platform, wherein the configuration information of each collection task at least comprises a collection data source, a collection data target and task information;
modifying metadata of the collected data and/or adding collection annotation information based on a collection task adjustment condition of the collected data in a database, wherein the metadata records interpretation data for interpreting the collected data, the collection annotation information records task adjustment information corresponding to the collected data, and the task adjustment information is used for adjusting the task information of the collection task;
the data collection platform synchronously obtains the collection data and corresponding metadata and/or collection annotation information, if the metadata is detected to be changed or the collection annotation information is detected, the collection task corresponding to the collection data is associated, and the collection task is adjusted based on the metadata and/or the collection annotation information.
In a second aspect, an embodiment of the present application provides an aggregation task adjusting apparatus based on database annotations, including:
the task establishing unit is used for establishing at least one collection task aiming at different collection data on a data collection platform, wherein the configuration information of each collection task at least comprises a collection data source, a collection data target and task information;
the annotation filling unit is used for modifying metadata of the collection data and/or adding collection annotation information in a database based on the collection task adjustment condition of the collection data, wherein the metadata records the explanation data for explaining the collection data correspondingly, the collection annotation information records the task adjustment information for adjusting the task information of the collection task, and the task adjustment information is used for adjusting the task information of the collection task;
and the collection adjusting unit is used for synchronously acquiring the collection data and the corresponding metadata and/or collection annotation information by the data collection platform, associating the collection task corresponding to the collection data if the metadata is detected to be changed or the collection annotation information is detected, and adjusting the collection task based on the metadata and/or the collection annotation information.
In a third aspect, an embodiment of the present application provides an electronic apparatus, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the computer program to perform the method for adjusting a collection task based on database annotations.
In a fourth aspect, an embodiment of the present application provides a readable storage medium, in which a computer program is stored, where the computer program includes program code for controlling a process to execute a process, and the process includes the aggregation task adjustment method based on database annotation.
The main contributions and innovation points of the invention are as follows:
according to the embodiment of the application, metadata are modified or collection annotation information representing field annotation or table annotation is added to the collection data in the data table, so that after the data collection platform synchronizes the collection data in the database, the collection task adjustment condition of the collection data can be rapidly obtained according to the metadata and/or the collection annotation information, the established collection task is automatically adjusted based on the collection task adjustment condition, manual adjustment is not needed any more, and the efficiency of data collection task adjustment is improved.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of a method for adjusting a collection task based on database annotations according to an embodiment of the present application;
fig. 2 is a schematic configuration diagram of collecting connection relationship configuration information according to an embodiment of the present application;
FIG. 3 is a logical framework diagram of a method for adjustment of a collection task based on database annotations according to an embodiment of the present application;
FIG. 4 is a block diagram of an apparatus for adjusting a collection task based on database annotations according to an embodiment of the present application;
fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of the present specification. Rather, they are merely examples of apparatus and methods consistent with certain aspects of one or more embodiments of the specification, as detailed in the claims which follow.
It should be noted that: in other embodiments, the steps of the corresponding methods are not necessarily performed in the order shown and described herein. In some other embodiments, the methods may include more or fewer steps than those described herein. Moreover, a single step described in this specification may be broken down into multiple steps for description in other embodiments; multiple steps described in this specification may be combined into a single step in other embodiments.
Example one
An embodiment of the present application provides a collection task adjustment method based on database annotation, and specifically, with reference to fig. 1, the method includes:
establishing at least one collection task aiming at different collection data on a data collection platform, wherein the configuration information of each collection task at least comprises a collection data source, a collection data target and task information;
modifying metadata of the collection data and/or adding collection annotation information based on collection task adjustment conditions of the collection data in a database, wherein the metadata records interpretation data corresponding to the interpretation collection data, the collection annotation information records task adjustment information corresponding to the collection data, and the task adjustment information is used for adjusting the task information of the collection task;
the data collection platform synchronously obtains the collection data and corresponding metadata and/or collection annotation information, if the metadata is detected to be changed or the collection annotation information is detected, the collection task corresponding to the collection data is associated, and the collection task is adjusted based on the metadata and/or the collection annotation information.
According to the scheme, the collection annotation information is originally introduced into the annotation of the collection data of the database, the metadata can be understood as the conventional annotation of the collection data, the collection annotation information and the metadata are used for recording the adjustment of different collection tasks, the metadata and the collection annotation information are used for realizing the automatic adjustment of the collection tasks, and the technical problem that the collection tasks need to be manually adjusted in the prior art is solved. The collection annotation information provided by the scheme is distinguished from the metadata, and the collection annotation information is annotated and explained only aiming at the task adjustment condition of the specific data adjusted by the collection task.
In order to realize data collection, at least one collection task for collecting the collected data needs to be established on a data collection platform. The collection data in different databases or different data tables can be collected into the target database or the target data table based on the configured collection task, and different collection tasks need to configure different configuration information.
The following sets forth the configuration information of the collection task created by the present solution:
the collection data source is used for configuring the source position of the collection data required by the collection task, and includes but is not limited to: data source, source table, and source field; the collection data object is used to configure the target location of the collection data that needs to be collected by the collection task, including but not limited to data objects, source tables, and target fields.
The data source is used for configuring a source of the collected data and can be a source database of the collected data; the data target is used for configuring a collection place of the collected data and can be a target database of the collected data; the source table is used for configuring a source data table of the collected data, and the source table is stored in a source database; the target table is used for configuring a target data table of the collected data, and the target table is stored in a target database; a source field for configuring fields of the aggregated data in the source table; and the target field is used for configuring the fields of the collected data in the target table.
In an embodiment of the present disclosure, the task information of the present disclosure includes at least one of collection connection configuration information, scheduling configuration information, collection mode, collection basis field, data time range, and data deadline of task scheduling forward movement. If the timing task does not need to be set, the scheduling configuration information and the data deadline for the forward movement of the task scheduling are not required to be configured.
The collection connection relationship configuration information configures a connection relationship between a collection data source and a collection data target, and specifically may be: the connection between the source field and the target field. Illustratively, as shown in fig. 2, a connection is formed between the source field ID and the target field ID, a connection is formed between the source field Name and the target field Name, and a connection is formed between the source field Age and the target field Age.
The scheduling configuration information is used for configuring the timing task of the collection task. In some embodiments, the Cron expression is used as scheduling configuration information of the collection task to complete the scheduling of the timing task.
The collection mode comprises a full quantity mode and an increment mode, wherein the full quantity mode indicates that the data in the target table is completely updated without reserving original data; the increment indicates that the data in the target table is updated in a manner that the original data is retained and new data is added.
The data in the source table is subjected to data collection according to the collection basis field, and the collection basis field is empty by default.
The data time range is used for configuring the collection time range of the collection task, and represents that data in the source table are collected in the data time range, the number of the time range is an integer larger than 0, and the unit is s/m/h/d (second/minute/hour/day).
Data deadline for task scheduling move forward: and configuring the forward time of the timing task scheduling, wherein the forward time is used for representing the latest time of the collected data of the data table, and the time range number is an integer larger than 0 and has the unit of s/m/h/d (second/minute/hour/day).
For example, if the task aggregation information of a certain aggregation task is: the scheduling configuration information is 1 point per day, the data deadline of the forward movement of the task scheduling is 1 hour, the data time range is 1 day, and the collection basis field is time; it means that the aggregation task runs at 1 point task every day, and based on the time in time, the aggregation data from 0 point (1 point-1 hour =0 point) to the previous 1 day (i.e. the data from 0 point of the previous day to 0 point of the current day) is filtered out and aggregated.
According to the above description, the collection task configured in the data acquisition platform can collect the collection data meeting the rules in the database.
In the step of modifying metadata of the collected data and/or adding collected annotation information based on the collection task adjustment condition of the collected data in the database, the metadata refers to explanation data for explaining the collected data, and mainly refers to actual change information of tables and fields, such as deletion, modification and the like, in the scheme, and the collected annotation information can be divided into table data and field data according to the type of the annotated collected data.
It is worth mentioning that the "collection task adjustment case" includes several cases: when the adjustment condition of the collection task is that the data of the collection data changes but the task information of the collection task does not change, filling the content of the data change into corresponding metadata as interpretation data; when the adjustment condition of the collection task is data change of collection data and task information change of the collection task, filling the content of the data change into corresponding metadata as interpretation data, and filling the content of the task information change into corresponding collection annotation information as task adjustment information; and when the collection task adjustment condition is that the data of the collection data does not change but the task information of the collection task changes, filling the changed content of the task information into corresponding collection annotation information as task adjustment information.
For example, if the aggregation task adjustment condition of the aggregated data is: if the table name of the source table changes, the interpretation data is: the source table name changes, the change name is 'xx', and the interpretation data is recorded as metadata; in this case, since the change of the contents of the aggregation task is not involved, the aggregation comment information may not be added.
If the adjustment condition of the collection task of the collected data is as follows: adding fields in the source table, adopting a full-collection mode, and interpreting data as follows: the source table adds a field with the name of "xx", the content of the aggregation task corresponding to the field changes, and the interpretation data is recorded as metadata, and aggregation comment information is added.
If the adjustment condition of the collection task of the collected data is as follows: and if the scheduling configuration time of the field in the source table changes, the metadata does not need to be modified, and the changed scheduling configuration time is only used as the collection annotation information.
In addition, the collection annotation information of the scheme is original, and the specific content is shown in the following table one:
table-to-collection annotation information
Figure DEST_PATH_IMAGE001
Specifically, if the type of the collection data is table data, the corresponding collection annotation information may be defined as a table annotation, and the table annotation includes: at least one of a collection state annotation, a collection mode annotation, a retention time annotation, a scheduling configuration annotation, a data time range annotation and a data deadline annotation for forward movement of task scheduling; if the type of the collection data is field data, the corresponding collection annotation information may be defined as a field annotation, and the field annotation includes: at least one of a collection status annotation and a retention time annotation.
Exemplary, the collection task adjustment condition for the collection data is: adding a field 'name', the collection annotation information is field data, the recorded metadata is 'new name', the collection annotation information is '# #1# #', and the annotation information of the database is: "name # #1# #".
Of course, when there may be multiple aggregation tasks for adjusting the same aggregation data, there may be multiple aggregation annotation information, and multiple aggregation annotation information may be accumulated.
In the step of synchronously acquiring the metadata and/or the collection annotation information of the collection data by the data collection platform, the metadata and/or the collection annotation information of the collection data can be synchronized into the data collection platform in an off-line or real-time manner, and the data collection platform can not only acquire the related metadata but also analyze the collection annotation information corresponding to the metadata.
The collection annotation information of the scheme corresponds to a specific data collection platform, namely, different types of collection annotation information can be set corresponding to different data collection platforms.
In order to reduce the data volume, the data aggregation platform deletes the aggregation annotation information in the database after acquiring the aggregation annotation information corresponding to the metadata. Meanwhile, the collection annotation information only acts on the current effective data when the metadata is changed and is only used for the current collection task adjustment, meanwhile, in order to avoid adding invalid connecting fields on the data collection platform, and in addition, in order to avoid the situation that repeated collection annotation information needs to be filled in for secondary modification in a short time, the collection annotation information is deleted after the data collection platform caches the retention time, and the retention time is related to the collection annotation information. As previously described, the collection annotation information includes a retention time annotation within which the retention time of the collection annotation information in the data collection platform is configured.
In the step of associating the collection task corresponding to the collection data if the metadata is detected to be changed or the collection annotation information is included, the method includes three steps: metadata is changed but there is no aggregation annotation information, metadata is not changed but there is aggregation annotation information, and metadata is changed but there is aggregation annotation information.
According to the scheme, the metadata can be directly associated with the corresponding collection task, specifically, the metadata can record the unique identification information of the collection data, and the collection task in the data collection platform can be associated by using the unique identification information. For example, the data aggregation platform can find the aggregation task of the related table in the created series aggregation tasks through the related table name.
Referring to fig. 3, in the step of "associating the aggregation task corresponding to the aggregation data if the metadata is detected to be changed or includes the aggregation annotation information", if the metadata is detected to be changed but the aggregation annotation information is not included, associating the aggregation task corresponding to the aggregation data, and replacing an aggregation data source and/or an aggregation data target in the aggregation task with the changed metadata.
For example, if the adjusted contents of the data source, the source table, and the source field, or the data target, the target table, and the target field, which are changed in the metadata, are recorded, but do not relate to the relevant aggregation annotation information, the corresponding information in the aggregation task is directly modified. Specifically, if the name of the source table corresponding to the collected data is changed, the table name in the collected data source of the collected data is directly modified.
In the step of associating the collection task corresponding to the collection data if the metadata is detected to be changed or the collection annotation information is contained, associating the collection task corresponding to the collection data if the metadata is not changed but the collection annotation information is contained, and replacing the task information of the collection task with the collection annotation information. This may achieve the effect of modifying the aggregation task with the aggregation annotation information.
In the step of associating the collection task corresponding to the collection data if the metadata is detected to be changed or the collection annotation information is contained, associating the collection task corresponding to the collection data if the metadata is changed and the collection annotation information is contained, and correspondingly adjusting the collection task based on the collection state annotation and the metadata.
Specifically, the collection state annotation corresponds to six situations, namely a source table modification field, a source table addition field, a source table deletion field, a target table modification field, a target table addition field, and a target table deletion field. That is, when a table or field change occurs, the corresponding metadata records the change. When the source table modification field and the target table modification field appear, the collection state comment of the field data is correspondingly filled, when the source table addition field and the target table addition field appear, the collection state comment of the field data and the collection state comment of the field data are correspondingly filled, and when the source table deletion field and the target table deletion field appear, the collection state comment of the field data is filled as non-collection.
If the task adjustment condition corresponding to the collection state annotation is a source table modification field, the metadata filling adjustment content is as follows: modifying fields and fields to be modified in the source table, wherein the collection state annotations fill the collection state of the fields to be modified; if the task adjustment condition corresponding to the collection state annotation is a target table modification field, the metadata filling adjustment content is as follows: and modifying fields and fields to be modified which need to be modified in the target table, wherein the collection state annotation fills the collection state of the fields to be modified. That is to say, when a field in the database is modified, the information of the modified field and the field to be modified are recorded in the metadata, the platform compares the obtained metadata with the existing metadata, finds a corresponding collection task according to the field to be modified, and changes the collection state of the modified field in the collection task based on the recorded collection state comments.
If the field to be modified has no connection relation with other fields in the original collection task, judging whether the connection state of the field of the source table and the field of the target table needs to be established according to the collection state comment of the field to be modified, if the content of the collection state comment is in need of connection, newly adding corresponding fields with the same name and consistent content in the target table or the source table, and establishing the connection state; and if the field to be modified has a connection relation with other fields in the collection task, annotating and modifying the connection relation with other fields according to the collection state of the modified field.
The specific modification is as follows:
if the collection state comment is a full collection and the collection state comment corresponding to the field comment is represented by "# #1# #", maintaining the connection relationship between the field to be modified and other fields; and if the collection state annotation is not collection and the collection state annotation of the corresponding field annotation is marked as "# #2# #", canceling the connection relation between the field to be modified and other fields.
If the task adjustment condition corresponding to the collection state annotation is a source table deleted field, the metadata filling adjustment content is as follows: the source table deletes fields and fields to be deleted that need to be deleted. If the task adjustment condition corresponding to the collection state annotation is a target table deleted field, the metadata filling adjustment content is as follows: and deleting fields and fields to be deleted of the target table, wherein the collection state of the fields to be deleted is filled with the collection state comments, the collection state is not collected, the fields to be deleted are directly deleted, and the connection relation corresponding to the fields to be deleted is cancelled.
If the task adjustment condition corresponding to the collection state annotation is that a field is added to a source table, the metadata is filled with adjustment contents: the method comprises the steps that a field and a field to be added are added to a source table, the collection state annotation fills in a collection state annotation of the source table and a collection state annotation of the field to be added, and whether the field to be added needs to be connected with a target field or not is judged based on the collection state annotation of the source table and the collection state annotation of the field to be added.
When the collection state annotation of the source table is full collection and the collection state annotation of the field to be added is collection, the field to be added needs to be connected with a target field; when the collection state annotation of the source table is full collection and the collection state annotation of the field to be added is not collection, the field to be added needs not to be connected with the target field; and when the collection state annotation of the source table is a full collection and the collection state annotation of the field to be added is an initial state, the field to be added needs to be connected with the target field. It is worth mentioning that the annotation of the collection state as the initial state means that: the fields have no collection state data, and the collection task is changed only according to the collection state comments of the table.
When the collection state annotation of the source table is not collection and the collection state annotation of the field to be added is collection, the field to be added needs to be connected with the target field; when the collection state annotation of the source table is full collection and the collection state annotation of the field to be added is not collection, the field to be added needs not to be connected with the target field; and when the collection state annotation of the source table is not collection and the collection state annotation of the field to be added is an initial state, the field to be added needs not to be connected with the target field.
And if the field to be added does not need to be connected with the target field, updating the field information in the collection task by using the metadata. If the field to be added needs to be connected with the target field, screening whether the target table has the corresponding target field, if so, establishing connection with the field to be added, if not, newly adding the corresponding field in the target table according to the field to be added, and establishing the connection relation between the corresponding field and the field to be added.
If the task adjustment condition corresponding to the collection state annotation is that a field is added to a target table, the metadata is filled with interpretation data: adding a field and a field to be added to a target table, filling the collection state comment of the target table and the collection state comment of the field to be added with the collection state comment, and judging whether the field to be added needs to be connected with a source field or not based on the collection state comment of the target table and the collection state comment of the field to be added.
When the collection state annotation of the target table is full collection and the collection state annotation of the field to be added is collection, the field to be added needs to be connected with the source field; when the collection state annotation of the target table is full collection and the collection state annotation of the field to be added is not collection, the field to be added needs to be not connected with the source field; when the collection state annotation of the target table is a full collection and the collection state annotation of the field to be added is an initial state, the field to be added needs to be connected with the source field;
when the collection state annotation of the target table is not collection and the collection state annotation of the field to be added is collection, the field to be added needs to be connected with the source field; when the collection state annotation of the target table is full collection and the collection state annotation of the field to be added is not collection, the field to be added needs to be not connected with the source field; and when the collection state annotation of the target table is not collection and the collection state annotation of the field to be added is an initial state, the field to be added needs not to be connected with the source field.
And if the field to be added does not need to be connected with the source field, updating the field information in the collection task by using the metadata. And if the field to be added needs to be connected with the source field, screening whether a corresponding target field exists in the target table, if so, establishing connection with the field to be added, otherwise, newly adding a corresponding field in the source table according to the field to be added, and establishing a connection relation between the corresponding field and the field to be added.
The above is the implementation mode of the scheme, the modification of the metadata and the modification of the collection task can be realized through the annotation information of the data table provided by the scheme, manual adjustment is not needed, and the workload of data implementation personnel is greatly saved.
Example two
Based on the same concept, referring to fig. 4, the present application further proposes a collection task adjusting apparatus based on database annotation, including:
the task establishing unit 301 is configured to establish at least one aggregation task for different aggregation data on a data aggregation platform, where configuration information of each aggregation task at least includes an aggregation data source, an aggregation data target, and task information;
an annotation filling unit 302, configured to modify metadata of the collected data and/or add collection annotation information in a database based on a collection task adjustment condition of the collected data, where the metadata records an explanation of the collected data, the collection annotation information records task adjustment information corresponding to the collected data, and the task adjustment information is used to adjust task information of the collection task;
a collection adjusting unit 303, configured to synchronously acquire the collection data and corresponding metadata and/or collection annotation information by the data collection platform, and if it is detected that the metadata changes or contains collection annotation information, associate the collection task corresponding to the collection data, and adjust the collection task based on the metadata and/or the collection annotation information.
EXAMPLE III
The present embodiment further provides an electronic apparatus, referring to fig. 5, including a memory 404 and a processor 402, where the memory 404 stores a computer program, and the processor 402 is configured to execute the computer program to perform the steps in any one of the above embodiments of the aggregation task adjusting method based on database annotations.
Specifically, the processor 402 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more integrated circuits of the embodiments of the present application.
Memory 404 may include, among other things, mass storage 404 for data or instructions. By way of example, and not limitation, the memory 404 may include a hard disk drive (hard disk drive, abbreviated HDD), a floppy disk drive, a solid state drive (solid state drive, abbreviated SSD), flash memory, an optical disk, a magneto-optical disk, tape, or a Universal Serial Bus (USB) drive or a combination of two or more of these. Memory 404 may include removable or non-removable (or fixed) media, where appropriate. The memory 404 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 404 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, memory 404 includes Read-only memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or FLASH memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a static random-access memory (SRAM) or a dynamic random-access memory (DRAM), where the DRAM may be a fast page mode dynamic random-access memory 404 (FPMDRAM), an extended data output dynamic random-access memory (EDODRAM), a synchronous dynamic random-access memory (SDRAM), or the like.
Memory 404 may be used to store or cache various data files for processing and/or communication use, as well as possibly computer program instructions for execution by processor 402.
The processor 402 may read and execute the computer program instructions stored in the memory 404 to implement any of the above-described embodiments of the aggregation task adjustment method based on database annotations.
Optionally, the electronic apparatus may further include a transmission device 406 and an input/output device 408, where the transmission device 406 is connected to the processor 402, and the input/output device 408 is connected to the processor 402.
The transmitting device 406 may be used to receive or transmit data via a network. Specific examples of the network described above may include wired or wireless networks provided by communication providers of the electronic devices. In one example, the transmission device includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmitting device 406 may be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
The input-output device 408 is used to input or output information. In this embodiment, the input information may be the collection data and metadata in various databases.
Alternatively, in this embodiment, the processor 402 may be configured to execute the following steps by a computer program:
establishing at least one collection task aiming at different collection data on a data collection platform, wherein the configuration information of each collection task at least comprises a collection data source, a collection data target and task information;
modifying metadata of the collected data and/or adding collection annotation information based on a collection task adjustment condition of the collected data in a database, wherein the metadata records correspondingly explain the explanation of the collected data, the collection annotation information records task adjustment information corresponding to the collected data, and the task adjustment information is used for adjusting the task information of the collection task;
the data collection platform synchronously obtains the collection data and corresponding metadata and/or collection annotation information, if the metadata is detected to be changed or the collection annotation information is detected, the collection task corresponding to the collection data is associated, and the collection task is adjusted based on the metadata and/or the collection annotation information.
It should be noted that, for specific examples in this embodiment, reference may be made to examples described in the foregoing embodiments and optional implementations, and details of this embodiment are not described herein again.
In general, the various embodiments may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. Some aspects of the invention may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto. While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
Embodiments of the invention may be implemented by computer software executable by a data processor of the mobile device, such as in a processor entity, or by hardware, or by a combination of software and hardware. Computer software or programs (also referred to as program products) including software routines, applets and/or macros can be stored in any device-readable data storage medium and they include program instructions for performing particular tasks. The computer program product may comprise one or more computer-executable components configured to perform embodiments when the program is run. The one or more computer-executable components may be at least one software code or a portion thereof. Further in this regard it should be noted that any block of the logic flow as in the figures may represent a program step, or an interconnected logic circuit, block and function, or a combination of a program step and a logic circuit, block and function. The software may be stored on physical media such as memory chips or memory blocks implemented within the processor, magnetic media such as hard or floppy disks, and optical media such as, for example, DVDs and data variants thereof, CDs. The physical medium is a non-transitory medium.
It should be understood by those skilled in the art that various features of the above embodiments can be combined arbitrarily, and for the sake of brevity, all possible combinations of the features in the above embodiments are not described, but should be considered as within the scope of the present disclosure as long as there is no contradiction between the combinations of the features.
The above examples are merely illustrative of several embodiments of the present application, and the description is more specific and detailed, but not to be construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application should be subject to the appended claims.

Claims (14)

1. A collection task adjusting method based on database annotation is characterized by comprising the following steps:
establishing at least one collection task aiming at different collection data on a data collection platform, wherein the configuration information of each collection task at least comprises a collection data source, a collection data target and task information;
modifying metadata of the collected data and/or adding collection annotation information based on a collection task adjustment condition of the collected data in a database, wherein the metadata records explanation data for explaining the collected data correspondingly, the collection annotation information records task adjustment information for adjusting the task information of the collection task correspondingly;
and the data collection platform synchronously acquires the collection data and corresponding metadata and/or collection annotation information, if the metadata is changed and the collection annotation information is contained, the collection task corresponding to the collection data is associated, and corresponding adjustment is performed on the collection task based on collection state annotation in the collection annotation information and the metadata.
2. The database annotation-based collection task adjustment method according to claim 1, wherein if the task adjustment condition corresponding to the collection state annotation is a source table modification field or a target table modification field, the source table modification field or the target table modification field and a field to be modified are filled in the metadata, the collection state annotation fills the collection state of the field to be modified, and the collection state annotation is modified according to the collection state annotation of the field to be modified and the connection relationship between other fields.
3. The method according to claim 1, wherein if the task adjustment condition corresponding to the collection status comment is a source table deleted field or a target table deleted field, the metadata fills in the source table deleted field or the target table deleted field and a field to be deleted, and the collection status comment fills in the field to be deleted in a non-collection state, the field to be deleted is directly deleted and the connection relationship corresponding to the field to be deleted is cancelled.
4. The collection task adjustment method based on database annotations according to claim 1, wherein if the task adjustment condition corresponding to the collection state annotation is that a field is added to a source table, the metadata fill-in adjustment content is: the method comprises the steps that a field and a field to be added are added to a source table, the collection state annotation fills in a collection state annotation of the source table and a collection state annotation of the field to be added, and whether the field to be added needs to be connected with a target field or not is judged based on the collection state annotation of the source table and the collection state annotation of the field to be added.
5. The method according to claim 1, wherein if the task adjustment condition corresponding to the collection status annotation is that a field is added to a target table, the metadata is filled with adjustment contents that include: adding a field and a field to be added to a target table, filling the collection state comment of the target table and the collection state comment of the field to be added with the collection state comment, and judging whether the field to be added needs to be connected with a source field or not based on the collection state comment of the target table and the collection state comment of the field to be added.
6. The method of claim 1, wherein the task information comprises at least one of collection connection configuration information, scheduling configuration information, collection mode, collection basis field, data time range, and data deadline for task scheduling to advance.
7. The method according to claim 1, wherein when the adjustment condition of the collection task is that the data of the collection data changes but the task information of the collection task does not change, the content of the data change is filled in corresponding metadata as interpretation data; when the adjustment condition of the collection task is data change of collection data and task information change of the collection task, filling the content of the data change into corresponding metadata as interpretation data, and filling the content of the task information change into corresponding collection annotation information as task adjustment information; and when the collection task adjustment condition is that the data of the collection data does not change but the task information of the collection task changes, filling the changed content of the task information into corresponding collection annotation information as task adjustment information.
8. The database annotation based collection task adjusting method according to claim 1, wherein if the type of the collection data is table data, the corresponding collection annotation information is defined as a table annotation, and the table annotation includes: at least one of a collection state annotation, a collection mode annotation, a retention time annotation, a scheduling configuration annotation, a data time range annotation and a data deadline annotation for forward movement of task scheduling; if the type of the collection data is field data, the corresponding collection annotation information is defined as a field annotation, and the field annotation comprises: at least one of a collection status annotation and a retention time annotation.
9. The method of claim 1, wherein the aggregating annotation information of the metadata is deleted in the database after the data aggregating platform collects the aggregating annotation information, and the data aggregating platform deletes the aggregating annotation information after a retention time associated with the aggregating annotation information is cached.
10. The method according to claim 1, wherein if the metadata change is detected but there is no collection annotation information, associating the collection task corresponding to the collection data, and replacing a collection data source and/or a collection data target in the collection task with the changed metadata.
11. The method according to claim 1, wherein if the metadata is not changed but there is collection annotation information, associating the collection task corresponding to the collection data, and replacing task information of the collection task with the collection annotation information.
12. An aggregation task adjusting apparatus based on database annotation, comprising:
the task establishing unit is used for establishing at least one collection task aiming at different collection data on a data collection platform, wherein the configuration information of each collection task at least comprises a collection data source, a collection data target and task information;
the annotation filling unit is used for modifying metadata of the collection data and/or adding collection annotation information in a database based on the collection task adjustment condition of the collection data, wherein the metadata records the explanation data for explaining the collection data correspondingly, the collection annotation information records the task adjustment information for adjusting the task information of the collection task, and the task adjustment information is used for adjusting the task information of the collection task;
and the collection adjusting unit is used for synchronously acquiring the collection data and corresponding metadata and/or collection annotation information by the data collection platform, associating the collection task corresponding to the collection data if the metadata is changed and contains the collection annotation information, and correspondingly adjusting the collection task based on the collection state annotation in the collection annotation information and the metadata.
13. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the database annotation based aggregation task adjustment method according to any one of claims 1 to 11.
14. A readable storage medium, characterized in that a computer program is stored in the readable storage medium, the computer program comprising program code for controlling a process to execute the process, the process comprising the database annotation based aggregation task adjustment method according to any one of claims 1 to 11.
CN202210500164.5A 2022-05-10 2022-05-10 Collection task adjusting method and device based on database annotation Active CN114595291B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210500164.5A CN114595291B (en) 2022-05-10 2022-05-10 Collection task adjusting method and device based on database annotation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210500164.5A CN114595291B (en) 2022-05-10 2022-05-10 Collection task adjusting method and device based on database annotation

Publications (2)

Publication Number Publication Date
CN114595291A CN114595291A (en) 2022-06-07
CN114595291B true CN114595291B (en) 2022-08-02

Family

ID=81821744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210500164.5A Active CN114595291B (en) 2022-05-10 2022-05-10 Collection task adjusting method and device based on database annotation

Country Status (1)

Country Link
CN (1) CN114595291B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114791915B (en) * 2022-06-22 2022-09-27 深圳高灯计算机科技有限公司 Data aggregation method and device, computer equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105279261A (en) * 2015-10-23 2016-01-27 北京京东尚科信息技术有限公司 Dynamic extensible database filing method and system

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8938430B2 (en) * 2012-02-22 2015-01-20 International Business Machines Corporation Intelligent data archiving
CN106156165A (en) * 2015-04-16 2016-11-23 阿里巴巴集团控股有限公司 Method of data synchronization between heterogeneous data source and device
US20170269971A1 (en) * 2016-03-15 2017-09-21 International Business Machines Corporation Migrating enterprise workflows for processing on a crowdsourcing platform
CN106294009B (en) * 2016-08-05 2019-09-10 北京小米支付技术有限公司 Database filing method and system
CN106407404B (en) * 2016-09-22 2019-09-24 成都快乐家网络技术有限公司 Date storage method, data managing method and system, database, client
CN108052681B (en) * 2018-01-12 2020-05-26 毛彬 Method and system for synchronizing structured data between relational databases
WO2021055460A1 (en) * 2019-09-16 2021-03-25 Aveva Software, Llc Computerized systems and methods for bi-directional file sharing and synchronization on and over a network
CN111538754A (en) * 2020-06-22 2020-08-14 杭州城市大数据运营有限公司 Data collection management system, method, device, equipment and storage medium
CN113377758A (en) * 2021-06-30 2021-09-10 数字郑州科技有限公司 Data quality auditing engine and auditing method thereof
CN113742357A (en) * 2021-08-25 2021-12-03 国核电力规划设计研究院有限公司 Method and system for automatically collecting and associating cross-platform design data
CN114037304A (en) * 2021-11-16 2022-02-11 浪潮通用软件有限公司 Data collection method, equipment and medium for cost data
CN114416806A (en) * 2021-12-13 2022-04-29 深圳供电局有限公司 Method and device for acquiring power safety knowledge data and computer equipment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105279261A (en) * 2015-10-23 2016-01-27 北京京东尚科信息技术有限公司 Dynamic extensible database filing method and system

Also Published As

Publication number Publication date
CN114595291A (en) 2022-06-07

Similar Documents

Publication Publication Date Title
CN110147411B (en) Data synchronization method, device, computer equipment and storage medium
US11636083B2 (en) Data processing method and apparatus, storage medium and electronic device
CN107943718B (en) Method and device for cleaning cache file
CN104424351A (en) Log data store that stores data across a plurality of storage devices using non-disjoint layers
CN109039870A (en) E-mail sending method, system, computer equipment and storage medium
EP3125501A1 (en) File synchronization method, server, and terminal
CN114595291B (en) Collection task adjusting method and device based on database annotation
CN110245149B (en) Metadata version management method and device
CN105577841A (en) File synchronization method, device, client, server side and device
US9009435B2 (en) Methods and systems for data cleanup using physical image of files on storage devices
CN115469813A (en) Data processing method and device, storage medium and electronic device
US11159616B2 (en) Email synchronization method and apparatus, and computer readable storage medium
CN112840334A (en) Method and device for managing data of partition table, management node and storage medium
US11200205B2 (en) Displaying an alert and options when deleting a file that is associated with a sequence of files
CN115801426B (en) Method, device and medium for batch detection of residual validity period of sub domain name ssl certificate
CN112559118A (en) Application data migration method and device, electronic equipment and storage medium
US20170262439A1 (en) Information processing apparatus and non-transitory computer readable medium
CN117931248B (en) Method and device for improving deployment efficiency of boulder application
CN114817218A (en) Data verification method and system, electronic device and readable storage medium
CN114780484A (en) Implementation method for file life cycle management for object storage
CN111090670B (en) Data pre-aggregation method, system, computing device and storage medium
CN102377582A (en) Data uploading method and device
CN117931248A (en) Method and device for improving deployment efficiency of boulder application
CN115292051B (en) Hot migration method, device and application of GPU (graphics processing Unit) resource POD (POD)
CN115391355B (en) Data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant