CN112632132A - Method, device and equipment for processing abnormal import data - Google Patents

Method, device and equipment for processing abnormal import data Download PDF

Info

Publication number
CN112632132A
CN112632132A CN202011637145.4A CN202011637145A CN112632132A CN 112632132 A CN112632132 A CN 112632132A CN 202011637145 A CN202011637145 A CN 202011637145A CN 112632132 A CN112632132 A CN 112632132A
Authority
CN
China
Prior art keywords
data
abnormal
target
imported
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011637145.4A
Other languages
Chinese (zh)
Other versions
CN112632132B (en
Inventor
谢南翔
张岩
黄宇昕
王元文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202011637145.4A priority Critical patent/CN112632132B/en
Publication of CN112632132A publication Critical patent/CN112632132A/en
Application granted granted Critical
Publication of CN112632132B publication Critical patent/CN112632132B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/244Grouping and aggregation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a method, a device and equipment for processing abnormal import data. And if the data to be imported meets the abnormal condition, determining the data to be imported as abnormal data to be checked, and writing the abnormal data to be checked into the abnormal data table. And finally, selecting target abnormal data from the abnormal data table, importing the target abnormal data into a target database, and determining an abnormal problem according to an import result. And if the import is successful, determining the abnormal problem as a data abnormal problem. And if the import is unsuccessful, determining the abnormal problem as the abnormal problem in the warehouse. By importing the target abnormal data into the target database, automatic import of part of target abnormal data which can be normally put in storage can be realized, the abnormal reason of the target abnormal data is determined, and the abnormal data can be conveniently processed according to the abnormal reason. The processing speed of the abnormal data is accelerated, and the data import efficiency is improved.

Description

Method, device and equipment for processing abnormal import data
Technical Field
The present application relates to the field of data processing, and in particular, to a method, an apparatus, and a device for processing exception import data.
Background
The database is used for storing a large amount of data and providing services such as data storage, data processing and data analysis for users by using the stored data. When new data is generated or data stored in the database needs to be modified, the data needs to be imported into the database.
Abnormal data may exist in the imported data of the database. Currently, abnormal data in imported data is extracted, and the import process of the abnormal data is controlled manually or the abnormal data is processed. The manual processing mode causes low efficiency of processing abnormal data and influences the speed of data import.
Disclosure of Invention
In view of this, embodiments of the present application provide a method, an apparatus, and a device for processing exception import data, which can implement automatic warehousing of part of exception import data, determine an exception problem of the exception import data, facilitate subsequent processing of the exception data according to the exception problem, and improve processing efficiency of the exception data.
In order to solve the above problem, the technical solution provided by the embodiment of the present application is as follows:
in a first aspect, the present application provides a method for processing exception import data, where the method includes:
acquiring data to be imported, and judging whether the data to be imported meets an abnormal condition; the abnormal conditions comprise that the number of data fields of the data to be imported does not accord with the number of target fields and/or the field format of the data to be imported does not accord with a preset field format;
if the data to be imported meets the abnormal condition, determining the data to be imported as abnormal data to be checked, and writing the abnormal data to be checked into an abnormal data table;
selecting target abnormal data from the abnormal data table, and importing the target abnormal data into a target database;
if the import is successful, determining the abnormal problem of the target abnormal data as a data abnormal problem;
and if the import is unsuccessful, determining the abnormal problem of the target abnormal data as a storage abnormal problem.
In a possible implementation manner, the writing the exception data to be checked into an exception data table includes:
acquiring a target table name of a data table into which the abnormal data to be checked is imported and preset import time for importing the abnormal data to be checked into a target database;
acquiring column information, data information and abnormal information of the abnormal data to be checked, and forming a target value field by the column information, the data information and the abnormal information;
and writing the target table name, the target import time and the target value field into the abnormal data table.
In a possible implementation manner, the selecting target abnormal data from the abnormal data table includes:
selecting a data table in a target database as a data table to be written;
acquiring a table name of the data table to be written as a table name to be written; acquiring date information of the data table to be written as a date to be written;
inquiring whether corresponding abnormal data to be checked exist in the abnormal data table according to the name of the table to be written and the date to be written;
and if so, taking the abnormal data to be checked as target abnormal data.
In one possible implementation, the importing the target exception data into a target database includes:
key value data of the target abnormal data are obtained, and the key value data are spliced to obtain target key value data;
and writing the target key value data into a target database.
In a possible implementation manner, before the obtaining of the data to be imported and the determining of whether the data to be imported is abnormal data to be checked, the method further includes:
acquiring original data, and performing data aggregation and data conversion on the original data to obtain source data;
and selecting data to be imported from the source data.
In a possible implementation manner, before the obtaining source data, and performing data aggregation and data conversion on the original data to obtain source data, the method further includes:
acquiring first file information of a first data file from a configuration file, wherein the first file information comprises first position information and first data information; the first data file is a file for storing original data;
querying whether the first data file exists by using the first position information;
if so, inquiring whether the original data in the first data file is complete by using the first data information;
the acquiring of the raw data includes:
if the data in the first data file is complete, taking the data in the first data file as original data;
and acquiring original data from the first data file.
In a possible implementation manner, before the obtaining of the data to be imported and the determining of whether the data to be imported is abnormal data to be checked, the method further includes:
acquiring second file information of a second data file from the configuration file, wherein the second file information comprises second position information and second data information; the second data file is a file for storing source data;
querying whether the second data file exists by using the second position information;
if so, inquiring whether the data in the second data file is complete by using the second data information;
the acquiring the data to be imported includes:
if the data in the second data file is complete, taking the data in the second data file as source data;
and acquiring data to be imported from the second data file.
In a second aspect, the present application provides an apparatus for processing exception import data, the apparatus comprising:
the device comprises a first acquisition unit, a second acquisition unit and a control unit, wherein the first acquisition unit is used for acquiring data to be imported and judging whether the data to be imported meets an abnormal condition; the abnormal conditions comprise that the number of data fields of the data to be imported does not accord with the number of target fields and/or the field format of the data to be imported does not accord with a preset field format;
the first determining unit is used for determining the data to be imported as abnormal data to be checked and writing the abnormal data to be checked into an abnormal data table if the data to be imported meets an abnormal condition;
the import unit is used for selecting target abnormal data from the abnormal data table and importing the target abnormal data into a target database;
the second determining unit is used for determining the abnormal problem of the target abnormal data as a data abnormal problem if the importing is successful;
and the third determining unit is used for determining the abnormal problem of the target abnormal data as the abnormal problem in storage if the importing is unsuccessful.
In a possible implementation manner, the first determining unit is specifically configured to obtain a target table name of a data table into which the abnormal data to be checked is to be imported and a predetermined import time for importing the abnormal data to be checked into a target database;
acquiring column information, data information and abnormal information of the abnormal data to be checked, and forming a target value field by the column information, the data information and the abnormal information;
and writing the target table name, the target import time and the target value field into the abnormal data table.
In a possible implementation manner, the importing unit is specifically configured to select a data table in a target database as a data table to be written in;
acquiring a table name of the data table to be written as a table name to be written; acquiring date information of the data table to be written as a date to be written;
inquiring whether corresponding abnormal data to be checked exist in the abnormal data table according to the name of the table to be written and the date to be written;
and if so, taking the abnormal data to be checked as target abnormal data.
In a possible implementation manner, the import unit is specifically configured to obtain key value data of the target abnormal data, and splice the key value data to obtain target key value data;
and writing the target key value data into a target database.
In one possible implementation, the apparatus further includes:
the second acquisition unit is used for acquiring original data, and performing data aggregation and data conversion on the original data to obtain source data;
and the first selection unit is used for selecting the data to be imported from the source data.
In one possible implementation, the apparatus further includes:
a third obtaining unit, configured to obtain first file information of a first data file from a configuration file, where the first file information includes first location information and first data information; the first data file is a file for storing original data;
a first query unit, configured to query whether the first data file exists or not by using the first location information;
the second query unit is used for querying whether the original data in the first data file is complete or not by utilizing the first data information if the original data exists;
the second obtaining unit is specifically configured to, if the data in the first data file is complete, take the data in the first data file as original data;
and acquiring original data from the first data file.
In one possible implementation, the apparatus further includes:
a fourth obtaining unit, configured to obtain second file information of a second data file from the configuration file, where the second file information includes second location information and second data information; the second data file is a file for storing source data;
a third inquiring unit, configured to inquire whether the second data file exists by using the second location information;
a fourth query unit, configured to query whether data in the second data file is complete or not by using the second data information if the data exists;
the first obtaining unit is specifically configured to, if data in the second data file is complete, take the data in the second data file as source data;
and acquiring data to be imported from the second data file.
In a third aspect, the present application provides an apparatus for processing exception import data, including: a processor, a memory, a system bus;
the processor and the memory are connected through the system bus;
the memory is used for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of the above embodiment.
In a fourth aspect, the present application provides a computer-readable storage medium, which stores instructions that, when executed on a terminal device, cause the terminal device to perform the method of the foregoing embodiment.
Therefore, the embodiment of the application has the following beneficial effects:
according to the method, the device and the equipment for processing the abnormal import data, data to be imported are obtained first, and whether the data to be imported meet abnormal conditions or not is judged; the abnormal conditions comprise that the number of data fields of the data to be imported does not accord with the number of target fields and/or the field format of the data to be imported does not accord with a preset field format. And if the data to be imported meets the abnormal condition, determining the data to be imported as abnormal data to be checked, and writing the abnormal data to be checked into the abnormal data table. And finally, selecting target abnormal data from the abnormal data table, importing the target abnormal data into a target database, and determining an abnormal problem according to an import result. If the import is successful, the problem is not the problem in the import operation, and the abnormal problem of the target abnormal data is determined as the data abnormal problem. And if the import is unsuccessful, determining the abnormal problem of the target abnormal data as the abnormal problem of the storage. By importing the determined abnormal data into the target database, on one hand, automatic import of part of abnormal data which can be normally put in storage can be realized, on the other hand, the abnormal reason of the abnormal data can be determined, and the abnormal data can be conveniently processed aiming at the abnormal reason subsequently. The processing speed of the abnormal data is accelerated, and the data import efficiency is improved.
Drawings
Fig. 1 is a flowchart of a method for processing exception import data according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a device for processing exception import data according to an embodiment of the present application.
Detailed Description
In order to facilitate understanding and explaining the technical solutions provided by the embodiments of the present application, the following description will first describe the background art of the present application.
After studying a traditional importing process of importing data into a database, the inventor finds that part of abnormal data may exist in the imported data, and if the abnormal data is not processed before data importing, the operation of a target database may be affected subsequently. In the existing data importing method, abnormal data is extracted and then processed in a manual processing mode. The manual processing method is inefficient, and when the abnormal problem type of the abnormal data is not determined, multiple problem solutions may need to be tried, resulting in a long time for processing the abnormal data.
Based on this, according to the method, the device and the equipment for processing the abnormal import data provided by the embodiment of the application, the data to be imported is obtained first, and whether the data to be imported meets the abnormal condition or not is judged; the abnormal conditions comprise that the number of data fields of the data to be imported does not accord with the number of target fields and/or the field format of the data to be imported does not accord with a preset field format. And if the data to be imported meets the abnormal condition, determining the data to be imported as abnormal data to be checked, and writing the abnormal data to be checked into the abnormal data table. And finally, selecting target abnormal data from the abnormal data table, importing the target abnormal data into a target database, and determining an abnormal problem according to an import result. If the import is successful, the problem is not the problem in the import operation, and the abnormal problem of the target abnormal data is determined as the data abnormal problem. And if the import is unsuccessful, determining the abnormal problem of the target abnormal data as the abnormal problem of the storage. By importing the determined abnormal data into the target database, on one hand, automatic import of part of abnormal data which can be normally put in storage can be realized, on the other hand, the abnormal reason of the abnormal data can be determined, and the abnormal data can be conveniently processed aiming at the abnormal reason subsequently. The processing speed of the abnormal data is accelerated, and the data import efficiency is improved.
In order to facilitate understanding of the technical solutions provided in the embodiments of the present application, a method for processing exception import data provided in the embodiments of the present application is described below with reference to the accompanying drawings.
Referring to fig. 1, the figure is a flowchart of a processing method for exception import data according to an embodiment of the present application, where the method includes steps S101 to S105.
S101: acquiring data to be imported, and judging whether the data to be imported meets an abnormal condition; the abnormal conditions comprise that the number of data fields of the data to be imported does not accord with the number of target fields and/or the field format of the data to be imported does not accord with a preset field format.
The data to be imported is data which needs to be written into the target database, and the data to be imported can be sourced from each service system generating the data.
The data to be imported may be data obtained after basic data processing. The data to be imported may have abnormal data, and the abnormal data may not be normally written into the target database. In order to not affect the normal import process, the data to be imported is analyzed, abnormal data in the data to be imported is extracted, and the abnormal data is processed.
The abnormal data to be imported can be judged according to the abnormal condition. The abnormal condition may specifically be that the number of data fields of the data to be imported does not meet the target number of fields and/or that the field format of the data to be imported does not meet a preset field format. And judging whether the data to be imported is abnormal or not according to the number of data fields and the field format in the data to be imported.
The exception condition may be set in the configuration file in advance. In a possible implementation manner, after the data to be imported is obtained, the abnormal condition may be obtained from the configuration file to perform the judgment on the data to be imported.
The embodiment of the application does not limit the specific target field number and the preset field format, and can be set according to the data form of the database and the data form of the standard data to be imported.
S102: and if the data to be imported meets the abnormal condition, determining the data to be imported as abnormal data to be checked, and writing the abnormal data to be checked into an abnormal data table.
And if the data to be imported meets the abnormal condition, determining the data to be imported as abnormal data to be checked. Data which can be read and written normally may exist in the abnormal data to be checked, and although the data meets the abnormal condition, special abnormal data processing is not needed.
And writing the abnormal data to be checked into the abnormal data table. Specifically, the abnormal data to be checked may be written into the abnormal data table after being correspondingly processed according to the format of the abnormal data table. The embodiment of the present application provides a specific implementation manner for writing abnormal data to be checked into an abnormal data table, please refer to the following.
The embodiment of the application does not limit the processing mode of the data to be imported which does not meet the abnormal condition. In one possible implementation, the data to be imported that does not satisfy the exception condition may be written to the target database. Specifically, when the target database is the HBase database, a BulkLoad warehousing mode may be adopted.
S103: and selecting target abnormal data from the abnormal data table, and importing the target abnormal data into a target database.
It can be understood that, the format of some abnormal data is different, and the import of the abnormal data and the reading in the target database are not affected, and such abnormal data can be normally imported into the target database for subsequent read-write operation.
The source of the target exception data in the exception data table is different, and the data table in the target database to be written is also different. Part of abnormal data can be selected from the abnormal data table to serve as target abnormal data, the target abnormal data are imported into the target database, and abnormal data to be checked in the abnormal data table are sequentially processed.
S104: and if the import is successful, determining the abnormal problem of the target abnormal data as the data abnormal problem.
And if the target abnormal data can be imported into the target database, the target abnormal data has no warehousing problem, and the abnormal problem of the target abnormal data is determined as the data abnormal problem.
The data exception problem refers to a format problem existing in the data to be imported. The data exception problem may be caused by the fact that the data format of the service system from which the data to be imported comes is different from the data format of the import target database.
S105: and if the import is unsuccessful, determining the abnormal problem of the target abnormal data as the abnormal problem of the storage.
The abnormal storage problem is that the target abnormal data cannot be correctly written into the target database. The warehousing exception problem may be caused by the target exception data failing to satisfy the warehousing condition. And if the abnormal data import is unsuccessful, determining the abnormal problem of the target abnormal data as the abnormal problem of the storage.
Based on the relevant contents of the above S101-S105, the abnormal data to be checked is determined from the data to be imported, and is written into the corresponding abnormal data table. And then target abnormal data is selected from the abnormal data table and is imported into the target database, so that on one hand, automatic import of part of abnormal data which can be normally warehoused can be realized, on the other hand, the abnormal reason of the target abnormal data can be determined, and the abnormal data can be conveniently processed aiming at the abnormal reason subsequently. The processing speed of the abnormal data is accelerated, and the data import efficiency is improved.
In one possible implementation, the primary key in the exception table may be the import time and the table name. For recording abnormal data to be checked corresponding to different data tables to be written and different writing times.
Specifically, writing the abnormal data to be checked into the abnormal data table includes:
acquiring a target table name of a data table into which abnormal data to be checked is imported and preset import time for importing the abnormal data to be checked into a target database;
acquiring column information, data information and abnormal information of abnormal data to be checked, and forming a target value field by the column information, the data information and the abnormal information;
and writing the target table name, the target import time and the target value field into the exception data table.
And acquiring the target table name of the data table in the target database into which each abnormal data to be checked is written, and the preset leading-in time for leading the abnormal data to be checked into the target database. And determining a data table in the target database into which the abnormal data to be checked is imported according to the target table name and the preset import time.
And acquiring column information, data information and abnormal information of abnormal data to be checked. The column information is corresponding column information in an original data table stored in the abnormal data to be checked. The data information is information related to abnormal data to be checked. The abnormal information corresponds to an abnormal problem of abnormal data to be checked.
And forming the column information, the data information and the abnormal information into a target value field.
And writing the target table name, the target import time and the target value field into the abnormal data table. In a possible implementation manner, the target table name and the target import time are key values of primary keys corresponding to abnormal data to be checked, and the target value field is corresponding key value data in the abnormal data table.
Further, selecting target abnormal data from the abnormal data table, including:
selecting a data table in a target database as a data table to be written;
acquiring a table name of a data table to be written as a table name to be written; acquiring date information to be written into the data table as a date to be written;
inquiring whether corresponding abnormal data to be checked exist in the abnormal data table according to the name and the date to be written in the table;
and if so, taking the abnormal data to be checked as target abnormal data.
In a possible implementation manner, after the normal data to be written is written into the corresponding data table in the target database, whether each data table in the target database has corresponding unwritten abnormal data to be checked may be queried.
And selecting a data table in the target database as a data table to be written, wherein the data table to be written can be one of the data tables in which data to be imported is written.
The target table name and the target lead-in time of the abnormal data to be checked are stored in the abnormal data table, so that the table name of the data table to be written can be obtained as the table name to be written, and the date information of the data table to be written, which is written with the data to be written, can be obtained as the date to be written. And inquiring whether the abnormal data to be checked exist in the abnormal data table according to the name and the date to be written.
And if the data to be written in the data table to be written in has the corresponding abnormal data to be checked, the data to be written in the data table to be written in need of having the abnormal data to be checked. And taking the corresponding abnormal data to be checked as target abnormal data so as to write the target abnormal data into the data table to be written in the follow-up process.
Based on the above, in the embodiment of the present application, the data table to be written in the target database is queried for the corresponding abnormal data to be checked, so as to implement processing of the abnormal data to be checked. And when the data to be written has corresponding abnormal data to be checked, taking the corresponding abnormal data to be checked as target abnormal data so as to realize the subsequent writing of the target abnormal data into a corresponding data table to be written, performing the supplement of the target abnormal data and determining the abnormal type of the target abnormal data.
In a possible implementation manner, importing the target exception data into the target database specifically includes:
key value data of the target abnormal data are obtained, and the key value data are spliced to obtain the target key value data;
and writing the target key value data into the target database.
After the target abnormal data is selected from the abnormal data table, the target abnormal data is extracted from the abnormal data table.
And key value data of the target abnormal data are obtained and spliced to obtain the target key value data. The target key value data is data for writing to the target database. And writing the target key value data into the target database to realize the import of the target abnormal data into the target database. And then the type of the abnormal problem of the target abnormal data can be determined according to the import result.
In the embodiment of the application, the target key value data is obtained by using the key value data of the target abnormal data, and the target key value data is written into the target database, so that the target abnormal data can be recorded, the target abnormal data which can be partially imported into the target database can be automatically processed, and the processing efficiency of the abnormal data is improved. And, by writing the target key-value data into the target database, the type of the abnormal problem of the target abnormal data can be determined according to the writing result.
It can be understood that the data to be imported is data derived from a service system, the service system may specifically be a business system, and the generated data may be business data. The structures of data from different service systems are different, and the association between data from the respective service systems is not strong enough. If the data from the service system is directly used as the data to be imported, corresponding multiple import tasks are executed for the data to be imported of different types, the determination of abnormal data to be checked is influenced, and the efficiency of data import is reduced.
Based on the above problem, the data to be imported can be obtained after the data from the service system is processed. In a possible implementation manner, before acquiring data to be imported and determining whether the data to be imported is abnormal data to be checked, the method further includes:
acquiring original data, and performing data aggregation and data conversion on the original data to obtain source data;
and selecting data to be imported from the source data.
The original data is data from the service system, and the original data has characteristics of the service system of the source. And carrying out data aggregation and data conversion on the original data to obtain processed source data.
The data aggregation specifically refers to performing association integration on original data belonging to different types or originating from different service systems. The method specifically comprises the steps of extracting raw data with the same or similar characteristics from the raw data and clustering. The characteristics may specifically be a service type associated with the raw data. For example, data for a public service may be extracted from raw data, and such extracted raw data may be aggregated. The data aggregation also includes data processing on the raw data, and the critical data can be selected from the raw data or calculated by using one or more raw data. For example, mean data corresponding to a plurality of raw data is calculated using the plurality of raw data. When data import is carried out, the key data can be used as data to be imported, and the data volume of the data to be imported which is imported into the target database is reduced.
The data conversion is to preprocess a part of abnormal characters in the original data to prevent the abnormal characters from influencing data import. Specifically, the null character may be set as a default character, a special character related to an individual data field in the data may be removed, and the like.
And obtaining source data after data aggregation and data conversion, and extracting the data to be imported into the target database from the source data.
The embodiment of the application does not limit the specific way of selecting the data to be imported from the source data, and can set the corresponding selection condition according to the data import requirement.
In addition, in order to facilitate the subsequent acquisition of the specific operations of data aggregation and data conversion, the related information of the specific operations of data aggregation and data conversion can be saved in a configuration file. Therefore, specific operations of data aggregation and data conversion can be determined through the configuration file, and modification of data and optimization of a later data processing mode are achieved.
Based on the above, the complexity of the field information of the data to be imported can be reduced through data aggregation and data conversion, and different interfaces do not need to be developed for different data types, so that the unification of the warehousing interfaces is realized. And high-quality data which can be imported through the interface is obtained, and the success rate of batch warehousing is improved. In addition, the volume of the data to be imported can be reduced, the efficiency of data import is improved, and the workload of processing abnormal data to be imported in the later period is reduced.
Further, before data aggregation and data conversion are performed on the original data, whether the original data exists or not and whether the original data is complete or not may be verified.
In a possible implementation manner, before obtaining the source data, and performing data aggregation and data conversion on the original data to obtain the source data, the method further includes:
acquiring first file information of a first data file from a configuration file, wherein the first file information comprises first position information and first data information; the first data file is a file for storing original data;
inquiring whether a first data file exists or not by using the first position information;
and if so, inquiring whether the original data in the first data file is complete or not by utilizing the first data information.
The configuration file has basic information of data transmitted by other service systems, and first file information of the first data file can be acquired according to the configuration information. The first data file is a file for storing data sent by the service system. The first file information includes first location information and first data information. The first location information is used to determine a storage location of the first data file, and the storage location may be a designated directory. The first data information is data information stored in the first data file, and the first data information may specifically include one or more of file name information, file classification information, and file distribution information.
The first position information is used to determine the existence of the first data file. If the first data file does not exist, the file for storing the data may be lost in transmission or the data file has a storage problem, and further data acquisition or problem troubleshooting is required.
If the first data file exists, the data transmitted by the corresponding service system exists. However, the data stored in the first data file is not necessarily complete. And checking whether the data in the first data file is complete or not by using the acquired first file information.
It should be noted that the configuration file storing the first file information of the first data file and the configuration file storing the information related to the specific operations of data aggregation and data conversion may be the same configuration file. Such a profile may be a first level of profile for storing information related to data and data files. In addition, there is a second level of configuration files and a third level of configuration files. The configuration file of the second level can be used for storing task information related to the data to be imported for warehousing. The third level of configuration files may be used to build specific subtasks, with initialization information related to the warehousing operation. By setting the specific information of the configuration files of the three levels, the extension of a simple and convenient importing system of the data to be imported can be realized, and the stability of data import is improved.
Correspondingly, acquiring raw data comprises:
if the data in the first data file is complete, taking the data in the first data file as original data;
raw data is obtained from a first data file.
If the data in the first data file is determined to be complete, the data in the first data file is determined to be the original data. And acquiring the original data from the first data file, and performing subsequent processing on the original data.
In the embodiment of the application, the integrity of the original data can be ensured by checking the existence of the data file and the integrity of the data, so that the subsequent processing of the original data is facilitated, the problem occurring when the data transmission is carried out on the service system is prevented, the efficiency of importing the data to be imported is improved, the abnormal data to be imported is reduced, and the efficiency of processing the abnormal data to be imported is improved.
In a possible implementation manner, before acquiring data to be imported and determining whether the data to be imported is abnormal data to be checked, the method further includes:
acquiring second file information of a second data file from the configuration file, wherein the second file information comprises second position information and second data information; the second data file is a file for storing source data;
inquiring whether a second data file exists or not by utilizing the second position information;
and if so, inquiring whether the data in the second data file is complete or not by utilizing the second data information.
The second data file is a data file for storing source data. The second data file may be generated after processing the original data to obtain source data.
There may be a problem of data file loss during the transmission of the second data file. Correspondingly, the existence of the second data file needs to be verified. Specifically, second file information of a second data file is obtained from the configuration file, and the second file information includes second position information and second data information. The second location information is storage location information of the second data file, and may specifically be a storage location corresponding to the directory to be imported. The second data information is related information of data stored in the second data file, and the second data information may specifically include one or more of file name information, file classification information, and file distribution information.
The second location information is used to determine the existence of the second data file. If the second data file does not exist, the file storing the source data may be lost during transmission or the second data file has a storage problem, and the second data file needs to be further acquired or subjected to problem troubleshooting.
And if the second data file exists, the data to be sourced exists. However, the source data stored in the second data file is not necessarily complete. And utilizing the acquired second file information to check whether the source data in the second data file is complete or not.
Correspondingly, acquiring data to be imported, including:
if the data in the second data file is complete, taking the data in the second data file as source data;
and acquiring the data to be imported from the second data file.
If the data in the second data file is determined to be complete, the data in the second data file is determined to be source data. Note that not all source data is to-be-imported data, and a part of the source data may be selected as the to-be-imported data. And acquiring source data from the second data file, acquiring data to be imported from the source data, and performing subsequent judgment on abnormal conditions on the data to be imported.
In the embodiment of the application, the second data file is checked for file existence and data integrity, so that the integrity of the source data can be ensured, the data to be imported can be conveniently selected from the source data, the problems of data file loss and data omission in the process of generating the source data or transmitting the source data are prevented, abnormal data to be imported are reduced, and the processing efficiency of the abnormal data to be imported is improved.
Based on the method for processing the exception import data provided by the above method embodiment, an embodiment of the present application further provides a device for processing the exception import data, and the device for processing the exception import data will be described below with reference to the accompanying drawings.
Referring to fig. 2, this figure is a schematic structural diagram of a device for processing exception import data according to an embodiment of the present application. As shown in fig. 2, the apparatus for processing exception import data includes:
a first obtaining unit 201, configured to obtain data to be imported, and determine whether the data to be imported meets an abnormal condition; the abnormal conditions comprise that the number of data fields of the data to be imported does not accord with the number of target fields and/or the field format of the data to be imported does not accord with a preset field format;
a first determining unit 202, configured to determine, if the data to be imported satisfies an exception condition, that the data to be imported is to-be-checked exception data, and write the to-be-checked exception data into an exception data table;
an importing unit 203, configured to select target exception data from the exception data table, and import the target exception data into a target database;
a second determining unit 204, configured to determine, if the importing is successful, an abnormal problem of the target abnormal data as a data abnormal problem;
a third determining unit 205, configured to determine, if the importing is unsuccessful, an abnormal problem of the target abnormal data as a warehousing abnormal problem.
In a possible implementation manner, the first determining unit 202 is specifically configured to obtain a target table name of a data table into which the abnormal data to be checked is to be imported and a predetermined import time for importing the abnormal data to be checked into a target database;
acquiring column information, data information and abnormal information of the abnormal data to be checked, and forming a target value field by the column information, the data information and the abnormal information;
and writing the target table name, the target import time and the target value field into the abnormal data table.
In a possible implementation manner, the importing unit 203 is specifically configured to select a data table in a target database as a data table to be written in;
acquiring a table name of the data table to be written as a table name to be written; acquiring date information of the data table to be written as a date to be written;
inquiring whether corresponding abnormal data to be checked exist in the abnormal data table according to the name of the table to be written and the date to be written;
and if so, taking the abnormal data to be checked as target abnormal data.
In a possible implementation manner, the importing unit 203 is specifically configured to obtain key value data of the target abnormal data, and splice the key value data to obtain target key value data;
and writing the target key value data into a target database.
In one possible implementation, the apparatus further includes:
the second acquisition unit is used for acquiring original data, and performing data aggregation and data conversion on the original data to obtain source data;
and the first selection unit is used for selecting the data to be imported from the source data.
In one possible implementation, the apparatus further includes:
a third obtaining unit, configured to obtain first file information of a first data file from a configuration file, where the first file information includes first location information and first data information; the first data file is a file for storing original data;
a first query unit, configured to query whether the first data file exists or not by using the first location information;
the second query unit is used for querying whether the original data in the first data file is complete or not by utilizing the first data information if the original data exists;
the second obtaining unit is specifically configured to, if the data in the first data file is complete, take the data in the first data file as original data;
and acquiring original data from the first data file.
In one possible implementation, the apparatus further includes:
a fourth obtaining unit, configured to obtain second file information of a second data file from the configuration file, where the second file information includes second location information and second data information; the second data file is a file for storing source data;
a third inquiring unit, configured to inquire whether the second data file exists by using the second location information;
a fourth query unit, configured to query whether data in the second data file is complete or not by using the second data information if the data exists;
the first obtaining unit 201 is specifically configured to, if the data in the second data file is complete, take the data in the second data file as source data;
and acquiring data to be imported from the second data file.
Based on the method for processing the exception import data provided by the embodiment of the method, the application provides a device for processing the exception import data, which comprises the following steps: a processor, a memory, a system bus;
the processor and the memory are connected through the system bus;
the memory is used for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of the above embodiment.
Based on the method for processing the exception import data provided by the foregoing method embodiment, the present application provides a computer-readable storage medium, where instructions are stored, and when the instructions are run on a terminal device, the terminal device is caused to execute the method described in the foregoing embodiment.
It should be noted that, in the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the system or the device disclosed by the embodiment, the description is simple because the system or the device corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for processing exception import data, the method comprising:
acquiring data to be imported, and judging whether the data to be imported meets an abnormal condition; the abnormal conditions comprise that the number of data fields of the data to be imported does not accord with the number of target fields and/or the field format of the data to be imported does not accord with a preset field format;
if the data to be imported meets the abnormal condition, determining the data to be imported as abnormal data to be checked, and writing the abnormal data to be checked into an abnormal data table;
selecting target abnormal data from the abnormal data table, and importing the target abnormal data into a target database;
if the import is successful, determining the abnormal problem of the target abnormal data as a data abnormal problem;
and if the import is unsuccessful, determining the abnormal problem of the target abnormal data as a storage abnormal problem.
2. The method according to claim 1, wherein the writing the exception data to be checked into an exception data table comprises:
acquiring a target table name of a data table into which the abnormal data to be checked is imported and preset import time for importing the abnormal data to be checked into a target database;
acquiring column information, data information and abnormal information of the abnormal data to be checked, and forming a target value field by the column information, the data information and the abnormal information;
and writing the target table name, the target import time and the target value field into the abnormal data table.
3. The method of claim 2, wherein said selecting target anomaly data from said anomaly data table comprises:
selecting a data table in a target database as a data table to be written;
acquiring a table name of the data table to be written as a table name to be written; acquiring date information of the data table to be written as a date to be written;
inquiring whether corresponding abnormal data to be checked exist in the abnormal data table according to the name of the table to be written and the date to be written;
and if so, taking the abnormal data to be checked as target abnormal data.
4. The method of claim 1, wherein importing the target anomaly data into a target database comprises:
key value data of the target abnormal data are obtained, and the key value data are spliced to obtain target key value data;
and writing the target key value data into a target database.
5. The method according to claim 1, wherein before the obtaining of the data to be imported and the determining of whether the data to be imported is abnormal data to be checked, the method further comprises:
acquiring original data, and performing data aggregation and data conversion on the original data to obtain source data;
and selecting data to be imported from the source data.
6. The method of claim 5, wherein before the obtaining the source data, and performing data aggregation and data transformation on the raw data to obtain the source data, the method further comprises:
acquiring first file information of a first data file from a configuration file, wherein the first file information comprises first position information and first data information; the first data file is a file for storing original data;
querying whether the first data file exists by using the first position information;
if so, inquiring whether the original data in the first data file is complete by using the first data information;
the acquiring of the raw data includes:
if the data in the first data file is complete, taking the data in the first data file as original data;
and acquiring original data from the first data file.
7. The method according to claim 1, wherein before the obtaining of the data to be imported and the determining of whether the data to be imported is abnormal data to be checked, the method further comprises:
acquiring second file information of a second data file from the configuration file, wherein the second file information comprises second position information and second data information; the second data file is a file for storing source data;
querying whether the second data file exists by using the second position information;
if so, inquiring whether the data in the second data file is complete by using the second data information;
the acquiring the data to be imported includes:
if the data in the second data file is complete, taking the data in the second data file as source data;
and acquiring data to be imported from the second data file.
8. An apparatus for processing exception import data, the apparatus comprising:
the device comprises a first acquisition unit, a second acquisition unit and a control unit, wherein the first acquisition unit is used for acquiring data to be imported and judging whether the data to be imported meets an abnormal condition; the abnormal conditions comprise that the number of data fields of the data to be imported does not accord with the number of target fields and/or the field format of the data to be imported does not accord with a preset field format;
the first determining unit is used for determining the data to be imported as abnormal data to be checked and writing the abnormal data to be checked into an abnormal data table if the data to be imported meets an abnormal condition;
the import unit is used for selecting target abnormal data from the abnormal data table and importing the target abnormal data into a target database;
the second determining unit is used for determining the abnormal problem of the target abnormal data as a data abnormal problem if the importing is successful;
and the third determining unit is used for determining the abnormal problem of the target abnormal data as the abnormal problem in storage if the importing is unsuccessful.
9. An apparatus for processing exception import data, comprising: a processor, a memory, a system bus;
the processor and the memory are connected through the system bus;
the memory is to store one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of any of claims 1-7.
10. A computer-readable storage medium having stored therein instructions that, when executed on a terminal device, cause the terminal device to perform the method of any one of claims 1-7.
CN202011637145.4A 2020-12-31 2020-12-31 Processing method, device and equipment for abnormal imported data Active CN112632132B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011637145.4A CN112632132B (en) 2020-12-31 2020-12-31 Processing method, device and equipment for abnormal imported data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011637145.4A CN112632132B (en) 2020-12-31 2020-12-31 Processing method, device and equipment for abnormal imported data

Publications (2)

Publication Number Publication Date
CN112632132A true CN112632132A (en) 2021-04-09
CN112632132B CN112632132B (en) 2024-04-12

Family

ID=75290115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011637145.4A Active CN112632132B (en) 2020-12-31 2020-12-31 Processing method, device and equipment for abnormal imported data

Country Status (1)

Country Link
CN (1) CN112632132B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113220695A (en) * 2021-06-03 2021-08-06 中国农业银行股份有限公司 Data storage method, device, equipment, medium and product

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760373A (en) * 2014-12-15 2016-07-13 金蝶软件(中国)有限公司 Abnormal data processing method and abnormal data processing device
CN107506451A (en) * 2017-08-28 2017-12-22 泰康保险集团股份有限公司 abnormal information monitoring method and device for data interaction
CN109656985A (en) * 2018-09-27 2019-04-19 深圳壹账通智能科技有限公司 Data lead-in method, system, terminal and storage medium
CN110162563A (en) * 2019-05-28 2019-08-23 深圳市网心科技有限公司 A kind of data storage method, system and electronic equipment and storage medium
CN110781231A (en) * 2019-09-19 2020-02-11 平安科技(深圳)有限公司 Batch import method, device, equipment and storage medium based on database
WO2020134213A1 (en) * 2018-12-25 2020-07-02 苏宁云计算有限公司 Method and system for querying abnormal financial data on basis of knowledge map

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760373A (en) * 2014-12-15 2016-07-13 金蝶软件(中国)有限公司 Abnormal data processing method and abnormal data processing device
CN107506451A (en) * 2017-08-28 2017-12-22 泰康保险集团股份有限公司 abnormal information monitoring method and device for data interaction
CN109656985A (en) * 2018-09-27 2019-04-19 深圳壹账通智能科技有限公司 Data lead-in method, system, terminal and storage medium
WO2020134213A1 (en) * 2018-12-25 2020-07-02 苏宁云计算有限公司 Method and system for querying abnormal financial data on basis of knowledge map
CN110162563A (en) * 2019-05-28 2019-08-23 深圳市网心科技有限公司 A kind of data storage method, system and electronic equipment and storage medium
CN110781231A (en) * 2019-09-19 2020-02-11 平安科技(深圳)有限公司 Batch import method, device, equipment and storage medium based on database

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113220695A (en) * 2021-06-03 2021-08-06 中国农业银行股份有限公司 Data storage method, device, equipment, medium and product

Also Published As

Publication number Publication date
CN112632132B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
CN111506559B (en) Data storage method, device, electronic equipment and storage medium
CN110135590B (en) Information processing method, information processing apparatus, information processing medium, and electronic device
CN109871408B (en) Multi-type database adaptation method, device, electronic equipment and storage medium
CN112632132A (en) Method, device and equipment for processing abnormal import data
CN112214473B (en) Data migration method and system between databases
CN112114978A (en) Electronic scale data updating method, device, equipment and readable storage medium
CN111200654A (en) Client request error processing method and device
CN116166629A (en) File format conversion method, device, equipment and readable storage medium
CN113377604B (en) Data processing method, device, equipment and storage medium
CN111010676B (en) Short message caching method, device and system
CN114648410A (en) Stock staring method, apparatus, system, device and medium
CN106775854B (en) Method and device for generating configuration file
CN113468379A (en) Data source processing method and device and intelligent analysis platform
CN112181539B (en) File processing method, device, equipment and medium
CN116303627B (en) Query method and device for semiconductor test data, electronic equipment and storage medium
CN111046012B (en) Method and device for extracting inspection log, storage medium and electronic equipment
CN112632147B (en) Data differentiation comparison method, system and storage medium
CN116719866B (en) Multi-format data self-adaptive distribution method and system
EP1183596B1 (en) Generating optimized computer data field conversion routines
CN110457260B (en) File processing method, device, equipment and computer readable storage medium
EP3989078A1 (en) Method and apparatus for realizing global unique index
CN107729013B (en) Method for managing operation buttons on web page and computer-readable storage medium
AU2023285924A1 (en) Data processing method and apparatus, communication device, storage medium, and vehicle
CN112965992A (en) Multi-parameter constraint data retrieval man-machine interaction method and device
CN115408993A (en) Data conversion method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant