CN115794752A - Data processing method, device, equipment, medium and product - Google Patents

Data processing method, device, equipment, medium and product Download PDF

Info

Publication number
CN115794752A
CN115794752A CN202211578816.3A CN202211578816A CN115794752A CN 115794752 A CN115794752 A CN 115794752A CN 202211578816 A CN202211578816 A CN 202211578816A CN 115794752 A CN115794752 A CN 115794752A
Authority
CN
China
Prior art keywords
file
data
new data
data file
format
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211578816.3A
Other languages
Chinese (zh)
Inventor
李玮
李德良
郭鹏翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202211578816.3A priority Critical patent/CN115794752A/en
Publication of CN115794752A publication Critical patent/CN115794752A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data processing method, a data processing device, data processing equipment, data processing media and data processing products. The data processing method comprises the steps of scanning a file directory on a file server through a file scanning thread, determining a new data file to be uploaded to the file server based on the file directory, determining a target file parser corresponding to the format type of the new data file according to file parameters of the new data file under the condition that the new data file is uploaded, and parsing the new data file by using the target file parser to obtain first data. The data processing method comprises the steps that a file parser corresponding to the format type of the data file is used for identifying and parsing the corresponding data file to obtain data content, so that the data files of various format types can be accurately identified.

Description

Data processing method, device, equipment, medium and product
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data processing method, apparatus, device, medium, and product.
Background
In a data processing scenario, data is usually loaded into a data warehouse and processed using a job or algorithm. The data can comprise multi-source data such as a data dictionary and transaction data, and has the characteristics of multiple manufacturers, multiple service lines, multiple channels and the like, so that the data format is diversified.
The multi-source heterogeneous data formats cannot be accurately identified by the data platform, so that the data cannot be normally loaded to the data warehouse, and the processing result of the data is influenced.
Content of application
The embodiment of the application provides a data processing method, a data processing device, data processing equipment, data processing media and a data processing product, and multi-source heterogeneous data formats can be accurately identified.
In a first aspect, an embodiment of the present application provides a data processing method, including:
scanning a file directory on a file server through a file scanning thread, wherein the file directory is generated based on a data file uploaded to the file server;
determining a target file resolver corresponding to the format type of the new data file according to the file parameter of the new data file under the condition that the new data file is determined to be uploaded to the file server based on the file directory and the uploading of the new data file is completed;
and analyzing the new data file by using the target file analyzer to obtain first data.
In a second aspect, an embodiment of the present application provides a data processing apparatus, including:
the scanning module is used for scanning a file directory on the file server through a file scanning thread, and the file directory is generated based on the data file uploaded to the file server;
the determining module is used for determining a target file parser corresponding to the format type of the new data file according to the file parameters of the new data file when the new data file is determined to be uploaded to the file server based on the file directory and the uploading of the new data file is completed;
and the analysis module is used for analyzing the new data file by using the target file analyzer to obtain first data.
In a third aspect, an embodiment of the present application provides an electronic device, including:
a processor;
a memory for storing computer program instructions;
the computer program instructions, when executed by a processor, implement the method as described in the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method according to the first aspect.
In a fifth aspect, the present application provides a computer program product, and when executed by a processor of an electronic device, the instructions of the computer program product cause the electronic device to perform the method according to the first aspect.
According to the data processing method, the device, the equipment, the medium and the product, the file directory on the file server is scanned through the file scanning thread, the new data file is determined to be uploaded to the file server based on the file directory, and under the condition that the uploading of the new data file is completed, the target file parser corresponding to the format type of the new data file is determined according to the file parameters of the new data file, and the new data file is parsed by the target file parser to obtain the first data. The data file analysis method and the data file analysis device have the advantages that the file analyzers corresponding to the format types of the data files are used for identifying and analyzing the corresponding data files to obtain the data contents, so that the data files of various format types can be accurately identified, and in addition, the corresponding file analyzers are used for analyzing after the data files are uploaded, so that the data contents of the data files can be accurately identified.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present application;
fig. 2 is a schematic process diagram of data processing according to an embodiment of the present application;
fig. 3 is a structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 4 is a structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.
It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising 8230; \8230;" comprises 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
It should be noted that all the embodiments of the present application conform to the relevant regulations of national laws and regulations for data acquisition, storage, use, processing, and the like.
As described above, in a data processing scenario, multiple types of data files are usually exposed, and the data files may also originate from different manufacturers, business lines or channels, and in order not to affect subsequent data processing results, it is necessary to accurately identify the various types of data files uploaded by different manufacturers, business lines or channels.
However, in general, a platform or a mechanism can only identify one type of data file, but cannot identify various types of data files, thereby affecting the subsequent data processing result.
Therefore, the embodiment of the application provides a data processing method, device, equipment, medium and product, which can accurately identify multi-source heterogeneous data formats and acquire data contents.
The following describes a data processing method provided in an embodiment of the present application with reference to a specific embodiment, and fig. 1 is a flowchart of a data processing method provided in an embodiment of the present application. The method can be applied to electronic equipment which can include but is not limited to a mobile phone, a tablet computer, a notebook computer, a palm computer and the like.
As shown in fig. 1, the data processing method may include the steps of:
and S110, scanning a file directory on the file server through a file scanning thread.
Wherein the file directory is generated based on the data files uploaded to the file server.
And S120, determining a target file resolver corresponding to the format type of the new data file according to the file parameters of the new data file under the condition that the new data file is determined to be uploaded to the file server based on the file directory and the uploading of the new data file is completed.
And S130, analyzing the new data file by using the target file analyzer to obtain first data.
According to the method and the device, a file directory on a file server is scanned through a file scanning thread, a new data file is determined to be uploaded to the file server based on the file directory, and under the condition that the new data file is uploaded completely, a target file parser corresponding to the format type of the new data file is determined according to the file parameters of the new data file, and the new data file is parsed by the target file parser to obtain the first data. The embodiment of the application can identify and analyze the corresponding data file by using the file analyzer corresponding to the format type of the data file to obtain the data content, so that the data files of various format types can be accurately identified.
The above steps are described in detail below, specifically as follows:
in S110, the file scanning thread is configured to scan the file server to determine whether a new data file is uploaded, for example, the file scanning thread may scan a file directory on the file server, and since the file directory is generated based on the data file, when it is determined that the file directory is updated, it may be determined that a new data file is uploaded to the file server.
For example, when it is determined that a new data file is uploaded to the file server, the new data file may be added to the file execution queue to wait for a subsequent execution process.
For example, the file scanning thread may be registered in a thread pool of the scheduler, and the thread pool may include other threads besides the file scanning thread, for example, in this embodiment, a file parsing execution thread may also be included, and the following embodiments may be referred to for the relevant content of the file parsing execution thread. The file scanning thread may be a single thread, i.e., the thread pool contains one file scanning thread.
The file server is used for receiving original data files uploaded by the external system, namely data files corresponding to various format types, and exemplarily, the original data files may include at least one format of an xlsx format, an mdb format, a text format and a csv format. Of course, other formats are possible, and the embodiments of the present application are not limited. The file server can simultaneously receive original data files uploaded by a plurality of external systems, so that when a plurality of new data files are accessed, the file server can quickly respond, and the efficiency is improved.
The external system may be a system capable of transferring data files, and may be, for example, a different system of the same organization or a system of different organizations.
In S120, when a new data file is uploaded, the scanning thread may determine whether the new data file is continuously written, for example, if the new data file is not written any more within a preset time, it may be determined that the new data file is written completely, that is, the new data file is uploaded completely, and the preset time may be set according to actual needs.
The file parameter may be a parameter capable of determining a format type of the data file, and for example, the file parameter may include a source parameter of the file from which the source of the data file may be determined and a file type parameter from which the format type of the data file may be determined.
It should be understood that the data file may have different sources, and the manner used in parsing the format of the data file may be different, and according to the source and the file type of the data file, in the embodiment of the present application, a more appropriate file parser may be selected, so as to identify the format of the data file more accurately, and obtain the data content.
Illustratively, before S120, the data processing method may further include the steps of:
adding a new data file into the file execution queue, and waiting;
under the condition that the file analysis execution thread is idle, distributing a new data file to a file analysis executor through the file analysis execution thread;
and identifying the file name of the new data file by using a file analysis executor in a keyword identification mode to obtain the file parameter of the new data file, wherein the file parameter is used for indicating the format type of the new data file.
According to the method and the device, the file execution queue is utilized, so that the processing of the data files can be executed in sequence, and input and output blockage caused when a disk is read and written can be avoided.
The file parsing execution threads are threads registered in the thread pool, and when the file parsing execution threads are actually applied, the number of the file parsing execution threads can be controlled by the thread pool. Because reading and writing the file is the high input/output operation, can make full use of the Central Processing Unit (CPU) resource with multithreading, control the quantity of the thread through the thread pool at the same time, can avoid causing the input/output to block while reading and writing the magnetic disc.
When each file analysis execution thread executes a task, the required time may be different, after the same file analysis execution thread executes the previous task, the preset time length needs to be waited for executing the next task, and the preset time lengths of the different file analysis execution threads may be different, for example, after the file analysis execution thread a executes the previous task, the next task needs to be waited for 1s, and after the file analysis execution thread B executes the previous task, the next task needs to be waited for 2 s.
When a certain file analysis execution thread is idle, a new data file can be distributed to a file analysis executor through the idle file analysis execution thread. The file analysis executor is used for identifying file parameters of the new data file.
Illustratively, the file parsing executor identifies the file name of the new data file by means of keyword identification, and obtains file parameters, so as to select an adapted file parser according to the file parameters.
According to the embodiment of the application, the proper file parser can be selected for the uploaded data files by utilizing the file parsing execution threads, the concurrency of tasks is improved, and CPU resources can be fully utilized to work when input and output wait.
The file parser is used for recognizing the format of the data file and parsing the data file according to a preset parsing method to obtain data content.
Illustratively, the parsing process of the file parser may include, but is not limited to, file preprocessing, table information acquisition corresponding to the file, and reading data content. The file preprocessing may be to detect whether the data file is completely uploaded, and the table information may be information included in a data table corresponding to a standard format, for example, a name corresponding to each column. For example, if the first column of the table information is name and the second column is identification number, the first column of the adjusted data file should also be name and the second column should be identification number.
Exemplarily, the S120 may include the following steps:
and searching a file parser list according to the file parameters of the new data file to obtain a target file parser corresponding to the format type of the new data file, wherein the file parser list is used for storing registered file parsers adapting to data files with different format types.
Specifically, the file parser list is searched according to the file parameters, and a file parser adapted to the format type of the data file, that is, a target file parser, can be obtained.
The file parser list can contain a plurality of file parsers, so that the heterogeneous data formats of multiple sources can be identified, and the expandability of data file access is improved. Each file parser needs to be registered in advance, and the registration process is not limited in the embodiment of the present application.
Illustratively, file parsers may include, but are not limited to, MDB file parsers, EXCLE file parsers, text file parsers, such that MDB format, EXCLE format, and text format data files may be identified.
In S130, after the target text parser is determined, the corresponding data file may be parsed based on the target text parser, so that a multi-source heterogeneous data format may be identified, and a basis is provided for subsequent data processing.
For the parsing process of the target text parser, reference may be made to the above embodiments, and for brevity, details are not described here again.
In some embodiments, after S130, the data processing method may further include the steps of:
arranging the first data according to a preset format to obtain a data file in a standard format;
and uploading the data file with the standard format to a data platform, and processing the data file with the standard format by the data platform.
The preset format may be a format that can be recognized by the data platform, and for example, the first data may be arranged in columns, and preset separators are arranged between different columns, so as to obtain a data file in a standard format.
For example, the preset delimiter may be | @ |, and may also be a delimiter that can be recognized by other data platforms, which is not limited in the embodiment of the present application.
Therefore, the data platform can uniformly read and obtain the data content of the data files in the multi-source heterogeneous format, and the problem that the data files cannot be uniformly read due to the fact that the data format types are inconsistent at present is solved.
For example, after the data file in the standard format is obtained, the data file may be integrated into a file parsing execution thread, and the file parsing execution thread sends the data file to a file server, and uploads the data file to a data platform through the file server for use by the data platform.
Because the file analyzers and the file analyzing execution threads are multiple, the uploading time of the data files in the standard format is saved, and the uploading efficiency is improved.
The above process may be implemented by using a language with characteristics of high readability, easy expansion, and the like, for example, java language may be used, and of course, other languages may also be used, and the embodiment of the present application is not limited.
The data processing procedure of the embodiment of the present application is explained below by an example.
As shown in fig. 2, the external system may upload a data file in an original format to the file server, and the file scanning thread may scan a file directory on the file server according to a certain frequency to determine whether a new data file is uploaded, and when it is determined that a new data file is uploaded, add the new data file to the file execution queue after the data file is uploaded.
And the thread scheduler reads the file execution queue, schedules the file analysis execution thread according to the idle condition of the file analysis execution thread in the thread pool, and distributes the new data file to the file analysis executor through the idle file analysis execution thread.
After receiving the data file, the file analysis executor identifies the file name of the data file in a keyword identification mode to obtain file parameters, selects an adaptive file analyzer from a registered file analyzer list, and sends the data file to a corresponding file analyzer. Wherein the thread scheduler may control a constant number of threads to be running.
After receiving the data file, the file analyzer analyzes the data file according to a preset analysis method to obtain data contents, each column of the data is arranged in a separated mode according to preset separators to obtain the data file in a standard format, the data file is output to a file server and uploaded to a data platform by the file server, and the data platform performs subsequent processing.
According to the embodiment of the application, the task concurrency is improved in a multithreading mode, CPU resources can be fully utilized to work during input and output waiting, and meanwhile, the data formats with different sources and different structures are unified into the data format which can be recognized by the data platform through the file resolvers, so that the purpose of data consistency is achieved, and the data platform can accurately recognize data uploaded by an external system.
Based on the same inventive concept, the embodiment of the present application further provides a data processing apparatus, and the following describes in detail the data processing apparatus provided by the embodiment of the present application with reference to fig. 3.
Fig. 3 is a structural diagram of a data processing apparatus according to an embodiment of the present application.
As shown in fig. 3, the data processing apparatus may include:
a scanning module 310, configured to scan a file directory on the file server through a file scanning thread, where the file directory is generated based on a data file uploaded to the file server;
the determining module 320 is configured to determine, according to file parameters of a new data file, a target file parser corresponding to a format type of the new data file when it is determined that the new data file is uploaded to the file server based on the file directory and the new data file is completely uploaded;
and the parsing module 330 is configured to parse the new data file by using the target file parser to obtain the first data.
According to the method and the device, a file directory on a file server is scanned through a file scanning thread, a new data file is determined to be uploaded to the file server based on the file directory, and under the condition that the new data file is uploaded completely, a target file parser corresponding to the format type of the new data file is determined according to the file parameters of the new data file, and the new data file is parsed by the target file parser to obtain the first data. The embodiment of the application can identify and analyze the corresponding data file by using the file analyzer corresponding to the format type of the data file to obtain the data content, so that the data files of various format types can be accurately identified.
In some embodiments, the data processing apparatus may further include:
an adding module, configured to add the new data file into a file execution queue and wait before the determining module 320 determines, according to the file parameter of the new data file, a target file parser corresponding to the format type of the new data file;
the distribution module is used for distributing a new data file to the file analysis executor through the file analysis execution thread under the condition that the file analysis execution thread is idle;
and the identification module is used for identifying the file name of the new data file by utilizing the file analysis executor in a keyword identification mode to obtain the file parameter of the new data file, and the file parameter is used for indicating the format type of the new data file.
In some embodiments, the determining module 320 is specifically configured to:
and searching a file parser list according to the file parameters of the new data file to obtain a target file parser corresponding to the format type of the new data file, wherein the file parser list is used for storing registered file parsers adapting to data files with different format types.
In some embodiments, the data processing apparatus may further include:
the arrangement module is used for arranging the first data according to a preset format after the analysis module 330 analyzes the new data file by using the target file analyzer to obtain the first data to obtain a data file in a standard format;
and the uploading module is used for uploading the data file in the standard format to the data platform, and the data platform processes the data file in the standard format.
In some embodiments, the ranking module is specifically configured to:
arranging the first data according to columns, and setting preset separators among different columns to obtain a data file in a standard format.
In some embodiments, the format type of the new data file includes at least one of: xlsx format, mdb format, text format, csv format.
Each module in the apparatus shown in fig. 3 has a function of implementing each step in fig. 1-2 and can achieve a corresponding technical effect, and for brevity, is not described again here.
Based on the same inventive concept, the embodiment of the present application further provides an electronic device, which may be, for example, a mobile phone, a tablet computer, a notebook computer, a palm computer, and the like. The electronic device provided by the embodiment of the present application is described in detail below with reference to fig. 4.
As shown in fig. 4, the electronic device may include a processor 410 and a memory 420 for storing computer program instructions.
Processor 410 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of embodiments of the present Application.
Memory 420 may include a mass storage for data or instructions. By way of example, and not limitation, memory 420 may include a Hard Disk Drive (HDD), a floppy Disk Drive, flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. In one example, memory 420 may include removable or non-removable (or fixed) media, or memory 420 is non-volatile solid-state memory. In one example, the Memory 420 may be a Read Only Memory (ROM). In one example, the ROM can be mask-programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these.
The processor 410 reads and executes the computer program instructions stored in the memory 420 to implement the method in the embodiment shown in fig. 1-2, and achieve the corresponding technical effect achieved by the embodiment shown in fig. 1-2 executing the method, which is not described herein again for brevity.
In one example, the electronic device can also include a communication interface 430 and a bus 440. As shown in fig. 4, the processor 410, the memory 420, and the communication interface 430 are connected via a bus 440 to complete communication therebetween.
The communication interface 430 is mainly used for implementing communication between modules, apparatuses, and/or devices in this embodiment.
The bus 440 includes hardware, software, or both to couple the various components of the electronic device to one another. By way of example, and not limitation, bus 440 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (Front Side Bus, FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) Bus, an InfiniBand interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a video electronics standards Association local (VLB) Bus, or other suitable Bus or a combination of two or more of these. Bus 440 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The electronic device may execute the data processing method in the embodiment of the present application after scanning the file directory on the file server by the file scanning thread, so as to implement the data processing method described in conjunction with fig. 1-2 and the data processing apparatus described in fig. 3.
In addition, in combination with the data processing method in the foregoing embodiment, the embodiment of the present application may provide a computer storage medium to implement. The computer storage medium having computer program instructions stored thereon; the computer program instructions, when executed by a processor, implement any of the data processing methods in the above embodiments.
It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions, or change the order between the steps, after comprehending the spirit of the present application.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic Circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include an electronic circuit, a semiconductor memory device, a ROM, a flash memory, an Erasable ROM (EROM), a floppy disk, a CD-ROM, an optical disk, a hard disk, an optical fiber medium, a Radio Frequency (RF) link, and so forth. The code segments may be downloaded via computer networks such as the internet, intranets, etc.
It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed at the same time.
Aspects of embodiments of the present application are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As described above, only the specific embodiments of the present application are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims (10)

1. A data processing method, comprising:
scanning a file directory on a file server through a file scanning thread, wherein the file directory is generated based on a data file uploaded to the file server;
determining a target file parser corresponding to the format type of a new data file according to the file parameter of the new data file under the condition that the new data file is determined to be uploaded to the file server based on the file directory and the uploading of the new data file is completed;
and analyzing the new data file by using the target file analyzer to obtain first data.
2. The method of claim 1, wherein before determining the target file parser corresponding to the format type of the new data file according to the file parameters of the new data file, the method further comprises:
adding the new data file into a file to execute queuing and waiting;
under the condition that a file analysis execution thread is idle, distributing the new data file to a file analysis executor through the file analysis execution thread;
and identifying the file name of the new data file by using the file analysis actuator in a keyword identification mode to obtain the file parameter of the new data file, wherein the file parameter is used for indicating the format type of the new data file.
3. The method of claim 1, wherein determining a target file parser corresponding to a format type of the new data file according to file parameters of the new data file comprises:
and searching a file resolver list according to the file parameters of the new data file to obtain a target file resolver corresponding to the format type of the new data file, wherein the file resolver list is used for storing registered file resolvers adaptive to data files of different format types.
4. The method of claim 1, wherein after parsing the new data file with the target file parser into first data, the method further comprises:
arranging the first data according to a preset format to obtain a data file with a standard format;
and uploading the data file in the standard format to a data platform, and processing the data file in the standard format by the data platform.
5. The method according to claim 4, wherein said arranging the first data according to a preset format to obtain a data file in a standard format comprises:
and arranging the first data according to columns, and setting preset separators among different columns to obtain a data file in a standard format.
6. The method according to any of claims 1-5, wherein the format type of the new data file comprises at least one of: xlsx format, mdb format, text format, csv format.
7. A data processing apparatus, comprising:
the scanning module is used for scanning a file directory on a file server through a file scanning thread, and the file directory is generated based on a data file uploaded to the file server;
the determining module is used for determining a target file analyzer corresponding to the format type of a new data file according to the file parameters of the new data file when the new data file is determined to be uploaded to the file server based on the file directory and the uploading of the new data file is completed;
and the analysis module is used for analyzing the new data file by using the target file analyzer to obtain first data.
8. An electronic device, comprising:
a processor;
a memory for storing computer program instructions;
the computer program instructions, when executed by the processor, implement the method of any of claims 1-6.
9. A computer-readable storage medium having computer program instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1-6.
10. A computer program product, wherein instructions in the computer program product, when executed by a processor of an electronic device, cause the electronic device to perform the method of any of claims 1-6.
CN202211578816.3A 2022-12-06 2022-12-06 Data processing method, device, equipment, medium and product Pending CN115794752A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211578816.3A CN115794752A (en) 2022-12-06 2022-12-06 Data processing method, device, equipment, medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211578816.3A CN115794752A (en) 2022-12-06 2022-12-06 Data processing method, device, equipment, medium and product

Publications (1)

Publication Number Publication Date
CN115794752A true CN115794752A (en) 2023-03-14

Family

ID=85418159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211578816.3A Pending CN115794752A (en) 2022-12-06 2022-12-06 Data processing method, device, equipment, medium and product

Country Status (1)

Country Link
CN (1) CN115794752A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116882712A (en) * 2023-09-07 2023-10-13 北京前景无忧电子科技股份有限公司 Universal power grid digital asset management method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116882712A (en) * 2023-09-07 2023-10-13 北京前景无忧电子科技股份有限公司 Universal power grid digital asset management method

Similar Documents

Publication Publication Date Title
CN108984389B (en) Application program testing method and terminal equipment
CN110851324B (en) Log-based routing inspection processing method and device, electronic equipment and storage medium
CN110267215B (en) Data detection method, equipment and storage medium
CN115794752A (en) Data processing method, device, equipment, medium and product
CN111367531B (en) Code processing method and device
CN112115105A (en) Service processing method, device and equipment
CN110716804A (en) Method and device for automatically deleting useless resources, storage medium and electronic equipment
CN113760242A (en) Data processing method, device, server and medium
CN116662302A (en) Data processing method, device, electronic equipment and storage medium
CN116303320A (en) Real-time task management method, device, equipment and medium based on log file
CN114531340B (en) Log acquisition method and device, electronic equipment, chip and storage medium
CN114598547A (en) Data analysis method applied to network attack recognition and electronic equipment
CN111931161B (en) RISC-V processor based chip verification method, apparatus and storage medium
CN115113871A (en) Front-end code version information checking method, device and equipment based on data middlebox
CN110442370B (en) Test case query method and device
CN110674839B (en) Abnormal user identification method and device, storage medium and electronic equipment
US20140195540A1 (en) Expeditious citation indexing
CN109918293B (en) System test method and device, electronic equipment and computer readable storage medium
CN116401113B (en) Environment verification method, device and medium for heterogeneous many-core architecture acceleration card
CN117993360A (en) File analysis method, device, equipment, medium and product
CN115795551A (en) Method, device and equipment for detecting text
CN115509586A (en) Upgrade package processing method, device, equipment and medium
CN117687899A (en) Automatic test method, device, terminal equipment and storage medium
CN114095484A (en) Access parameter processing method, device, equipment and storage medium
CN116909909A (en) Log analysis method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination