CN109284260B

CN109284260B - Big data file reading method and device, computer equipment and storage medium

Info

Publication number: CN109284260B
Application number: CN201811203112.1A
Authority: CN
Inventors: 袁彪; 张要伟
Original assignee: Ping An Securities Co Ltd
Current assignee: Ping An Securities Co Ltd
Priority date: 2018-10-16
Filing date: 2018-10-16
Publication date: 2023-10-13
Anticipated expiration: 2038-10-16
Also published as: CN109284260A

Abstract

The application discloses a method, a device, computer equipment and a storage medium for reading a big data file, which are applied to the technical field of databases and are used for solving the problem that the data reading mode of the existing dbf file is low in efficiency. The method provided by the application comprises the following steps: reading header information of the target dbf file to obtain data quantity of each row of fields in the header information; mapping the first block field data of the target dbf file into a specified memory to serve as a current field block; acquiring field data of each row in the current field block row by row in sequence according to the field data quantity of each row; analyzing the acquired field data to obtain an analyzed data value; before all field data of the target dbf file are acquired, mapping the next field data of the target dbf file into a designated memory to serve as a new current field block, and returning to execute the step of acquiring the field data of each row in the current field block row by row in sequence according to the field data quantity of each row.

Description

Big data file reading method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of database technologies, and in particular, to a method and apparatus for reading a big data file, a computer device, and a storage medium.

Background

With the advent of the large data age, storage, reading and migration of data has become increasingly important. Currently, most databases use dbf format files to store data. The dbf file (database file with suffix of. Dbf) is a database format file used by database systems such as dBase and FoxPro. Most of the choices are made to use dbf files because the amount of data needed in the large data age is very large, and the server often needs to obtain the needed data from a data center or elsewhere and store it in a database so that the client can quickly respond when requesting the data from the server. dbf files are widely used because of their good data structure.

When reading the dbf file, the existing mode sequentially reads each piece of data from the disk according to the requirement of the dbf field format, and analyzes the field data in the dbf file in the reading process to obtain the value of the corresponding field.

However, when the dbf file with large data volume is read, for example, the dbf file with the volume of more than 10G is read, a large amount of reading time is required, and the requirement of the large data age on quick data reading is difficult to meet.

Disclosure of Invention

The embodiment of the application provides a method, a device, computer equipment and a storage medium for reading a big data file, which are used for solving the problem that the efficiency of the data reading mode of the existing dbf file is low.

A method for reading a large data file, comprising:

reading header information of a target dbf file to obtain data volume of each row of field in the header information, wherein the data volume of each row of field refers to the size of each row of field in the target dbf file;

mapping first block field data of the target dbf file to a specified memory to serve as a current field block, wherein the first block field data refers to field data which is located behind the head information and is close to the head information and has a preset data size in the target dbf file;

acquiring field data of each row in the current field block row by row in sequence according to the field data quantity of each row;

analyzing the acquired field data to obtain an analyzed data value;

before all field data of the target dbf file are acquired, mapping the next field data of the target dbf file into the designated memory to serve as a new current field block, and returning to execute the step of sequentially acquiring the field data of each row in the current field block row by row according to the field data quantity of each row, wherein the next field data refers to the field data which is positioned behind the current field block and is close to the preset data quantity of the current field block in the target dbf file.

A large data file reading apparatus comprising:

the header information reading module is used for reading header information of the target dbf file to obtain data volume of each row of field in the header information, wherein the data volume of each row of field refers to the size of each row of field in the target dbf file;

the field data mapping module is used for mapping the first block field data of the target dbf file to a specified memory to serve as a current field block, wherein the first block field data refers to field data which is located behind the head information and is close to the head information and has a preset data size in the target dbf file;

a line-by-line acquisition module, configured to sequentially acquire field data of each line in the current field block line by line according to the field data amount of each line;

the field data analysis module is used for analyzing the acquired field data to obtain an analyzed data value;

and the circulation processing module is used for mapping next field data of the target dbf file to the designated memory as a new current field block until all field data of the target dbf file are acquired, and returning to trigger the progressive acquisition module and the field data analysis module in sequence, wherein the next field data refers to the field data which is positioned behind the current field block and is close to the current field block in the target dbf file and has the preset data size.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the big data file reading method described above when the computer program is executed.

A computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the large data file reading method described above.

The method, the device, the computer equipment and the storage medium for reading the big data file comprise the steps of firstly, reading the header information of a target dbf file to obtain the data volume of each row of field data in the header information, wherein the data volume of each row of field data refers to the size of each row of field data in the target dbf file; then, mapping the first block field data of the target dbf file to a specified memory to serve as a current field block, wherein the first block field data refers to field data which is positioned behind the head information and is close to the head information and has a preset data size in the target dbf file; sequentially obtaining field data of each row in the current field block row by row according to the field data quantity of each row; analyzing the acquired field data to obtain an analyzed data value; before all field data of the target dbf file are acquired, mapping the next field data of the target dbf file into the designated memory to serve as a new current field block, and returning to execute the step of sequentially acquiring the field data of each row in the current field block row by row according to the field data quantity of each row, wherein the next field data refers to the field data which is positioned behind the current field block and is close to the preset data quantity of the current field block in the target dbf file. Therefore, the speed of processing data in the memory is far faster than that of processing data in the disk, and the file mapping mode is higher than that of reading data on the disk, so that the overall processing efficiency is higher than that of the existing mode, the efficiency of reading data of the dbf file is improved, the reading time of the data is shortened, and the requirement of the large data era on quick data reading can be met.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments of the present application will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of an application environment of a method for reading a big data file according to an embodiment of the application;

FIG. 2 is a flow chart of a method for reading a big data file according to an embodiment of the application;

FIG. 3 is a flowchart of the step 102 of the method for reading a big data file in an application scenario according to an embodiment of the present application;

FIG. 4 is a schematic flow chart of deleting header information in an application scenario in a big data file reading method according to an embodiment of the application;

FIG. 5 is a flow chart of presetting the preset data size in an application scenario according to a big data file reading method in an embodiment of the application;

FIG. 6 is a schematic diagram of a big data file reading device according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a computer device in accordance with an embodiment of the application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The method for reading the big data file provided by the application can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server through a network. The client may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.

In one embodiment, as shown in fig. 2, a method for reading a big data file is provided, and the method is applied to the server in fig. 1, and includes the following steps:

101. reading header information of a target dbf file to obtain data volume of each row of field in the header information, wherein the data volume of each row of field refers to the size of each row of field in the target dbf file;

in this embodiment, the dbf file is one of database format files, which has a fixed format requirement, for example, the header of the dbf file is header information, and the size of one line of field data in the dbf file, that is, the data size of each line of field data, is recorded in the header information. Accordingly, the server can obtain the data amount of each line of field in the header information by reading the header information of the target dbf file.

102. Mapping first block field data of the target dbf file to a specified memory to serve as a current field block, wherein the first block field data refers to field data which is located behind the head information and is close to the head information and has a preset data size in the target dbf file;

it can be appreciated that, because the data is mapped to the memory at a much faster rate than the data is read, the present solution maps the target dbf file to the specified memory for processing in a file mapping manner. Because the target dbf file is usually large, and the memory resources of the server are limited, it is often difficult to map the entire target dbf file into memory at one time. Therefore, in this embodiment, the server may map the first block field data of the target dbf file to the specified memory as the current field block, where the first block field data refers to field data of a preset data size located after and next to the header information in the target dbf file. For example, assume that 0-100 bytes in the target dbf file are header information and 101-10000 bytes are field data. When the preset data size is 100 bytes, the first block of field data is 101-200 bytes, and the server maps the field data of 101-200 bytes in the target dbf file to the specified memory as the current field block.

In practical use, when mapping the target dbf file to the specified memory, it is difficult to map the header field data to the specified memory alone. To this end, as shown in fig. 3, further, the step 102 may specifically include:

201. mapping data with the preset mapping data size from the head information in the target dbf file to a specified memory, wherein the preset mapping data size is equal to the sum of the preset data size and the data size of the head information;

202. and determining the data except the head information in the data mapped to the appointed memory as a current field block.

For step 201, it may be understood that, in order to successfully map the first block field data to the specified memory, the first block field data may be mapped to the specified memory together with header information, that is, field data with a preset mapping data size from the header of the target dbf file, where the preset mapping data size is equal to the sum of the preset data size and the data size of the header information. For example, the above example is received, i.e., 0-200 bytes of data in the target dbf file are mapped into a specified memory.

For step 202, since the data mapped in step 201 includes header information, and the header information does not include the values of the field data required by the scheme, the server may directly determine the data other than the header information in the data mapped to the specified memory as the current field block. The header information mapped to the specified memory may be erased or ignored, which is not limited in this embodiment.

In order to solve the above-mentioned problem that it is difficult to implement separate mapping of the header field data when mapping data, another way is to perform processing in this embodiment. As shown in fig. 4, further, before step 102, the method further includes:

301. deleting the header information of the target dbf file to obtain a new target dbf file;

step 102 specifically comprises: and mapping the field data with the preset data size at the head part in the target dbf file to a specified memory to serve as a current field block.

The idea of this way is that the header information in the target dbf file is deleted before the target dbf file is mapped, so that when the step 102 is executed to map the target dbf file, only field data remains in the new target dbf file, and no interference caused by the header information exists, so that the mapping process can be directly performed.

For step 301, the server may delete header information of the target dbf file to obtain a new target dbf file after deleting the header information.

Based on step 301, it may be understood that in step 102, the server may map the field data with the preset data size in the header of the target dbf file to the specified memory as the current field block. This is because, after the processing in step 301, the header information is not already in the target dbf file, so that the field data with the preset data size can be directly fetched from the beginning and mapped into the specified memory, and the field data mapped into the specified memory is the current field block.

In the dbf file, field data are stored in the form of rows, and the data volume of each row is the field data volume of each row. However, considering that the data size of the current field block is a preset data size when the mapped current field block is determined, if the preset data size and the field data size of each row are not limited, the data of the last row in the current field block may be less than one row of field data. For example, assuming that the data size of each line of field is 100, the preset data size is 250,0-100 bytes and is header information, the first block of field data of the target dbf file is 101-350, the last line of data is 301-350, and it can be seen that the last line of data is less than one line of field data, which may cause difficulty in subsequent analysis of field data. For this reason, the present embodiment can avoid the above situation by defining the relationship between the preset data amount size and the field data amount of each line, so that the last line data of each current field block is enough one line of field data.

As shown in fig. 5, further, the preset data size may be preset by:

401. acquiring the current available space of the appointed memory;

402. determining a mapping space of the specified memory for mapping field data according to a preset memory usage proportion and the current available space;

403. dividing the mapping space by the data volume of each row of field to obtain a first numerical value;

404. and rounding the first value, and calculating the product of the rounded first value and the data volume of each row of field to obtain a second value serving as a preset data volume.

For step 401, the currently available space is the remaining space that the specified memory can currently use, for example, assuming that the specified memory is 4g and 2g has been used, the currently available space is 2g.

For step 402, it is understood that the server may preset a memory usage ratio that indicates how much of the memory in the currently available space should be used to map data. The memory usage ratio may be set according to actual usage conditions, for example, may be set to 50%. After the server obtains the current available space of the specified memory, the server can calculate the product of the preset memory usage proportion and the current available space to obtain the mapping space of the specified memory for mapping field data.

For steps 403 and 404, in this embodiment, in order to make the preset data size equal to an integer multiple of the data size of each row of field, the preset data size is as close as possible to the size of the mapping space. Therefore, the mapping space may be divided by the field data amount of each row to obtain a first value, then the first value is rounded, the rounded first value may be regarded as a multiple, and a second value obtained by calculating the product of the rounded first value and the field data amount of each row is an integer multiple of the field data amount of each row, so as to determine that the second value is a preset data amount.

103. Acquiring field data of each row in the current field block row by row in sequence according to the field data quantity of each row;

it can be understood that, because the field data in the dbf file is arranged and stored in a row and a row, and the arrangement sequence of each field and the bytes occupied by each field are pre-agreed in the field data of each row, when the field data in the current field block is acquired, the field data of each row in the current field block needs to be acquired row by row sequentially according to the field data quantity of each row, and then the field data are respectively parsed in the executing step 104.

Further, as can be seen from the foregoing, the last line of data of the current field block may be less than one line of field data, and in this case, after step 103, the method may further include: if the data volume of the field data of the last line of the current field block is less than the data volume of each line of field data, temporarily storing the field data of the last line as continuous data; then, after mapping the next field data of the target dbf file to the designated memory as a new current field block in step 105, the continuation data is merged into the header of the current field block before returning to the step of sequentially obtaining the field data of each line in the current field block line by line according to the amount of each line of field data. For example, given the above example, assuming that the last line of data of the current field block is 301-350, the server may save the last line of data in memory, then, after obtaining a new current field block in step 105, it may know that the new current field block is 351-600, and the server merges the last line of data into the header of the current field block, so that the current field block is 301-600. It can be seen that the last line of data of the current field block can be well processed by this way of processing.

104. Analyzing the acquired field data to obtain an analyzed data value;

in this embodiment, after each time the field data of each row in the current field block is obtained, the server may parse the obtained field data to obtain the parsed data value. It can be understood that each field in the field data of each row in the dbf file is arranged according to a predetermined arrangement sequence, and the bytes occupied by each field are also predetermined, so that after the field data are acquired, the data values in the field data are directly resolved according to the predetermined fields.

105. Before all field data of the target dbf file are acquired, mapping the next field data of the target dbf file into the designated memory to serve as a new current field block, and returning to execute the step of sequentially acquiring the field data of each row in the current field block row by row according to the field data quantity of each row, wherein the next field data refers to the field data which is positioned behind the current field block and is close to the preset data quantity of the current field block in the target dbf file.

In this embodiment, since the field data in the entire target dbf file is obtained and parsed, after the step 104 is performed, that is, after the current field block is obtained and parsed, the server may map the next field data of the target dbf file to the specified memory as a new current field block, then return to the step 103 and the step 104 to process the next field data of the first field data, after the step 103 and the step 104 process the next field data, determine the next field data as a new current field block, then obtain and parse until all field data of the target dbf file are obtained and parsed, and it is known that the data values of all field data in the target dbf file can be obtained at this time, so that the reading of the target dbf file is completed.

In the embodiment of the application, firstly, header information of a target dbf file is read to obtain the data volume of each row of field in the header information, wherein the data volume of each row of field refers to the size of each row of field in the target dbf file; then, mapping the first block field data of the target dbf file to a specified memory to serve as a current field block, wherein the first block field data refers to field data which is positioned behind the head information and is close to the head information and has a preset data size in the target dbf file; sequentially obtaining field data of each row in the current field block row by row according to the field data quantity of each row; analyzing the acquired field data to obtain an analyzed data value; before all field data of the target dbf file are acquired, mapping the next field data of the target dbf file into the designated memory to serve as a new current field block, and returning to execute the step of sequentially acquiring the field data of each row in the current field block row by row according to the field data quantity of each row, wherein the next field data refers to the field data which is positioned behind the current field block and is close to the preset data quantity of the current field block in the target dbf file. Therefore, the speed of processing data in the memory is far faster than that of processing data in the disk, and the file mapping mode is higher than that of reading data on the disk, so that the overall processing efficiency is higher than that of the existing mode, the efficiency of reading data of the dbf file is improved, the reading time of the data is shortened, and the requirement of the large data era on quick data reading can be met.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application.

In an embodiment, a large data file reading device is provided, where the large data file reading device corresponds to the large data file reading method in the above embodiment one by one. As shown in fig. 6, the large data file reading apparatus includes a header information reading module 501, a field data mapping module 502, a progressive acquisition module 503, a field data parsing module 504, and a loop processing module 505. The functional modules are described in detail as follows:

the header information reading module 501 is configured to read header information of a target dbf file, and obtain a field data amount of each row in the header information, where the field data amount of each row refers to a size of field data of each row in the target dbf file;

the field data mapping module 502 is configured to map, as a current field block, first block field data of the target dbf file to a specified memory, where the first block field data refers to field data, in the target dbf file, located after the header information and close to the header information, with a preset data size;

a progressive obtaining module 503, configured to sequentially obtain, progressive, according to the field data amount of each row, field data of each row in the current field block;

the field data parsing module 504 is configured to parse the acquired field data to obtain parsed data values;

and the loop processing module 505 is configured to map, until all field data of the target dbf file is acquired, next block field data of the target dbf file to the specified memory as a new current field block, and return to trigger the progressive acquisition module and the field data analysis module sequentially, where the next block field data refers to field data of the preset data size located after the current field block and next to the current field block in the target dbf file.

Further, the field data mapping module may include:

a first mapping unit, configured to map data with a preset mapping data size from the header information in the target dbf file to a specified memory, where the preset mapping data size is equal to a sum of the preset data size and the data size of the header information;

and the field block determining unit is used for determining data except the header information in the data mapped to the appointed memory as a current field block.

Further, the big data file reading device may further include:

the head information deleting module is used for deleting the head information of the target dbf file to obtain a new target dbf file;

the field data mapping module is specifically configured to: and mapping the field data with the preset data size at the head part in the target dbf file to a specified memory to serve as a current field block.

Further, the preset data size may be preset by:

the available space acquisition module is used for acquiring the current available space of the appointed memory;

the mapping space determining module is used for determining a mapping space of the appointed memory for mapping field data according to a preset memory use proportion and the current available space;

the first numerical value calculation module is used for dividing the mapping space by the data volume of each row of field to obtain a first numerical value;

the rounding module is used for rounding the first numerical value, calculating the product of the rounded first numerical value and the data volume of each row of field, and obtaining a second numerical value serving as the preset data volume.

Further, the big data file reading device may further include:

the temporary storage module is used for temporarily storing the field data of the last line as continuous data if the acquired data volume of the field data of the last line of the current field block is smaller than the field data volume of each line;

and the merging module is used for merging the continuous data to the head of the current field block before returning to triggering the progressive acquisition module and the field data analysis module in sequence after mapping the next field data of the target dbf file to the designated memory as a new current field block.

For specific limitations on the large data file reading apparatus, reference may be made to the above limitations on the large data file reading method, and no further description is given here. The above-described respective modules in the large data file reading apparatus may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data involved in the large data file reading method. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of reading a large data file.

In one embodiment, a computer device is provided that includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method for reading a large data file in the above embodiment, such as steps 101 to 105 shown in fig. 2. Alternatively, the processor, when executing a computer program, implements the functions of the modules/units of the big data file reading apparatus in the above embodiments, such as the functions of the modules 501 to 505 shown in fig. 6. In order to avoid repetition, a description thereof is omitted.

In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of the large data file reading method of the above embodiment, such as steps 101 to 105 shown in fig. 2. Alternatively, the computer program when executed by the processor implements the functions of the modules/units of the large data file reading apparatus in the above embodiment, such as the functions of the modules 501 to 505 shown in fig. 6. In order to avoid repetition, a description thereof is omitted.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. A method for reading a large data file, comprising:

mapping data with the preset mapping data size from the head information in the target dbf file to a specified memory, wherein the preset mapping data size is equal to the sum of the preset data size and the data size of the head information;

determining data except the head information in the data mapped to the appointed memory as a current field block;

the first block field data refers to field data with a preset data size, which is positioned behind the head information and is close to the head information, in the target dbf file;

analyzing the acquired field data to obtain an analyzed data value;

before all field data of the target dbf file are acquired, mapping the next field data of the target dbf file into the designated memory to serve as a new current field block, and returning to execute the step of sequentially acquiring the field data of each row in the current field block row by row according to the field data quantity of each row, wherein the next field data refers to the field data which is positioned behind the current field block and is close to the preset data quantity of the current field block in the target dbf file;

the preset data size is preset through the following steps:

acquiring the current available space of the appointed memory;

determining a mapping space of the specified memory for mapping field data according to a preset memory usage proportion and the current available space;

dividing the mapping space by the data volume of each row of field to obtain a first numerical value;

and rounding the first value, and calculating the product of the rounded first value and the data volume of each row of field to obtain a second value serving as a preset data volume.

2. The method for reading a big data file according to claim 1, wherein before mapping the data of the predetermined mapping data size from the header information in the target dbf file to the specified memory, the predetermined mapping data size is equal to a sum of the predetermined data size and the data size of the header information, the method further comprises:

deleting the header information of the target dbf file to obtain a new target dbf file;

the mapping the first block field data of the target dbf file to the specified memory as the current field block specifically includes: and mapping the field data with the preset data size at the head part in the target dbf file to a specified memory to serve as a current field block.

3. The large data file reading method according to any one of claims 1 to 2, characterized by further comprising, after sequentially acquiring the field data of each line in the current field block line by line in accordance with the field data amount of each line:

if the data volume of the field data of the last line of the current field block is less than the data volume of each line of field data, temporarily storing the field data of the last line as continuous data;

after mapping the next field data of the target dbf file to the designated memory as a new current field block, merging the continuing data to the header of the current field block before returning to the step of sequentially obtaining the field data of each row in the current field block row by row according to the field data quantity of each row.

4. A big data file reading apparatus, comprising:

a field data mapping module, configured to map data with a preset mapping data size from the header information in the target dbf file to a specified memory, where the preset mapping data size is equal to a sum of the preset data size and the data size of the header information; determining data except the head information in the data mapped to the appointed memory as a current field block; the first block field data refers to field data with a preset data size, which is positioned behind the head information and is close to the head information, in the target dbf file;

the circulation processing module is used for mapping next field data of the target dbf file to the designated memory to serve as a new current field block until all field data of the target dbf file are acquired, and returning to trigger the progressive acquisition module and the field data analysis module in sequence, wherein the next field data refers to the field data which is positioned behind the current field block and is close to the current field block in the target dbf file and has the preset data size;

the preset data size can be preset through the following steps:

5. The big data file reading apparatus of claim 4, wherein the field data mapping module comprises:

6. The large data file reading apparatus of claim 4, wherein the large data file reading apparatus further comprises:

7. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the big data file reading method according to any of claims 1 to 3 when the computer program is executed.

8. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the big data file reading method according to any of claims 1 to 3.