CN113590554A - File processing method and device, electronic equipment and storage medium - Google Patents

File processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113590554A
CN113590554A CN202110916317.XA CN202110916317A CN113590554A CN 113590554 A CN113590554 A CN 113590554A CN 202110916317 A CN202110916317 A CN 202110916317A CN 113590554 A CN113590554 A CN 113590554A
Authority
CN
China
Prior art keywords
file
array
record
fields
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110916317.XA
Other languages
Chinese (zh)
Inventor
马玲玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110916317.XA priority Critical patent/CN113590554A/en
Publication of CN113590554A publication Critical patent/CN113590554A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing

Abstract

A file processing method is provided, which can be applied to the financial field. The method comprises the following steps: receiving a target file; identifying a file type of a target file; determining a file analysis strategy according to the file type, wherein the file analysis strategy corresponds to the file type; analyzing the file information of the target file according to the determined file analysis strategy, wherein the file information comprises a plurality of file fields and at least one record; and storing the plurality of file fields and the at least one record of the parsed target file according to a predetermined format, wherein the predetermined format includes at least two arrays, and storing the plurality of file fields and the at least one record of the parsed target file according to the predetermined format includes: storing a plurality of file fields of the parsed target file in two arrays including a first array and a second array; and storing the at least one record of the parsed target file in at least two arrays including a third array and a fourth array.

Description

File processing method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies and big data technologies, and in particular, to a file processing method, an apparatus, an electronic device, a computer-readable storage medium, and a program product.
Background
With the continuous development of the information age, at present, companies and organizations have special data management systems to store and manage working data, and a large amount of data generated in daily work needs to be uploaded to the data management systems.
Generally, the data management system needs to analyze the uploaded data file, the existing file analysis is generally directly performed in a background program, once the requirement changes or the field changes and expands, the background program needs to be modified again, and the program is not easy to expand.
Disclosure of Invention
In view of the above, some embodiments of the present disclosure provide a file processing method, apparatus, electronic device, computer-readable storage medium, and program product.
According to an aspect of the embodiments of the present disclosure, there is provided a file processing method performed by an electronic device, the method including: receiving a target file; identifying a file type of the target file; determining a file analysis strategy according to the file type, wherein the file analysis strategy corresponds to the file type; analyzing the file information of the target file according to the determined file analysis strategy, wherein the file information comprises a plurality of file fields and at least one record; and storing the plurality of file fields and the at least one record of the parsed target file according to a predetermined format, wherein the predetermined format includes at least two arrays, and the storing the plurality of file fields and the at least one record of the parsed target file according to the predetermined format includes: storing the plurality of analyzed file fields of the target file in two arrays including a first array and a second array; and storing the parsed at least one record of the target file in at least two arrays including a third array and a fourth array.
According to some exemplary embodiments, the values of the first array are subscripts of the second array; and/or the value of the third array is the subscript of the fourth array.
According to some exemplary embodiments, each of the file fields includes a field identification, location information of a column in which the field is located, and a field name, the subscript of the first array includes the field identification, the value of the first array includes the location information of the column in which the field is located, and the value of the second array includes the field name.
According to some exemplary embodiments, each record includes a plurality of record data, the number of the plurality of record data is consistent with the number of the plurality of file fields; and each of the record data includes a data identifier, position information of a column in which the record data is located, and a data value, the subscript of the third array includes the data identifier, the value of the third array includes the position information of the column in which the record data is located, and the value of the fourth array includes the data value.
According to some exemplary embodiments, the file information further includes a file version number, and the file processing method further includes: and checking the target file according to the file version number and the number of the file fields.
According to some exemplary embodiments, the file type includes a text file type, and the parsing the file information of the target file according to the determined file parsing policy specifically includes: and analyzing the target file line by line to analyze the file information of the target file.
According to some exemplary embodiments, the file type includes a table file type, and the parsing the file information of the target file according to the determined file parsing policy specifically includes: and respectively analyzing the header part and the data part of the table file to analyze the file information of the target file.
According to some exemplary embodiments, the values of the second array respectively correspond to the values of the fourth array, and the file processing method further includes: and checking the validity of the values of the fourth array corresponding to the values of the second arrays one by one according to the values of the second arrays.
According to another aspect, there is provided a document processing apparatus including: the receiving module is used for receiving the target file; the identification module is used for identifying the file type of the target file; the strategy determining module is used for determining a file analysis strategy according to the file type, wherein the file analysis strategy corresponds to the file type; the analysis module is used for analyzing the file information of the target file according to the determined file analysis strategy, wherein the file information comprises a plurality of file fields and at least one record; and a storage module, configured to store the plurality of file fields and the at least one record of the parsed target file according to a predetermined format, where the predetermined format includes at least two arrays, and the storing the plurality of file fields and the at least one record of the parsed target file according to the predetermined format includes: storing the plurality of analyzed file fields of the target file in two arrays including a first array and a second array; and storing the parsed at least one record of the target file in at least two arrays including a third array and a fourth array.
According to some exemplary embodiments, the values of the first array are subscripts of the second array; and/or the value of the third array is the subscript of the fourth array.
According to still another aspect of the embodiments of the present disclosure, there is provided an electronic device including: one or more processors; a storage device for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the file processing method described above.
According to yet another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-mentioned file processing method.
According to yet another aspect of the embodiments of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the file processing method described above.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:
fig. 1 schematically shows an application scenario diagram of a file processing method according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow diagram of a document processing method according to an embodiment of the disclosure;
FIG. 3A is a schematic diagram of an object file processed by a file processing method according to an embodiment of the disclosure, wherein the object file is schematically shown as a text file;
FIG. 3B is a schematic diagram of an object file processed by the file processing method according to an embodiment of the present disclosure, wherein the object file is schematically shown as a table file;
FIG. 4 is a block diagram schematically showing the configuration of a document processing apparatus according to an embodiment of the present disclosure; and
fig. 5 schematically shows a block diagram of an electronic device adapted to implement a file processing method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
For example, in this document, the expressions "field", "record", "array" are understood as follows. In a database, the "columns" of a table are called "fields," which are collections of data having the same attributes, and each field typically has a unique name, called a field name. For example, in the "address book" database, "name" and "contact phone" are attributes that are common to all rows in the table, so these columns are referred to as the "name" field and the "contact phone" field. Records refer to rows in a table. For example, in the "address book" database, fields such as name, age, sex, and email may be included, and each member added to the table includes data such as name, age, sex, and email, and the data for each member constitutes a record. The array is an ordered sequence of elements. Each variable that makes up an array is referred to as a component of the array, also referred to as an element of the array or a value of the array, and sometimes referred to as a subscript variable. The numbers used to distinguish the various elements of the array are referred to as subscripts.
Embodiments of the present disclosure provide a file processing method, apparatus, electronic device, computer-readable storage medium, and program product. The file processing method is executed by an electronic device, and for example, the method may include the steps of: receiving a target file; identifying a file type of the target file; determining a file analysis strategy according to the file type, wherein the file analysis strategy corresponds to the file type; analyzing the file information of the target file according to the determined file analysis strategy, wherein the file information comprises a plurality of file fields and at least one record; and storing the plurality of file fields and the at least one record of the parsed target file according to a predetermined format, wherein the predetermined format includes at least two arrays, and the storing the plurality of file fields and the at least one record of the parsed target file according to the predetermined format includes: storing the plurality of analyzed file fields of the target file in two arrays including a first array and a second array; and storing the parsed at least one record of the target file in at least two arrays including a third array and a fourth array. In the embodiment of the disclosure, the file processing method can support multiple file types, and defines the relationship between the uploaded files and the storage variables of the back end, so that the program operated by the back end is easy to expand.
Fig. 1 schematically shows an application scenario diagram of a file processing method according to an embodiment of the present disclosure.
As shown in fig. 1, the application scenario 100 according to this embodiment may include various terminal devices 101, 102, 103, a server 105, and a network 104 for connecting the various terminal devices and the server. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various client applications installed thereon, such as a bank APP, a shopping APP, a web browser application, a search APP, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a backend server that parses files uploaded by the user using the terminal devices 101, 102, 103 or a backend management server that provides support for browsed websites (just an example). The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the file processing method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the file processing apparatus provided by the embodiment of the present disclosure may be generally disposed in the server 105. The file processing method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the file processing apparatus provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
FIG. 2 schematically shows a flow chart of a file processing method according to an embodiment of the present disclosure. As shown in fig. 2, a file processing method according to an embodiment of the present disclosure may include operations S210 to S250, and the file processing method may be performed by an electronic device, for example, by a processor of the electronic device.
In operation S210, a target file is received. For example, the target file may be a file uploaded by the terminal device 101, 102, 103. The target file may be a file generated by a bank APP on the terminal device 101, 102, 103, or may be a file created by a user through input of the terminal device 101, 102, 103.
In some exemplary embodiments, the target file may include a financial class file. In the embodiment of the present disclosure, the financial class file will be described as an example, but the embodiment of the present disclosure is not limited thereto.
In operation S220, a file type of the target file is identified. For example, the file types of the target file may include a text file type and a form file type. The text file type may include a txt type and the like, and the form file type may include an excel type, a csv type and the like.
In operation S230, a file parsing policy is determined according to the file type, where the file parsing policy corresponds to the file type.
In the embodiment of the present disclosure, different file parsing policies corresponding to file types may be adopted according to different file types of the target file. For example, for text file types, a line-by-line parsing strategy may be adopted. For the table file type, a strategy that the header part and the data part are respectively analyzed can be adopted.
It should be noted that the embodiments of the present disclosure are not limited to the two types of text files and table files, and accordingly, the embodiments of the present disclosure are not limited to the above two strategies.
In operation S240, the file information of the target file is parsed according to the determined file parsing policy, where the file information includes a plurality of file fields and at least one record.
For example, for a text file, the file information of the text file is parsed line by line according to the determined policy of parsing line by line to obtain a plurality of file fields and at least one record. For the table file, according to the determined strategy for respectively analyzing the header part and the data part, the file information of the table file can be respectively analyzed to obtain a plurality of file fields and at least one record.
In operation S250, the parsed plurality of file fields and at least one record of the target file are stored in a predetermined format.
In an embodiment of the present disclosure, the prescribed format includes at least two arrays. The storing the parsed file fields and the at least one record of the target file according to the predetermined format includes: storing the plurality of analyzed file fields of the target file in two arrays including a first array and a second array; and storing the parsed at least one record of the target file in at least two arrays including a third array and a fourth array.
For example, the values of the first array are subscripts of the second array; and/or the value of the third array is the subscript of the fourth array.
For example, each file field includes a field identification, location information of a column in which the field is located, and a field name, the subscript of the first array includes the field identification, the value of the first array includes the location information of the column in which the field is located, and the value of the second array includes the field name.
For example, each record includes a plurality of record data, the number of the plurality of record data is consistent with the number of the plurality of file fields; and each of the record data includes a data identifier, position information of a column in which the record data is located, and a data value, the subscript of the third array includes the data identifier, the value of the third array includes the position information of the column in which the record data is located, and the value of the fourth array includes the data value.
Fig. 3A is a schematic diagram of an object file processed by the file processing method according to the embodiment of the present disclosure, in which the object file is schematically illustrated as a text file. Fig. 3B is a schematic diagram of an object file processed by the file processing method according to the embodiment of the present disclosure, wherein the object file is schematically shown as a table file.
In some embodiments, the target file may include a financial data file, such as a payroll file, however, embodiments of the present disclosure are not so limited.
Referring to fig. 2 and 3A in combination, in operation S220, the file type of the target file is identified as a text file type, for example, a txt file type.
In operation S230, the file parsing policy is determined to be a line-by-line parsing policy according to the identified text file type, that is, the determined line-by-line parsing policy corresponds to the identified text file type.
In operation S240, the text file shown in fig. 3A is parsed line by line according to the determined line-by-line parsing strategy to obtain file information. That is, a plurality of file fields and at least one record may be obtained.
Taking the text file shown in fig. 3A as an example, the plurality of file fields may include a sequence number, a payee name, a collection account, an amount transferred, a remark, and the like, and the at least one record includes 3 records, respectively, "1 zhangsan 11111111300.007 monthly payroll", "2 lisi 22222222400.007 monthly payroll", and "3 wangwu 33333333400.007 monthly payroll".
In operation S250, the parsed plurality of file fields and at least one record of the target file are stored in a predetermined format.
Taking the text file shown in fig. 3A as an example, the parsed file fields are stored in two arrays including a first array and a second array.
For example, the first array may comprise the form:
{[‘A1’,‘liel’],[‘A2’,‘lie2’],[‘A3’,‘lie3’],[‘A4’,‘lie4’],[‘A5’,‘lie5’]}。
the second array may comprise the form:
{ [ 'lie 1', 'order number' ], [ 'lie 2', 'payee name' ], [ 'lie 3', 'payee account' ], [ 'lie 4', 'transfer amount' ], [ 'lie 5', 'remark' ] }.
In an embodiment of the present disclosure, each of the file fields includes a field identification, location information of a column in which the field is located, and a field name. For example, a1, a2, A3, a4 and a5 are respectively field identifiers of the file fields, lie1, lie2, lie3, lie4 and lie5 are respectively position information of columns where the file fields are located, and a sequence number, a payee name, a payee account, a transfer amount and a remark are respectively field names of the file fields. The subscript of the first array includes the field identification, and the value of the first array includes the location information of the column in which the field is located, i.e., lie1, lie2, lie3, lie4, lie 5. The values lie, lie2, lie3, lie4 and lie5 of the first array are subscripts of the second array, and the values of the second array comprise field names.
In this way, two arrays are used to store the file fields of the target file. When the file fields in the target file need to be modified, added or deleted, only the elements in the two arrays need to be modified correspondingly, the background program does not need to be changed, and development, maintenance and expansion of the background program are facilitated.
Likewise, in an embodiment of the present disclosure, the parsed at least one record of the target file is stored in at least two arrays including a third array and a fourth array.
For example, taking the 1 st record shown in FIG. 3A as an example, the third array may comprise the following form:
{[‘B1’,‘mxlie1’],[‘B2’,‘mxlie2’],[‘B3’,‘mxlie3’],[‘B4’,‘mxlie4’],[‘B5’,‘mxlie5’]}。
the fourth array may comprise the form:
{ [ 'mxlie', '1', [ 'mxlie 2', 'zhangsan', [ 'mxlie 3', '11111111', [ 'mxlie 4', '300.0', [ 'mxlie 5', '7 months' etc. ].
For example, the at least two arrays may further include a fifth array and a sixth array. The fifth array may store the 2 nd record.
For example, the fifth array may comprise the form:
{ [ 'mxlie', '2' ], [ 'mxlie 2', 'lisi' ], [ 'mxlie 3', '22222222' ], [ 'mxlie 4', '400.0' ], [ 'mxlie 5', '7 months' wage }.
The sixth array may store the 3 rd record, for example, the sixth array may include the form:
{ [ 'mxlie', '3', [ 'mxlie 2', 'wangwu', [ 'mxlie 3', '33333333', [ 'mxlie 4', '400.0' ], [ 'mxlie 5', '7-month payroll' ] }.
Taking the file shown in fig. 3A as an example, each record includes a plurality of record data, for example, 5 record data. The number of the plurality of record data is identical to the number of the plurality of file fields. For example, record 1 includes 5 records of 1, zhangsan, 11111111, 300.00, and 7 months of payroll. Each of the record data includes a data identification, position information of a column in which the record data is located, and a data value. For example, the record data "1" includes a data identification B1, position information mxlilel of a column in which the record data is located, and a data value 1, the record data "zhangsan" includes a data identification B2, position information mxlie2 of a column in which the record data is located, and a data value zhangsan, and so on. The subscript of the third array comprises the data identification, and the value of the third array comprises position information of a column in which the data is recorded. The subscripts of the fourth, fifth, and sixth arrays include location information of a column in which the record data is located, and the values of the fourth, fifth, and sixth arrays include the data value.
In this manner, at least one record of the target file is stored using at least two arrays. When the records in the target file need to be modified, added or deleted, the elements in at least two arrays only need to be modified correspondingly, the background program does not need to be changed, and the development, maintenance and extension of the background program are facilitated.
Referring to fig. 2 and 3B in combination, in operation S220, the file type of the target file is identified as a table file type, for example, an excel file type.
In operation S230, a file parsing policy is determined as a policy for parsing the header part and the data part respectively according to the identified table file type, that is, the determined parsing policy corresponds to the identified table file type.
In operation S240, the table file shown in fig. 3B is parsed according to the determined policies for parsing the header part and the data part, respectively, to obtain file information. That is, a plurality of file fields and at least one record may be obtained.
Taking the table file shown in fig. 3B as an example, the plurality of file fields may include a sequence number, a payee name, a payee account, an amount transferred, a payee contact number, a remark, and the like, and the at least one record includes 3 records, which are "1 zhangsan 11111111300.00345678997 monthly payroll", "2 lisi 22222222400.00456789697 monthly payroll", and "3 wangwu 33333333400.00556789697 monthly payroll", respectively.
In operation S250, the parsed plurality of file fields and at least one record of the target file are stored in a predetermined format.
Taking the table file shown in fig. 3B as an example, the parsed file fields are stored in two arrays including a first array and a second array.
For example, the first array includes the following form:
{[‘A1’,‘lie1’],[‘A2’,‘lie2’],[‘A3’,‘lie3’],[‘A4’,‘lie4’],[‘A5’,‘lie5’],[‘A6’,‘lie6’]}。
the second array may comprise the form:
{ [ 'lie 1', 'order number' ], [ 'lie 2', 'payee name' ], [ 'lie 3', 'payee account' ], [ 'lie 4', 'transfer amount' ], [ 'lie 5', 'payee contact telephone' ], [ 'lie 6', 'remark' ] }.
In an embodiment of the present disclosure, each of the file fields includes a field identification, location information of a column in which the field is located, and a field name. For example, a1, a2, A3, a4, a5 and a6 are respectively field identifications of the file fields, lie1, lie2, lie3, lie4, lie5 and lie6 are respectively position information of columns where the file fields are located, and a sequence number, a payee name, a payee account, a transfer amount, a payee contact phone and a remark are respectively field names of the file fields. The subscript of the first array includes the field identification, and the value of the first array includes the location information of the column in which the field is located, i.e., lie1, lie2, lie3, lie4, lie5, lie 6. The values lie1, lie2, lie3, lie4, lie5 and lie6 of the first array are respectively subscripts of the second array, and the values of the second array comprise field names.
In this way, two arrays are used to store the file fields of the target file. When the file fields in the target file need to be modified, added or deleted, only the elements in the two arrays need to be modified correspondingly, the background program does not need to be changed, and development, maintenance and expansion of the background program are facilitated.
Likewise, in an embodiment of the present disclosure, the parsed at least one record of the target file is stored in at least two arrays including a third array and a fourth array.
For example, taking the 1 st record shown in FIG. 3B as an example, the third array may comprise the following form:
{[‘B1’,‘mxliel’],[‘B2’,‘mxlie2’],[‘B3’,‘mxlie3’],[‘B4’,‘mxlie4’],[‘B5’,‘mxlie5’],[‘B6’,‘mxlie6’]}。
the fourth array may comprise the form:
{ [ ' mxlie ', ' 1 ', [ ' mxlie2 ', ' zhangsan ', [ ' mxlie3 ', ' 11111111 ', [ ' mxlie4 ', ' 300.0 ', [ ' mxlie5 ', ' 34567899 ', ' mxlie6 ', ' 7 months worth } ].
For example, the at least two arrays may further include a fifth array and a sixth array. The fifth array may store the 2 nd record, for example, the fifth array may include the form:
{ [ 'mxlie 1', '2' ], [ 'mxlie 2', 'lisi' ], [ 'mxlie 3', '22222222' ], [ 'mxlie 4', '400.0' ], [ 'mxlie 5', '45678969' ], [ 'mxlie 6', '7 months' }.
The sixth array may store the 3 rd record, for example, the sixth array may include the form:
{ [ 'mxlie 1', '3' ], [ 'mxlie 2', 'wangwu' ], [ 'mxlie 3', '33333333', [ 'mxlie 4', '400.0' ], [ 'mxlie 5', '55678969' ], [ 'mxlie 6', '7 months' etc. ].
Taking the file shown in fig. 3B as an example, each record includes a plurality of record data, for example, 6 record data. The number of the plurality of record data is identical to the number of the plurality of file fields. For example, record 1 includes 6 records of 1, zhangsan, 11111111, 300.00, 34567899, and 7-month payroll. Each of the record data includes a data identification, position information of a column in which the record data is located, and a data value. For example, the record data "1" includes a data identification B1, position information mxlie1 of a column in which the record data is located, and a data value 1, the record data "zhangsan" includes a data identification B2, position information mxlie2 of a column in which the record data is located, and a data value zhangsan, and so on. The subscript of the third array comprises the data identification, and the value of the third array comprises position information of a column in which the data is recorded. The subscripts of the fourth, fifth, and sixth arrays include location information of a column in which the record data is located, and the values of the fourth, fifth, and sixth arrays include the data value.
In this manner, at least one record of the target file is stored using at least two arrays. When the records in the target file need to be modified, added or deleted, the elements in at least two arrays only need to be modified correspondingly, the background program does not need to be changed, and the development, maintenance and extension of the background program are facilitated.
In the embodiment of the present disclosure, the relationship between the uploaded target file and the background storage variable is clarified, and the file processing method and system may support multiple file types, for example, for the text file shown in fig. 3A and the table file shown in fig. 3B, the background storage is represented as multiple arrays. When the field or the record is changed, only the elements of the array need to be changed, and the background program does not need to be changed, so that the expandability of the program is facilitated.
For example, the file information further includes a file version number, for example, a file version number "1.1" shown in fig. 3A and 3B. The file processing method further comprises the following steps: and checking the target file according to the file version number and the number of the file fields. In some embodiments, the target file may include multiple versions, and different versions may have different numbers of file fields. For example, the target file may include version 1.1, version 1.2, version 1.3, and accordingly version 1.1 may have 6 file fields, version 1.2 may have 8 file fields, and version 1.3 may have 9 file fields. In the file processing method, the target file may be checked according to a correspondence between the file version number and the number of the file fields. Exemplarily, when the file version number of the target file is 1.1, if the number of file fields is 6, the target file is judged to be correct; and when the file version number of the target file is 1.1, if the number of the file fields is 7, judging that the target file is incorrect.
In some exemplary embodiments, the values of the second array respectively correspond to the values of the fourth array, and the file processing method further includes: and checking the validity of the values of the fourth array corresponding to the values of the second arrays one by one according to the values of the second arrays. The rule of the verification may be verified according to a data type, for example, the data type may include a character string, a numerical value, and the like. The verification rule can also be used for verifying according to the data length, the character string length and the like. In the embodiment of the disclosure, after the data in the target file is stored in each array, the verification is performed, and the target file does not need to be verified one by one, so that the efficiency of data verification can be improved.
For example, for the target file shown in fig. 3A or 3B, for the second array value of "payee name" (i.e., field name "payee name"), it is checked whether the value of the fourth array corresponding to "payee name" is a character string; for the value of the second array being "transfer amount" (i.e., field name "transfer amount"), it is checked whether the value of the fourth array corresponding to the "transfer amount" is a numeric value.
Some exemplary embodiments of the present disclosure also provide a file processing apparatus. The document processing apparatus will be described in detail below with reference to fig. 4. Fig. 4 schematically shows a block diagram of the structure of a document processing apparatus according to an embodiment of the present disclosure. As shown in fig. 4, the document processing apparatus 40 includes: a receiving module 41, configured to receive a target file; an identifying module 42, configured to identify a file type of the target file; a policy determining module 43, configured to determine a file parsing policy according to the file type, where the file parsing policy corresponds to the file type; the analysis module 44 is configured to analyze file information of the target file according to the determined file analysis policy, where the file information includes a plurality of file fields and at least one record; and a storage module 45, configured to store the plurality of file fields and the at least one record of the parsed target file according to a predetermined format.
In an embodiment of the present disclosure, the predetermined format includes at least two arrays, and the storing the parsed file fields and the at least one record of the target file according to the predetermined format includes: storing the plurality of analyzed file fields of the target file in two arrays including a first array and a second array; and storing the parsed at least one record of the target file in at least two arrays including a third array and a fourth array.
For example, the values of the first array are subscripts of the second array; and/or the value of the third array is the subscript of the fourth array.
For example, each file field includes a field identification, location information of a column in which the field is located, and a field name, the subscript of the first array includes the field identification, the value of the first array includes the location information of the column in which the field is located, and the value of the second array includes the field name.
For example, each record includes a plurality of record data, the number of the plurality of record data is consistent with the number of the plurality of file fields; and each of the record data includes a data identifier, position information of a column in which the record data is located, and a data value, the subscript of the third array includes the data identifier, the value of the third array includes the position information of the column in which the record data is located, and the value of the fourth array includes the data value.
For example, the file information further includes a file version number, and the file processing apparatus further includes a checking module 46, configured to check the target file according to the file version number and the number of the file fields.
For example, the file type includes a text file type, and the analyzing the file information of the target file according to the determined file analysis policy specifically includes: and analyzing the target file line by line to analyze the file information of the target file.
For example, the file type includes a table file type, and the analyzing the file information of the target file according to the determined file analysis policy specifically includes: and respectively analyzing the header part and the data part of the table file to analyze the file information of the target file.
For example, the values of the second array respectively correspond to the values of the fourth array, and the file processing apparatus further includes: and the validity checking module 47 is configured to check validity of the values of the fourth array corresponding to the values of the second arrays one by one according to the values of the second arrays.
According to some embodiments of the present disclosure, any of the receiving module 41, the identifying module 42, the policy determining module 43, the parsing module 44, the storing module 45, the checking module 46 and the validity checking module 47 may be combined into one module to be implemented, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to some embodiments of the present disclosure, at least one of the receiving module 41, the identifying module 42, the policy determining module 43, the parsing module 44, the storing module 45, the checking module 46, and the validity checking module 47 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or by a suitable combination of any of them. Alternatively, at least one of the receiving module 41, the identifying module 42, the policy determining module 43, the parsing module 44, the storing module 45, the checking module 46 and the validity checking module 47 may be at least partially implemented as a computer program module which, when executed, may perform a corresponding function.
Fig. 5 schematically shows a block diagram of an electronic device adapted to implement the above-described document processing method according to an embodiment of the present disclosure.
As shown in fig. 5, an electronic device 500 according to an embodiment of the present disclosure includes a processor 501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. The processor 501 may comprise, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 501 may also include onboard memory for caching purposes. Processor 501 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the disclosure.
In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are stored. The processor 501, the ROM502, and the RAM 503 are connected to each other by a bus 504. The processor 501 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM502 and/or the RAM 503. Note that the programs may also be stored in one or more memories other than the ROM502 and the RAM 503. The processor 501 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to some embodiments of the present disclosure, electronic device 500 may also include an input/output (I/O) interface 509, input/output (I/O) interface 509 also being connected to bus 504. The electronic device 500 may also include one or more of the following components connected to the I/O interface 509: an input portion 506 including a keyboard, a mouse, and the like; an output section 507 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 509 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to some embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to some embodiments of the present disclosure, a computer-readable storage medium may include ROM502 and/or RAM 503 and/or one or more memories other than ROM502 and RAM 503 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated in the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the item recommendation method provided by the embodiment of the disclosure.
The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 501. According to some embodiments of the present disclosure, the systems, devices, modules, units, etc. described above may be implemented by computer program modules.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed in the form of a signal on a network medium, downloaded and installed through the communication section 509, and/or installed from the removable medium 511. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program, when executed by the processor 501, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to some embodiments of the present disclosure.
According to some embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (13)

1. A file processing method, the file processing method being performed by an electronic device, the method comprising:
receiving a target file;
identifying a file type of the target file;
determining a file analysis strategy according to the file type, wherein the file analysis strategy corresponds to the file type;
analyzing the file information of the target file according to the determined file analysis strategy, wherein the file information comprises a plurality of file fields and at least one record; and
storing the parsed file fields and at least one record of the target file in a predetermined format,
wherein the prescribed format includes at least two arrays,
the storing the parsed file fields and the at least one record of the target file according to the predetermined format includes:
storing the plurality of analyzed file fields of the target file in two arrays including a first array and a second array; and
and storing the at least one analyzed record of the target file in at least two arrays including a third array and a fourth array.
2. The file processing method according to claim 1, wherein the values of the first array are subscripts of the second array; and/or the value of the third array is the subscript of the fourth array.
3. The file processing method according to claim 2, wherein each of the file fields includes a field identification, location information of a column in which the field is located, and a field name, the subscript of the first array includes the field identification, the value of the first array includes the location information of the column in which the field is located, and the value of the second array includes the field name.
4. The file processing method according to claim 2 or 3, wherein each record includes a plurality of record data, the number of the plurality of record data being identical to the number of the plurality of file fields; and
each of the record data includes a data identifier, position information of a column in which the record data is located, and a data value, the subscript of the third array includes the data identifier, the value of the third array includes the position information of the column in which the record data is located, and the value of the fourth array includes the data value.
5. The file processing method according to claim 2 or 3, wherein the file information further includes a file version number,
the file processing method further comprises the following steps: and checking the target file according to the file version number and the number of the file fields.
6. The file processing method according to claim 2 or 3, wherein the file type includes a text file type,
the analyzing the file information of the target file according to the determined file analysis strategy specifically includes:
and analyzing the target file line by line to analyze the file information of the target file.
7. The file processing method according to claim 2 or 3, wherein the file type includes a table file type,
the analyzing the file information of the target file according to the determined file analysis strategy specifically includes:
and respectively analyzing the header part and the data part of the table file to analyze the file information of the target file.
8. The file processing method according to claim 2 or 3, wherein the values of the second array respectively correspond to the values of the fourth array,
the file processing method further comprises the following steps: and checking the validity of the values of the fourth array corresponding to the values of the second arrays one by one according to the values of the second arrays.
9. A document processing apparatus, characterized by comprising:
the receiving module is used for receiving the target file;
the identification module is used for identifying the file type of the target file;
the strategy determining module is used for determining a file analysis strategy according to the file type, wherein the file analysis strategy corresponds to the file type;
the analysis module is used for analyzing the file information of the target file according to the determined file analysis strategy, wherein the file information comprises a plurality of file fields and at least one record; and
a storage module for storing the parsed file fields and at least one record of the target file according to a predetermined format,
wherein the prescribed format includes at least two arrays,
the storing the parsed file fields and the at least one record of the target file according to the predetermined format includes:
storing the plurality of analyzed file fields of the target file in two arrays including a first array and a second array; and
and storing the at least one analyzed record of the target file in at least two arrays including a third array and a fourth array.
10. The file processing apparatus according to claim 9, wherein the values of the first array are subscripts of the second array; and/or the value of the third array is the subscript of the fourth array.
11. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the file processing method of any of claims 1-8.
12. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform a method of file processing according to any of claims 1 to 8.
13. A computer program product comprising a computer program which, when executed by a processor, implements a file processing method according to any one of claims 1 to 8.
CN202110916317.XA 2021-08-10 2021-08-10 File processing method and device, electronic equipment and storage medium Pending CN113590554A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110916317.XA CN113590554A (en) 2021-08-10 2021-08-10 File processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110916317.XA CN113590554A (en) 2021-08-10 2021-08-10 File processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113590554A true CN113590554A (en) 2021-11-02

Family

ID=78256997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110916317.XA Pending CN113590554A (en) 2021-08-10 2021-08-10 File processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113590554A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110069449A (en) * 2019-03-20 2019-07-30 平安科技(深圳)有限公司 Document handling method, device, computer equipment and storage medium
CN111125225A (en) * 2019-12-24 2020-05-08 北京数衍科技有限公司 Bill data analysis method and device and server
CN111427899A (en) * 2020-03-17 2020-07-17 中国建设银行股份有限公司 Method, device, equipment and computer readable medium for storing file
CN112445866A (en) * 2019-08-13 2021-03-05 北京京东振世信息技术有限公司 Data processing method and device, computer readable medium and electronic equipment
US10983815B1 (en) * 2019-10-31 2021-04-20 Jpmorgan Chase Bank, N.A. System and method for implementing a generic parser module

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110069449A (en) * 2019-03-20 2019-07-30 平安科技(深圳)有限公司 Document handling method, device, computer equipment and storage medium
CN112445866A (en) * 2019-08-13 2021-03-05 北京京东振世信息技术有限公司 Data processing method and device, computer readable medium and electronic equipment
US10983815B1 (en) * 2019-10-31 2021-04-20 Jpmorgan Chase Bank, N.A. System and method for implementing a generic parser module
CN111125225A (en) * 2019-12-24 2020-05-08 北京数衍科技有限公司 Bill data analysis method and device and server
CN111427899A (en) * 2020-03-17 2020-07-17 中国建设银行股份有限公司 Method, device, equipment and computer readable medium for storing file

Similar Documents

Publication Publication Date Title
US11563674B2 (en) Content based routing method and apparatus
US20220301051A1 (en) Systems and methods for managing a loan application
CN109359194B (en) Method and apparatus for predicting information categories
CN113946425A (en) Service processing method and device, electronic equipment and computer readable storage medium
CN112463729B (en) Data file warehousing method and device, electronic equipment and medium
CN113344523A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN114237651A (en) Installation method and device of cloud native application, electronic equipment and medium
CN114020689A (en) Data processing method, data processing device, electronic device, and storage medium
US11029923B2 (en) Technical building block
CN113535565B (en) Interface use case generation method, device, equipment and medium
CN113448578A (en) Page data processing method, processing system, electronic device and readable storage medium
CN113590554A (en) File processing method and device, electronic equipment and storage medium
US10601892B2 (en) Collaborative bookmarks
CN114218254A (en) Report generation method, device, equipment and storage medium
CN116795951A (en) Service processing method, device, electronic equipment and storage medium
CN117395314A (en) Request processing method, request processing device, electronic equipment and storage medium
CN114254621A (en) Document auditing method and device, electronic equipment and storage medium
CN114817314A (en) Data processing method and device, electronic equipment and storage medium
CN114218160A (en) Log processing method and device, electronic equipment and medium
CN114861054A (en) Information acquisition method and device, electronic equipment and storage medium
CN115309404A (en) File generation method and device, electronic equipment and storage medium
CN113051090A (en) Interface processing method and device, interface calling method and device, system and medium
CN115687002A (en) Data processing method, apparatus, device, medium, and program product
CN115421738A (en) Version deployment method and device, electronic equipment and storage medium
CN114169485A (en) Certificate manufacturing method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination