CN110909523B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN110909523B
CN110909523B CN201911212915.8A CN201911212915A CN110909523B CN 110909523 B CN110909523 B CN 110909523B CN 201911212915 A CN201911212915 A CN 201911212915A CN 110909523 B CN110909523 B CN 110909523B
Authority
CN
China
Prior art keywords
file
json
hashmap
mapfile
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911212915.8A
Other languages
Chinese (zh)
Other versions
CN110909523A (en
Inventor
朱晓峰
翁星晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN201911212915.8A priority Critical patent/CN110909523B/en
Publication of CN110909523A publication Critical patent/CN110909523A/en
Application granted granted Critical
Publication of CN110909523B publication Critical patent/CN110909523B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a data processing method and a data processing device, wherein the method is used for acquiring a JSON file and a JSON file tag of a data structure to be converted, the JSON file tag is used for indicating the type of the JSON file, acquiring a first HashMap file and a tag chain set file mapfile which are stored in a configuration file in advance and corresponding to the type of the JSON file, and converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile. Based on a first HashMap file and a label chain set file mapfile which are stored in a configuration file in advance, under the condition that codes corresponding to JSON file formats do not need to be modified, conversion of JSON files with different formats can be completed quickly, and the purpose of improving efficiency of converting unstructured files into structured files is achieved.

Description

Data processing method and device
Technical Field
The invention belongs to the field of data processing, and particularly relates to a data processing method and device.
Background
At present, unstructured documents can be converted into structured documents through various methods, so that the structured documents after the unstructured documents are converted have the characteristic of specification and can be better used for data processing.
In practical application, when the json file is used as an unstructured document and is converted into a structured document, the json file format needs to be modified by strong coupling, and the purpose of modifying the json file format needs to be achieved by modifying codes.
However, the json file has multiple formats, and each format corresponds to different codes, so that when the json file with different formats is encountered, the codes corresponding to the json file format need to be modified respectively, and the efficiency of converting the unstructured file into the structured file is reduced.
Disclosure of Invention
In view of the above, the present invention aims to provide a data processing method and apparatus, which are used for solving the problem that when JSON files with different formats are encountered, codes corresponding to the JSON file formats need to be modified respectively, and the efficiency of converting unstructured files into structured files is reduced. The technical proposal is as follows:
the embodiment of the invention discloses a data processing method, which comprises the following steps:
acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file, wherein the JSON file tag is used for indicating the type of the JSON file;
Based on the JSON file type, a first HashMap file and a label chain set file mapfile which are pre-stored in a configuration file and correspond to the JSON file type are obtained, wherein the mapfile is used for defining the mapping relation between a target structured file and the JSON file;
and converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the label chain set file mapfile.
Optionally, the obtaining, based on the JSON file type, a HashMap file and a label chain set file mapfile, which are stored in the configuration file in advance and correspond to the JSON file type, includes:
searching a catalog indicating the type stored in the JSON file in the configuration file;
acquiring a first HashMap file and a label chain set file mapfile which are pre-stored in the directory and correspond to the JSON file type;
the first HashMap file and the label chain set file mapfile which are generated in advance based on JSON files of different JSON file types are stored under each directory of the configuration file;
the configuration file comprises an XML configuration file.
Optionally, the method further comprises:
if the configuration file does not store the first hashMap file and the label chain set file mapfile corresponding to the JSON file type, establishing a catalog corresponding to the JSON file in the configuration file;
Generating the first HashMap file and the label chain set file mapfile based on the JSON file corresponding to the JSON file type, and storing the first HashMap file and the label chain set file mapfile under the catalog.
Optionally, the process of generating the first HashMap file based on the JSON file includes:
analyzing the JSON file to obtain node information of all nodes in the JSON file, wherein the node information comprises a JSOnSpot object and a tag chain of the JSOnSpot object;
and taking the tag chain as a key, taking the JSOnSpot object as a value, and storing the JSOnSpot object and the tag chain in a first HashMap in a key-value pair mode to obtain a first HashMap file comprising all node information of the JSON file.
Optionally, the process of generating the tag chain set file mapfile based on the JSON file includes:
analyzing the JSON file to obtain the mapping relation between the JSON file and the structured file;
setting the number N of the lines of the mapfile based on the number of the lines of the JSOnSpot objects of the type object array OA, wherein the value of N is the number 1 of the lines of the OA;
taking a label chain set of a non-OA type in the JSON file as a first layer of the mapfile;
Taking a tag chain set of each OA type in the JSON file as a second layer of the mapfile, wherein a row is correspondingly generated by the tag chain set of each OA type in the second layer, and if one or more OAs are nested in the nodes of the OA type, the tag chain set of the row where the nodes of the OA type are located is contained in the tag chain set of the row where all the OA types are nested;
and defining a structured file name for each row in the first layer and the second layer, and generating the mapfile containing the corresponding relation between the structured file name and the label chain set.
Optionally, the method further comprises:
and modifying the structured file names, the arrangement sequence of the tag chain set and the screening fields in the mapfile to obtain a new mapfile.
Optionally, the converting the JSON file of the data structure to be converted into the target structured file based on the first HashMap file and the label chain set file mapfile includes
Analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the tag chain values of all JSOnSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of the type objects of each OA in the JSON file of the data structure to be converted;
Traversing each row of the label chain set file mapfile, wherein each row of the label chain set file mapfile corresponds to a structured file, and acquiring a label chain set corresponding to the structured file;
judging the type of each tag chain in the tag chain set according to the first HashMap file;
if the label chain is of a non-OA type, acquiring a value equal to the label chain of the non-OA type in the second HashMap file, and adding the value and a field separator to a character string of a structured file to be written;
if the OA type label chain is adopted, the number of type objects of each OA is determined based on the third HashMap file, the second HashMap file is searched based on the number of the type objects of each OA, and the obtained value and field separator are added to a character string of a structured file to be written;
and writing the character string into the structured file after each line of traversal of the tag chain set file mapfile is completed until the traversal of the tag chain set file mapfile is completed, so as to obtain a target structured file corresponding to the JSON file of the data structure to be converted.
The embodiment of the invention discloses a data processing device, which comprises:
The JSON file conversion device comprises a first acquisition module, a second acquisition module and a storage module, wherein the first acquisition module is used for acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file, and the JSON file tag is used for indicating the type of the JSON file;
the second acquisition module is used for acquiring a first HashMap file and a label chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type based on the JSON file type, wherein the mapfile is used for defining the mapping relation between the target structured file and the JSON file;
and the conversion module is used for converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the label chain set file mapfile.
Optionally, the second obtaining module includes:
the searching unit is used for searching a directory indicating the type of the JSON file stored in the configuration file;
the first acquisition unit is used for acquiring a first HashMap file and a label chain set file mapfile which are pre-stored in the catalog and correspond to the JSON file type; the first HashMap file and the label chain set file mapfile which are generated in advance based on JSON files of different JSON file types are stored under each directory of the configuration file; the configuration file comprises an XML configuration file.
Optionally, the conversion module includes:
the analysis unit is used for analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the tag chain values of all JSOnSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of the type objects of each OA in the JSON file of the data structure to be converted;
the first traversing unit is used for traversing each row of the label chain set file mapfile, wherein each row of the label chain set file mapfile corresponds to one structured file, and a label chain set corresponding to the structured file is obtained;
the judging unit is used for judging the type of each tag chain in the tag chain set according to the first HashMap file;
the second obtaining unit is configured to obtain a value of the second HashMap file key equal to the label chain of the non-OA type if the label chain of the non-OA type is the label chain of the non-OA type, and add the value and the field separator to a character string of the structured file to be written;
the third obtaining unit is configured to determine, if the label chain is of an OA type, based on the third HashMap file, the number of type objects of each OA, find the second HashMap file based on the number of type objects of each OA, and add the obtained value and field separator to a character string of a structured file to be written;
And a fourth obtaining unit, when the tag chain set file mapfile completes one line of traversal, writing the character string into the structured file until the traversal of the tag chain set file mapfile is completed, and obtaining a target structured file corresponding to the JSON file of the data structure to be converted.
Compared with the prior art, the technical scheme provided by the invention has the following advantages:
the method comprises the steps of obtaining a JSON file and a JSON file tag of a data structure to be converted, wherein the JSON file tag is used for indicating the type of the JSON file, obtaining a first HashMap file and a tag chain set file mapfile which are stored in a configuration file in advance and correspond to the type of the JSON file, and converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the mapfile. Based on a first HashMap file and a label chain set file mapfile which are stored in a configuration file in advance, under the condition that codes corresponding to JSON file formats do not need to be modified, conversion of JSON files with different formats can be completed quickly, and the purpose of improving efficiency of converting unstructured files into structured files is achieved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present invention;
fig. 2 is a flowchart of acquiring a first HashMap file and a label chain set file mapfile corresponding to JSON file types stored in advance in a configuration file according to an embodiment of the present invention;
FIG. 3 is a flow chart of another data processing method according to an embodiment of the present invention;
FIG. 4 is a flowchart of generating a first HashMap file based on a JSON file according to an embodiment of the present invention;
FIG. 5 is a flowchart of a tag chain set file mapfile generated based on a JSON file provided by an embodiment of the present invention;
FIG. 6 is a flowchart of converting a JSON file of a data structure to be converted into a target structured file according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
The invention provides a data processing method and device, which are used for solving the problem that codes corresponding to JSON file formats need to be modified respectively when JSON files with different formats are encountered, and the efficiency of converting unstructured files into structured files is reduced.
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
For a better understanding of structured and unstructured data, structured and unstructured data are explained here as follows:
structured data: namely row data, is stored in a database, can logically express the realized data by using a two-dimensional table structure, is not easy to change in structure definition, and has fixed length.
Unstructured data: the data, which is inconvenient to represent by a two-dimensional logical table of the database, has variable field lengths, and the record of each field may be composed of sub-fields that may or may not be repeatable.
As known from the background art, in the prior art, when a JSON file is used as an unstructured document, the JSON file needs to be strongly coupled to modify the JSON file format, and the purpose of modifying the JSON file format needs to be achieved by modifying codes.
Therefore, the invention provides a data processing method and device, which can quickly complete conversion of JSON files with different formats under the condition that codes corresponding to the JSON file format do not need to be modified based on a first HashMap file and a label chain set file mapfile which are pre-stored in a configuration file, so that the aim of improving the efficiency of converting an unstructured file into a structured file is fulfilled.
As shown in fig. 1, a flowchart of a data processing method provided by an embodiment of the present invention is shown, where the method includes the following steps:
s101, acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file.
In S101, the JSON (JavaScript ObjectNotation) file is a lightweight data exchange format.
In the process of concretely implementing the S101, a guarantee is provided for subsequent parsing of the JSON file by acquiring the JSON file of the data to be converted, and besides acquiring the JSON file, a JSON file tag of the JSON file is required to be acquired, wherein the JSON file tag is used for indicating the type of the JSON file.
It should be noted that JSON files exist in different file types, i.e., JSON file formats are different.
S102, based on the JSON file type, acquiring a first HashMap file and a label chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type.
In S102, in the field of computer science, the configuration file (configuration file) is a computer file that can configure parameters and initial settings for some computer programs. JSON file types are used to distinguish JSON files of different formats.
In the process of specifically implementing S102, in the configuration file, since the file type of the JSON file has a corresponding relationship with the corresponding first HashMap file and tag chain set file mapfile, the file type based on the JSON file can obtain the first HashMap file and tag chain set file mapfile in the configuration file.
Specifically, the first HashMap file consists of < key: value > is formed, and value values can be obtained quickly according to key values. The label chain is a key value, and the JSOnSpot object is a value.
The tag chain set file mapfile is composed of < structured file name: tag chain set > and is used for defining the mapping relation between the target structured file and the JSON file.
S103, converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the label chain set file mapfile.
In the specific implementation S103, based on the first HashMap file and the label chain set file mapfile, converting the JSON file of the data structure to be converted into a target structured file, where the structure of the generated target structured file is as follows: xxx|xxx I xxx xxxx.
According to the data processing method disclosed by the embodiment of the invention, the method obtains the JSON file and the JSON file tag of the data structure to be converted, the JSON file tag is used for indicating the type of the JSON file, the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the type of the JSON file are obtained, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the mapfile. Based on a first HashMap file and a label chain set file mapfile which are stored in a configuration file in advance, under the condition that codes corresponding to JSON file formats do not need to be modified, conversion of JSON files with different formats can be completed quickly, and the purpose of improving efficiency of converting unstructured files into structured files is achieved.
Based on the data processing method disclosed in fig. 1 in the above embodiment of the present invention, S102 shown in fig. 1 is a specific implementation process for obtaining, based on JSON file types, a first HashMap file and a label chain set file mapfile corresponding to JSON file types stored in advance in a configuration file, where, as shown in fig. 2, the implementation process mainly includes:
s201, searching a directory indicating the type of the JSON file stored in the configuration file.
In the process of implementing S201, the configuration file stores a plurality of directories of JSON file types, that is, information corresponding to the JSON file types to be acquired can be found through the directories.
For example: in the configuration file, the catalog stores a JSON file type 1, a JSON file type 2 and a JSON file type 3, and if the obtained JSON file type corresponding to the JSON file is the JSON file type 2, the information corresponding to the JSON file type 2 can be obtained only by searching the catalog in the configuration file.
It should be noted that, the configuration file stores in advance a first HashMap file and a tag chain set file mapfile corresponding to JSON files of different JSON file types, when encountering JSON files of different JSON file types, the configuration file can start multiple threads and analyze the JSON files of different JSON file types.
It should be noted that, the first HashMap file and the tag chain set file mapfile corresponding to JSON files of different JSON file types are generated by parsing the JSON files in advance, and then the first HashMap file and the tag chain set file mapfile corresponding to each JSON file are stored in each directory in the configuration file, where the configuration file includes but is not limited to an XML configuration file. By the method, the first HashMap file and the label chain set file mapfile corresponding to the JSON file are obtained efficiently, and guarantee is provided for converting the JSON file into the structured file in a subsequent efficient mode.
It should be noted that, specifically, the JSON file type corresponding to the JSON file is stored in the configuration file, which may be stored according to the actual situation, and will not be described herein.
S202, acquiring a first HashMap file and a label chain set file mapfile of a corresponding JSON file type which are pre-stored in a directory.
In the process of specifically implementing S202, in the configuration file, a required JSON file type is searched through a directory, and then a first HashMap file and a tag chain set file mapfile corresponding to the JSON file of the JSON file type under the directory are obtained through the JSON file type.
According to the data processing method disclosed by the embodiment of the invention, the method obtains the JSON file and the JSON file tag of the data structure to be converted, the JSON file tag is used for indicating the type of the JSON file, the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the type of the JSON file are obtained, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the mapfile. Based on a first HashMap file and a label chain set file mapfile which are stored in a configuration file in advance, under the condition that codes corresponding to JSON file formats do not need to be modified, conversion of JSON files with different formats can be completed quickly, and the purpose of improving efficiency of converting unstructured files into structured files is achieved.
As shown in fig. 3, a flowchart of another data processing method according to an embodiment of the present invention mainly includes:
s301, acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file.
The execution principle of S301 is the same as that of S101, and will not be described in detail here.
S302, judging whether a first HashMap file and a label chain set file mapfile corresponding to the JSON file type are stored in the configuration file, if yes, executing S303, and if not, executing S304.
In the specific implementation S302, after obtaining the JSON file of the data structure to be converted and the JSON file tag of the JSON file, it is required to determine whether there are a first HashMap file and a tag chain set file mapfile corresponding to the JSON file stored in advance in the configuration file, and if the configuration file stores the first HashMap file and the tag chain set file mapfile corresponding to the JSON file, converting the data structure of the JSON file based on the first HashMap file and the tag chain set file mapfile. If the first HashMap file and the label chain set file mapfile corresponding to the JSON file are not stored in the configuration file, the JSON file needs to be analyzed to obtain the corresponding first HashMap file and label chain set file mapfile, and then the JSON file is subjected to data structure conversion.
S303, based on the JSON file type, acquiring a first HashMap file and a label chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type.
The execution principle of S303 is identical to that of S102 described above, and will not be described in detail here.
S304, establishing a catalog corresponding to the JSON file in the configuration file.
In the specific implementation S304, if the configuration file does not have the first HashMap file and the tag chain set file mapfile corresponding to the JSON file of the data structure to be converted stored in advance, firstly, based on the JSON file type of the JSON file of the data structure to be converted, a directory corresponding to the JSON file type of the JSON file of the data structure to be converted is built in the configuration file, so that the first HashMap file and the tag chain set file mapfile generated after the JSON file of the data structure to be converted is analyzed are stored in the directory.
It should be noted that, a directory corresponding to the JSON file type of the JSON file of the data structure to be converted is built in the configuration file, and the directory may be permanently stored in the configuration file, and may be deleted when not needed.
S305, generating a first HashMap file and a label chain set file mapfile based on a JSON file corresponding to the JSON file type, and storing the first HashMap file and the label chain set file mapfile under a directory.
In the specific implementation S305, after the directory corresponding to the JSON file of the structure data to be converted is built in the configuration file, the JSON file of the structure data to be converted needs to be parsed, a first HashMap file and a label chain set file mapfile are generated, and then the generated first HashMap file and label chain set file mapfile are stored in the built directory.
S306, converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the label chain set file mapfile.
The execution principle of S306 is the same as that of S103, and will not be described here again.
It should be noted that, a specific implementation process for generating the first HashMap file based on the JSON file, as shown in fig. 4, mainly includes:
S401, analyzing the JSON file to obtain node information of all nodes in the JSON file.
In S401, each node in the JSON file can be divided into these several types: array, object, and common element. The array can be divided into a common array and an OA array; the normal elements may be divided into numeric elements, string elements, boolean elements, and NULL elements.
In the process of specifically implementing S401, the JSON file is parsed by the org.json.jar package and the recursive algorithm, so as to obtain all node information of the JSON file, where the node information includes a JSON spot object and a tag chain of the JSON spot object. Wherein, the JSOnSpot object contains the following elements: full path spotname, node type spottype, node depth, parent node parent, and node value type spotvaluetype of a node.
S402, taking a tag chain as a key, taking a JSOnSpot object as a value, and storing the JSOnSpot object and the tag chain in a first HashMap in a key-value pair mode to obtain a first HashMap file comprising all node information of the JSON file.
In S402, hashMap is a data structure of java language, consisting of < key: value > is formed, and value values can be obtained quickly according to key values. That is, the JsonSpot object can be obtained quickly through the tag chain.
In the specific implementation process of S402, by taking the tag chain as a key value, and the JsonSpot object as a value, the key value pair < key: and storing the JSOnSpot object and the tag chain in the first HashMap in a value > mode to obtain a first HashMap file comprising all node information of the JSON file.
In the specific implementation process of S401 and S402, the JSON file is parsed, and the mapping relationship between the JSON file and the structured file is obtained, that is, the key value is the structured file name, and the value is the JSON tag chain set.
It should be noted that, a specific implementation process of the tag chain set file mapfile generated based on the JSON file, as shown in fig. 5, mainly includes:
s501, analyzing the JSON file to obtain the mapping relation between the JSON file and the structured file.
S502, setting the number N of lines of mapfile based on the number of JSOnSpot objects of the type object array OA, wherein the value of N is the number of lines of OA plus 1.
In the process of concretely implementing S502, the number of rows of mapfile is set with the number of JsonSpot objects of the type object array OA. The extra row corresponds to a row number of the non-OA type, and the row number contains a tag chain set of the non-OA type.
S503, taking a label chain set of non-OA types in the JSON file as a first layer of mapfile.
In S503, the set of tag chains of the non-OA type means that no OA type tag is included in the tag chain.
S504, taking a label chain set of each OA type in the JSON file as a second layer of mapfile.
In S504, the set of OA-type tag chains refers to the tag chains containing OA-type tags.
In the implementation S504, in the second layer of mapfile, each row is a label chain set of each OA type, where if one or more OA is nested in a node of the OA type, the label chain set of the row in which the node of the OA type is located is included in the label chain set of the row in which all OA types are nested. That is, any one OA-corresponding set of tag chains will appear in all its offspring OA-corresponding sets of tag chains.
For example: OA2 is nested in the OA type node OA1, and OA3 is nested in the OA2, so that the label chain set corresponding to the OA2 comprises the label chain set of the OA1, and the label chain set corresponding to the OA3 comprises the label chain set of the OA2 and naturally also comprises the label chain set of the OA 1.
S505, defining a structured file name for each row in the first layer and the second layer, and generating a mapfile containing the corresponding relation between the structured file name and the label chain set.
In the specific implementation S505, a structured filename is defined for a line number corresponding to a non-OA type in the first layer, and a structured filename is defined for each line of an OA type in the second layer, so as to generate a mapfile including a correspondence between the structured filename and a tag chain set.
Specifically, the file structure of mapfile is < structured filename: tag chain set >.
For example: the content of the label chain set file mapfile obtained by analyzing the JSON file is as follows:
TABLE0:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount
TABLE1:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount|ruleEngine-->decisionTreeId|ruleEngine-->sugg-->actionCode|ruleEngine-->sugg-->_class
TABLE2:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount|ruleEngine-->decisionTreeId|ruleEngine-->sugg-->actionCode|ruleEngine-->sugg-->_class|ruleEngine-->ruleEngineDetail-->ruleId|ruleEngine-->ruleEngineDetail-->result|ruleEngine-->ruleEngineDetail-->ruleName
where TABLE0 is the structured filename of the first layer.
the transactioncode|bypass|ischeck|request sequence|_class|id|servicecode|finalDecission- > actioncode|finalDecission- > classification|extension- > financommon- > openwork institute|extension- > financommon- > transaction is a set of non-OA tag chains.
TABLE1 is the first structured filename for the second layer.
TransactionCode bypasspassischeck requestSeq_class_id_serviceCode finalide @ tissue @ and @ method of preparing the same an actionCode|finalDecision- > -class|extension- > -finanCommon- > -opponentInstruction extension- > financommon- > transam count- > rule engine- > resolution treeid rule engine- > sugg- > actionCode- > rule engine- > sugg- > class is the tag chain set of OA, where ruleEngine is a label of OA type.
TABLE2 is the second structured filename for the second layer.
TransactionCode|bypasspassisclock|requestSeq|_class|_id|serviceCode|finalDecission- > actionCode|finalDecission- >. The composition extension a tag chain set of precision treeid|rule engine- > sugg- > actioncode|rule engine- > sugg- > class|rule engine- > rule engine detail- > rule id|rule engine- > rule engine detail- > result|rule engine- > rule engine is OA, wherein ruleEngine detail is also an OA type, which is nested in the ruleEngine OA.
Wherein TABLE0 represents the first layer, TABLE1, TABLE2 represents the second layer, and these three are also the target structured filenames. From the content of the label chain set file mapfile, it is known that TABLE1 contains a set of label chains of TABLE0 and TABLE2 contains a set of label chains of TABLE0 and TABLE 1. The tag chains are separated by "|".
Note that OA corresponding to TABLE1 is embedded in OA.
Namely, a tag chain set of ruleEngine- > ruleEngine Detail- > ruleed|ruleengine- > ruleeengineDetail- > in ruleEngineDetail, TABLE is nested in ruleEngine
result|rule Engine- > rule EngineDetail- > rule Name is the tag chain set of OA nested by TABLE 1.
If the mapfile needs to be modified, optionally, the structured file name, the arrangement sequence of the tag chain set and the filtering field in the mapfile can be modified to obtain a new mapfile.
If the name of TABLE1 can be modified, the last layer of label chain set nested in the label chain set can be modified, and mapfile can be optimized.
According to the data processing method disclosed by the embodiment of the invention, the JSON file of the data structure to be converted and the JSON file tag of the JSON file are obtained, the JSON file tag is used for indicating the type of the JSON file, the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the type of the JSON file are obtained based on the type of the JSON file, the mapfile is used for defining the target structured file, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the tag chain set file mapfile. Through the first HashMap file and the label chain set file mapfile, the aim of converting the JSON file of the data structure to be converted into the target structured file is fulfilled, and the problem that codes corresponding to JSON file formats need to be modified respectively when JSON files of different formats are encountered is solved, and the efficiency of converting unstructured files into structured files is reduced.
Based on the data processing method disclosed in fig. 1 according to the embodiment of the present invention, S103 shown in fig. 1 is a specific implementation process for converting a JSON file of a data structure to be converted into a target structured file based on a first HashMap file and a tag chain set file mapfile, and as shown in fig. 6, the specific implementation process mainly includes:
s601, analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file.
In S601, the second HashMap file stores all tag chains in the JSON file of the data structure to be converted, and the subscripts and their corresponding value are added, which should be noted that if the data structure is non-OA, no subscript exists.
The third HashMap file stores the number of objects of each type object array OA in the JSON file of the data structure to be converted.
In the specific implementation S601, line-by-line parsing is performed on JSON files of the data structure to be converted, all tag chains in each line of JSON files are parsed into a second HashMap file by adding subscripts and corresponding value values, and the cycle number of each OA in each line of JSON files is parsed into a third HashMap file, so that the second HashMap file and the third HashMap file are obtained.
Specifically, the second HashMap file is formed by < key: value >, and covers all value values of the JSON message. The specific < key: value > of the second Hashmap is < tag chain+subscript set: value > in json file, wherein the subscript set refers to one subscript for one OA in the tag chain.
The third HashMap file is composed of < key: value >, and the < key: value > of the third HashMap file is < OA tag chain+subscript set: the number of objects contained in the OA array >, and the file is related to the OA node. Only in the OA node, the number of objects refers to the number of objects in the OA array, and because OA is nested in the OA tag, a subscript is also needed here.
S602, traversing each row of the label chain set file mapfile, wherein each row of the label chain set file mapfile corresponds to one structured file, and acquiring a label chain set corresponding to the structured file.
S603, judging the type of each tag chain in the tag chain set according to the first HashMap file.
In S603, it is actually determined whether each tag chain in the tag chain set includes an OA tag.
S604, if the label chain is of a non-OA type, obtaining a value of the label chain corresponding to the non-OA type in the second HashMap file, and adding the value and the field separator to a character string of the structured file to be written. In S604, the tag chain of the non-OA type indicates that no OA tag is included in the tag chain.
S605, if the label chain is of the OA type, determining the number of type objects of each OA based on the third HashMap file, searching for the second HashMap file based on the number of type objects of each OA, and adding the obtained value and the field separator to a character string of the structured file to be written.
In S605, the OA-type tag chain means that the OA tag is included in the tag chain.
In the specific implementation S605, based on the third HashMap file, a value corresponding to the OA tag chain, that is, the number of objects contained in the OA array, where the number of objects corresponds to the number of rows of the structured file, and then sequentially searching for the second HashMap by adding an subscript (from 1 to the number of objects) to the OA tag chain to obtain the value, and adding the value and the field separator to the character string of the structured file to be written.
In the process of concretely implementing S602 to S605:
if the father node is an OA node, the circulation times of the OA node are found from the third HashMap, circulation is started, the values corresponding to the label chains with the same level and the same depth in the label chain set are read from the second HashMap, and if one O object is still nested in the layer or the label chain set is not processed, the downward processing is continued in a recursion mode.
Where the same hierarchy refers to having a common parent node.
If the father node is an O node or is empty, the value corresponding to the label chain with the same level and the same depth under the father node is read out from the second HashMap. If the tag chain set has not been processed, processing continues down in a recursive manner.
And S606, writing the character strings into the structured file after each line of traversal of the label chain set file mapfile is completed until the traversal of the label chain set file mapfile is completed, and obtaining a target structured file corresponding to the JSON file of the data structure to be converted.
In S606, the character string refers to the character string appended to the structured document to be written as described above.
According to the data processing method disclosed by the embodiment of the invention, the method obtains the JSON file and the JSON file tag of the data structure to be converted, the JSON file tag is used for indicating the type of the JSON file, the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the type of the JSON file are obtained, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the mapfile. Based on a first HashMap file and a label chain set file mapfile which are stored in a configuration file in advance, under the condition that codes corresponding to JSON file formats do not need to be modified, conversion of JSON files with different formats can be completed quickly, and the purpose of improving efficiency of converting unstructured files into structured files is achieved.
Based on the data processing method disclosed in the above embodiment of the present invention, the following JSON file is exemplified here:
The content of the JSON file is:
{"_id":"CREDIT_CARD_CHN:4807172220181118025222","_class":"com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog","serviceCode":"MONI","transactionCode":"0005","requestSeq":"4807172220181118025222","ruleEngine":[{"decisionTreeId":"520","sugg":{"_class":"com.bocsoft.ruleProcess.dto.MonitorResponseInfo","actionCode":"ALLOW"},"ruleEngineDetail":[{"ruleId":"12876","result":"false","ruleName":"combine_cond_2876"},{"ruleId":"12968","result":"false","ruleName":"combine_cond_2968"}]},{"decisionTreeId":"1510","sugg":{"_class":"com.bocsoft.ruleProcess.dto.MonitorResponseInfo","actionCode":"ALLOW"},"ruleEngineDetail":[{"ruleId":"2876","result":"false","ruleName":"combine_cond_2876"},{"ruleId":"AAAA","result":"false","ruleName":"combine_cond_2876"},{"ruleId":"BBBBB","result":"false","ruleName":"combine_cond_2876"},{"ruleId":"2968","result":"false","ruleName":"combine_cond_2968"}]}],"finalDecision":{"_class":"com.bocsoft.ruleProcess.dto.MonitorResponseInfo","actionCode":"ALLOW"},"bypassed":"false","isCheck":"false","extension":{"financeCommon":{"transAmount":"2100.0","opponentInstitution":"48021240"}}}
firstly, determining a JSON file type based on the content of the JSON file, and calling a first HashMap file and a label chain set file mapfile corresponding to the JSON file type in a configuration file through the JSON file type.
Specifically, the file structure of the first HashMap file is < tag chain, jsonSpot object >.
The content of the first HashMap file is:
ruleEngine-->ruleEngineDetail-->ruleId:
JsonSpot[spotname=ruleEngine-->ruleEngineDetail-->ruleId,spottype=L,parent=ruleEngine-->ruleEngineDetail,degree=3,spotvaluetype=S]
extension-->financeCommon-->opponentInstitution:
JsonSpot[spotname=extension-->financeCommon-->opponentInstitution,spottype=L,parent=extension-->financeCommon,degree=3,spotvaluetype=S]
finalDecision-->actionCode:
JsonSpot[spotname=finalDecision-->actionCode,spottype=L,parent=finalDecision,degree=2,spotvaluetype=S]
extension-->financeCommon-->transAmount:
JsonSpot[spotname=extension-->financeCommon-->transAmount,spottype=L,parent=extension-->financeCommon,degree=3,spotvaluetype=S]
transactionCode:
JsonSpot[spotname=transactionCode,spottype=L,parent=,degree=1,spotvaluetype=S]
ruleEngine-->sugg:
JsonSpot[spotname=ruleEngine-->sugg,spottype=M,parent=ruleEngine,degree=2,spotvaluetype=O]
extension-->financeCommon:
JsonSpot[spotname=extension-->financeCommon,spottype=M,parent=extension,degree=2,spotvaluetype=O]
ruleEngine-->ruleEngineDetail-->result:
JsonSpot[spotname=ruleEngine-->ruleEngineDetail-->result,spottype=L,parent=ruleEngine-->ruleEngineDetail,degree=3,spotvaluetype=S]
bypassed:
JsonSpot[spotname=bypassed,spottype=L,parent=,degree=1,spotvaluetype=S]
isCheck:
JsonSpot[spotname=isCheck,spottype=L,parent=,degree=1,spotvaluetype=S]
ruleEngine-->sugg-->actionCode:
JsonSpot[spotname=ruleEngine-->sugg-->actionCode,spottype=L,parent=ruleEngine-->sugg,degree=3,spotvaluetype=S]
ruleEngine-->ruleEngineDetail:
JsonSpot[spotname=ruleEngine-->ruleEngineDetail,spottype=M,parent=ruleEngine,degree=2,spotvaluetype=OA]
finalDecision:
JsonSpot[spotname=finalDecision,spottype=M,parent=,degree=1,spotvaluetype=O]
ruleEngine-->decisionTreeId:
JsonSpot[spotname=ruleEngine-->decisionTreeId,spottype=L,parent=ruleEngine,degree=2,spotvaluetype=S]
ruleEngine-->sugg-->_class:
JsonSpot[spotname=ruleEngine-->sugg-->_class,spottype=L,parent=ruleEngine-->sugg,degree=3,spotvaluetype=S]
extension:
JsonSpot[spotname=extension,spottype=M,parent=,degree=1,spotvaluetype=O]
requestSeq:
JsonSpot[spotname=requestSeq,spottype=L,parent=,degree=1,spotvaluetype=S]
_class:
JsonSpot[spotname=_class,spottype=L,parent=,degree=1,spotvaluetype=S]
finalDecision-->_class:JsonSpot[spotname=finalDecision-->_class,spottype=L,parent=finalDecision,degree=2,spotvaluetype=S]
_id:
JsonSpot[spotname=_id,spottype=L,parent=,degree=1,spotvaluetype=S]
serviceCode:
JsonSpot[spotname=serviceCode,spottype=L,parent=,degree=1,spotvaluetype=S]
ruleEngine:
JsonSpot[spotname=ruleEngine,spottype=M,parent=,degree=1,spotvaluetype=OA]
ruleEngine-->ruleEngineDetail-->ruleName:
JsonSpot[spotname=ruleEngine-->ruleEngineDetail-->ruleName,spottype=L,parent=ruleEngine-->ruleEngineDetail,degree=3,spotvaluetype=S]
wherein spotname is the same tag chain name, spottype is the node type: m is an intermediate node, L is a leaf node; the pallet is the label chain name of the parent node of the label chain; degree is depth, i.e., the number of tags on the tag chain; spotvaluetype is a tag type: s is a character, N is a numerical value, B is a Boolean type, SA is a character array, O (object) is an object, NA is a numerical value array, and OA is an object array.
Specifically, the file structure of mapfile is < structured filename: tag chain set >.
The content of the Mapfile file is:
TABLE0:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount
TABLE1:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount|ruleEngine-->decisionTreeId|ruleEngine-->sugg-->actionCode|ruleEngine-->sugg-->_class
TABLE2:transactionCode|bypassed|isCheck|requestSeq|_class|_id|serviceCode|finalDecision-->actionCode|finalDecision-->_class|extension-->financeCommon-->opponentInstitution|extension-->financeCommon-->transAmount|ruleEngine-->decisionTreeId|ruleEngine-->sugg-->actionCode|ruleEngine-->sugg-->_class|ruleEngine-->ruleEngineDetail-->ruleId|ruleEngine-->ruleEngineDetail-->result|ruleEngine-->ruleEngineDetail-->ruleName
then, analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file.
The specific content of the second HashMap file is as follows: maplebelvalue < tag chain+subscript of OA tag: value >.
ruleEngine-->ruleEngineDetail-->result#1|1:false
ruleEngine-->ruleEngineDetail-->result#1|0:false
transactionCode:0005
bypassed:false
isCheck:false
finalDecision-->_class:com.bocsoft.ruleProcess.dto.MonitorResponseInfo
ruleEngine-->sugg-->_class#1:com.bocsoft.ruleProcess.dto.MonitorResponseInfo
ruleEngine-->ruleEngineDetail-->result#0|1:false
_id:CREDIT_CARD_CHN:4807172220181118025222
ruleEngine-->sugg-->_class#0:com.bocsoft.ruleProcess.dto.MonitorResponseInfo
ruleEngine-->ruleEngineDetail-->result#0|0:false
ruleEngine-->ruleEngineDetail-->result#1|2:false
ruleEngine-->ruleEngineDetail-->result#1|3:false
extension-->financeCommon-->opponentInstitution:48021240
finalDecision-->actionCode:ALLOW
extension-->financeCommon-->transAmount:2100.0
ruleEngine-->ruleEngineDetail-->ruleId#1|1:AAAA
ruleEngine-->ruleEngineDetail-->ruleId#1|0:2876
ruleEngine-->ruleEngineDetail-->ruleId#1|3:2968
ruleEngine-->ruleEngineDetail-->ruleId#0|0:12876
ruleEngine-->ruleEngineDetail-->ruleId#1|2:BBBBB
ruleEngine-->ruleEngineDetail-->ruleId#0|1:12968
ruleEngine-->ruleEngineDetail-->ruleName#1|3:combine_cond_2968
ruleEngine-->ruleEngineDetail-->ruleName#0|0:combine_cond_2876
ruleEngine-->ruleEngineDetail-->ruleName#1|2:combine_cond_2876
ruleEngine-->ruleEngineDetail-->ruleName#1|1:combine_cond_2876
_class:com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog
requestSeq:4807172220181118025222
ruleEngine-->ruleEngineDetail-->ruleName#1|0:combine_cond_2876
ruleEngine-->decisionTreeId#1:1510
ruleEngine-->decisionTreeId#0:520
serviceCode:MONI
ruleEngine-->sugg-->actionCode#0:ALLOW
ruleEngine-->sugg-->actionCode#1:ALLOW
ruleEngine-->ruleEngineDetail-->ruleName#0|1:combine_cond_2968
The specific contents of the third HashMap file are: mapoanum: < OA-type tag chain: number of times >.
ruleEngine:2
ruleEngine-->ruleEngineDetail#0:2
ruleEngine-->ruleEngineDetail#1:4
It should be noted that these are chains of OA type tags.
rule Engine 2 indicates that the object array contains two objects.
The rule Engine- > rule Engine detail #0:2 indicates that rule Engine detail is an array of objects nested under rule Engine that contains two objects when the rule Engine subscript is 0.
The ruleEngine- > ruleEngine detail #1:4 indicates that ruleEngine detail is an array of objects nested under ruleEngine that contains four objects when the ruleEngine subscript is 1.
And finally, traversing the label chain set file mapfile to obtain a corresponding label chain based on the second HashMap file and the third HashMap obtained by analyzing the JSON file of the data structure to be converted, and searching a JSOnSpot object corresponding to the obtained label chain in the first HashMap file based on the obtained label chain.
According to a target structured file defined by mapfile, converting the JSON file into the target structured file, wherein the content of the target structured file is as follows:
TABLE0:
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|
TABLE1:
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|520|
ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|
TABLE2:
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|520|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|12876|false|combine_cond_2876|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|520|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|12968|false|combine_cond_2968|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|2876|false|
combine_cond_2876|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|AAAA|false|
combine_cond_2876|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|
ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|BBBBB|false|
combine_cond_2876|
0005|false|false|4807172220181118025222|
com.bocsoft.ruleProcess.mongo.bo.RuleEngineLog|
CREDIT_CARD_CHN:4807172220181118025222|MONI|ALLOW|
com.bocsoft.ruleProcess.dto.MonitorResponseInfo|48021240|2100.0|1510|
ALLOW|com.bocsoft.ruleProcess.dto.MonitorResponseInfo|2968|false|
combine_cond_2968|
according to the data processing method disclosed by the embodiment of the invention, the second HashMap file and the third HashMap obtained by analyzing the JSON file of the data structure to be converted are traversed, the corresponding tag chain is obtained by traversing the tag chain set file mapfile, and then the JSOnSpot object corresponding to the obtained tag chain is searched in the first HashMap file based on the obtained tag chain. According to the target structured file defined by mapfile, the JSON file is converted into the target structured file, so that the conversion of JSON files with different formats can be rapidly completed under the condition that codes corresponding to the JSON file format are not required to be modified, and the aim of improving the efficiency of converting unstructured files into structured files is fulfilled.
Based on the data processing method disclosed in the above embodiment of the present invention, the embodiment of the present invention also correspondingly discloses a data processing device, as shown in fig. 7, which is a schematic structural diagram of the data processing device provided in the embodiment of the present invention, and mainly includes: a first acquisition module 70, a second acquisition module 71 and a conversion module 72.
The first obtaining module 70 is configured to obtain a JSON file of the data structure to be converted, and a JSON file tag of the JSON file, where the JSON file tag is used to indicate a JSON file type.
The second obtaining module 71 is configured to obtain, based on the JSON file type, a first HashMap file and a label chain set file mapfile, which are stored in the configuration file in advance and correspond to the JSON file type, where the mapfile is used to define a mapping relationship between the target structured file and the JSON file.
The conversion module 72 is configured to convert the JSON file of the data structure to be converted into the target structured file based on the first HashMap file and the label chain assembly file mapfile.
An alternative structure of the second acquisition module 71 in the embodiment of the present invention is: the second acquisition module 71 includes a search unit and a second acquisition unit.
And the searching unit is used for searching the directory which indicates the type of the JSON file and is stored in the configuration file.
The first acquisition unit is used for acquiring a first HashMap file and a label chain set file mapfile of a corresponding JSON file type which are pre-stored in a directory; the method comprises the steps that a first HashMap file and a label chain set file mapfile which are generated in advance based on JSON files of different JSON file types are stored in each directory of a configuration file; the configuration file includes an XML configuration file.
Optionally, the first obtaining unit includes: a first generation subunit.
The first generation subunit is used for analyzing the JSON file to obtain node information of all nodes in the JSON file, wherein the node information comprises JSOnSpot objects and tag chains of the JSOnSpot objects; and the JSOnSpot object and the tag chain are stored in the first HashMap in a key-value pair mode by taking the tag chain as a key, so that a first HashMap file comprising all node information of the JSON file is obtained.
Optionally, the first obtaining unit includes: and a second generation subunit.
The second generation subunit is used for analyzing the JSON file to obtain the mapping relation between the JSON file and the structured file; setting the number N of lines of mapfile based on the number of JSOnSpot objects of the type object array OA, wherein the value of N is the number of lines of OA plus 1; taking a label chain set of a non-OA type in the JSON file as a first layer of mapfile; taking the tag chain set of each OA type in the JSON file as a second layer of mapfile, and correspondingly generating a row of tag chain sets of each OA type in the second layer, wherein if one or more OAs are nested in the nodes of the OA type, the tag chain set of the row in which the nodes of the OA type are located is contained in the tag chain set of the row in which all the OA types are nested; and defining a structured file name for each row in the first layer and the second layer, and generating mapfile containing the corresponding relation between the structured file name and the label chain set.
An alternative configuration of the conversion module 72 in an embodiment of the present invention is: the conversion module 72 includes an parsing unit, a first traversing unit, a judging unit, a second acquiring unit, a third acquiring unit, a second traversing unit, and a fourth acquiring unit.
The analysis unit is used for analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the tag chain values of all JSOnSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of the type objects of each OA array in the JSON file of the data structure to be converted.
The first traversing unit is used for traversing each row of the label chain set file mapfile, wherein each row of the label chain set file mapfile corresponds to one structured file, and a label chain set corresponding to the structured file is obtained.
The judging unit judges the type of each tag chain in the tag chain set according to the first HashMap.
And the second obtaining unit is used for obtaining the value of the label chain which is equal to the non-OA type in the second HashMap file if the label chain is the non-OA type, and adding the value and the field separator to the character string to be written into the structured file.
The third obtaining unit is configured to determine, if the label chain is of an OA type, based on the third HashMap file, the number of type objects of each OA, search for the second HashMap file based on the number of type objects of each OA, and append the obtained value and field separator to a character string of the structured file to be written.
And the fourth acquisition unit is used for writing the character strings into the structured file after each line of traversal of the tag chain set file mapfile is completed until the traversal of the tag chain set file mapfile is completed, so as to obtain a target structured file corresponding to the JSON file of the data structure to be converted.
According to the data processing device disclosed by the embodiment of the invention, by acquiring the JSON file and the JSON file tag of the data structure to be converted, the JSON file tag is used for indicating the JSON file type, the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type are acquired, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the mapfile. Based on a first HashMap file and a label chain set file mapfile which are stored in a configuration file in advance, under the condition that codes corresponding to JSON file formats do not need to be modified, conversion of JSON files with different formats can be completed quickly, and the purpose of improving efficiency of converting unstructured files into structured files is achieved.
Based on the data processing device disclosed in the embodiment of the present invention, the data processing device further includes: the device comprises a building module and a storage module.
The establishing module is used for establishing a catalog corresponding to the JSON file in the configuration file if the first HashMap file and the label chain set file mapfile corresponding to the JSON file type are not stored in the configuration file.
The storage module is used for generating a first HashMap file and a label chain set file mapfile based on the JSON file corresponding to the JSON file type, and storing the first HashMap file and the label chain set file mapfile under the directory.
Based on the data processing device disclosed in the embodiment of the present invention, the data processing device further includes: the module is modified.
And the modification module is used for modifying the structured file names, the arrangement sequence of the tag chain set and the screening fields in the mapfile to obtain a new mapfile.
According to the data processing device disclosed by the embodiment of the invention, by acquiring the JSON file and the JSON file tag of the data structure to be converted, the JSON file tag is used for indicating the JSON file type, the first HashMap file and the tag chain set file mapfile which are stored in the configuration file in advance and correspond to the JSON file type are acquired, and the JSON file of the data structure to be converted is converted into the target structured file based on the first HashMap file and the mapfile. Based on a first HashMap file and a label chain set file mapfile which are stored in a configuration file in advance, under the condition that codes corresponding to JSON file formats do not need to be modified, conversion of JSON files with different formats can be completed quickly, and the purpose of improving efficiency of converting unstructured files into structured files is achieved.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (8)

1. A method of data processing, the method comprising:
acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file, wherein the JSON file tag is used for indicating the type of the JSON file;
based on the JSON file type, a first HashMap file and a label chain set file mapfile which are prestored in a configuration file and correspond to the JSON file type are obtained, wherein the label chain set file mapfile is used for defining a mapping relation between a target structured file and the JSON file, and the first HashMap file comprises a JSOnSpot object and a label chain corresponding to the JSOnSpot object;
Converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and a label chain set file mapfile;
the converting the JSON file of the data structure to be converted into the target structured file based on the first HashMap file and the label chain set file mapfile includes:
analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the tag chain values of all JSOnSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of the type objects of each OA in the JSON file of the data structure to be converted;
traversing each row of the label chain set file mapfile, wherein each row of the label chain set file mapfile corresponds to a structured file, and acquiring a label chain set corresponding to the structured file;
judging the type of each tag chain in the tag chain set according to the first HashMap file;
if the label chain is of a non-OA type, acquiring a value equal to the label chain of the non-OA type in the second HashMap file, and adding the value and a field separator to a character string of a structured file to be written;
If the OA type label chain is adopted, the number of type objects of each OA is determined based on the third HashMap file, the second HashMap file is searched based on the number of the type objects of each OA, and the obtained value and field separator are added to a character string of a structured file to be written;
and writing the character string into the structured file after each line of traversal of the tag chain set file mapfile is completed until the traversal of the tag chain set file mapfile is completed, so as to obtain a target structured file corresponding to the JSON file of the data structure to be converted.
2. The method according to claim 1, wherein the obtaining, based on the JSON file type, a first HashMap file and a tag chain set file mapfile corresponding to the JSON file type, which are pre-stored in a configuration file, includes:
searching a catalog indicating the type stored in the JSON file in the configuration file;
acquiring a first HashMap file and a label chain set file mapfile which are pre-stored in the directory and correspond to the JSON file type;
the first HashMap file and the label chain set file mapfile which are generated in advance based on JSON files of different JSON file types are stored under each directory of the configuration file;
The configuration file comprises an XML configuration file.
3. The method as recited in claim 1, further comprising:
if the configuration file does not store the first hashMap file and the label chain set file mapfile corresponding to the JSON file type, establishing a catalog corresponding to the JSON file in the configuration file;
generating the first HashMap file and the label chain set file mapfile based on the JSON file corresponding to the JSON file type, and storing the first HashMap file and the label chain set file mapfile under the catalog.
4. A method according to claim 3, wherein the generating a first HashMap file based on the JSON file corresponding to the JSON file type comprises:
analyzing the JSON file to obtain node information of all nodes in the JSON file, wherein the node information comprises a JSOnSpot object and a tag chain of the JSOnSpot object;
and taking the tag chain as a key, taking the JSOnSpot object as a value, and storing the JSOnSpot object and the tag chain in a first HashMap in a key-value pair mode to obtain a first HashMap file comprising all node information of the JSON file.
5. A method according to claim 3, wherein the process of generating a tag chain set file mapfile based on JSON files corresponding to the JSON file types comprises:
analyzing the JSON file to obtain the mapping relation between the JSON file and the structured file;
setting the number N of the lines of the mapfile based on the number of the lines of the JSOnSpot objects of the type object array OA, wherein the value of N is the number 1 of the lines of the OA;
taking a label chain set of a non-OA type in the JSON file as a first layer of the mapfile;
taking a tag chain set of each OA type in the JSON file as a second layer of the mapfile, wherein a row is correspondingly generated by the tag chain set of each OA type in the second layer, and if one or more OAs are nested in the nodes of the OA type, the tag chain set of the row where the nodes of the OA type are located is contained in the tag chain set of the row where all the OA types are nested;
and defining a structured file name for each row in the first layer and the second layer, and generating the mapfile containing the corresponding relation between the structured file name and the label chain set.
6. The method as recited in claim 5, further comprising:
And modifying the structured file names, the arrangement sequence of the tag chain set and the screening fields in the mapfile to obtain a new mapfile.
7. A data processing apparatus, the apparatus comprising:
the JSON file conversion device comprises a first acquisition module, a second acquisition module and a storage module, wherein the first acquisition module is used for acquiring a JSON file of a data structure to be converted and a JSON file tag of the JSON file, and the JSON file tag is used for indicating the type of the JSON file;
the second acquisition module is used for acquiring a first HashMap file and a label chain set file mapfile, which are stored in a configuration file in advance and correspond to the JSON file type, based on the JSON file type, wherein the label chain set file mapfile is used for defining the mapping relation between a target structured file and the JSON file, and the first HashMap file comprises a JSOnSpot object and a label chain corresponding to the JSOnSpot object;
the conversion module is used for converting the JSON file of the data structure to be converted into a target structured file based on the first HashMap file and the label chain set file mapfile;
wherein, the conversion module includes:
the analysis unit is used for analyzing the JSON file of the data structure to be converted to obtain a second HashMap file and a third HashMap file, wherein the second HashMap file stores the tag chain values of all JSOnSpot objects in the JSON file of the data structure to be converted, and the third HashMap file stores the number of the type objects of each OA in the JSON file of the data structure to be converted;
The first traversing unit is used for traversing each row of the label chain set file mapfile, wherein each row of the label chain set file mapfile corresponds to one structured file, and a label chain set corresponding to the structured file is obtained;
the judging unit is used for judging the type of each tag chain in the tag chain set according to the first HashMap file;
the second obtaining unit is configured to obtain a value of the second HashMap file key equal to the label chain of the non-OA type if the label chain of the non-OA type is the label chain of the non-OA type, and add the value and the field separator to a character string of the structured file to be written;
the third obtaining unit is configured to determine, if the label chain is of an OA type, based on the third HashMap file, the number of type objects of each OA, find the second HashMap file based on the number of type objects of each OA, and add the obtained value and field separator to a character string of a structured file to be written;
and a fourth obtaining unit, when the tag chain set file mapfile completes one line of traversal, writing the character string into the structured file until the traversal of the tag chain set file mapfile is completed, and obtaining a target structured file corresponding to the JSON file of the data structure to be converted.
8. The apparatus of claim 7, wherein the second acquisition module comprises:
the searching unit is used for searching a directory indicating the type of the JSON file stored in the configuration file;
the first acquisition unit is used for acquiring a first HashMap file and a label chain set file mapfile which are pre-stored in the catalog and correspond to the JSON file type; the first HashMap file and the label chain set file mapfile which are generated in advance based on JSON files of different JSON file types are stored under each directory of the configuration file; the configuration file comprises an XML configuration file.
CN201911212915.8A 2019-12-02 2019-12-02 Data processing method and device Active CN110909523B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911212915.8A CN110909523B (en) 2019-12-02 2019-12-02 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911212915.8A CN110909523B (en) 2019-12-02 2019-12-02 Data processing method and device

Publications (2)

Publication Number Publication Date
CN110909523A CN110909523A (en) 2020-03-24
CN110909523B true CN110909523B (en) 2023-10-27

Family

ID=69821264

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911212915.8A Active CN110909523B (en) 2019-12-02 2019-12-02 Data processing method and device

Country Status (1)

Country Link
CN (1) CN110909523B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448923B (en) * 2020-04-17 2023-09-12 北京新氧科技有限公司 File generation method, device and terminal
CN114185855B (en) * 2022-02-15 2022-05-24 中博信息技术研究院有限公司 Simplified method and system for generating OFD file based on JSON

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389991A (en) * 2012-05-09 2013-11-13 阿里巴巴集团控股有限公司 Data interaction method, data interaction device, data conversion method and data conversion device
CN105787128A (en) * 2016-03-29 2016-07-20 四川秘无痕信息安全技术有限责任公司 Method for recovering Java serialized file data
CN106934011A (en) * 2017-03-09 2017-07-07 济南浪潮高新科技投资发展有限公司 A kind of structuring analysis method and device of JSON data
CN108037915A (en) * 2017-11-07 2018-05-15 福建天泉教育科技有限公司 A kind of method and terminal of acquisition json configuration files

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101579493B1 (en) * 2015-01-08 2015-12-23 주식회사 파수닷컴 Staging control method for source code, Computer program for the same, Recording medium storing computer program for the same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389991A (en) * 2012-05-09 2013-11-13 阿里巴巴集团控股有限公司 Data interaction method, data interaction device, data conversion method and data conversion device
CN105787128A (en) * 2016-03-29 2016-07-20 四川秘无痕信息安全技术有限责任公司 Method for recovering Java serialized file data
CN106934011A (en) * 2017-03-09 2017-07-07 济南浪潮高新科技投资发展有限公司 A kind of structuring analysis method and device of JSON data
CN108037915A (en) * 2017-11-07 2018-05-15 福建天泉教育科技有限公司 A kind of method and terminal of acquisition json configuration files

Also Published As

Publication number Publication date
CN110909523A (en) 2020-03-24

Similar Documents

Publication Publication Date Title
US7558791B2 (en) System and method for ontology-based translation between directory schemas
US6829606B2 (en) Similarity search engine for use with relational databases
US8762410B2 (en) Document level indexes for efficient processing in multiple tiers of a computer system
US9141727B2 (en) Information search device, information search method, computer program, and data structure
AU2013329525C1 (en) System and method for recursively traversing the internet and other sources to identify, gather, curate, adjudicate, and qualify business identity and related data
WO2014010082A1 (en) Retrieval device, method for controlling retrieval device, and recording medium
US7725466B2 (en) High accuracy document information-element vector encoding server
CN110909523B (en) Data processing method and device
CN102521232B (en) Distributed acquisition and processing system and method of internet metadata
CN110222110A (en) A kind of resource description framework data conversion storage integral method based on ETL tool
CN111752542B (en) Database query interface engine based on XML template
US8145667B2 (en) Schemaless XML payload generation
CN114564482A (en) Multi-entity-oriented label system and processing method
WO2013111287A1 (en) Sparql query optimization method
CN109933589B (en) Data structure conversion method for data summarization based on ElasticSearch aggregation operation result
JP2006215735A (en) Duplicate website detection device
CN117454846B (en) Method for converting XSD file into JSON file
WO2016159819A1 (en) System and method for data search in a graph database
Liu et al. Finding smallest k-compact tree set for keyword queries on graphs using mapreduce
JP5194856B2 (en) Efficient indexing using compact decision diagrams
CN110309214A (en) A kind of instruction executing method and its equipment, storage medium, server
US20090055345A1 (en) UDDI Based Classification System
Attard et al. ExConQuer: Lowering barriers to RDF and Linked Data re-use
CN111552840A (en) Method for converting JSON data into tree-shaped hierarchical data
Sbai et al. JsonToOnto: building Owl2 ontologies from Json documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant