CN109284254A - A kind of general excel document handling method - Google Patents
A kind of general excel document handling method Download PDFInfo
- Publication number
- CN109284254A CN109284254A CN201811203170.4A CN201811203170A CN109284254A CN 109284254 A CN109284254 A CN 109284254A CN 201811203170 A CN201811203170 A CN 201811203170A CN 109284254 A CN109284254 A CN 109284254A
- Authority
- CN
- China
- Prior art keywords
- excel
- file
- title
- further include
- array
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000004458 analytical method Methods 0.000 claims abstract description 7
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 230000000737 periodic effect Effects 0.000 claims description 3
- 238000012517 data analytics Methods 0.000 abstract description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of general excel document handling method, belong to big data platform web crawlers data analytic technique field, the present invention passes through to the processing of excel objectification, the self attributes of excel are described by a configuration file, to realize to different editions, the dynamic analysis of the excel of different topic Types.
Description
Technical field
The present invention relates to big data platform web crawlers data analytic techniques, more particularly to a kind of general excel file
Processing method.
Background technique
The whole nation is mobile at present, connection, telecommunications, and for steel tower all in the data center for building oneself, data source has standard csv literary
Part, such as resource, performance, signaling, the data of the network managements such as charging, also some gives such as alarm of database.But also have
Many departments are because information security considers no corresponding file or database interface.The data of offer need account number cipher artificial
Derived xls, to reduce manual intervention, the xls that the general crawler script logged in using simulation is derived automatically from.For such class
Type difference (may be xls, it is also possible to xlsx), file title is multifarious (to be had plenty of unduplicated Chinese title, also there is weight
Multiple compound title, there are also the titles of financial column mode)
The prior art has the following deficiencies:
1, xls version problem: because it is a suffix that the difference of xls and xlsx is remarkable, the coding 03 of 2 files is GBK),
07+ is UTF-8, needs to handle respectively in the case of majority, a general only compatible version.
2, Chinese title variation or adjustment problem:
Because data source is other systems, other side may adjust the field of template Chinese or
Sequence, it is therefore desirable to which a verification handles or find data exception with fault tolerant mechanism.
3, xls data header type is excessive:
1), there is title Chinese unduplicated
2), there is duplicate compound title.
3), there is the title of column mode.
4), the title of biserial mode
A set of simple and practical mode is needed to handle.
Traditional processing mode generally uses java to develop, and needs indicated release for same type of file, and mark
Topic change needs to recompilate modification source code, and the development cycle is long, safeguards higher to technical requirements.
Summary of the invention
In order to solve the above technical problems, the invention proposes a kind of general excel document handling methods, using script
Language perl exploitation is provided a kind of general parsing, storage and authentication mechanism, is helped by the development mode of object-oriented
Maintenance personnel easily configures and finds data exception.
The technical scheme is that
A kind of general excel document handling method,
By the way that the processing of excel objectification, the self attributes of excel are described by a configuration file, thus realization pair
Different editions, the dynamic analysis of the excel of different topic Types.
Further, the present invention is developed using scripting language perl.
Further, related excel suffix must defeated xls and xlsx.
Further, related excel title is in regular range class.
Further, 2 parsing difference before and after comparison periodic group.
Further, concrete operation step are as follows:
1) loading configuration file obtains list of fields and saves as array, is added to absent field and provides corresponding prompt letter
Breath;
2) being determined according to parsing type needs analytic method to be loaded;
3) determine that file is xls or xlsx according to excel file suffixes, to generate file handle, and then to determine that file is compiled
Code processing mode;
4) title data is obtained according to the position of title in configuration, title is traversed according to parsing type, determines title
Array length and name-matches situation, the corresponding relationship of output header and column, and inconsistent place is prompted;
5) storage control file handle is opened, storage control file is generated;
6) data file handle is opened, according to the corresponding relationship of the title and position that 4) obtain, starts to be parsed line by line, is written
Data file;
7) storage is executed;
8) being compared this warehousing quantity and last time carries out alarm prompt for abnormal.
Further, in step 1), after acquisition list of fields saves as array, present field letter is obtained according to physics table name
The array of breath is compared to 2 to array.
Further, the objectification definition to Exccel is needed:
Meaning tag explanation:
Detailed description of the invention
Fig. 1 is workflow schematic diagram of the invention;
Fig. 2 is the definition figure of the objectification of Exccel.
Specific embodiment
More detailed elaboration is carried out to the contents of the present invention below:
The present invention provides a kind of general excel document handling method, including the objectification mode by excel, analytic uniform,
The problems such as handling file type disunity, solving title variation or Type-Inconsistencies.
By the way that the processing of excel objectification, the self attributes of excel are described by a configuration file, thus real
Now to different editions, the dynamic analysis of the excel of different topic Types
Related excel suffix must defeated xls and xlsx;Excel title is in regular range class.
2 parsing difference before and after periodic group can be compared.
As shown in Figure 1, concrete operation step are as follows:
1) loading configuration file obtains list of fields and saves as array, the number of present field information is then obtained according to physics table name
Group, compares to 2 to array, is added to absent field and provides corresponding prompt information;
2) being determined according to parsing type needs analytic method to be loaded;
3) determine that file is xls or xlsx according to excel file suffixes, to generate file handle, and then to determine that file is compiled
Code processing mode;
4) title data is obtained according to the position of title in configuration, title is traversed according to parsing type, determines title
Array length and name-matches situation, the corresponding relationship of output header and column, and inconsistent place is prompted;
5) storage control file handle is opened, storage control file is generated;
6) data file handle is opened, according to the corresponding relationship of the title and position that 4) obtain, starts to be parsed line by line, is written
Data file;
7) storage is executed;
8) being compared this warehousing quantity and last time carries out alarm prompt for abnormal.
Above-mentioned treatment process, crucial processing are objectification processing and the front and back data comparison of Exccel.
The definition of the objectification of Exccel as shown in Fig. 2,
Meaning tag explanation:
Claims (8)
1. a kind of general excel document handling method, which is characterized in that
By the way that the processing of excel objectification, the self attributes of excel are described by a configuration file, thus realization pair
Different editions, the dynamic analysis of the excel of different topic Types.
2. the method according to claim 1, wherein
Further include,
It is developed using scripting language perl.
3. the method according to claim 1, wherein
Further include,
Related excel suffix must defeated xls and xlsx.
4. the method according to claim 1, wherein
Further include,
Related excel title is in regular range class.
5. the method according to claim 1, wherein
Further include,
2 parsing difference before and after comparison periodic group.
6. the method according to claim 1, wherein
It include further concrete operation step are as follows:
1) loading configuration file obtains list of fields and saves as array, is added to absent field and provides corresponding prompt letter
Breath;
2) being determined according to parsing type needs analytic method to be loaded;
3) determine that file is xls or xlsx according to excel file suffixes, to generate file handle, and then to determine that file is compiled
Code processing mode;
4) title data is obtained according to the position of title in configuration, title is traversed according to parsing type, determines title
Array length and name-matches situation, the corresponding relationship of output header and column, and inconsistent place is prompted;
5) storage control file handle is opened, storage control file is generated;
6) data file handle is opened, according to the corresponding relationship of the title and position that 4) obtain, starts to be parsed line by line, is written
Data file;
7) storage is executed;
8) being compared this warehousing quantity and last time carries out alarm prompt for abnormal.
7. according to the method described in claim 6, it is characterized in that,
Further include:
In step 1), after acquisition list of fields saves as array, the array of present field information is obtained according to physics table name, to 2
It is compared to array.
8. the method according to the description of claim 7 is characterized in that
Further include:
Need the objectification definition to Exccel:
Meaning tag explanation:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811203170.4A CN109284254A (en) | 2018-10-16 | 2018-10-16 | A kind of general excel document handling method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811203170.4A CN109284254A (en) | 2018-10-16 | 2018-10-16 | A kind of general excel document handling method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109284254A true CN109284254A (en) | 2019-01-29 |
Family
ID=65177235
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811203170.4A Pending CN109284254A (en) | 2018-10-16 | 2018-10-16 | A kind of general excel document handling method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109284254A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102200968A (en) * | 2011-05-30 | 2011-09-28 | 深圳市五巨科技有限公司 | Method and device for removing duplications of EXCEL form data |
CN103500196A (en) * | 2013-09-22 | 2014-01-08 | 成都交大光芒科技股份有限公司 | EXCEL data export method and export device in multi-concurrence large data volume environment |
CN104991776A (en) * | 2015-07-09 | 2015-10-21 | 国云科技股份有限公司 | Excel reading and writing method based on configuration |
CN106844324A (en) * | 2017-02-22 | 2017-06-13 | 浪潮通用软件有限公司 | It is a kind of to change the method that column data exports as Excel forms |
CN106933835A (en) * | 2015-12-29 | 2017-07-07 | 航天信息软件技术有限公司 | The data lead-in method and system of a kind of compatibility parsing Excel file |
CN107180019A (en) * | 2016-03-11 | 2017-09-19 | 阿里巴巴集团控股有限公司 | Form methods of exhibiting and device |
CN107291674A (en) * | 2017-06-12 | 2017-10-24 | 广东川田卫生用品有限公司 | A kind of method that Excel list datas are converted to database format |
CN108280056A (en) * | 2017-12-26 | 2018-07-13 | 北京市天元网络技术股份有限公司 | A kind of Excel file analytic method |
-
2018
- 2018-10-16 CN CN201811203170.4A patent/CN109284254A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102200968A (en) * | 2011-05-30 | 2011-09-28 | 深圳市五巨科技有限公司 | Method and device for removing duplications of EXCEL form data |
CN103500196A (en) * | 2013-09-22 | 2014-01-08 | 成都交大光芒科技股份有限公司 | EXCEL data export method and export device in multi-concurrence large data volume environment |
CN104991776A (en) * | 2015-07-09 | 2015-10-21 | 国云科技股份有限公司 | Excel reading and writing method based on configuration |
CN106933835A (en) * | 2015-12-29 | 2017-07-07 | 航天信息软件技术有限公司 | The data lead-in method and system of a kind of compatibility parsing Excel file |
CN107180019A (en) * | 2016-03-11 | 2017-09-19 | 阿里巴巴集团控股有限公司 | Form methods of exhibiting and device |
CN106844324A (en) * | 2017-02-22 | 2017-06-13 | 浪潮通用软件有限公司 | It is a kind of to change the method that column data exports as Excel forms |
CN107291674A (en) * | 2017-06-12 | 2017-10-24 | 广东川田卫生用品有限公司 | A kind of method that Excel list datas are converted to database format |
CN108280056A (en) * | 2017-12-26 | 2018-07-13 | 北京市天元网络技术股份有限公司 | A kind of Excel file analytic method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20190073646A1 (en) | Consolidated blockchain-based data transfer control method and system | |
US20100192006A1 (en) | Database change verifier | |
US10817662B2 (en) | Expert system for automation, data collection, validation and managed storage without programming and without deployment | |
CN101021890A (en) | Method, system and server for checking page data | |
CN109815748A (en) | A kind of centre data source method for monitoring based on block chain | |
CN106777291A (en) | A kind of file resource management method and system | |
CN102932443A (en) | HDFS (hadoop distributed file system) cluster based distributed cloud storage system | |
CN111625528B (en) | Verification method and device for configuration management database and readable storage medium | |
CN109284254A (en) | A kind of general excel document handling method | |
EP2169587A1 (en) | Method and rule-repository for generating security-definitions for heterogeneous systems | |
Taylor et al. | A service oriented architecture for a health research data network | |
CN106302388A (en) | A kind of configurable information system security auditing method and device | |
CN103577746B (en) | Between a kind of information system based on XML configuration, authorize difference detecting method | |
Jeong et al. | Optimal control strategies depending on interest level for the spread of rumor | |
Anderson | Professionalization of journalism | |
Hubbard | Open access citation advantage? A local study at a large research university | |
US20020169642A1 (en) | Computer method for collection and delivery of insurance statutory reporting information | |
Brahmia et al. | High-level Operations for Changing Temporal Schema, Conventional Schema and Annotations, in the τXSchema Framework | |
CN105303362A (en) | Web signature process method based on storage process | |
Kumar et al. | Automation of detection of security vulnerabilities in Web Services using dynamic analysis | |
Huang et al. | A mixed linear quadratic optimal control problem with a controlled time horizon | |
Anciaux et al. | Minexp-card: limiting data collection using a smart card | |
CN105243319B (en) | The access method of controlling security of XBRL application platforms | |
CN103745299A (en) | Method and equipment for across-data-center data extracting | |
Hao et al. | Some Existence Results for High Order Fractional Impulsive Differential Equation on Infinite Interval |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190129 |