CN109284254A - A kind of general excel document handling method - Google Patents

A kind of general excel document handling method Download PDF

Info

Publication number
CN109284254A
CN109284254A CN201811203170.4A CN201811203170A CN109284254A CN 109284254 A CN109284254 A CN 109284254A CN 201811203170 A CN201811203170 A CN 201811203170A CN 109284254 A CN109284254 A CN 109284254A
Authority
CN
China
Prior art keywords
excel
file
title
further include
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811203170.4A
Other languages
Chinese (zh)
Inventor
邱建波
马涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Tianyuan Communication Information System Co Ltd
Original Assignee
Inspur Tianyuan Communication Information System Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Tianyuan Communication Information System Co Ltd filed Critical Inspur Tianyuan Communication Information System Co Ltd
Priority to CN201811203170.4A priority Critical patent/CN109284254A/en
Publication of CN109284254A publication Critical patent/CN109284254A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of general excel document handling method, belong to big data platform web crawlers data analytic technique field, the present invention passes through to the processing of excel objectification, the self attributes of excel are described by a configuration file, to realize to different editions, the dynamic analysis of the excel of different topic Types.

Description

A kind of general excel document handling method
Technical field
The present invention relates to big data platform web crawlers data analytic techniques, more particularly to a kind of general excel file Processing method.
Background technique
The whole nation is mobile at present, connection, telecommunications, and for steel tower all in the data center for building oneself, data source has standard csv literary Part, such as resource, performance, signaling, the data of the network managements such as charging, also some gives such as alarm of database.But also have Many departments are because information security considers no corresponding file or database interface.The data of offer need account number cipher artificial Derived xls, to reduce manual intervention, the xls that the general crawler script logged in using simulation is derived automatically from.For such class Type difference (may be xls, it is also possible to xlsx), file title is multifarious (to be had plenty of unduplicated Chinese title, also there is weight Multiple compound title, there are also the titles of financial column mode)
The prior art has the following deficiencies:
1, xls version problem: because it is a suffix that the difference of xls and xlsx is remarkable, the coding 03 of 2 files is GBK), 07+ is UTF-8, needs to handle respectively in the case of majority, a general only compatible version.
2, Chinese title variation or adjustment problem:
Because data source is other systems, other side may adjust the field of template Chinese or
Sequence, it is therefore desirable to which a verification handles or find data exception with fault tolerant mechanism.
3, xls data header type is excessive:
1), there is title Chinese unduplicated
2), there is duplicate compound title.
3), there is the title of column mode.
4), the title of biserial mode
A set of simple and practical mode is needed to handle.
Traditional processing mode generally uses java to develop, and needs indicated release for same type of file, and mark Topic change needs to recompilate modification source code, and the development cycle is long, safeguards higher to technical requirements.
Summary of the invention
In order to solve the above technical problems, the invention proposes a kind of general excel document handling methods, using script Language perl exploitation is provided a kind of general parsing, storage and authentication mechanism, is helped by the development mode of object-oriented Maintenance personnel easily configures and finds data exception.
The technical scheme is that
A kind of general excel document handling method,
By the way that the processing of excel objectification, the self attributes of excel are described by a configuration file, thus realization pair Different editions, the dynamic analysis of the excel of different topic Types.
Further, the present invention is developed using scripting language perl.
Further, related excel suffix must defeated xls and xlsx.
Further, related excel title is in regular range class.
Further, 2 parsing difference before and after comparison periodic group.
Further, concrete operation step are as follows:
1) loading configuration file obtains list of fields and saves as array, is added to absent field and provides corresponding prompt letter Breath;
2) being determined according to parsing type needs analytic method to be loaded;
3) determine that file is xls or xlsx according to excel file suffixes, to generate file handle, and then to determine that file is compiled Code processing mode;
4) title data is obtained according to the position of title in configuration, title is traversed according to parsing type, determines title Array length and name-matches situation, the corresponding relationship of output header and column, and inconsistent place is prompted;
5) storage control file handle is opened, storage control file is generated;
6) data file handle is opened, according to the corresponding relationship of the title and position that 4) obtain, starts to be parsed line by line, is written Data file;
7) storage is executed;
8) being compared this warehousing quantity and last time carries out alarm prompt for abnormal.
Further, in step 1), after acquisition list of fields saves as array, present field letter is obtained according to physics table name The array of breath is compared to 2 to array.
Further, the objectification definition to Exccel is needed:
Meaning tag explanation:
Detailed description of the invention
Fig. 1 is workflow schematic diagram of the invention;
Fig. 2 is the definition figure of the objectification of Exccel.
Specific embodiment
More detailed elaboration is carried out to the contents of the present invention below:
The present invention provides a kind of general excel document handling method, including the objectification mode by excel, analytic uniform, The problems such as handling file type disunity, solving title variation or Type-Inconsistencies.
By the way that the processing of excel objectification, the self attributes of excel are described by a configuration file, thus real Now to different editions, the dynamic analysis of the excel of different topic Types
Related excel suffix must defeated xls and xlsx;Excel title is in regular range class.
2 parsing difference before and after periodic group can be compared.
As shown in Figure 1, concrete operation step are as follows:
1) loading configuration file obtains list of fields and saves as array, the number of present field information is then obtained according to physics table name Group, compares to 2 to array, is added to absent field and provides corresponding prompt information;
2) being determined according to parsing type needs analytic method to be loaded;
3) determine that file is xls or xlsx according to excel file suffixes, to generate file handle, and then to determine that file is compiled Code processing mode;
4) title data is obtained according to the position of title in configuration, title is traversed according to parsing type, determines title Array length and name-matches situation, the corresponding relationship of output header and column, and inconsistent place is prompted;
5) storage control file handle is opened, storage control file is generated;
6) data file handle is opened, according to the corresponding relationship of the title and position that 4) obtain, starts to be parsed line by line, is written Data file;
7) storage is executed;
8) being compared this warehousing quantity and last time carries out alarm prompt for abnormal.
Above-mentioned treatment process, crucial processing are objectification processing and the front and back data comparison of Exccel.
The definition of the objectification of Exccel as shown in Fig. 2,
Meaning tag explanation:

Claims (8)

1. a kind of general excel document handling method, which is characterized in that
By the way that the processing of excel objectification, the self attributes of excel are described by a configuration file, thus realization pair Different editions, the dynamic analysis of the excel of different topic Types.
2. the method according to claim 1, wherein
Further include,
It is developed using scripting language perl.
3. the method according to claim 1, wherein
Further include,
Related excel suffix must defeated xls and xlsx.
4. the method according to claim 1, wherein
Further include,
Related excel title is in regular range class.
5. the method according to claim 1, wherein
Further include,
2 parsing difference before and after comparison periodic group.
6. the method according to claim 1, wherein
It include further concrete operation step are as follows:
1) loading configuration file obtains list of fields and saves as array, is added to absent field and provides corresponding prompt letter Breath;
2) being determined according to parsing type needs analytic method to be loaded;
3) determine that file is xls or xlsx according to excel file suffixes, to generate file handle, and then to determine that file is compiled Code processing mode;
4) title data is obtained according to the position of title in configuration, title is traversed according to parsing type, determines title Array length and name-matches situation, the corresponding relationship of output header and column, and inconsistent place is prompted;
5) storage control file handle is opened, storage control file is generated;
6) data file handle is opened, according to the corresponding relationship of the title and position that 4) obtain, starts to be parsed line by line, is written Data file;
7) storage is executed;
8) being compared this warehousing quantity and last time carries out alarm prompt for abnormal.
7. according to the method described in claim 6, it is characterized in that,
Further include:
In step 1), after acquisition list of fields saves as array, the array of present field information is obtained according to physics table name, to 2 It is compared to array.
8. the method according to the description of claim 7 is characterized in that
Further include:
Need the objectification definition to Exccel:
Meaning tag explanation:
CN201811203170.4A 2018-10-16 2018-10-16 A kind of general excel document handling method Pending CN109284254A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811203170.4A CN109284254A (en) 2018-10-16 2018-10-16 A kind of general excel document handling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811203170.4A CN109284254A (en) 2018-10-16 2018-10-16 A kind of general excel document handling method

Publications (1)

Publication Number Publication Date
CN109284254A true CN109284254A (en) 2019-01-29

Family

ID=65177235

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811203170.4A Pending CN109284254A (en) 2018-10-16 2018-10-16 A kind of general excel document handling method

Country Status (1)

Country Link
CN (1) CN109284254A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102200968A (en) * 2011-05-30 2011-09-28 深圳市五巨科技有限公司 Method and device for removing duplications of EXCEL form data
CN103500196A (en) * 2013-09-22 2014-01-08 成都交大光芒科技股份有限公司 EXCEL data export method and export device in multi-concurrence large data volume environment
CN104991776A (en) * 2015-07-09 2015-10-21 国云科技股份有限公司 Excel reading and writing method based on configuration
CN106844324A (en) * 2017-02-22 2017-06-13 浪潮通用软件有限公司 It is a kind of to change the method that column data exports as Excel forms
CN106933835A (en) * 2015-12-29 2017-07-07 航天信息软件技术有限公司 The data lead-in method and system of a kind of compatibility parsing Excel file
CN107180019A (en) * 2016-03-11 2017-09-19 阿里巴巴集团控股有限公司 Form methods of exhibiting and device
CN107291674A (en) * 2017-06-12 2017-10-24 广东川田卫生用品有限公司 A kind of method that Excel list datas are converted to database format
CN108280056A (en) * 2017-12-26 2018-07-13 北京市天元网络技术股份有限公司 A kind of Excel file analytic method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102200968A (en) * 2011-05-30 2011-09-28 深圳市五巨科技有限公司 Method and device for removing duplications of EXCEL form data
CN103500196A (en) * 2013-09-22 2014-01-08 成都交大光芒科技股份有限公司 EXCEL data export method and export device in multi-concurrence large data volume environment
CN104991776A (en) * 2015-07-09 2015-10-21 国云科技股份有限公司 Excel reading and writing method based on configuration
CN106933835A (en) * 2015-12-29 2017-07-07 航天信息软件技术有限公司 The data lead-in method and system of a kind of compatibility parsing Excel file
CN107180019A (en) * 2016-03-11 2017-09-19 阿里巴巴集团控股有限公司 Form methods of exhibiting and device
CN106844324A (en) * 2017-02-22 2017-06-13 浪潮通用软件有限公司 It is a kind of to change the method that column data exports as Excel forms
CN107291674A (en) * 2017-06-12 2017-10-24 广东川田卫生用品有限公司 A kind of method that Excel list datas are converted to database format
CN108280056A (en) * 2017-12-26 2018-07-13 北京市天元网络技术股份有限公司 A kind of Excel file analytic method

Similar Documents

Publication Publication Date Title
US20190073646A1 (en) Consolidated blockchain-based data transfer control method and system
US20100192006A1 (en) Database change verifier
US10817662B2 (en) Expert system for automation, data collection, validation and managed storage without programming and without deployment
CN101021890A (en) Method, system and server for checking page data
CN109815748A (en) A kind of centre data source method for monitoring based on block chain
CN106777291A (en) A kind of file resource management method and system
CN102932443A (en) HDFS (hadoop distributed file system) cluster based distributed cloud storage system
CN111625528B (en) Verification method and device for configuration management database and readable storage medium
CN109284254A (en) A kind of general excel document handling method
EP2169587A1 (en) Method and rule-repository for generating security-definitions for heterogeneous systems
Taylor et al. A service oriented architecture for a health research data network
CN106302388A (en) A kind of configurable information system security auditing method and device
CN103577746B (en) Between a kind of information system based on XML configuration, authorize difference detecting method
Jeong et al. Optimal control strategies depending on interest level for the spread of rumor
Anderson Professionalization of journalism
Hubbard Open access citation advantage? A local study at a large research university
US20020169642A1 (en) Computer method for collection and delivery of insurance statutory reporting information
Brahmia et al. High-level Operations for Changing Temporal Schema, Conventional Schema and Annotations, in the τXSchema Framework
CN105303362A (en) Web signature process method based on storage process
Kumar et al. Automation of detection of security vulnerabilities in Web Services using dynamic analysis
Huang et al. A mixed linear quadratic optimal control problem with a controlled time horizon
Anciaux et al. Minexp-card: limiting data collection using a smart card
CN105243319B (en) The access method of controlling security of XBRL application platforms
CN103745299A (en) Method and equipment for across-data-center data extracting
Hao et al. Some Existence Results for High Order Fractional Impulsive Differential Equation on Infinite Interval

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190129