CN109376120A - A kind of document format data method for transformation, device and the storage medium of internal memory optimization - Google Patents

A kind of document format data method for transformation, device and the storage medium of internal memory optimization Download PDF

Info

Publication number
CN109376120A
CN109376120A CN201811287516.3A CN201811287516A CN109376120A CN 109376120 A CN109376120 A CN 109376120A CN 201811287516 A CN201811287516 A CN 201811287516A CN 109376120 A CN109376120 A CN 109376120A
Authority
CN
China
Prior art keywords
data
format
file
transformed
subtype
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811287516.3A
Other languages
Chinese (zh)
Inventor
许爱东
黄文琦
明哲
颜学谨
陈华军
杨航
关泽武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CSG Electric Power Research Institute
China Southern Power Grid Co Ltd
Research Institute of Southern Power Grid Co Ltd
Original Assignee
China Southern Power Grid Co Ltd
Research Institute of Southern Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Southern Power Grid Co Ltd, Research Institute of Southern Power Grid Co Ltd filed Critical China Southern Power Grid Co Ltd
Priority to CN201811287516.3A priority Critical patent/CN109376120A/en
Publication of CN109376120A publication Critical patent/CN109376120A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0625Power saving in storage systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of document format data method for transformation of internal memory optimization, the fundamental type of each data and subtype corresponding with fundamental type in the data file to be transformed got can be determined;Then required target memory space is determined according to each subtype;Data file to be transformed progress format conversion is obtained into object format file again, the object format file after conversion can be stored to target storage space.This method, it can be in document format conversion, it determines for data to be read into the memory headroom for needing to occupy after memory according to the subtype of data each in data file to be transformed, it can accurately determine the memory headroom needed, compared with the directly given memory headroom of traditional maximum subtype according to each column data fundamental type, and then reduce calculator memory occupancy, improve memory usage and data processing speed.In addition, the invention also discloses the document format data reforming unit and storage medium of a kind of internal memory optimization, effect is as above.

Description

A kind of document format data method for transformation, device and the storage medium of internal memory optimization
Technical field
The present invention relates to document format conversion field, in particular to the document format data conversion side of a kind of internal memory optimization Method, device and storage medium.
Background technique
With the development of electronics technology, the quantity sustainable growth of sensor device, received data amount is in explosive increasing It is long.At the same time, since the concept of cloud computing is suggested, the spies such as cloud computing is ultra-large with its, virtualization, enhanced scalability Point is widely applied by all trades and professions, especially in industrial circle.Along with the sharp increase of data volume and the enhancing of operational capability, people The development of work intelligence is also getting faster, and the machine learning algorithm to emerge one after another is advised because it can be automatically analyzed from data Rule, simultaneously assimilated equations predict unknown data and are used widely.By cloud computing technology by excavating mass data It carries out related data prediction and has become a kind of trend to reduce cost of labor, improve intelligence degree.
When building machine learning processing platform, platform needs are interacted with operator, data, sensor.One side Face, platform operator quantity may be more, need to carry out visual check to Various types of data;On the other hand, to machine learning Document format data required for algorithm should set up certain standard, to facilitate many algorithms to handle;Last aspect is different Difference, the format of received data would also vary from the sensor device of producer's production.
Therefore, in order to facilitate the operation of personnel visibility's data, facilitate machine learning algorithm analyzing and processing data, need to biography The various file formats that sensor receives mutually are converted.Currently, when carrying out document format data conversion, usually to turn The maximum memory storage space of data setting after change, that is to say, that have great memory storage space, be provided in great Deposit memory space;But has more than is needed so big memory storage space after presumable data conversion at all, therefore will appear meter Calculation machine EMS memory occupation amount is big, memory usage is low and in the case where data high concurrent, the slow situation of data processing speed.
It can be seen that how to overcome in document format data conversion, caused calculator memory occupancy is big, memory makes It is those skilled in the art's urgent problem to be solved with the problem that rate is low and data processing speed is slow.
Summary of the invention
The embodiment of the present application provides document format data method for transformation, device and the storage medium of a kind of internal memory optimization, When document format data converts, caused calculator memory occupancy is big, memory usage is low in the prior art to solve and The slow problem of data processing speed.
In order to solve the above technical problems, the present invention provides a kind of document format data method for transformation of internal memory optimization, packet It includes:
Data file to be transformed is obtained, and the data in the data file to be transformed are read into memory headroom;
Determine the fundamental type of each data and subtype corresponding with the fundamental type in the data file to be transformed;
Required target memory space is determined according to each subtype, and carries out lower transition;
Format conversion is carried out to obtain object format file to the data file to be transformed, and the object format is literary Part is stored to target storage space.
Preferably, the format of the data file to be transformed specifically include csv format or xml format or xlsx format or Xls format or txt format.
Preferably, described to obtain data file to be transformed, and in the data in the data file to be transformed are read into Deposit space specifically:
The data file to be transformed is obtained using the library pandas, and the data in the data file to be transformed are read in To the memory headroom.
Preferably, in the determination data file to be transformed the fundamental type of each data and with the fundamental type pair The subtype answered specifically:
Using the library pandas by column traverse each data in the data file to be transformed obtain the master data and The subtype.
Preferably, described that required target memory space is determined according to each subtype, and it is specific to carry out lower transition Include:
The first memory headroom is determined according to the fundamental type;
First memory headroom is optimized according to the subtype and determines the target memory space, and is carried out Lower transition.
Preferably, described to the data file to be transformed when the format of the data file to be transformed is xml format Format conversion is carried out to show that object format file specifically includes:
Dict formatted file is converted by the data file to be transformed of the xml format;
The object format file is converted by the dict formatted file.
Preferably, the fundamental type includes int type or float type or datetime type or bool type.
In order to solve the above technical problems, the present invention also provides a kind of and internal memory optimization document format data method for transformation The document format data reforming unit of corresponding internal memory optimization, comprising:
Module is obtained, is read into for obtaining data file to be transformed, and by the data in the data file to be transformed Memory headroom;
First determining module, for determine in the data file to be transformed the fundamental type of each data and with it is described basic The corresponding subtype of type;
Second determining module for determining required target memory space according to each subtype, and turn Type;
Format conversion module, for carrying out format conversion to the data file to be transformed to obtain object format file, And the object format file is stored to target storage space.
In order to solve the above technical problems, the present invention also provides another document format data conversion sides with internal memory optimization The document format data reforming unit of the corresponding internal memory optimization of method, comprising:
Memory, for storing the computer program;
Processor, for executing the computer program to realize the document format data of any one of the above internal memory optimization The step of method for transformation.
In order to solve the above technical problems, the present invention also provides a kind of and internal memory optimization document format data method for transformation A kind of computer readable storage medium is corresponded to, is stored with computer program, the calculating on the computer readable storage medium The step of document format data method for transformation that machine program is executed by processor to realize any one of the above internal memory optimization.
Compared with the prior art, the document format data method for transformation of a kind of internal memory optimization provided by the present invention, is being obtained Get data file to be transformed, and the data in data file to be transformed be read into after memory headroom, so that it may determine to The fundamental type of each data and subtype corresponding with fundamental type in conversion data file;Then it is determined according to each subtype Required target memory space, and carry out lower transition;Determine target memory space and then by data file to be transformed into Row format conversion obtains object format file, while the object format file after conversion being stored to target storage space.Thus As it can be seen that can be determined in document format conversion according to the subtype of data each in data file to be transformed using this method The computer memory space that maximum demand occupies when conversion can accurately determine the memory headroom for needing to occupy, with biography The directly given memory headroom of the maximum subtype according to each column data fundamental type of system is compared, and then reduces calculator memory Occupancy improves memory usage and data processing speed.In addition, the present invention also provides a kind of data of internal memory optimization Document format conversion device and storage medium, effect are as above.
Detailed description of the invention
Fig. 1 is a kind of document format data method for transformation flow chart of internal memory optimization provided by the embodiment of the present invention;
Fig. 2 is a kind of mutual transition diagram of Common File Format provided by the embodiment of the present invention;
Fig. 3 is a kind of document format data reforming unit composition signal of internal memory optimization provided by the embodiment of the present invention Figure;
Fig. 4 is that the document format data reforming unit of another kind internal memory optimization provided by the embodiment of the present invention forms signal Figure.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art without making creative work it is obtained it is all its Its embodiment, shall fall within the protection scope of the present invention.
Core of the invention is to provide document format data method for transformation, device and the storage medium of a kind of internal memory optimization, It can solve in document format data conversion that caused calculator memory occupancy is big, memory usage is low and data processing Slow-footed problem.
Scheme in order to enable those skilled in the art to better understand the present invention, with reference to the accompanying drawing and specific embodiment party The present invention is described in further detail for formula.
Fig. 1 is a kind of document format data method for transformation flow chart of internal memory optimization provided by the embodiment of the present invention, such as Shown in Fig. 1, method includes the following steps:
S101: data file to be transformed is obtained, and the data in data file to be transformed are read into memory headroom.
Specifically, exactly after getting data file to be transformed, the data in the data file to be transformed are first read Into calculator memory space, it is convenient for post-processing.In view of actual conditions, it is preferable that be transformed in the embodiment of the present application The format of data file specifically includes csv format or xml format or xlsx format or xls format or txt format.Csv format text Part refers to the file for making with plain text of comma and separating value table storage lattice data;Xls and xlsx formatted file refers to common Microsoft Excel spreadsheet file format;Txt formatted file refers to that one kind can be by system terminal or simple The format that text editor receives;Xml formatted file refers to extensible markup language.
In order to improve the acquisition speed of data file, preferably embodiment, obtains data file to be transformed, and will Data in data file to be transformed are read into memory headroom specifically:
Data file to be transformed is obtained using the library pandas, and the data in data file to be transformed are read into memory sky Between.
The library pandas refers to a kind of Data analysis library based on Python;Python refer to a kind of explanation type, object-oriented, The high-level programming language of dynamic data type.
S102: the fundamental type of each data and subtype corresponding with fundamental type in data file to be transformed are determined.
S103: required target memory space is determined according to each subtype, and carries out lower transition.
Specifically, after getting data file to be transformed, each data in data file to be transformed are just first determined Fundamental type and subtype corresponding with fundamental type.Fundamental type is biggish data type range, for example, integer, is floated Point-type etc. is properly termed as fundamental type;Subtype refers to lesser data type corresponding with fundamental type, for example, when basic When type is integer, accordingly subtype just refers to the specific digit of integer data, when fundamental type is floating type, accordingly Subtype just refers to the subsequent data bits of decimal point.It can similarly obtain, the meaning of other fundamental types and accordingly subtype, This is repeated no more.Preferably embodiment, fundamental type include int type or float type or datetime type or Bool type.Then required target storage space is determined according to each subtype again, and carries out lower transition.Namely according to each Subtype determines the calculating memory headroom needed when data file to be transformed conversion.
In order to quickly determine the fundamental type of each data and subtype corresponding with fundamental type in file to be transformed, Preferably embodiment determines the fundamental type of each data and subclass corresponding with fundamental type in data file to be transformed Type specifically:
Each data in data file to be transformed are traversed by column using the library pandas and obtain master data and subtype.
Specifically, each data in conversion data file exactly are treated by column using the library pandas to be traversed, records each column Maximum value, minimum value and the basic data type of data, determine the subtype of each column;In practical applications, it also needs each column Title and its subtype are stored in database, for subsequent reads access according to when inquire.It is right finally according to the subtype of each column in database Each column carries out transition operation downwards, cuts down memory consumption.
In order to cut down memory consumption, calculator memory utilization rate is improved, preferably embodiment, according to each subtype It determines required target memory space, and carries out lower transition and specifically include:
The first memory headroom is determined according to fundamental type;
The first memory headroom is optimized according to subtype and determines target memory space, and carries out lower transition.
Particularly as being according to after the fundamental type of each data in the data file to be transformed determined, elder generation is according to basic class Type determines the maximum memory space (the first memory headroom) needed after data file conversion to be transformed, and fundamental type data are corresponding Memory space be it is known, then the first memory headroom is optimized further according to subtype corresponding with fundamental type, really Actually required target memory space is made, the first memory headroom is greater than or equal to target memory space, is then turned downwards Type operation.
S104: treating conversion data file and carry out format conversion to obtain object format file, and by object format file It stores to target storage space.
After determining target memory space, conversion data file is just treated in the target memory space and carries out format Then object format file after conversion is stored target storage space, such as stored to USB flash disk, hard disk etc. by conversion.Right When data file to be transformed carries out format conversion, generally all first need to convert DataFrame format for data file to be transformed, Then subsequent file format conversion, i.e. a DataFrame format intermediate form being document format conversion are being carried out, also referred to as Cross format.Specifically, the subtype for exactly specifying each column data in data file to be transformed, by data text format to be transformed It is read as Dataframe format;Then the resulting Dataframe formatted file of previous step is converted into object format file. DataFrame format refers in Python may isomery, the two-dimensional table data structure that with label axis, size is variable. Subtype of the embodiment of the present application by judgement data, the occupancy volume of compressed data in memory, run for machine learning algorithm, Multi-user concurrent operation provides memory headroom basis.
A kind of document format data method for transformation of internal memory optimization provided by the present invention, it is provided by the present invention one kind in The document format data method for transformation for depositing optimization is getting data file to be transformed, and by the number in data file to be transformed According to being read into after memory headroom, so that it may determine in data file to be transformed the fundamental type of each data and with fundamental type pair The subtype answered;Then required target memory space is determined according to each subtype, and carries out lower transition;Determining target Memory headroom and then data file to be transformed progress format conversion is obtained into object format file, while by the mesh after conversion Mark formatted file is stored to target storage space.It can be seen that using this method, can in document format conversion, according to The subtype of each data determines the calculator memory space that maximum demand occupies when conversion in conversion data file, can be calibrated Really determine the memory headroom for needing to occupy, it is directly given with traditional maximum subtype according to each column data fundamental type Memory headroom is compared, and then reduces calculator memory occupancy, improves memory usage and data processing speed.
In view of the particularity of xml formatted file, on the basis of the above embodiments, preferably embodiment, when When the format of data file to be transformed is xml format, treats conversion data file and carry out format conversion to obtain object format text Part specifically includes:
Dict formatted file is converted by the data file to be transformed of xml format;
Object format file is converted by dict formatted file.
Particularly as being if when the format of data file to be transformed is xml format, that is, if it is desired to by xml formatted file When being converted into other formatted files (object format file), need first to convert dict formatted file for xml formatted file, then The library pandas is being recalled, is converting DataFrame formatted file for dict formatted file, and then is being converted into other format texts Part.Dict refers to the variodenser model in a kind of Python, and can store any type object.
In order to make those skilled in the art more fully understand this programme, this programme is carried out below with reference to practical application scene It is described in detail, it is assumed that system reads data file to be transformed for the first time, and key step is as follows:
The first step, data file to be converted is read with the library pandas, and data will be read into interior with DataFrame format Among depositing.
Second step presses column ergodic data with the library pandas, calculates the maximum value of each column and minimum value and records each column Basic data type;
Third step is first the subtype for determining each column, specifically, since the subtype of every column data is unknown, so Pandas can be that every column data opens up maximum subtype memory space according to the fundamental type of each column, such as int64, float64 Deng each data will occupy 8 bytes of storage space in memory.It, can according to each column maximum value of previous step record, minimum value To determine the practical subtype of every column data.After obtaining the practical subtype of each column, by entire data in the form of key-value pair Column name and subtype be stored in mongoDB database, mongoDB refers to the database based on distributed document storage. For example, being the column of integer entirely for certain in data, pandas will be that the column subtype is considered as int64 automatically, in fact, if the column The integer that data are 0 to 255, subtype should be uint8;If the integer that such data is -128 to 127, the column subclass Type should be int8.Memory space needed for int64 is 8 bytes, and memory space needed for int8 and uint8 is only 1 byte.
4th step, each column subtype obtained according to third step carry out transition operation downwards to each column of data, if certain Column can be optimised, then the reduction of its memory space is at least 50%, reaches as high as 87.5%.After EMS memory occupation optimization, Machine will accommodate the reading of more multifile simultaneously, and memory effective rate of utilization greatly improves.Under multi-user scene, user is not necessarily to volume Outer waiting carries out Data Mining using Python, or is directly inputted to machine learning algorithm and is operated, improve use Family experience.
This programme is illustrated so that text formatting is converted as an example below, the specific steps are as follows:
The first step will read the subtype of the data each column according to file name from database.In mongoDB database In, column name and its subtype are stored in the form of key-value pair, and program will read the character string of these key-value pairs.
Second step, pandas, according to key-value pair character string obtained by the first step, will specify the tool of each column when reading data Body subtype directly occupies optimal mode with memory and reads data among memory.Namely determined according to data subtypes Target memory space out.
Third step formats, and Fig. 2 is that a kind of common Document type data provided by the embodiment of the present invention is mutual Transition diagram needs to choose a kind of intermediate form, the format as shown in Fig. 2, for convenience, quickly mutually being converted Should have the following characteristics that can be simultaneously comprising number and character string;Memory reduction can be carried out by row or column;It is mutual to carry out format Cost is minimum, fastest when conversion;Machine learning algorithm can be directly inputted to.In the embodiment of the present application, intermediate form is DataFrame.DataFrame is a kind of two dimensional data structure, is had the characteristics that, can isomery (can have number in data simultaneously Word and character string), have label axis (have row label and column label, can be by line number or columns, row label and column label to array Be sliced), size can be changed (can arbitrarily increase, delete row or column) and can be directly inputted to machine learning or deep learning algorithm. Specifically, file conversion process in, if file format be csv, xls, xlsx or csv, data will be read directly for DataFrame format, if the object format converted also into csv, xls, xlsx or csv, will be carried out directly by Dataframe Format conversion;If file format is that xml will be with nested key because the format can not be converted directly into DataFrame format The form of value pair reads data: to arrange name as external bond, corresponding value is nested inside key-value pair, the nesting key-value pair Key be row serial number, data are that corresponding value in this way, the data of xml format will be converted into dict format recalls pandas Dict format conversion is DataFrame format, and then is converted into other formats by library.
It is converted below by csv format to xml format, the process of specification format conversion.It is reading csv data and is carrying out It, will be by column ergodic data after memory is cut down;For each column, stored in the form of the key-value pair of " { row label: row value } " first The data of institute's previous step and column label are later " { column label: { { row label 1: row value 1 } ... { row by each data of the column Label n: row value n } } } ";In next step, all column back obtained are all stored in a braces to arrive dict lattice The data of formula (intermediate form of xml and DataFrame);Finally, segmentation dict character string will using character string relevant operation Dict formatted data is converted to the data in xml line by line one by one.
It is described in detail above for a kind of embodiment of the document format data method for transformation of internal memory optimization, base In the document format data method for transformation of the internal memory optimization of above-described embodiment description, the embodiment of the invention also provides one kind and it is somebody's turn to do The document format data reforming unit of the corresponding internal memory optimization of method.Due to the embodiment of device part and the implementation of method part Example corresponds to each other, therefore the embodiment of device part please refers to the embodiment description of method part, and which is not described herein again.
Fig. 3 is a kind of document format data reforming unit composition signal of internal memory optimization provided by the embodiment of the present invention Figure, as shown in figure 3, the device includes obtaining module 301, the first determining module 302, the second determining module 303 and format turn Change module 304.
Module 301 is obtained, for obtaining data file to be transformed, and in the data in data file to be transformed are read into Deposit space;
First determining module 302, for determining the fundamental type of each data and and fundamental type in data file to be transformed Corresponding subtype;
Second determining module 303 for determining required target memory space according to each subtype, and turn Type;
Format conversion module 304 carries out format conversion for treating conversion data file to obtain object format file, and Object format file is stored to target storage space.
A kind of document format data reforming unit of internal memory optimization provided by the present invention is getting data text to be transformed Part, and the data in data file to be transformed are read into after memory headroom, so that it may it determines each in data file to be transformed The fundamental type of data and subtype corresponding with fundamental type;Then determine that required target memory is empty according to each subtype Between, and carry out lower transition;It is determining target memory space and then data file to be transformed progress format conversion is obtained into mesh Formatted file is marked, while the object format file after conversion being stored to target storage space.It can be seen that using the present apparatus, It can determine that maximum demand accounts for when conversion according to the subtype of data each in data file to be transformed in document format conversion Calculator memory space can accurately determine the memory headroom for needing to occupy, with traditional according to each column data The directly given memory headroom of the maximum subtype of fundamental type is compared, and then reduces calculator memory occupancy, is improved interior Deposit utilization rate and data processing speed.
It is described in detail above for a kind of embodiment of the document format data method for transformation of internal memory optimization, base In the document format data method for transformation of the internal memory optimization of above-described embodiment description, the embodiment of the invention also provides it is another with The document format data reforming unit of the corresponding internal memory optimization of this method.Due to the embodiment of device part and the reality of method part Example reciprocal correspondence is applied, therefore the embodiment of device part please refers to the embodiment description of method part, which is not described herein again.
Fig. 4 is that the document format data reforming unit of another kind internal memory optimization provided by the embodiment of the present invention forms signal Figure, as shown in figure 4, the device includes memory 401 and processor 402.
Memory 401, for storing computer program;
Processor 402 realizes internal memory optimization provided by any one above-mentioned embodiment for executing computer program Document format data method for transformation the step of.
The document format data reforming unit of another kind internal memory optimization provided by the present invention, can be in document format conversion When, the calculator memory for determining that maximum demand occupies when conversion according to the subtype of data each in data file to be transformed is empty Between, the memory headroom for needing to occupy can be accurately determined, with traditional maximum according to each column data fundamental type The directly given memory headroom of type is compared, and then reduces calculator memory occupancy, improves memory usage and data Processing speed.
It is described in detail above for a kind of embodiment of the document format data method for transformation of internal memory optimization, base In a kind of document format data method for transformation of internal memory optimization of above-described embodiment description, the embodiment of the invention also provides one kind Computer readable storage medium corresponding with this method.Embodiment and method part due to computer readable storage medium part Embodiment correspond to each other, therefore the embodiment of computer readable storage medium part please refers to the embodiment of method part and retouches It states, which is not described herein again.
A kind of computer readable storage medium is stored with computer program, computer journey on computer readable storage medium Sequence is executed by processor the document format data method for transformation to realize internal memory optimization that above-mentioned any one embodiment provides Step.
A kind of computer readable storage medium provided by the present invention, processor can read in readable storage medium storing program for executing and store Program, it can the document format data method for transformation for realizing internal memory optimization provided by above-mentioned any one embodiment, can To determine that maximum demand is occupied when conversion according to the subtype of data each in data file to be transformed in document format conversion Calculator memory space, can accurately determine to need the memory headroom that occupies, with traditional according to each column data base The directly given memory headroom of the maximum subtype of this type is compared, and then reduces calculator memory occupancy, improves memory Utilization rate and data processing speed.
Above to a kind of document format data method for transformation, device and the storage medium of internal memory optimization provided by the present invention It is described in detail.With several examples, principle and implementation of the present invention are described herein, the above implementation The explanation of example, is merely used to help understand method and its core concept of the invention;Meanwhile for the general technology people of this field Member, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this explanation Book content should not be construed as limiting the invention, those skilled in the art, under the premise of no creative work, to this hair Bright made modification, equivalent replacement, improvement etc., should be included in the application.
It should also be noted that, in the present specification, relational terms such as first and second and the like be used merely to by One operation is distinguished with another operation, without necessarily requiring or implying there are any between these entities or operation This actual relationship or sequence.Moreover, the similar word such as term " includes ", so that including the unit of a series of elements, equipment Or system not only includes those elements, but also including other elements that are not explicitly listed, or further includes for this list Member, equipment or the intrinsic element of system.

Claims (10)

1. a kind of document format data method for transformation of internal memory optimization characterized by comprising
Data file to be transformed is obtained, and the data in the data file to be transformed are read into memory headroom;
Determine the fundamental type of each data and subtype corresponding with the fundamental type in the data file to be transformed;
Required target memory space is determined according to each subtype, and carries out lower transition;
Format conversion is carried out to obtain object format file to the data file to be transformed, and the object format file is deposited It stores up to target storage space.
2. the document format data method for transformation of internal memory optimization according to claim 1, which is characterized in that described to be transformed The format of data file specifically includes csv format or xml format or xlsx format or xls format or txt format.
3. the document format data method for transformation of internal memory optimization according to claim 2, which is characterized in that it is described obtain to Conversion data file, and the data in the data file to be transformed are read into memory headroom specifically:
The data file to be transformed is obtained using the library pandas, and the data in the data file to be transformed are read into institute State memory headroom.
4. the document format data method for transformation of internal memory optimization according to claim 3, which is characterized in that the determining institute State the fundamental type of each data and subtype corresponding with the fundamental type in data file to be transformed specifically:
Each data in the data file to be transformed are traversed by column using the library pandas and obtain the master data and described Subtype.
5. the document format data method for transformation of internal memory optimization according to claim 4, which is characterized in that described according to each institute It states subtype and determines required target memory space, and carry out lower transition and specifically include:
The first memory headroom is determined according to the fundamental type;
First memory headroom is optimized according to the subtype and determines the target memory space, and turn Type.
6. the document format data method for transformation of internal memory optimization according to claim 2, which is characterized in that when described wait turn It is described that format conversion is carried out to obtain target lattice to the data file to be transformed when the format for changing data file is xml format Formula file specifically includes:
Dict formatted file is converted by the data file to be transformed of the xml format;
The object format file is converted by the dict formatted file.
7. according to claim 1 to the document format data method for transformation of internal memory optimization described in 6 any one, feature exists In the fundamental type includes int type or float type or datetime type or bool type.
8. a kind of document format data reforming unit of internal memory optimization characterized by comprising
Module is obtained, is read into memory for obtaining data file to be transformed, and by the data in the data file to be transformed Space;
First determining module, for determine in the data file to be transformed the fundamental type of each data and with the fundamental type Corresponding subtype;
Second determining module for determining required target memory space according to each subtype, and carries out lower transition;
Format conversion module, for carrying out format conversion to the data file to be transformed to obtain object format file, and will The object format file is stored to target storage space.
9. a kind of document format data reforming unit of internal memory optimization characterized by comprising
Memory, for storing the computer program;
Processor, for executing the computer program to realize internal memory optimization as claimed in any one of claims 1 to 7 The step of document format data method for transformation.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program, the computer program are executed by processor the data to realize the internal memory optimization as described in claim 1 to 7 any one The step of document format conversion method.
CN201811287516.3A 2018-10-31 2018-10-31 A kind of document format data method for transformation, device and the storage medium of internal memory optimization Pending CN109376120A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811287516.3A CN109376120A (en) 2018-10-31 2018-10-31 A kind of document format data method for transformation, device and the storage medium of internal memory optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811287516.3A CN109376120A (en) 2018-10-31 2018-10-31 A kind of document format data method for transformation, device and the storage medium of internal memory optimization

Publications (1)

Publication Number Publication Date
CN109376120A true CN109376120A (en) 2019-02-22

Family

ID=65391058

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811287516.3A Pending CN109376120A (en) 2018-10-31 2018-10-31 A kind of document format data method for transformation, device and the storage medium of internal memory optimization

Country Status (1)

Country Link
CN (1) CN109376120A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059129A (en) * 2019-04-28 2019-07-26 顶象科技有限公司 Date storage method, device and electronic equipment
CN110674199A (en) * 2019-08-13 2020-01-10 中国电建集团贵阳勘测设计研究院有限公司 Method and device for converting csv format data into SEG-2 format data
CN112363672A (en) * 2020-11-09 2021-02-12 北京大豪科技股份有限公司 Data processing method, device and equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
机器之心: "简单实用的pandas技巧:如何将内存占用降低90%", 《HTTPS://JUEJIN.CN/POST/6844903573696806926》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059129A (en) * 2019-04-28 2019-07-26 顶象科技有限公司 Date storage method, device and electronic equipment
CN110674199A (en) * 2019-08-13 2020-01-10 中国电建集团贵阳勘测设计研究院有限公司 Method and device for converting csv format data into SEG-2 format data
CN110674199B (en) * 2019-08-13 2022-11-08 中国电建集团贵阳勘测设计研究院有限公司 Method and device for converting csv format data into SEG-2 format data
CN112363672A (en) * 2020-11-09 2021-02-12 北京大豪科技股份有限公司 Data processing method, device and equipment

Similar Documents

Publication Publication Date Title
CN109376120A (en) A kind of document format data method for transformation, device and the storage medium of internal memory optimization
CN102411616B (en) Method and system for storing data and data management method
CN104732574B (en) The compression method and device of a kind of role play
CN104040542A (en) Techniques for maintaining column vectors of relational data within volatile memory
CN103425772A (en) Method for searching massive data with multi-dimensional information
CN110738037A (en) Method, apparatus, device and storage medium for automatically generating electronic form
CN107563557A (en) Determine the method and device of oil well output lapse rate
CN108363559A (en) Multiplication processing method, equipment and the computer-readable medium of neural network
CN103425692A (en) Data exporting method and data exporting device
CN102004787A (en) Method for combining multiple application scene forms based on office software plugins
CN106372008A (en) Data caching method and device
CN105095255A (en) Data index creating method and device
CN104978325B (en) A kind of web page processing method, device and user terminal
CN106095991A (en) A kind of automatically generate from relevant database to the method for the code of MongoDB database data migration
CN105229625A (en) Obtain the mixing Hash scheme of effective HMM
CN110134398A (en) Analytic method, system and the equipment of list data
CN109102141A (en) A kind of service level methods of marking and device
CN102724506A (en) JPEG (joint photographic experts group)_LS (laser system) general coding hardware implementation method
CN115438114B (en) Storage format conversion method, system, device, electronic equipment and storage medium
CN112000628A (en) Multi-channel laser radar data storage method and device and electronic equipment
WO2023103334A1 (en) Data processing method and apparatus of neural network simulator, and terminal
CN112765960B (en) Text matching method and device and computer equipment
CN102339342B (en) Method for fast materializing of parameterization device unit
CN114331071A (en) Model training method, fracturing parameter determination device and computer equipment
CN111309988B (en) Character string retrieval method and device based on coding and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190222

RJ01 Rejection of invention patent application after publication