CN113254443A - Big data partition storage device and method - Google Patents

Big data partition storage device and method Download PDF

Info

Publication number
CN113254443A
CN113254443A CN202110570574.2A CN202110570574A CN113254443A CN 113254443 A CN113254443 A CN 113254443A CN 202110570574 A CN202110570574 A CN 202110570574A CN 113254443 A CN113254443 A CN 113254443A
Authority
CN
China
Prior art keywords
data
storage
small
unit
beginning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110570574.2A
Other languages
Chinese (zh)
Inventor
唐丙寅
李萌
刘杨涛
杨思佳
史彰民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanyang Institute of Technology
Original Assignee
Nanyang Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanyang Institute of Technology filed Critical Nanyang Institute of Technology
Priority to CN202110570574.2A priority Critical patent/CN113254443A/en
Publication of CN113254443A publication Critical patent/CN113254443A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2291User-Defined Types; Storage management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a large data partition storage device and a method, which can split different data and mark the beginning and the end of a plurality of small data after splitting, on one hand, the small data is marked to facilitate the data reorganization during query, on the other hand, the beginning and the end are marked to only store the position information of the beginning and the end marks, the invention can quickly search the position of the content when searching, extracts the keywords of the split small data, establishes the mapping relation between the keywords and the small data, the invention further enables personnel to store big data by using the invention, and accurately retrieve the stored big data and part of the content, and has the advantages of storing different big data in a partition mode, accurately retrieving the stored big data and part of the content and the like.

Description

Big data partition storage device and method
Technical Field
The invention belongs to the technical field of data storage, and relates to a large data partition storage device and method.
Background
The current database comprises a data table, and the data table is used for storing big data in the database. When data is stored, a data table has an upper limit of capacity for storing data in a storage area, so that the data table cannot store large data exceeding the upper limit of the capacity for storing data in the storage area in the data table. The present invention relates to a method for dividing data belonging to the same data table into a plurality of storage areas, then dividing large-capacity data into a plurality of small-memory data, and then storing the plurality of data into a plurality of storage areas of the same data table, but the existing partitioned storage method can only store data in a partitioned manner for ASCII code characters, and can not store data in a partitioned manner for non-ASCII code characters, and at the same time, the existing partitioned storage method needs to combine the stored large data into a complete data to be retrieved and checked first when data is read, and when we need to check partial content in the large data, effective retrieval can not be performed, so that the partitioned storage is only used for splitting the large data, and the advantages of the partitioned storage method are not fully exerted, thereby greatly reducing retrieval and checking efficiency The advent of the method is imminent.
Disclosure of Invention
The invention aims to provide a large data partition storage device and a method.
The purpose of the invention can be realized by the following technical scheme:
a big data partition storage method is characterized by comprising the following specific steps:
1) acquiring big data to be stored, splitting the big data to be stored into a plurality of data with smaller memories;
2) marking the beginning and the end of the data with smaller memory as Ann and Bn (n +1), wherein n is any number which is not 0;
3) acquiring an occupied storage space of data to be stored, dividing the storage space into a plurality of partitions, sequentially storing the divided small data in the plurality of partitions divided from the storage space, extracting Ann and Bn (n +1) on the small data, recording positions of the small data corresponding to Ann and Bn (n +1) in the storage space, recording storage position data as Zn, and taking n as any number which is not 0;
4) forming data from Z1 to Zn into a data address base data;
5) storing the data of the data address base in an address storage base;
6) extracting keywords of the small data from the small data;
7) and establishing a mapping relation between the key words and Ann and Bn (n + 1).
Further, the working steps of step 2) are as follows:
1) recording the beginning of the split first small data as A11, and recording the end of the split first small data as B12;
2) recording the beginning of the small data after the split first small data as A22, and recording the end of the small data after the split first small data as B23;
3) the beginning of the small data after the split n-1 th small data is marked as Ann, the end of the small data after the split first small data is marked as Bn (n +1), and n is any number which is not 0.
Further, the working steps of step 3) are as follows:
1) acquiring occupied storage space of data to be stored, and dividing the storage space into an address storage library and a data storage library;
2) dividing the data storage library into a plurality of storage partitions by taking the set storage size as a unit;
3) sequentially storing the small data into storage partitions of a data storage library;
4) extracting Ann and Bn (n +1) of small data stored in a data storage bank;
5) the data storage locations of A11 and B12 are denoted as Z1;
6) the data storage locations of A22 and B23 are denoted as Z2;
7) let the data storage locations Ann and Bn (n +1) be Zn, and n be an arbitrary number other than 0.
Further, the query step of the partition storage method comprises:
(1) comparing the input keywords with the stored keywords by inputting the keywords;
(2) extracting keywords identical to the input keywords from the stored keywords;
(3) mapping the beginning and end marks of the small data corresponding to the key by the extracted key;
(4) searching the storage position corresponding to the mark from the address storage library through the beginning mark and the ending mark;
(5) deriving data by the storage location;
(6) and splicing the exported data to form integral data export.
The device comprises a data acquisition unit, a big data splitting unit, a marking unit, a storage unit and a mapping unit, wherein the data acquisition unit is used for acquiring big data to be stored and transmitting the data to the big data splitting unit, the big data splitting unit splits the acquired data to enable the data to be split into a plurality of small memory data and transmits the small memory data to the marking unit, the marking unit is used for marking the beginning and the end of the small memory data, the storage unit is used for storing the small memory data, marking the storage positions of the small memories and storing the marked storage position data, and the mapping unit is used for extracting keywords of the small memory data and mapping the beginning and the end of the corresponding small memory data.
Further, the working steps of the device are as follows:
1) a data acquisition unit: acquiring big data to be stored, and transmitting the data to a big data splitting unit;
2) big data splitting unit: splitting the acquired data into a plurality of small memory data, and transmitting the small memory data to the marking unit;
3) a marking unit: marking the beginning and the end of the small memory data;
4) a storage unit: storing the data of the small memory, marking the storage position of the small memory, and storing the marked storage position data;
5) a mapping unit: and extracting the key words of the small memory data, and mapping the key words and the beginning and the end of the corresponding small memory data.
The invention has the beneficial effects that: the invention can split different data, mark the beginning and end of a plurality of small data after splitting, on one hand mark the small data, facilitate the data reorganization during the inquiry, on the other hand, through marking the beginning and end, only need to carry on the position information storage to the beginning and end mark, can while searching, search out the position where the content locates rapidly, the invention extracts the keyword from the small data after splitting, set up through keyword and small data mapping relation, and then enable personnel to utilize the invention to store big data, and carry on accurate search and some content search to the big data stored, the invention has to carry on the beneficial effects such as the subregion storage, carry on accurate search and some content search to the big data stored to different big data.
Drawings
In order to facilitate understanding for those skilled in the art, the present invention will be further described with reference to the accompanying drawings.
FIG. 1 is a simplified flow chart of a method for partitioned storage of big data according to the present invention;
FIG. 2 is a schematic diagram of a large data partition memory device according to the present invention.
Detailed Description
The invention is illustrated in detail by the following examples in connection with figures 1-2:
a big data partition storage method is characterized by comprising the following specific steps:
1) acquiring big data to be stored, splitting the big data to be stored into a plurality of data with smaller memories;
2) marking the beginning and the end of the data with smaller memory as Ann and Bn (n +1), wherein n is any number which is not 0;
3) acquiring an occupied storage space of data to be stored, dividing the storage space into a plurality of partitions, sequentially storing the divided small data in the plurality of partitions divided from the storage space, extracting Ann and Bn (n +1) on the small data, recording positions of the small data corresponding to Ann and Bn (n +1) in the storage space, recording storage position data as Zn, and taking n as any number which is not 0;
4) forming data from Z1 to Zn into a data address base data;
5) storing the data of the data address base in an address storage base;
6) extracting keywords of the small data from the small data;
7) and establishing a mapping relation between the key words and Ann and Bn (n + 1).
The working steps of the step 2) are as follows:
1) recording the beginning of the split first small data as A11, and recording the end of the split first small data as B12;
2) recording the beginning of the small data after the split first small data as A22, and recording the end of the small data after the split first small data as B23;
3) the beginning of the small data after the split n-1 th small data is marked as Ann, the end of the small data after the split first small data is marked as Bn (n +1), and n is any number which is not 0.
The working steps of the step 3) are as follows:
1) acquiring occupied storage space of data to be stored, and dividing the storage space into an address storage library and a data storage library;
2) dividing the data storage library into a plurality of storage partitions by taking the set storage size as a unit;
3) sequentially storing the small data into storage partitions of a data storage library;
4) extracting Ann and Bn (n +1) of small data stored in a data storage bank;
5) the data storage locations of A11 and B12 are denoted as Z1;
6) the data storage locations of A22 and B23 are denoted as Z2;
7) let the data storage locations Ann and Bn (n +1) be Zn, and n be an arbitrary number other than 0.
The query step of the partition storage method comprises the following steps:
(1) comparing the input keywords with the stored keywords by inputting the keywords;
(2) extracting keywords identical to the input keywords from the stored keywords;
(3) mapping the beginning and end marks of the small data corresponding to the key by the extracted key;
(4) searching the storage position corresponding to the mark from the address storage library through the beginning mark and the ending mark;
(5) deriving data by the storage location;
(6) and splicing the exported data to form integral data export.
The device comprises a data acquisition unit, a big data splitting unit, a marking unit, a storage unit and a mapping unit, wherein the data acquisition unit is used for acquiring big data to be stored and transmitting the data to the big data splitting unit, the big data splitting unit splits the acquired data to enable the data to be split into a plurality of small memory data and transmit the small memory data to the marking unit, the marking unit is used for marking the beginning and the end of the small memory data, the storage unit is used for storing the small memory data and marking the storage positions of the small memories at the same time and storing the marked storage position data, and the mapping unit is used for extracting keywords of the small memory data and mapping the beginning and the end of the keywords and the corresponding small memory data.
The device comprises the following working steps:
1) a data acquisition unit: acquiring big data to be stored, and transmitting the data to a big data splitting unit;
2) big data splitting unit: splitting the acquired data into a plurality of small memory data, and transmitting the small memory data to the marking unit;
3) a marking unit: marking the beginning and the end of the small memory data;
4) a storage unit: storing the data of the small memory, marking the storage position of the small memory, and storing the marked storage position data;
5) a mapping unit: and extracting the key words of the small memory data, and mapping the key words and the beginning and the end of the corresponding small memory data.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (6)

1. A big data partition storage method is characterized by comprising the following specific steps:
1) acquiring big data to be stored, splitting the big data to be stored into a plurality of data with smaller memories;
2) marking the beginning and the end of the data with smaller memory as Ann and Bn (n +1), wherein n is any number which is not 0;
3) acquiring an occupied storage space of data to be stored, dividing the storage space into a plurality of partitions, sequentially storing the divided small data in the plurality of partitions divided from the storage space, extracting Ann and Bn (n +1) on the small data, recording positions of the small data corresponding to Ann and Bn (n +1) in the storage space, recording storage position data as Zn, and taking n as any number which is not 0;
4) forming data from Z1 to Zn into a data address base data;
5) storing the data of the data address base in an address storage base;
6) extracting keywords of the small data from the small data;
7) and establishing a mapping relation between the key words and Ann and Bn (n + 1).
2. The big data partition storage method according to claim 1, wherein the working steps of step 2) are as follows:
1) recording the beginning of the split first small data as A11, and recording the end of the split first small data as B12;
2) recording the beginning of the small data after the split first small data as A22, and recording the end of the small data after the split first small data as B23;
3) the beginning of the small data after the split n-1 th small data is marked as Ann, the end of the small data after the split first small data is marked as Bn (n +1), and n is any number which is not 0.
3. The big data partition storage method according to claim 1, wherein the working steps of step 3) are as follows:
1) acquiring occupied storage space of data to be stored, and dividing the storage space into an address storage library and a data storage library;
2) dividing the data storage library into a plurality of storage partitions by taking the set storage size as a unit;
3) sequentially storing the small data into storage partitions of a data storage library;
4) extracting Ann and Bn (n +1) of small data stored in a data storage bank;
5) the data storage locations of A11 and B12 are denoted as Z1;
6) the data storage locations of A22 and B23 are denoted as Z2;
7) let the data storage locations Ann and Bn (n +1) be Zn, and n be an arbitrary number other than 0.
4. The big data partition storage method according to claim 1, wherein the query step of the partition storage method comprises:
(1) comparing the input keywords with the stored keywords by inputting the keywords;
(2) extracting keywords identical to the input keywords from the stored keywords;
(3) mapping the beginning and end marks of the small data corresponding to the key by the extracted key;
(4) searching the storage position corresponding to the mark from the address storage library through the beginning mark and the ending mark;
(5) deriving data by the storage location;
(6) and splicing the exported data to form integral data export.
5. The device is characterized by comprising a data acquisition unit, a big data splitting unit, a marking unit, a storage unit and a mapping unit, wherein the data acquisition unit is used for acquiring big data to be stored and transmitting the data to the big data splitting unit, the big data splitting unit splits the acquired data to enable the data to be split into a plurality of small memory data and transmits the small memory data to the marking unit, the marking unit is used for marking the beginning and the end of the small memory data, the storage unit is used for storing the small memory data, marking the storage positions of the small memories and storing the marked storage position data, and the mapping unit is used for extracting keywords of the small memory data and mapping the beginning and the end of the small memory data corresponding to the keywords.
6. The big data partition storage device according to claim 5, wherein the device operation steps are as follows:
(1) a data acquisition unit: acquiring big data to be stored, and transmitting the data to a big data splitting unit;
(2) big data splitting unit: splitting the acquired data into a plurality of small memory data, and transmitting the small memory data to the marking unit;
(3) a marking unit: marking the beginning and the end of the small memory data;
(4) a storage unit: storing the data of the small memory, marking the storage position of the small memory, and storing the marked storage position data;
(5) a mapping unit: and extracting the key words of the small memory data, and mapping the key words and the beginning and the end of the corresponding small memory data.
CN202110570574.2A 2021-05-19 2021-05-19 Big data partition storage device and method Pending CN113254443A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110570574.2A CN113254443A (en) 2021-05-19 2021-05-19 Big data partition storage device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110570574.2A CN113254443A (en) 2021-05-19 2021-05-19 Big data partition storage device and method

Publications (1)

Publication Number Publication Date
CN113254443A true CN113254443A (en) 2021-08-13

Family

ID=77184204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110570574.2A Pending CN113254443A (en) 2021-05-19 2021-05-19 Big data partition storage device and method

Country Status (1)

Country Link
CN (1) CN113254443A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115248578A (en) * 2022-09-22 2022-10-28 南京旭上数控技术有限公司 Industrial equipment data acquisition method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115248578A (en) * 2022-09-22 2022-10-28 南京旭上数控技术有限公司 Industrial equipment data acquisition method

Similar Documents

Publication Publication Date Title
US6931408B2 (en) Method of storing, maintaining and distributing computer intelligible electronic data
CN101097573B (en) Automatically request-answering system and method
CN104331446A (en) Memory map-based mass data preprocessing method
CN102890675B (en) Method and device for storing and finding data
CN110399568A (en) Information search method, device, terminal and storage medium
CN107491487A (en) A kind of full-text database framework and bitmap index establishment, data query method, server and medium
CN104881481A (en) Method and device for accessing mass time sequence data
CN103488710B (en) The non-fixed-length data method of efficient storage in big data page
CN102110123A (en) Method for establishing inverted index
CN106445416A (en) Data record storage, query and retrieval method and device
CN106598919A (en) Document generation method and device
CN102622434A (en) Data storage method, data searching method and device
CN108595523A (en) device data retrieval model construction method, device and computer equipment
CN108491543A (en) Image search method, image storage method and image indexing system
CN107943520B (en) Application stack information acquisition method and device and stack information analysis method and device
CN102508901A (en) Content-based massive image search method and content-based massive image search system
CN113254443A (en) Big data partition storage device and method
US20090171936A1 (en) System, Method, and Computer Program Product for Accelerating Like Conditions
CN100383787C (en) Multi-chart information initializing method of database
CN109033353B (en) Electric energy quality management system shared memory updating method
CN109739854A (en) A kind of date storage method and device
CN115544975B (en) Log format conversion method and device
CN111581482A (en) Data sharing and analyzing method and system based on SEO data multi-dimensional association
CN105740374A (en) Distributed memory based three-dimensional platform data fuzzy query method
CN103699569A (en) Index structure and index method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination