CN105677826A - Resource management method for massive unstructured data - Google Patents

Resource management method for massive unstructured data Download PDF

Info

Publication number
CN105677826A
CN105677826A CN201610003635.6A CN201610003635A CN105677826A CN 105677826 A CN105677826 A CN 105677826A CN 201610003635 A CN201610003635 A CN 201610003635A CN 105677826 A CN105677826 A CN 105677826A
Authority
CN
China
Prior art keywords
data
metadata
label
unstructured data
tables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610003635.6A
Other languages
Chinese (zh)
Inventor
张善海
熊贵喜
蔡朝辉
杜博文
凌萍
谢志普
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BOCOM SMART NETWORK TECHNOLOGIES Inc
Original Assignee
BOCOM SMART NETWORK TECHNOLOGIES Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BOCOM SMART NETWORK TECHNOLOGIES Inc filed Critical BOCOM SMART NETWORK TECHNOLOGIES Inc
Priority to CN201610003635.6A priority Critical patent/CN105677826A/en
Publication of CN105677826A publication Critical patent/CN105677826A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a resource management method for massive unstructured data. The method includes the steps that a, the storage mode of the massive unstructured data is determined according to the size of a file of the massive unstructured data, and the file is stored on an HDFS or in an HBase; b, metadata information of the data is stored in the HBase, and query speed is increased by building index tables of metadata according to themes, tags and other information of the metadata; c, when the metadata is queried, the index tables of the metadata can be searched for according to the themes or the tags of the metadata needing to be searched for, and a data table is fast positioned; d, when unstructured data records are queried, the data index table corresponding to the data table needs to be found according to the naming rule of the data index tables, semantic tags of the data are queried in the data index table, recording main keys of the data needing to be searched for are found, and the data is fast positioned in the data table according to the main keys. By means of the resource management method, the massive unstructured data can be effectively organized and managed, and fast and efficient query can be performed.

Description

A kind of method for managing resource for magnanimity unstructured data
Technical field
The present invention relates to distributed data base HBase and distributed file system HDFS field, particularly to a kind of method for managing resource for magnanimity unstructured data.
Background technology
The full name of HDFS is HadoopDistributedFilesystem, is the flagship level file system of Hadoop. Its thought source is in Google file system (GoogleFileSystem, GFS), and applicable write-once, the access module that repeatedly reads, meets urban multi-source market demand scene. It is a distributed file system being suitable for storing big file, it is possible to as the data source of Hadoop and Spark.
HBase is based on the distributed data base of increasing income of Google Bigtable exploitation, and it is not traditional relevant database, and its initial objective is exactly solve traditional Relational DataBase not enough problem in theoretical and practice when processing extensive mass data. Owing to the bottom data of HBase is stored on HDFS, therefore HBase has high fault tolerance equally. The main feature of HBase has:
1) enhanced scalability. In memory capacity, HBase achieves level of linearity extension. When data volume reaches certain threshold values, data will be carried out horizontal segmentation by HBase, and will be assigned in thousands of servers of cluster by segmentation block. When the scale of data arrives the limit of cluster, HBase also supports to expand number of clusters, it is achieved do not shut down dynamic seamless dilatation.
2) high-performance. The design original intention of HBase seeks to meet the high concurrent mass data inquiry of user. It has 2 mechanism to ensure concurrently to inquire about efficiently. One is data segmentations. Data are divided into each node of cluster by HBase, and when user inquires about data, each node can return corresponding data block simultaneously, it is achieved concurrently inquires about. Two is caching mechanism. HBase devises efficient caching mechanism, Cache when being provided with MemStore unit especially as reading and writing data, it is possible to significantly increase the hit rate of data access.
3) high availability. The bottom of HBase utilizes HDFS to store data, and namely HDFS itself has high fault tolerance. When certain machine data is lost, HBase can find the backup of these data by HDFS, and duplicate copy, renewal system log (SYSLOG) table again. This ensure that the high availability of HBase system.
And along with current Urban Data amount is day by day huge, unstructured data kind gets more and more, how in distributed system, magnanimity unstructured data to be stored and to manage the direction just becoming research.
Summary of the invention
The present invention is directed to city unstructured data day by day huge, process day by day time-consuming technical problem, it is proposed to a kind of method for managing resource for magnanimity unstructured data, magnanimity unstructured data can be carried out effective organization and management by the method.
A kind of method for managing resource for magnanimity unstructured data, comprises the following steps:
Step a: determine its storage mode according to the size of unstructured data file, when described unstructured data file size exceedes given threshold value, it is deposited into HDFS file system, and the tables of data created on HBase stores its essential information and the path on HDFS; When described unstructured data file size is less than or equal to given threshold value, described file is serialized and is stored directly in HBase data base;
Step b: build metadata table and data directory according to described unstructured data, and utilize described metadata table to build index of metadata table;
Step c: when query metadata, makes a look up described index of metadata table according to the theme of the metadata to search or label, to obtain the tables of data of correspondence; And
Step d: when inquiring about unstructured data record, naming rule according to described data directory finds the data directory that tables of data is corresponding, the semantic label of described unstructured data record is searched afterwards in described data directory, obtain the major key of the data record to search, then according to described major key rapidly locating in described tables of data.
The method for managing resource for magnanimity unstructured data according to the present invention, for magnanimity unstructured data, it is possible to carries out effective organization and management, it is possible to inquire about fast and efficiently, substantially increases data-handling efficiency.
Accompanying drawing explanation
Fig. 1 is the flow chart of data processing figure according to the inventive method.
Fig. 2 is unstructured data unified storage exemplary plot.
Fig. 3 is the hierarchical chart of tables of data in data base.
Fig. 4 is data resource inquiry schematic diagram.
Detailed description of the invention
Below in conjunction with accompanying drawing, the present invention is described in detail. Following example are not limitation of the present invention. Under the spirit and scope without departing substantially from inventive concept, those skilled in the art it is conceivable that change and advantage be all included in the present invention.
Fig. 1 is the flow chart of data processing figure according to the inventive method, comprises the storing process of data, and metadata table, data directory and index of metadata create process, data request processing process three part.
First the storing process of data is described in detail below. Storing process includes to create tables of data on HBase and being stored on request on HBase and HDFS by initial data. As it is shown in figure 1, specific as follows:
Step a1: first according to the data to upload, creates corresponding data table on HBase. The content of the tables of data created is mainly some essential informations of unstructured data, such as table name, file size, access mode, content, semantic label etc.
Step a2: then user selects the file to upload, calls and uploads interface and carry out file transmission.
Step a3: judge the size of data file.
Step a4: if file size is more than 1MB, is just deposited into HDFS, and stores its essential information and the path on HDFS in the tables of data of HBase, otherwise enters step a5.
Step a5: file is serialized and is stored directly in HBase data base.
HBase is one and has high reliability, high-performance, and row store, expansible, the distributed data base system of real-time read write attribute, it is possible to meet the city unstructured data primary demand to storage. But, due to the design of HBase self, when directly storage large data objects, performance can be problematic for it. Such as, when the region of HBase rises to a certain size time, (acquiescence 256MB) can carry out fractured operation (split) automatically, at this moment can block all write operations to current bay. Additionally, a large amount of writes of same subregion can be caused repeatedly write with a brush dipped in Chinese ink operation (flush), thus frequently triggering union operation (Compaction), take the I/O of cluster.
For the problems referred to above, a solution is to leave on HDFS by unstructured data object, and then file path writes the particular column race of HBase, when to access this unstructured data object, just by reading the respective file on HDFS. Although this scheme can solve the problem that HBase storage performance issue caused by large data objects, but if the unstructured data object size of required storage is not judged, adopts in this way without exception, can cause that the small documents on HDFS is too much. And too much small documents influences whether the performance of HDFS. Therefore reasonably way is the storage mode that the size according to unstructured data object capacity selects it. Being based on discussed above, the present invention is with the 1MB cut-off rule for both modes. When unstructured data object is less than or equal to 1MB time, after its sequence can be turned to Byte array, be stored in the particular column race of HBase. Otherwise, then can be deposited on HDFS in the form of a file, then the path of file is left in the particular column race of HBase. HBase can arrange string specially for identifying the storage mode of data simultaneously.
Fig. 2 is unstructured data unified storage exemplary plot. Such as having two unstructured data objects to need storage, wherein a is the file of txt form, and b is the picture of png form. The size of a is less, for 3KB, it is possible to be directly converted in the Herba Orobanches that byte arrays leaves HBase in by file content. And the size of b is relatively big, for 5MB, then can be deposited on HDFS, path is write in the middle of HBase. Judged the storage mode of data object by an identity column simultaneously.
As shown in Figure 2, HBase deposits unstructured data object by a Ge Lie race (ColumnFamily), and these row race (ColumnFamily) comprise five row: Name row, and Size arranges, Format arranges, Access arranges, and Content row and Tags arrange, and represents the filename corresponding to unstructured data object respectively, size, file type, storage mode, content and semantic label. When Access is set to be stored directly in hbase, then the content of Content is the Byte array of unstructured data. Otherwise, it is set to unstructured data and deposits path on HDFS.
Data storage completes the structure of laggard row data directory, metadata table and index of metadata table, illustrates below in conjunction with Fig. 1 and Fig. 3.
First, data directory and metadata table (step b1) are built.
All unstructured datas all create data directory on HBase data base, and data directory is with the semantic label field in data message for line unit, and content is the major key of all unstructured data records relevant to institute semantic tags. Separating with " # " between multiple major keys, form is<" semantic label ", " major key 1# major key 2# major key 3#... ">. Semantic label field is the description to each unstructured data record, the major key of unstructured data and filename (filename).
Further, all unstructured datas are all created on HBase data base metadata table, described metadata table is the metadata information of unstructured data, every a line correspondence one unstructured data. Metadata information is the information for describing unstructured data attribute, it is used for the function supporting to include instruction storage position, resource lookup, file record, metadata information includes field: table name (name), theme (subject), label (tages) and file format (format), as shown in Figure 3. Using table name as line unit (rowkey) in metadata table, it is simple to made a look up by table name.
Wherein, Urban Data is classified by subject field general orientation, including traffic, environment and the condition of the people etc.
Label field is the Further Division to the various subject data in city, and this traffic subject data has the data such as traffic surveillance videos, bayonet socket picture.
When user uploads data, system can update metadata table.
Owing to client is to be inquired about by keyword, in order to accelerate inquiry velocity, the present invention creates index of metadata table (step b2) according to the metadata information in metadata table. Index of metadata table is with the theme in described metadata information or label field for line unit, and content is the table name of all tables of data relevant to described theme or label. Wherein, each line unit correspond to a list, and this list is the line unit of the metadata information of the data resource comprising institute's inquiry tag. In the content, multiple table names separate with " # ", and form is<" theme or label ", " table name 1# table name 2# table name 3#... ">, as shown in Figure 3.
Tables of data as shown in Figure 3 and concordance list, in bold box be all line unit in figure.
Finally introduce data request processing process in conjunction with Fig. 1 and Fig. 4.
When user uses theme or tag queries metadata, according to the metadata tag to search or theme, index of metadata table is made a look up, specifically includes following steps:
Step c1: determine the theme or label that to search data;
Step c2: with theme obtained in the previous step or label information for line unit, the index of metadata table in HBase is made a look up;
Step c3: determine whether requested theme or label, has the table name then returning all tables of data corresponding with requested theme or label in index of metadata table;
Step c4: the table name according to tables of data obtained in the previous step, finds the metadata of correspondence with table name for line unit in metadata table and returns, and the information of return includes table name, city, theme, label and file format, without then returning sky.
After obtaining the table name of all tables of data at the unstructured data record place to inquire about, inquiry unstructured data record further, specifically include following steps:
Step d1: the table name first passing through the step c4 tables of data obtained searches the data directory that unstructured data record is corresponding, the table name of data directory is spliced with " Index " by the table name of tables of data, as: data directory can be quickly found out " table name _ Index ".
Step d2: be then used by semantic label and data directory is made a look up, obtains the major key of all data records corresponding to institute's semantic tags, the i.e. filename of unstructured data. Here, semantic label is that user provides, and user wants to look up the data comprising which semantic information, namely can determine that semantic label.
Step d3: use the data record major key obtained to search in tables of data, and return relevant unstructured data record.
By above-mentioned step, it is possible to be quickly positioned to the unstructured data record searched in tables of data.
Obviously, those of ordinary skill in the art will be appreciated that, above embodiments is intended merely to the explanation present invention, and it is not used as limitation of the invention, as long as in the spirit of the present invention, to the change of embodiment described above, modification all by the Claims scope dropping on the present invention.

Claims (10)

1. the method for managing resource for magnanimity unstructured data, it is characterised in that comprise the following steps:
Step a: determine its storage mode according to the size of unstructured data file, when described unstructured data file size exceedes given threshold value, it is deposited into HDFS file system, and the tables of data created on HBase stores its essential information and the path on HDFS; When described unstructured data file size is less than or equal to given threshold value, described file is serialized and is stored directly in HBase data base;
Step b: build metadata table and data directory according to described unstructured data, and utilize described metadata table to build index of metadata table;
Step c: when query metadata, makes a look up described index of metadata table according to the theme of the metadata to search or label, to obtain the tables of data of correspondence; And
Step d: when inquiring about unstructured data record, naming rule according to described data directory finds the data directory that tables of data is corresponding, the semantic label of described unstructured data record is searched afterwards in described data directory, obtain the major key of the data record to search, then according to described major key rapidly locating in described tables of data.
2. method according to claim 1, it is characterised in that in step a, described given threshold value is 1MB, further includes steps of
Step a1: first create the tables of data uploading data on HBase;
Step a2: select the data file to upload;
Step a3: judge described data file size;
Step a4: if file size is more than 1MB, is just deposited into HDFS, and stores its essential information and the path on HDFS in HBase table, otherwise enters step a5; And
Step a5: file is serialized and is stored directly in HBase data base.
3. method according to claim 2, it is characterised in that step b farther includes:
Step b1: all unstructured datas all create metadata table and data directory on HBase data base, and described metadata table includes the metadata information of described unstructured data;
Step b2: create index of metadata table according to metadata information in described metadata table.
4. method according to claim 3, it is characterized in that, described metadata information is the information for describing unstructured data attribute, being used for the function supporting to include instruction storage position, resource lookup, file record, described metadata information includes following field: table name, theme, label and file format; Using table name as line unit in described metadata information table, for being made a look up by table name.
5. method according to claim 4, it is characterised in that Urban Data is classified by described subject field general orientation, including traffic, environment and the condition of the people.
6. method according to claim 5, it is characterised in that described label field is the Further Division to the various subject data in city, described traffic subject data has the data of traffic surveillance videos, bayonet socket picture.
7. method according to claim 6, it is characterized in that, described data directory is with the semantic label of described unstructured data record for line unit, with the major key of all unstructured data records relevant to institute semantic tags for content, separate with " # " between multiple major keys, form is<" semantic label ", " major key 1# major key 2# major key 3#... ">; Institute's semantic tags field is the description to each unstructured data record, the major key of described unstructured data and filename.
8. method according to claim 7, it is characterized in that, described index of metadata table is with the theme in described metadata information or label field for line unit, content is the table name of all tables of data relevant to described theme or label, multiple table names separate with " # ", form is<" theme or label ", " table name 1# table name 2# table name 3#... ">.
9. method according to claim 8, it is characterised in that in step c, during query metadata, makes a look up index of metadata table according to the metadata tag to search or theme, comprises the following steps:
Step c1: determine the theme or label that to search data;
Step c2: with theme obtained in the previous step or label information for line unit, the index of metadata table in HBase is made a look up;
Step c3: determine whether requested theme or label, has the table name then returning all tables of data corresponding with requested theme or label in index of metadata table; And
Step c4: the table name according to tables of data obtained in the previous step, finds the metadata of correspondence with table name for line unit in metadata table and returns, and the information of return includes table name, city, theme, label and file format, without then returning sky.
10. method according to claim 9, it is characterised in that in step d, during inquiry unstructured data record, further includes steps of
Step d1: the table name first passing through the step c4 tables of data obtained searches the data directory that unstructured data record is corresponding, and the table name of described data directory is spliced with " Index " by the table name of tables of data;
Step d2: use semantic label that data directory is made a look up, obtain the major key of all data records corresponding to institute's semantic tags, the i.e. filename of unstructured data; And
Step d3: use the data record major key obtained search in tables of data and return described unstructured data record.
CN201610003635.6A 2016-01-04 2016-01-04 Resource management method for massive unstructured data Pending CN105677826A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610003635.6A CN105677826A (en) 2016-01-04 2016-01-04 Resource management method for massive unstructured data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610003635.6A CN105677826A (en) 2016-01-04 2016-01-04 Resource management method for massive unstructured data

Publications (1)

Publication Number Publication Date
CN105677826A true CN105677826A (en) 2016-06-15

Family

ID=56190357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610003635.6A Pending CN105677826A (en) 2016-01-04 2016-01-04 Resource management method for massive unstructured data

Country Status (1)

Country Link
CN (1) CN105677826A (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106297291A (en) * 2016-08-29 2017-01-04 苏州金螳螂怡和科技有限公司 Urban expressway traffic information acquisition system
CN106407355A (en) * 2016-09-07 2017-02-15 中国农业银行股份有限公司 Data storage method and device
CN106506661A (en) * 2016-11-18 2017-03-15 浪潮软件集团有限公司 Method, server and system for dynamically returning data
CN106844236A (en) * 2016-12-27 2017-06-13 北京五八信息技术有限公司 The date storage method and device of terminal device
CN107169083A (en) * 2017-05-11 2017-09-15 聚龙融创科技有限公司 Public security bayonet socket magnanimity vehicle data storage and retrieval method and device, electronic equipment
CN107291889A (en) * 2017-06-20 2017-10-24 郑州云海信息技术有限公司 A kind of date storage method and system
CN107391765A (en) * 2017-09-01 2017-11-24 云南电网有限责任公司电力科学研究院 A kind of power network natural calamity data warehouse model implementation method
CN108241724A (en) * 2017-05-11 2018-07-03 新华三大数据技术有限公司 A kind of metadata management method and device
CN108248641A (en) * 2017-12-06 2018-07-06 中国铁道科学研究院电子计算技术研究所 A kind of urban track traffic data processing method and device
CN108268614A (en) * 2017-12-29 2018-07-10 郑州轻工业学院 A kind of distribution management method of forest reserves spatial data
CN108268517A (en) * 2016-12-30 2018-07-10 希姆通信息技术(上海)有限公司 The management method and system of label in database
CN108470040A (en) * 2018-02-11 2018-08-31 中国石油天然气股份有限公司 Method and device for warehousing unstructured data
CN108595589A (en) * 2018-04-19 2018-09-28 中国科学院电子学研究所苏州研究院 A kind of efficient access method of magnanimity science data picture
CN108647290A (en) * 2018-05-06 2018-10-12 深圳市保千里电子有限公司 Internet cell phone cloud photograph album backup querying method based on HBase and system
CN108897859A (en) * 2018-06-29 2018-11-27 郑州云海信息技术有限公司 A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium
CN109446296A (en) * 2018-09-10 2019-03-08 上海勋立信息科技有限公司 A kind of magnanimity unstructured data treating method and apparatus
CN109582643A (en) * 2018-11-20 2019-04-05 中国石油大学(华东) A kind of real-time dynamic data management system based on HBase
CN109669925A (en) * 2018-11-21 2019-04-23 北京市天元网络技术股份有限公司 The management method and device of unstructured data
WO2019116167A1 (en) * 2017-12-12 2019-06-20 International Business Machines Corporation Storing unstructured data in a structured framework
CN110109890A (en) * 2019-05-10 2019-08-09 京东方科技集团股份有限公司 Unstructured data processing method and unstructured data processing system
CN110555021A (en) * 2018-03-26 2019-12-10 深圳先进技术研究院 Data storage method, query method and related device
CN110633281A (en) * 2019-09-12 2019-12-31 北京百度网讯科技有限公司 Method and device for processing multi-type data sources
CN111190949A (en) * 2018-11-15 2020-05-22 杭州海康威视数字技术股份有限公司 Data storage and processing method, device, equipment and medium
CN111367857A (en) * 2020-03-03 2020-07-03 中国联合网络通信集团有限公司 Data storage method and device, FTP server and storage medium
CN111459945A (en) * 2020-04-07 2020-07-28 中科曙光(南京)计算技术有限公司 Hierarchical index query method based on HBase
WO2020192663A1 (en) * 2019-03-26 2020-10-01 华为技术有限公司 Data management method and related device
CN111881332A (en) * 2020-06-17 2020-11-03 武汉光庭信息技术股份有限公司 Automatic driving simulation data management server and method
CN112003956A (en) * 2020-10-27 2020-11-27 武汉中科通达高新技术股份有限公司 Traffic management system
CN112148938A (en) * 2020-10-16 2020-12-29 成都中科大旗软件股份有限公司 Cross-domain heterogeneous data retrieval system and retrieval method
CN113220945A (en) * 2021-04-28 2021-08-06 广州宸祺出行科技有限公司 Method and system for field retrieval and path display of data blood margin
US20220083589A1 (en) * 2020-09-14 2022-03-17 Olympus Corporation Information processing apparatus, information processing system, information processing method, metadata creation method, recording control method, and non-transitory computer-readable recording medium recording information processing program
CN117349401A (en) * 2023-12-06 2024-01-05 之江实验室 Metadata storage method, device, medium and equipment for unstructured data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164534A (en) * 2013-04-11 2013-06-19 苏州阔地网络科技有限公司 Method and system of data search based on cloud education platform
CN103198129A (en) * 2013-04-11 2013-07-10 苏州阔地网络科技有限公司 Data search realizing method and data search realizing system based on cloud education platform
CN103246700A (en) * 2013-04-01 2013-08-14 厦门市美亚柏科信息股份有限公司 Mass small file low latency storage method based on HBase
CN103647850A (en) * 2013-12-25 2014-03-19 北京京东尚科信息技术有限公司 Data processing method, device and system of distributed version control system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246700A (en) * 2013-04-01 2013-08-14 厦门市美亚柏科信息股份有限公司 Mass small file low latency storage method based on HBase
CN103164534A (en) * 2013-04-11 2013-06-19 苏州阔地网络科技有限公司 Method and system of data search based on cloud education platform
CN103198129A (en) * 2013-04-11 2013-07-10 苏州阔地网络科技有限公司 Data search realizing method and data search realizing system based on cloud education platform
CN103647850A (en) * 2013-12-25 2014-03-19 北京京东尚科信息技术有限公司 Data processing method, device and system of distributed version control system

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106297291A (en) * 2016-08-29 2017-01-04 苏州金螳螂怡和科技有限公司 Urban expressway traffic information acquisition system
CN106407355A (en) * 2016-09-07 2017-02-15 中国农业银行股份有限公司 Data storage method and device
CN106506661A (en) * 2016-11-18 2017-03-15 浪潮软件集团有限公司 Method, server and system for dynamically returning data
CN106844236A (en) * 2016-12-27 2017-06-13 北京五八信息技术有限公司 The date storage method and device of terminal device
CN108268517A (en) * 2016-12-30 2018-07-10 希姆通信息技术(上海)有限公司 The management method and system of label in database
CN107169083A (en) * 2017-05-11 2017-09-15 聚龙融创科技有限公司 Public security bayonet socket magnanimity vehicle data storage and retrieval method and device, electronic equipment
CN107169083B (en) * 2017-05-11 2020-03-31 聚龙融创科技有限公司 Mass vehicle data storage and retrieval method and device for public security card port and electronic equipment
CN108241724A (en) * 2017-05-11 2018-07-03 新华三大数据技术有限公司 A kind of metadata management method and device
WO2018205981A1 (en) * 2017-05-11 2018-11-15 新华三大数据技术有限公司 Metadata management
CN107291889A (en) * 2017-06-20 2017-10-24 郑州云海信息技术有限公司 A kind of date storage method and system
CN107391765A (en) * 2017-09-01 2017-11-24 云南电网有限责任公司电力科学研究院 A kind of power network natural calamity data warehouse model implementation method
CN108248641A (en) * 2017-12-06 2018-07-06 中国铁道科学研究院电子计算技术研究所 A kind of urban track traffic data processing method and device
WO2019116167A1 (en) * 2017-12-12 2019-06-20 International Business Machines Corporation Storing unstructured data in a structured framework
GB2582234A (en) * 2017-12-12 2020-09-16 Ibm Storing unstructured data in a structured framework
CN108268614A (en) * 2017-12-29 2018-07-10 郑州轻工业学院 A kind of distribution management method of forest reserves spatial data
CN108268614B (en) * 2017-12-29 2020-08-18 郑州轻工业学院 Distributed management method for forest resource spatial data
CN108470040B (en) * 2018-02-11 2021-03-09 中国石油天然气股份有限公司 Method and device for warehousing unstructured data
CN108470040A (en) * 2018-02-11 2018-08-31 中国石油天然气股份有限公司 Method and device for warehousing unstructured data
CN110555021B (en) * 2018-03-26 2023-09-19 深圳先进技术研究院 Data storage method, query method and related device
CN110555021A (en) * 2018-03-26 2019-12-10 深圳先进技术研究院 Data storage method, query method and related device
CN108595589A (en) * 2018-04-19 2018-09-28 中国科学院电子学研究所苏州研究院 A kind of efficient access method of magnanimity science data picture
CN108647290A (en) * 2018-05-06 2018-10-12 深圳市保千里电子有限公司 Internet cell phone cloud photograph album backup querying method based on HBase and system
CN108897859A (en) * 2018-06-29 2018-11-27 郑州云海信息技术有限公司 A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium
CN109446296A (en) * 2018-09-10 2019-03-08 上海勋立信息科技有限公司 A kind of magnanimity unstructured data treating method and apparatus
CN111190949A (en) * 2018-11-15 2020-05-22 杭州海康威视数字技术股份有限公司 Data storage and processing method, device, equipment and medium
CN111190949B (en) * 2018-11-15 2023-09-26 杭州海康威视数字技术股份有限公司 Data storage and processing method, device, equipment and medium
CN109582643A (en) * 2018-11-20 2019-04-05 中国石油大学(华东) A kind of real-time dynamic data management system based on HBase
CN109669925A (en) * 2018-11-21 2019-04-23 北京市天元网络技术股份有限公司 The management method and device of unstructured data
CN111753141B (en) * 2019-03-26 2024-06-11 华为技术有限公司 Data management method and related equipment
WO2020192663A1 (en) * 2019-03-26 2020-10-01 华为技术有限公司 Data management method and related device
CN111753141A (en) * 2019-03-26 2020-10-09 华为技术有限公司 Data management method and related equipment
CN110109890A (en) * 2019-05-10 2019-08-09 京东方科技集团股份有限公司 Unstructured data processing method and unstructured data processing system
WO2020228452A1 (en) * 2019-05-10 2020-11-19 京东方科技集团股份有限公司 Unstructed data processing method and unstructured data processing system
CN110633281A (en) * 2019-09-12 2019-12-31 北京百度网讯科技有限公司 Method and device for processing multi-type data sources
CN111367857A (en) * 2020-03-03 2020-07-03 中国联合网络通信集团有限公司 Data storage method and device, FTP server and storage medium
CN111459945A (en) * 2020-04-07 2020-07-28 中科曙光(南京)计算技术有限公司 Hierarchical index query method based on HBase
CN111459945B (en) * 2020-04-07 2023-11-10 中科曙光(南京)计算技术有限公司 Hierarchical index query method based on HBase
CN111881332A (en) * 2020-06-17 2020-11-03 武汉光庭信息技术股份有限公司 Automatic driving simulation data management server and method
US20220083589A1 (en) * 2020-09-14 2022-03-17 Olympus Corporation Information processing apparatus, information processing system, information processing method, metadata creation method, recording control method, and non-transitory computer-readable recording medium recording information processing program
CN112148938A (en) * 2020-10-16 2020-12-29 成都中科大旗软件股份有限公司 Cross-domain heterogeneous data retrieval system and retrieval method
CN112003956B (en) * 2020-10-27 2021-01-15 武汉中科通达高新技术股份有限公司 Traffic management system
CN112003956A (en) * 2020-10-27 2020-11-27 武汉中科通达高新技术股份有限公司 Traffic management system
CN113220945A (en) * 2021-04-28 2021-08-06 广州宸祺出行科技有限公司 Method and system for field retrieval and path display of data blood margin
CN113220945B (en) * 2021-04-28 2024-05-31 广州宸祺出行科技有限公司 Method and system for field retrieval and path display of data blood edges
CN117349401A (en) * 2023-12-06 2024-01-05 之江实验室 Metadata storage method, device, medium and equipment for unstructured data
CN117349401B (en) * 2023-12-06 2024-03-15 之江实验室 Metadata storage method, device, medium and equipment for unstructured data

Similar Documents

Publication Publication Date Title
CN105677826A (en) Resource management method for massive unstructured data
CN110825748B (en) High-performance and easily-expandable key value storage method by utilizing differentiated indexing mechanism
US9710535B2 (en) Object storage system with local transaction logs, a distributed namespace, and optimized support for user directories
CN107423422B (en) Spatial data distributed storage and search method and system based on grid
CN103544261B (en) A kind of magnanimity structuring daily record data global index&#39;s management method and device
JP5996088B2 (en) Cryptographic hash database
US8495036B2 (en) Blob manipulation in an integrated structured storage system
CN104346357B (en) The file access method and system of a kind of built-in terminal
CN110119425A (en) Solid state drive, distributed data-storage system and the method using key assignments storage
CN103282899B (en) The storage method of data, access method and device in file system
CN110347852B (en) File system embedded with transverse expansion key value storage system and file management method
CN105912687B (en) Magnanimity distributed data base storage unit
CN103067461B (en) A kind of metadata management system of file and metadata management method
CN104850572A (en) HBase non-primary key index building and inquiring method and system
CN103595797B (en) Caching method for distributed storage system
CN102169507A (en) Distributed real-time search engine
CN107807787B (en) Distributed data storage method and system
CN104408111A (en) Method and device for deleting duplicate data
CN109284273B (en) Massive small file query method and system adopting suffix array index
CN104978330A (en) Data storage method and device
WO2020125630A1 (en) File reading
CN103942301B (en) Distributed file system oriented to access and application of multiple data types
CN109634911A (en) A kind of storage method based on HDFS CD server
CN104516945A (en) Hadoop distributed file system metadata storage method based on relational data base
CN109213760B (en) High-load service storage and retrieval method for non-relational data storage

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 201209 Chuansha Road, Shanghai, No. 221, room 11, building 955

Applicant after: Bocom Intelligent Network Technology Co. Ltd.

Address before: 201209 Chuansha Road, Shanghai, No. 221, room 11, building 955

Applicant before: BOCOM Smart Network Technologies Inc.

COR Change of bibliographic data
RJ01 Rejection of invention patent application after publication

Application publication date: 20160615

RJ01 Rejection of invention patent application after publication