CN105677826A - Resource management method for massive unstructured data - Google Patents
Resource management method for massive unstructured data Download PDFInfo
- Publication number
- CN105677826A CN105677826A CN201610003635.6A CN201610003635A CN105677826A CN 105677826 A CN105677826 A CN 105677826A CN 201610003635 A CN201610003635 A CN 201610003635A CN 105677826 A CN105677826 A CN 105677826A
- Authority
- CN
- China
- Prior art keywords
- data
- metadata
- label
- unstructured data
- tables
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
- G06F16/134—Distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/148—File search processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a resource management method for massive unstructured data. The method includes the steps that a, the storage mode of the massive unstructured data is determined according to the size of a file of the massive unstructured data, and the file is stored on an HDFS or in an HBase; b, metadata information of the data is stored in the HBase, and query speed is increased by building index tables of metadata according to themes, tags and other information of the metadata; c, when the metadata is queried, the index tables of the metadata can be searched for according to the themes or the tags of the metadata needing to be searched for, and a data table is fast positioned; d, when unstructured data records are queried, the data index table corresponding to the data table needs to be found according to the naming rule of the data index tables, semantic tags of the data are queried in the data index table, recording main keys of the data needing to be searched for are found, and the data is fast positioned in the data table according to the main keys. By means of the resource management method, the massive unstructured data can be effectively organized and managed, and fast and efficient query can be performed.
Description
Technical field
The present invention relates to distributed data base HBase and distributed file system HDFS field, particularly to a kind of method for managing resource for magnanimity unstructured data.
Background technology
The full name of HDFS is HadoopDistributedFilesystem, is the flagship level file system of Hadoop. Its thought source is in Google file system (GoogleFileSystem, GFS), and applicable write-once, the access module that repeatedly reads, meets urban multi-source market demand scene. It is a distributed file system being suitable for storing big file, it is possible to as the data source of Hadoop and Spark.
HBase is based on the distributed data base of increasing income of Google Bigtable exploitation, and it is not traditional relevant database, and its initial objective is exactly solve traditional Relational DataBase not enough problem in theoretical and practice when processing extensive mass data. Owing to the bottom data of HBase is stored on HDFS, therefore HBase has high fault tolerance equally. The main feature of HBase has:
1) enhanced scalability. In memory capacity, HBase achieves level of linearity extension. When data volume reaches certain threshold values, data will be carried out horizontal segmentation by HBase, and will be assigned in thousands of servers of cluster by segmentation block. When the scale of data arrives the limit of cluster, HBase also supports to expand number of clusters, it is achieved do not shut down dynamic seamless dilatation.
2) high-performance. The design original intention of HBase seeks to meet the high concurrent mass data inquiry of user. It has 2 mechanism to ensure concurrently to inquire about efficiently. One is data segmentations. Data are divided into each node of cluster by HBase, and when user inquires about data, each node can return corresponding data block simultaneously, it is achieved concurrently inquires about. Two is caching mechanism. HBase devises efficient caching mechanism, Cache when being provided with MemStore unit especially as reading and writing data, it is possible to significantly increase the hit rate of data access.
3) high availability. The bottom of HBase utilizes HDFS to store data, and namely HDFS itself has high fault tolerance. When certain machine data is lost, HBase can find the backup of these data by HDFS, and duplicate copy, renewal system log (SYSLOG) table again. This ensure that the high availability of HBase system.
And along with current Urban Data amount is day by day huge, unstructured data kind gets more and more, how in distributed system, magnanimity unstructured data to be stored and to manage the direction just becoming research.
Summary of the invention
The present invention is directed to city unstructured data day by day huge, process day by day time-consuming technical problem, it is proposed to a kind of method for managing resource for magnanimity unstructured data, magnanimity unstructured data can be carried out effective organization and management by the method.
A kind of method for managing resource for magnanimity unstructured data, comprises the following steps:
Step a: determine its storage mode according to the size of unstructured data file, when described unstructured data file size exceedes given threshold value, it is deposited into HDFS file system, and the tables of data created on HBase stores its essential information and the path on HDFS; When described unstructured data file size is less than or equal to given threshold value, described file is serialized and is stored directly in HBase data base;
Step b: build metadata table and data directory according to described unstructured data, and utilize described metadata table to build index of metadata table;
Step c: when query metadata, makes a look up described index of metadata table according to the theme of the metadata to search or label, to obtain the tables of data of correspondence; And
Step d: when inquiring about unstructured data record, naming rule according to described data directory finds the data directory that tables of data is corresponding, the semantic label of described unstructured data record is searched afterwards in described data directory, obtain the major key of the data record to search, then according to described major key rapidly locating in described tables of data.
The method for managing resource for magnanimity unstructured data according to the present invention, for magnanimity unstructured data, it is possible to carries out effective organization and management, it is possible to inquire about fast and efficiently, substantially increases data-handling efficiency.
Accompanying drawing explanation
Fig. 1 is the flow chart of data processing figure according to the inventive method.
Fig. 2 is unstructured data unified storage exemplary plot.
Fig. 3 is the hierarchical chart of tables of data in data base.
Fig. 4 is data resource inquiry schematic diagram.
Detailed description of the invention
Below in conjunction with accompanying drawing, the present invention is described in detail. Following example are not limitation of the present invention. Under the spirit and scope without departing substantially from inventive concept, those skilled in the art it is conceivable that change and advantage be all included in the present invention.
Fig. 1 is the flow chart of data processing figure according to the inventive method, comprises the storing process of data, and metadata table, data directory and index of metadata create process, data request processing process three part.
First the storing process of data is described in detail below. Storing process includes to create tables of data on HBase and being stored on request on HBase and HDFS by initial data. As it is shown in figure 1, specific as follows:
Step a1: first according to the data to upload, creates corresponding data table on HBase. The content of the tables of data created is mainly some essential informations of unstructured data, such as table name, file size, access mode, content, semantic label etc.
Step a2: then user selects the file to upload, calls and uploads interface and carry out file transmission.
Step a3: judge the size of data file.
Step a4: if file size is more than 1MB, is just deposited into HDFS, and stores its essential information and the path on HDFS in the tables of data of HBase, otherwise enters step a5.
Step a5: file is serialized and is stored directly in HBase data base.
HBase is one and has high reliability, high-performance, and row store, expansible, the distributed data base system of real-time read write attribute, it is possible to meet the city unstructured data primary demand to storage. But, due to the design of HBase self, when directly storage large data objects, performance can be problematic for it. Such as, when the region of HBase rises to a certain size time, (acquiescence 256MB) can carry out fractured operation (split) automatically, at this moment can block all write operations to current bay. Additionally, a large amount of writes of same subregion can be caused repeatedly write with a brush dipped in Chinese ink operation (flush), thus frequently triggering union operation (Compaction), take the I/O of cluster.
For the problems referred to above, a solution is to leave on HDFS by unstructured data object, and then file path writes the particular column race of HBase, when to access this unstructured data object, just by reading the respective file on HDFS. Although this scheme can solve the problem that HBase storage performance issue caused by large data objects, but if the unstructured data object size of required storage is not judged, adopts in this way without exception, can cause that the small documents on HDFS is too much. And too much small documents influences whether the performance of HDFS. Therefore reasonably way is the storage mode that the size according to unstructured data object capacity selects it. Being based on discussed above, the present invention is with the 1MB cut-off rule for both modes. When unstructured data object is less than or equal to 1MB time, after its sequence can be turned to Byte array, be stored in the particular column race of HBase. Otherwise, then can be deposited on HDFS in the form of a file, then the path of file is left in the particular column race of HBase. HBase can arrange string specially for identifying the storage mode of data simultaneously.
Fig. 2 is unstructured data unified storage exemplary plot. Such as having two unstructured data objects to need storage, wherein a is the file of txt form, and b is the picture of png form. The size of a is less, for 3KB, it is possible to be directly converted in the Herba Orobanches that byte arrays leaves HBase in by file content. And the size of b is relatively big, for 5MB, then can be deposited on HDFS, path is write in the middle of HBase. Judged the storage mode of data object by an identity column simultaneously.
As shown in Figure 2, HBase deposits unstructured data object by a Ge Lie race (ColumnFamily), and these row race (ColumnFamily) comprise five row: Name row, and Size arranges, Format arranges, Access arranges, and Content row and Tags arrange, and represents the filename corresponding to unstructured data object respectively, size, file type, storage mode, content and semantic label. When Access is set to be stored directly in hbase, then the content of Content is the Byte array of unstructured data. Otherwise, it is set to unstructured data and deposits path on HDFS.
Data storage completes the structure of laggard row data directory, metadata table and index of metadata table, illustrates below in conjunction with Fig. 1 and Fig. 3.
First, data directory and metadata table (step b1) are built.
All unstructured datas all create data directory on HBase data base, and data directory is with the semantic label field in data message for line unit, and content is the major key of all unstructured data records relevant to institute semantic tags. Separating with " # " between multiple major keys, form is<" semantic label ", " major key 1# major key 2# major key 3#... ">. Semantic label field is the description to each unstructured data record, the major key of unstructured data and filename (filename).
Further, all unstructured datas are all created on HBase data base metadata table, described metadata table is the metadata information of unstructured data, every a line correspondence one unstructured data. Metadata information is the information for describing unstructured data attribute, it is used for the function supporting to include instruction storage position, resource lookup, file record, metadata information includes field: table name (name), theme (subject), label (tages) and file format (format), as shown in Figure 3. Using table name as line unit (rowkey) in metadata table, it is simple to made a look up by table name.
Wherein, Urban Data is classified by subject field general orientation, including traffic, environment and the condition of the people etc.
Label field is the Further Division to the various subject data in city, and this traffic subject data has the data such as traffic surveillance videos, bayonet socket picture.
When user uploads data, system can update metadata table.
Owing to client is to be inquired about by keyword, in order to accelerate inquiry velocity, the present invention creates index of metadata table (step b2) according to the metadata information in metadata table. Index of metadata table is with the theme in described metadata information or label field for line unit, and content is the table name of all tables of data relevant to described theme or label. Wherein, each line unit correspond to a list, and this list is the line unit of the metadata information of the data resource comprising institute's inquiry tag. In the content, multiple table names separate with " # ", and form is<" theme or label ", " table name 1# table name 2# table name 3#... ">, as shown in Figure 3.
Tables of data as shown in Figure 3 and concordance list, in bold box be all line unit in figure.
Finally introduce data request processing process in conjunction with Fig. 1 and Fig. 4.
When user uses theme or tag queries metadata, according to the metadata tag to search or theme, index of metadata table is made a look up, specifically includes following steps:
Step c1: determine the theme or label that to search data;
Step c2: with theme obtained in the previous step or label information for line unit, the index of metadata table in HBase is made a look up;
Step c3: determine whether requested theme or label, has the table name then returning all tables of data corresponding with requested theme or label in index of metadata table;
Step c4: the table name according to tables of data obtained in the previous step, finds the metadata of correspondence with table name for line unit in metadata table and returns, and the information of return includes table name, city, theme, label and file format, without then returning sky.
After obtaining the table name of all tables of data at the unstructured data record place to inquire about, inquiry unstructured data record further, specifically include following steps:
Step d1: the table name first passing through the step c4 tables of data obtained searches the data directory that unstructured data record is corresponding, the table name of data directory is spliced with " Index " by the table name of tables of data, as: data directory can be quickly found out " table name _ Index ".
Step d2: be then used by semantic label and data directory is made a look up, obtains the major key of all data records corresponding to institute's semantic tags, the i.e. filename of unstructured data. Here, semantic label is that user provides, and user wants to look up the data comprising which semantic information, namely can determine that semantic label.
Step d3: use the data record major key obtained to search in tables of data, and return relevant unstructured data record.
By above-mentioned step, it is possible to be quickly positioned to the unstructured data record searched in tables of data.
Obviously, those of ordinary skill in the art will be appreciated that, above embodiments is intended merely to the explanation present invention, and it is not used as limitation of the invention, as long as in the spirit of the present invention, to the change of embodiment described above, modification all by the Claims scope dropping on the present invention.
Claims (10)
1. the method for managing resource for magnanimity unstructured data, it is characterised in that comprise the following steps:
Step a: determine its storage mode according to the size of unstructured data file, when described unstructured data file size exceedes given threshold value, it is deposited into HDFS file system, and the tables of data created on HBase stores its essential information and the path on HDFS; When described unstructured data file size is less than or equal to given threshold value, described file is serialized and is stored directly in HBase data base;
Step b: build metadata table and data directory according to described unstructured data, and utilize described metadata table to build index of metadata table;
Step c: when query metadata, makes a look up described index of metadata table according to the theme of the metadata to search or label, to obtain the tables of data of correspondence; And
Step d: when inquiring about unstructured data record, naming rule according to described data directory finds the data directory that tables of data is corresponding, the semantic label of described unstructured data record is searched afterwards in described data directory, obtain the major key of the data record to search, then according to described major key rapidly locating in described tables of data.
2. method according to claim 1, it is characterised in that in step a, described given threshold value is 1MB, further includes steps of
Step a1: first create the tables of data uploading data on HBase;
Step a2: select the data file to upload;
Step a3: judge described data file size;
Step a4: if file size is more than 1MB, is just deposited into HDFS, and stores its essential information and the path on HDFS in HBase table, otherwise enters step a5; And
Step a5: file is serialized and is stored directly in HBase data base.
3. method according to claim 2, it is characterised in that step b farther includes:
Step b1: all unstructured datas all create metadata table and data directory on HBase data base, and described metadata table includes the metadata information of described unstructured data;
Step b2: create index of metadata table according to metadata information in described metadata table.
4. method according to claim 3, it is characterized in that, described metadata information is the information for describing unstructured data attribute, being used for the function supporting to include instruction storage position, resource lookup, file record, described metadata information includes following field: table name, theme, label and file format; Using table name as line unit in described metadata information table, for being made a look up by table name.
5. method according to claim 4, it is characterised in that Urban Data is classified by described subject field general orientation, including traffic, environment and the condition of the people.
6. method according to claim 5, it is characterised in that described label field is the Further Division to the various subject data in city, described traffic subject data has the data of traffic surveillance videos, bayonet socket picture.
7. method according to claim 6, it is characterized in that, described data directory is with the semantic label of described unstructured data record for line unit, with the major key of all unstructured data records relevant to institute semantic tags for content, separate with " # " between multiple major keys, form is<" semantic label ", " major key 1# major key 2# major key 3#... ">; Institute's semantic tags field is the description to each unstructured data record, the major key of described unstructured data and filename.
8. method according to claim 7, it is characterized in that, described index of metadata table is with the theme in described metadata information or label field for line unit, content is the table name of all tables of data relevant to described theme or label, multiple table names separate with " # ", form is<" theme or label ", " table name 1# table name 2# table name 3#... ">.
9. method according to claim 8, it is characterised in that in step c, during query metadata, makes a look up index of metadata table according to the metadata tag to search or theme, comprises the following steps:
Step c1: determine the theme or label that to search data;
Step c2: with theme obtained in the previous step or label information for line unit, the index of metadata table in HBase is made a look up;
Step c3: determine whether requested theme or label, has the table name then returning all tables of data corresponding with requested theme or label in index of metadata table; And
Step c4: the table name according to tables of data obtained in the previous step, finds the metadata of correspondence with table name for line unit in metadata table and returns, and the information of return includes table name, city, theme, label and file format, without then returning sky.
10. method according to claim 9, it is characterised in that in step d, during inquiry unstructured data record, further includes steps of
Step d1: the table name first passing through the step c4 tables of data obtained searches the data directory that unstructured data record is corresponding, and the table name of described data directory is spliced with " Index " by the table name of tables of data;
Step d2: use semantic label that data directory is made a look up, obtain the major key of all data records corresponding to institute's semantic tags, the i.e. filename of unstructured data; And
Step d3: use the data record major key obtained search in tables of data and return described unstructured data record.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610003635.6A CN105677826A (en) | 2016-01-04 | 2016-01-04 | Resource management method for massive unstructured data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610003635.6A CN105677826A (en) | 2016-01-04 | 2016-01-04 | Resource management method for massive unstructured data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105677826A true CN105677826A (en) | 2016-06-15 |
Family
ID=56190357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610003635.6A Pending CN105677826A (en) | 2016-01-04 | 2016-01-04 | Resource management method for massive unstructured data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105677826A (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106297291A (en) * | 2016-08-29 | 2017-01-04 | 苏州金螳螂怡和科技有限公司 | Urban expressway traffic information acquisition system |
CN106407355A (en) * | 2016-09-07 | 2017-02-15 | 中国农业银行股份有限公司 | Data storage method and device |
CN106506661A (en) * | 2016-11-18 | 2017-03-15 | 浪潮软件集团有限公司 | Method, server and system for dynamically returning data |
CN106844236A (en) * | 2016-12-27 | 2017-06-13 | 北京五八信息技术有限公司 | The date storage method and device of terminal device |
CN107169083A (en) * | 2017-05-11 | 2017-09-15 | 聚龙融创科技有限公司 | Public security bayonet socket magnanimity vehicle data storage and retrieval method and device, electronic equipment |
CN107291889A (en) * | 2017-06-20 | 2017-10-24 | 郑州云海信息技术有限公司 | A kind of date storage method and system |
CN107391765A (en) * | 2017-09-01 | 2017-11-24 | 云南电网有限责任公司电力科学研究院 | A kind of power network natural calamity data warehouse model implementation method |
CN108241724A (en) * | 2017-05-11 | 2018-07-03 | 新华三大数据技术有限公司 | A kind of metadata management method and device |
CN108248641A (en) * | 2017-12-06 | 2018-07-06 | 中国铁道科学研究院电子计算技术研究所 | A kind of urban track traffic data processing method and device |
CN108268614A (en) * | 2017-12-29 | 2018-07-10 | 郑州轻工业学院 | A kind of distribution management method of forest reserves spatial data |
CN108268517A (en) * | 2016-12-30 | 2018-07-10 | 希姆通信息技术(上海)有限公司 | The management method and system of label in database |
CN108470040A (en) * | 2018-02-11 | 2018-08-31 | 中国石油天然气股份有限公司 | Method and device for warehousing unstructured data |
CN108595589A (en) * | 2018-04-19 | 2018-09-28 | 中国科学院电子学研究所苏州研究院 | A kind of efficient access method of magnanimity science data picture |
CN108647290A (en) * | 2018-05-06 | 2018-10-12 | 深圳市保千里电子有限公司 | Internet cell phone cloud photograph album backup querying method based on HBase and system |
CN108897859A (en) * | 2018-06-29 | 2018-11-27 | 郑州云海信息技术有限公司 | A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium |
CN109446296A (en) * | 2018-09-10 | 2019-03-08 | 上海勋立信息科技有限公司 | A kind of magnanimity unstructured data treating method and apparatus |
CN109582643A (en) * | 2018-11-20 | 2019-04-05 | 中国石油大学(华东) | A kind of real-time dynamic data management system based on HBase |
CN109669925A (en) * | 2018-11-21 | 2019-04-23 | 北京市天元网络技术股份有限公司 | The management method and device of unstructured data |
WO2019116167A1 (en) * | 2017-12-12 | 2019-06-20 | International Business Machines Corporation | Storing unstructured data in a structured framework |
CN110109890A (en) * | 2019-05-10 | 2019-08-09 | 京东方科技集团股份有限公司 | Unstructured data processing method and unstructured data processing system |
CN110555021A (en) * | 2018-03-26 | 2019-12-10 | 深圳先进技术研究院 | Data storage method, query method and related device |
CN110633281A (en) * | 2019-09-12 | 2019-12-31 | 北京百度网讯科技有限公司 | Method and device for processing multi-type data sources |
CN111190949A (en) * | 2018-11-15 | 2020-05-22 | 杭州海康威视数字技术股份有限公司 | Data storage and processing method, device, equipment and medium |
CN111367857A (en) * | 2020-03-03 | 2020-07-03 | 中国联合网络通信集团有限公司 | Data storage method and device, FTP server and storage medium |
CN111459945A (en) * | 2020-04-07 | 2020-07-28 | 中科曙光(南京)计算技术有限公司 | Hierarchical index query method based on HBase |
WO2020192663A1 (en) * | 2019-03-26 | 2020-10-01 | 华为技术有限公司 | Data management method and related device |
CN111881332A (en) * | 2020-06-17 | 2020-11-03 | 武汉光庭信息技术股份有限公司 | Automatic driving simulation data management server and method |
CN112003956A (en) * | 2020-10-27 | 2020-11-27 | 武汉中科通达高新技术股份有限公司 | Traffic management system |
CN112148938A (en) * | 2020-10-16 | 2020-12-29 | 成都中科大旗软件股份有限公司 | Cross-domain heterogeneous data retrieval system and retrieval method |
CN113220945A (en) * | 2021-04-28 | 2021-08-06 | 广州宸祺出行科技有限公司 | Method and system for field retrieval and path display of data blood margin |
US20220083589A1 (en) * | 2020-09-14 | 2022-03-17 | Olympus Corporation | Information processing apparatus, information processing system, information processing method, metadata creation method, recording control method, and non-transitory computer-readable recording medium recording information processing program |
CN117349401A (en) * | 2023-12-06 | 2024-01-05 | 之江实验室 | Metadata storage method, device, medium and equipment for unstructured data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103164534A (en) * | 2013-04-11 | 2013-06-19 | 苏州阔地网络科技有限公司 | Method and system of data search based on cloud education platform |
CN103198129A (en) * | 2013-04-11 | 2013-07-10 | 苏州阔地网络科技有限公司 | Data search realizing method and data search realizing system based on cloud education platform |
CN103246700A (en) * | 2013-04-01 | 2013-08-14 | 厦门市美亚柏科信息股份有限公司 | Mass small file low latency storage method based on HBase |
CN103647850A (en) * | 2013-12-25 | 2014-03-19 | 北京京东尚科信息技术有限公司 | Data processing method, device and system of distributed version control system |
-
2016
- 2016-01-04 CN CN201610003635.6A patent/CN105677826A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103246700A (en) * | 2013-04-01 | 2013-08-14 | 厦门市美亚柏科信息股份有限公司 | Mass small file low latency storage method based on HBase |
CN103164534A (en) * | 2013-04-11 | 2013-06-19 | 苏州阔地网络科技有限公司 | Method and system of data search based on cloud education platform |
CN103198129A (en) * | 2013-04-11 | 2013-07-10 | 苏州阔地网络科技有限公司 | Data search realizing method and data search realizing system based on cloud education platform |
CN103647850A (en) * | 2013-12-25 | 2014-03-19 | 北京京东尚科信息技术有限公司 | Data processing method, device and system of distributed version control system |
Cited By (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106297291A (en) * | 2016-08-29 | 2017-01-04 | 苏州金螳螂怡和科技有限公司 | Urban expressway traffic information acquisition system |
CN106407355A (en) * | 2016-09-07 | 2017-02-15 | 中国农业银行股份有限公司 | Data storage method and device |
CN106506661A (en) * | 2016-11-18 | 2017-03-15 | 浪潮软件集团有限公司 | Method, server and system for dynamically returning data |
CN106844236A (en) * | 2016-12-27 | 2017-06-13 | 北京五八信息技术有限公司 | The date storage method and device of terminal device |
CN108268517A (en) * | 2016-12-30 | 2018-07-10 | 希姆通信息技术(上海)有限公司 | The management method and system of label in database |
CN107169083A (en) * | 2017-05-11 | 2017-09-15 | 聚龙融创科技有限公司 | Public security bayonet socket magnanimity vehicle data storage and retrieval method and device, electronic equipment |
CN107169083B (en) * | 2017-05-11 | 2020-03-31 | 聚龙融创科技有限公司 | Mass vehicle data storage and retrieval method and device for public security card port and electronic equipment |
CN108241724A (en) * | 2017-05-11 | 2018-07-03 | 新华三大数据技术有限公司 | A kind of metadata management method and device |
WO2018205981A1 (en) * | 2017-05-11 | 2018-11-15 | 新华三大数据技术有限公司 | Metadata management |
CN107291889A (en) * | 2017-06-20 | 2017-10-24 | 郑州云海信息技术有限公司 | A kind of date storage method and system |
CN107391765A (en) * | 2017-09-01 | 2017-11-24 | 云南电网有限责任公司电力科学研究院 | A kind of power network natural calamity data warehouse model implementation method |
CN108248641A (en) * | 2017-12-06 | 2018-07-06 | 中国铁道科学研究院电子计算技术研究所 | A kind of urban track traffic data processing method and device |
WO2019116167A1 (en) * | 2017-12-12 | 2019-06-20 | International Business Machines Corporation | Storing unstructured data in a structured framework |
GB2582234A (en) * | 2017-12-12 | 2020-09-16 | Ibm | Storing unstructured data in a structured framework |
CN108268614A (en) * | 2017-12-29 | 2018-07-10 | 郑州轻工业学院 | A kind of distribution management method of forest reserves spatial data |
CN108268614B (en) * | 2017-12-29 | 2020-08-18 | 郑州轻工业学院 | Distributed management method for forest resource spatial data |
CN108470040B (en) * | 2018-02-11 | 2021-03-09 | 中国石油天然气股份有限公司 | Method and device for warehousing unstructured data |
CN108470040A (en) * | 2018-02-11 | 2018-08-31 | 中国石油天然气股份有限公司 | Method and device for warehousing unstructured data |
CN110555021B (en) * | 2018-03-26 | 2023-09-19 | 深圳先进技术研究院 | Data storage method, query method and related device |
CN110555021A (en) * | 2018-03-26 | 2019-12-10 | 深圳先进技术研究院 | Data storage method, query method and related device |
CN108595589A (en) * | 2018-04-19 | 2018-09-28 | 中国科学院电子学研究所苏州研究院 | A kind of efficient access method of magnanimity science data picture |
CN108647290A (en) * | 2018-05-06 | 2018-10-12 | 深圳市保千里电子有限公司 | Internet cell phone cloud photograph album backup querying method based on HBase and system |
CN108897859A (en) * | 2018-06-29 | 2018-11-27 | 郑州云海信息技术有限公司 | A kind of metadata retrieval method, apparatus, equipment and computer readable storage medium |
CN109446296A (en) * | 2018-09-10 | 2019-03-08 | 上海勋立信息科技有限公司 | A kind of magnanimity unstructured data treating method and apparatus |
CN111190949A (en) * | 2018-11-15 | 2020-05-22 | 杭州海康威视数字技术股份有限公司 | Data storage and processing method, device, equipment and medium |
CN111190949B (en) * | 2018-11-15 | 2023-09-26 | 杭州海康威视数字技术股份有限公司 | Data storage and processing method, device, equipment and medium |
CN109582643A (en) * | 2018-11-20 | 2019-04-05 | 中国石油大学(华东) | A kind of real-time dynamic data management system based on HBase |
CN109669925A (en) * | 2018-11-21 | 2019-04-23 | 北京市天元网络技术股份有限公司 | The management method and device of unstructured data |
CN111753141B (en) * | 2019-03-26 | 2024-06-11 | 华为技术有限公司 | Data management method and related equipment |
WO2020192663A1 (en) * | 2019-03-26 | 2020-10-01 | 华为技术有限公司 | Data management method and related device |
CN111753141A (en) * | 2019-03-26 | 2020-10-09 | 华为技术有限公司 | Data management method and related equipment |
CN110109890A (en) * | 2019-05-10 | 2019-08-09 | 京东方科技集团股份有限公司 | Unstructured data processing method and unstructured data processing system |
WO2020228452A1 (en) * | 2019-05-10 | 2020-11-19 | 京东方科技集团股份有限公司 | Unstructed data processing method and unstructured data processing system |
CN110633281A (en) * | 2019-09-12 | 2019-12-31 | 北京百度网讯科技有限公司 | Method and device for processing multi-type data sources |
CN111367857A (en) * | 2020-03-03 | 2020-07-03 | 中国联合网络通信集团有限公司 | Data storage method and device, FTP server and storage medium |
CN111459945A (en) * | 2020-04-07 | 2020-07-28 | 中科曙光(南京)计算技术有限公司 | Hierarchical index query method based on HBase |
CN111459945B (en) * | 2020-04-07 | 2023-11-10 | 中科曙光(南京)计算技术有限公司 | Hierarchical index query method based on HBase |
CN111881332A (en) * | 2020-06-17 | 2020-11-03 | 武汉光庭信息技术股份有限公司 | Automatic driving simulation data management server and method |
US20220083589A1 (en) * | 2020-09-14 | 2022-03-17 | Olympus Corporation | Information processing apparatus, information processing system, information processing method, metadata creation method, recording control method, and non-transitory computer-readable recording medium recording information processing program |
CN112148938A (en) * | 2020-10-16 | 2020-12-29 | 成都中科大旗软件股份有限公司 | Cross-domain heterogeneous data retrieval system and retrieval method |
CN112003956B (en) * | 2020-10-27 | 2021-01-15 | 武汉中科通达高新技术股份有限公司 | Traffic management system |
CN112003956A (en) * | 2020-10-27 | 2020-11-27 | 武汉中科通达高新技术股份有限公司 | Traffic management system |
CN113220945A (en) * | 2021-04-28 | 2021-08-06 | 广州宸祺出行科技有限公司 | Method and system for field retrieval and path display of data blood margin |
CN113220945B (en) * | 2021-04-28 | 2024-05-31 | 广州宸祺出行科技有限公司 | Method and system for field retrieval and path display of data blood edges |
CN117349401A (en) * | 2023-12-06 | 2024-01-05 | 之江实验室 | Metadata storage method, device, medium and equipment for unstructured data |
CN117349401B (en) * | 2023-12-06 | 2024-03-15 | 之江实验室 | Metadata storage method, device, medium and equipment for unstructured data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105677826A (en) | Resource management method for massive unstructured data | |
CN110825748B (en) | High-performance and easily-expandable key value storage method by utilizing differentiated indexing mechanism | |
US9710535B2 (en) | Object storage system with local transaction logs, a distributed namespace, and optimized support for user directories | |
CN107423422B (en) | Spatial data distributed storage and search method and system based on grid | |
CN103544261B (en) | A kind of magnanimity structuring daily record data global index's management method and device | |
JP5996088B2 (en) | Cryptographic hash database | |
US8495036B2 (en) | Blob manipulation in an integrated structured storage system | |
CN104346357B (en) | The file access method and system of a kind of built-in terminal | |
CN110119425A (en) | Solid state drive, distributed data-storage system and the method using key assignments storage | |
CN103282899B (en) | The storage method of data, access method and device in file system | |
CN110347852B (en) | File system embedded with transverse expansion key value storage system and file management method | |
CN105912687B (en) | Magnanimity distributed data base storage unit | |
CN103067461B (en) | A kind of metadata management system of file and metadata management method | |
CN104850572A (en) | HBase non-primary key index building and inquiring method and system | |
CN103595797B (en) | Caching method for distributed storage system | |
CN102169507A (en) | Distributed real-time search engine | |
CN107807787B (en) | Distributed data storage method and system | |
CN104408111A (en) | Method and device for deleting duplicate data | |
CN109284273B (en) | Massive small file query method and system adopting suffix array index | |
CN104978330A (en) | Data storage method and device | |
WO2020125630A1 (en) | File reading | |
CN103942301B (en) | Distributed file system oriented to access and application of multiple data types | |
CN109634911A (en) | A kind of storage method based on HDFS CD server | |
CN104516945A (en) | Hadoop distributed file system metadata storage method based on relational data base | |
CN109213760B (en) | High-load service storage and retrieval method for non-relational data storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 201209 Chuansha Road, Shanghai, No. 221, room 11, building 955 Applicant after: Bocom Intelligent Network Technology Co. Ltd. Address before: 201209 Chuansha Road, Shanghai, No. 221, room 11, building 955 Applicant before: BOCOM Smart Network Technologies Inc. |
|
COR | Change of bibliographic data | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160615 |
|
RJ01 | Rejection of invention patent application after publication |