CN112347104A - Column storage layout optimization method based on deep reinforcement learning - Google Patents

Column storage layout optimization method based on deep reinforcement learning Download PDF

Info

Publication number
CN112347104A
CN112347104A CN202011228158.6A CN202011228158A CN112347104A CN 112347104 A CN112347104 A CN 112347104A CN 202011228158 A CN202011228158 A CN 202011228158A CN 112347104 A CN112347104 A CN 112347104A
Authority
CN
China
Prior art keywords
columns
column
sequence
query
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011228158.6A
Other languages
Chinese (zh)
Other versions
CN112347104B (en
Inventor
覃雄派
陈跃国
杜小勇
赵丽萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN202011228158.6A priority Critical patent/CN112347104B/en
Publication of CN112347104A publication Critical patent/CN112347104A/en
Application granted granted Critical
Publication of CN112347104B publication Critical patent/CN112347104B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a column storage layout optimization method based on deep reinforcement learning, which comprises the following steps: receiving a query load; analyzing the query load to generate query features; acquiring characteristic data of the data column according to the query characteristic; determining the output sequence of the columns based on the strategy of the output sequence of the columns and the characteristic data of the data columns; performing quantitative evaluation on the output sequence, wherein the quantitative evaluation strategy is adjusted based on the reward of a system; and adjusting the output sequence of the columns according to the quantitative evaluation result. By the method and the device, the used model parameters can be continuously adjusted in the expected direction that the disk skip time is reduced, the neural network can automatically learn the optimal column sequence according to the characteristic data of the columns, the incremental training can be realized, and the column sequence does not need to be recalculated every time the column sequence is optimized, so that the calculation cost is greatly reduced.

Description

Column storage layout optimization method based on deep reinforcement learning
Technical Field
The invention relates to the field of computers, in particular to a column storage layout optimization method based on deep reinforcement learning, which is mainly used for carrying out layout optimization on column storage of big data so as to improve data reading performance.
Background
OLAP (Online Line analytical processing) analysis oriented to relational data plays a crucial role in many analysis and decision support applications. In the big data era, many big data analysis systems, such as Hive and Spark SQL, use HDFS (hadoop Distributed File system) as the underlying storage, and a large amount of data is continuously accumulated and stored on HDFS, but the real-time requirement of data analysis is higher and higher. As a de facto standard for distributed big data low-cost data storage and processing, the HDFS provides fault-tolerant, portable, extensible and high-read-write throughput unified data storage for a big data analysis system. Big data analytics systems on HDFS are typically used to support batch and interactive query analysis on massive amounts of data.
In these systems, data tables typically employ column storage formats such as RCFile, ORC, partial, CarbonData, etc., data storage employing column storage provides flexible and efficient data encoding and compression functions and is able to read only the necessary data columns, thereby avoiding unnecessary I/O, but we have found that query analysis performance of data on HDFS can be further improved by optimization of the storage layout. When a data column in one horizontal fragment in an HDFS data block is queried and accessed, multiple disk skip reads are needed, and an optimal column sequence can provide the minimum disk skip cost. Among them, the column ordering problem has been demonstrated by academic papers as NP-Hard. It is a challenge how to design an efficient column ordering algorithm to find a near-optimal column order given the query load. The optimization randomness of the existing heuristic search is strong, the existing heuristic search is easy to fall into suboptimum, meanwhile, the column sequencing needs to be recalculated for each optimization, and the calculation cost is high.
Disclosure of Invention
In view of the above, the present invention has been developed to provide a solution that overcomes, or at least partially solves, the above-mentioned problems. Therefore, in one aspect of the present invention, a column storage layout optimization method based on deep reinforcement learning is provided, the method includes:
receiving a query load;
analyzing the query load to generate query features;
acquiring characteristic data of the data column according to the query characteristic;
determining the output sequence of the columns based on the strategy of the output sequence of the columns and the characteristic data of the data columns;
performing quantitative evaluation on the output sequence, wherein the quantitative evaluation strategy is adjusted based on the reward of a system;
and adjusting the output sequence of the columns according to the quantitative evaluation result.
Optionally, the output sequence is quantitatively evaluated according to the disk skipping time.
Optionally, an Actor-Critic algorithm is adopted to realize a strategy of the output sequence of the depth reinforcement learning column, and a reward adjustment quantitative evaluation strategy based on the system includes adjusting parameters in a Critic neural network according to rewards given by the system.
Optionally, the decision of the output sequence is made by using a neural network of pointernet, including mapping from one sequence to another.
Optionally, determining the output order of the columns based on the policy of the output order of the columns and the characteristic data of the data columns includes:
obtaining the weight of an element at a certain position of the output sequence and each position of the input sequence by using an attention mechanism;
and combining the input sequence with the weight to calculate the element with the maximum relation between the current output and the input sequence, and taking the element of the input sequence as an output element.
Optionally, the method further includes: the method for uniformly coding the input query load specifically comprises the following steps:
initializing each query in the input query load to a set;
determining a corresponding column access characteristic for each query;
and carrying out binary coding on the elements in the set corresponding to the query according to the column access characteristics.
The technical scheme provided by the application at least has the following technical effects or advantages: the invention realizes a column storage layout optimization method based on deep reinforcement learning, and performs experimental comparison with the existing heuristic column sorting algorithm, further reduces the disk skip cost, can continuously adjust the used model parameters in the expected direction of reducing the disk skip time, enables a neural network to automatically learn the optimal column sorting according to the characteristic data of columns, can realize incremental training, directly inputs the latest query load into a model, and does not need to recalculate the column sorting at every second optimization, thereby greatly reducing the calculation cost.
The above description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the technical solutions of the present invention and the objects, features, and advantages thereof more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow chart of a deep reinforcement learning-based column storage layout optimization method proposed by the present invention;
FIG. 2 is a diagram illustrating a skip cost model of a disk in a wide-table-based column storage layout optimization scheme;
FIG. 3 shows an overall framework diagram of the column storage layout optimization based on deep reinforcement learning proposed by the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
In a big data analysis system, I/O is often the main performance bottleneck, and the design and optimization of a storage system are crucial to the improvement of big data analysis performance. In terms of data organization, column storage (e.g., ORC, partial) provides flexible and efficient data encoding and compression functions and can only read the necessary data columns, thereby avoiding unnecessary I/O. Under the column storage layout, how to adjust the physical data storage layout to adapt to the changing query load and system environment is an urgent problem to be solved. The invention aims to design and realize a column storage layout optimization method DRL-COA (DRL-COA) based on deep reinforcement learning in a Hadoop environmentDeep Reinforcement Learning based Column Ordering Algorithm) is applied to the self-adaptive column storage layout optimization, and compared with the existing heuristic column sorting algorithm, the method further reduces the disk skip cost.
In the DRL-COA provided by the invention, an Actor-Critic algorithm model is used for reinforcement learning, a network structure of a Pointer Net is applied to an Actor neural network, the Actor-Critic neural network is mainly used for continuously outputting new actions (column sequence) according to the initial input, and the Critic neural network is used for evaluating the actions according to the 'income' after the actions, so that the new actions are continuously selected. When action selection is performed, the selection of the column of each position is also important because the action selection is in a column sequence, and the selection is performed by obtaining the weight of an element of a certain position and each position of an input sequence through an attention mechanism.
In one aspect of the present invention, a column storage layout optimization method based on deep reinforcement learning is provided, which uses a deep reinforcement learning technique to solve a sequential decision problem of columns, and optimizes a storage layout by training a model, specifically, as shown in fig. 1, the method includes:
receiving a query load;
analyzing the query load to generate query features;
acquiring characteristic data of the data column according to the query characteristic;
determining the output sequence of the columns based on the strategy of the output sequence of the columns and the characteristic data of the data columns;
performing quantitative evaluation on the output sequence, wherein the quantitative evaluation strategy is adjusted based on the reward of a system;
and adjusting the output sequence of the columns according to the quantitative evaluation result.
As a preferred embodiment, an Actor-Critic algorithm is adopted to realize a strategy of outputting the sequence of the deep reinforcement learning columns, and a quantitative evaluation strategy is adjusted based on the reward of the system, wherein the strategy comprises the step of adjusting parameters in a Critic neural network according to the reward given by the system.
The specific process of implementing the Actor-Critic algorithm is described in detail below. The input of the DRL-COA network model in the scheme can be represented as a matrix [1 x n ]]:
Figure BDA0002764276500000051
ciRepresenting whether each data column exists in the query (where 1 represents existence and 0 represents nonexistence), and n represents the number of queries, and the input is a mathematical representation of the query load Q. The output of the Actor-Critic model is the order of the data columns, which can be expressed as o (order). And meanwhile, evaluating the output result of each iteration of the model by using the disk jump time. The process mainly comprises the following steps:
(1) according to the current state, the Actor makes an action of column sequencing output;
(2) critic scores the last performance of the operator according to state and action;
(3) according to the score of critic, the Actor adjusts the current strategy (namely parameters in the Actor neural network) and executes the next action;
(4) critic also adjusts the current scoring strategy (i.e., parameters in the Critic neural network) according to the reward (revenue) given by the system;
(5) initially, Actor performed randomly, Critic scored randomly. But due to the existence of reward, Critic scores become more accurate, and Actor performs better and better.
In this method, it is preferable that the output order is quantitatively evaluated based on a disk skip time. Network model random strategy p of the schemeθ(o | c) can be expressed as: when the input is c (column) and the output is o (order), the model evaluation SC (o-c) is the corresponding disk jump time. The goals of study model training were: the smaller the model evaluation SC (o | c) value, the greater the probability that output o will be selected. The output result of each iteration of the model is evaluated according to the disk jump time during training, so that the design can ensure that the model parameters are continuously adjusted in the expected direction of reducing the disk jump time during the training process.
FIG. 2 is a diagram showing a skip cost model of a disk in a wide-table-based column storage layout optimization scheme. In a column storage layout optimization scheme based on a wide table, a jump reading time cost model based on a magnetic disk is designed aiming at the characteristics of data access on a traditional magnetic disk, multiple jump reading operations with equal distances are executed in a series of HDFS files, and the average jump reading time on the jump reading distance d is taken as the jump reading cost corresponding to d in a statistical sense. And after the skip cost under different skip distances is obtained, a segmented skip cost function is constructed by adopting linear fitting. Fig. 2 shows skip cost functions obtained by the method on three different types of disks.
In the patent, the Actor-Critic algorithm is adopted to realize deep reinforcement learning, so that discrete values can be processed and single-step updating can be realized in game data processing. However, after the Actor-critical deep learning algorithm is adopted in the application, the discrete value processing and single-step updating can be realized.
FIG. 3 is an overall framework diagram of deep reinforcement learning-based column storage layout optimization proposed by the present invention. The figure shows several components of a DRL-COA model, from the collection and analysis of query load to the characteristic input and training of the model, an intelligent agent (a deep learning agent) learns the characteristic data of a column through continuous interactive learning with the environment, and a Critic neural network evaluates the output result of each iteration of an Actor neural network model according to a skip estimator component, so that the model parameters can be continuously adjusted in the expected direction of reducing the skip time of a disk, and compared with the randomness optimization of heuristic search, the DRL-COA model has more directionality and is not easy to fall into suboptimum; meanwhile, as a deep reinforcement learning model, column sequencing does not need to be recalculated at every second optimization, and the calculation cost is greatly reduced by an incremental training mode.
In this patent, determining the output order of the columns based on the policy of the output order of the columns and the characteristic data of the data columns may include:
obtaining the weight of an element at a certain position of the output sequence and each position of the input sequence by using an attention mechanism;
and combining the input sequence with the weight to calculate the element with the maximum relation between the current output and the input sequence, and taking the element of the input sequence as an output element.
Preferably, the decision of the output order is made by using a neural network of pointernet, including mapping from one sequence to another.
The scheme is similar to a related problem of combination optimization in consideration of the problem of column ordering, namely, a decision between sequences is needed to adjust the column ordering. The DRL-COA model adopts a neural network of Pointer Net to solve a sequence decision problem in column sequencing and solve a mapping problem from one sequence to another sequence. Meanwhile, when the output sequence is calculated, the attention mechanism is utilized to obtain the weight of an element at a certain position of the output sequence and each position of the input sequence, and then the input sequence and the weight are combined in a certain mode to influence the output. Thus, the element with the largest relation between the current output and the input sequence can be calculated, and the element of the input sequence is used as the output element, and each output element points to the input element like a pointer. The design can control each input element to be pointed by only one output element, so that the repeated appearance of the input elements is avoided.
In the present invention, the input query load samples are encoded prior to model training. This is because the column accessed by each query may be only a part of columns, and the output is the set of all columns, the requirement of the network structure of the Pointer Net can be met by the input coding, the content of the output sequence is completely consistent with that of the input sequence, and only the sequence order is changed. In FIG. 3, C1,C2,C3,C4,C5For the data column input to the encoder, and<g>,C4,C5,C1,C2is the data column output by the decoder.
Assuming that the number of queries in load Q is N and the set of accessed data columns is N in length, we initialize each query Q to be N
Figure BDA0002764276500000071
ciSet N' of 0.
Therefore, the uniformly encoding the input query load may specifically include:
initializing each query in the input query load to a set;
determining a corresponding column access characteristic for each query; in particular, for each query Q (involving only m data columns) in the load Q, its column access characteristic is Cq={cq,1,cq,2,...,cq,m}。
The elements in the set corresponding to the query are binary-coded according to the column access characteristics, and specifically, the subscript position of the data column of {1, 2..., m } etc. corresponding to the query q in N' may be set to 1, and the other positions are still 0 (indicating that the query q has not accessed the column). Thus, the load is uniformly encoded into a pattern of {1, 0.., 1 }.
According to the method, an input load sample is effectively encoded through technologies such as an Actor-Critic deep reinforcement learning algorithm, a Pointer Net neural network, an attention mechanism and disk skip cost simulation, the sequence is used as output, the disk skip cost is used as an output result of each iteration of a model to be evaluated, and therefore model parameters can be continuously adjusted in an expected direction of reducing disk skip time. Under the implementation scheme, the neural network automatically learns the optimal column sequence according to the characteristic data of the columns, the incremental training of the DRL-COA model can be realized, the latest query load is directly input into the model, and the column sequence is not required to be recalculated every time the optimization is carried out, so that the calculation cost is greatly reduced.
The technical scheme provided by the application at least has the following technical effects or advantages: the invention realizes a column storage layout optimization method based on deep reinforcement learning, and performs experimental comparison with the existing heuristic column sorting algorithm, further reduces the disk skip cost, can continuously adjust the used model parameters in the expected direction of reducing the disk skip time, enables a neural network to automatically learn the optimal column sorting according to the characteristic data of columns, can realize incremental training, directly inputs the latest query load into a model, and does not need to recalculate the column sorting at every second optimization, thereby greatly reducing the calculation cost.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.

Claims (6)

1. A column storage layout optimization method based on deep reinforcement learning is characterized by comprising the following steps:
receiving a query load;
analyzing the query load to generate query features;
acquiring characteristic data of the data column according to the query characteristic;
determining the output sequence of the columns based on the strategy of the output sequence of the columns and the characteristic data of the data columns;
performing quantitative evaluation on the output sequence, wherein the quantitative evaluation strategy is adjusted based on the reward of a system;
and adjusting the output sequence of the columns according to the quantitative evaluation result.
2. The column storage layout optimization method of claim 1, further characterized by quantitatively evaluating the output order according to disk skip time.
3. The column storage layout optimization method according to claim 1, further characterized in that an Actor-Critic algorithm is adopted to implement a strategy of depth-enhanced learning column output order, and the system-based reward adjustment quantitative evaluation strategy comprises adjusting parameters in a Critic neural network according to rewards given by the system.
4. The column store layout optimization method of claim 1, further characterized by taking output order decisions using a neural network of pointernet, including mapping from one sequence to another.
5. The method of optimizing column storage layout of claim 1, further characterized in that determining the output order of the columns based on the policy of the output order of the columns and the characteristic data of the data columns comprises:
obtaining the weight of an element at a certain position of the output sequence and each position of the input sequence by using an attention mechanism;
and combining the input sequence with the weight to calculate the element with the maximum relation between the current output and the input sequence, and taking the element of the input sequence as an output element.
6. The method of optimizing column storage layout of claim 4, further characterized in that determining the output order of the columns based on the policy of the output order of the columns and the characteristic data of the data columns comprises: the input query load is uniformly coded, specifically: initializing each query in the input query load to a set; determining a corresponding column access characteristic for each query; and carrying out binary coding on the elements in the set corresponding to the query according to the column access characteristics.
CN202011228158.6A 2020-11-06 2020-11-06 Column storage layout optimization method based on deep reinforcement learning Active CN112347104B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011228158.6A CN112347104B (en) 2020-11-06 2020-11-06 Column storage layout optimization method based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011228158.6A CN112347104B (en) 2020-11-06 2020-11-06 Column storage layout optimization method based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN112347104A true CN112347104A (en) 2021-02-09
CN112347104B CN112347104B (en) 2023-09-29

Family

ID=74429231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011228158.6A Active CN112347104B (en) 2020-11-06 2020-11-06 Column storage layout optimization method based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN112347104B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117332229A (en) * 2023-09-27 2024-01-02 天津大学 Fault diagnosis-oriented inter-satellite interaction information optimization method
CN117332229B (en) * 2023-09-27 2024-05-10 天津大学 Fault diagnosis-oriented inter-satellite interaction information optimization method

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120143913A1 (en) * 2010-12-03 2012-06-07 International Business Machines Corporation Encoding Data Stored in a Column-Oriented Manner
CN102609493A (en) * 2012-01-20 2012-07-25 东华大学 Connection sequence inquiry optimizing method based on column-storage model
CN103294831A (en) * 2013-06-27 2013-09-11 中国人民大学 Multidimensional-array-based grouping aggregation calculating method in column storage database
CN103324765A (en) * 2013-07-19 2013-09-25 西安电子科技大学 Multi-core synchronization data query optimization method based on column storage
CN106528737A (en) * 2016-10-27 2017-03-22 中企动力科技股份有限公司 Website navigation display method and system
US20180107696A1 (en) * 2015-06-04 2018-04-19 Microsoft Technology Licensing, Llc Column ordering for input/output optimization in tabular data
CN108197275A (en) * 2018-01-08 2018-06-22 中国人民大学 A kind of distributed document row storage indexing means
CN108804473A (en) * 2017-05-04 2018-11-13 华为技术有限公司 The method, apparatus and Database Systems of data query
CN110032604A (en) * 2019-02-02 2019-07-19 阿里巴巴集团控股有限公司 Data storage device, transfer device and data bank access method
CN110084375A (en) * 2019-04-26 2019-08-02 东南大学 A kind of hierarchy division frame based on deeply study
CN110114783A (en) * 2016-11-04 2019-08-09 渊慧科技有限公司 Utilize the intensified learning of nonproductive task
CN110192206A (en) * 2017-05-23 2019-08-30 谷歌有限责任公司 Sequence based on attention converts neural network
CN110278149A (en) * 2019-06-20 2019-09-24 南京大学 Multi-path transmission control protocol data packet dispatching method based on deeply study
CN111612126A (en) * 2020-04-18 2020-09-01 华为技术有限公司 Method and device for reinforcement learning
US20200311585A1 (en) * 2019-03-31 2020-10-01 Palo Alto Networks Multi-model based account/product sequence recommender
CN111797860A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Feature extraction method and device, storage medium and electronic equipment

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120143913A1 (en) * 2010-12-03 2012-06-07 International Business Machines Corporation Encoding Data Stored in a Column-Oriented Manner
CN102609493A (en) * 2012-01-20 2012-07-25 东华大学 Connection sequence inquiry optimizing method based on column-storage model
CN103294831A (en) * 2013-06-27 2013-09-11 中国人民大学 Multidimensional-array-based grouping aggregation calculating method in column storage database
CN103324765A (en) * 2013-07-19 2013-09-25 西安电子科技大学 Multi-core synchronization data query optimization method based on column storage
US20180107696A1 (en) * 2015-06-04 2018-04-19 Microsoft Technology Licensing, Llc Column ordering for input/output optimization in tabular data
CN106528737A (en) * 2016-10-27 2017-03-22 中企动力科技股份有限公司 Website navigation display method and system
CN110114783A (en) * 2016-11-04 2019-08-09 渊慧科技有限公司 Utilize the intensified learning of nonproductive task
CN108804473A (en) * 2017-05-04 2018-11-13 华为技术有限公司 The method, apparatus and Database Systems of data query
CN110192206A (en) * 2017-05-23 2019-08-30 谷歌有限责任公司 Sequence based on attention converts neural network
CN108197275A (en) * 2018-01-08 2018-06-22 中国人民大学 A kind of distributed document row storage indexing means
CN110032604A (en) * 2019-02-02 2019-07-19 阿里巴巴集团控股有限公司 Data storage device, transfer device and data bank access method
US20200311585A1 (en) * 2019-03-31 2020-10-01 Palo Alto Networks Multi-model based account/product sequence recommender
CN111797860A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Feature extraction method and device, storage medium and electronic equipment
CN110084375A (en) * 2019-04-26 2019-08-02 东南大学 A kind of hierarchy division frame based on deeply study
CN110278149A (en) * 2019-06-20 2019-09-24 南京大学 Multi-path transmission control protocol data packet dispatching method based on deeply study
CN111612126A (en) * 2020-04-18 2020-09-01 华为技术有限公司 Method and device for reinforcement learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HAOQIONG BIAN等: "Wide Table Layout Optimization based on Column Ordering and Duplication", ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, pages 299 - 314 *
金国栋等: "HDFS存储和优化技术研究综述", 软件学报, vol. 31, no. 1, pages 137 - 161 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117332229A (en) * 2023-09-27 2024-01-02 天津大学 Fault diagnosis-oriented inter-satellite interaction information optimization method
CN117332229B (en) * 2023-09-27 2024-05-10 天津大学 Fault diagnosis-oriented inter-satellite interaction information optimization method

Also Published As

Publication number Publication date
CN112347104B (en) 2023-09-29

Similar Documents

Publication Publication Date Title
US9665572B2 (en) Optimal data representation and auxiliary structures for in-memory database query processing
CN114117153B (en) Online cross-modal retrieval method and system based on similarity relearning
JP2012008725A (en) Device and method for sorting data
CN111191002A (en) Neural code searching method and device based on hierarchical embedding
CN112417381B (en) Method and device for rapidly positioning infringement image applied to image copyright protection
CN113312505A (en) Cross-modal retrieval method and system based on discrete online hash learning
CN114186084A (en) Online multi-mode Hash retrieval method, system, storage medium and equipment
Wang et al. Research on maize disease recognition method based on improved resnet50
CN115577144A (en) Cross-modal retrieval method based on online multi-hash code joint learning
CN115511071A (en) Model training method and device and readable storage medium
CN115952277A (en) Knowledge relationship based retrieval enhancement method, model, device and storage medium
CN113377991B (en) Image retrieval method based on most difficult positive and negative samples
CN112347104A (en) Column storage layout optimization method based on deep reinforcement learning
CN113611354A (en) Protein torsion angle prediction method based on lightweight deep convolutional network
CN117058235A (en) Visual positioning method crossing various indoor scenes
CN1045343C (en) Chaos processor
CN116306321A (en) Particle swarm-based adsorbed water treatment scheme optimization method, device and equipment
CN116483337A (en) API completion method based on prompt learning and data enhancement
CN115757694A (en) Recruitment industry text recall method, system, device and medium
CN117009621A (en) Information searching method, device, electronic equipment, storage medium and program product
CN115309929A (en) Cross-modal Hash retrieval method and system for maintaining nonlinear semantics
CN111062477B (en) Data processing method, device and storage medium
CN114020948A (en) Sketch image retrieval method and system based on sorting clustering sequence identification selection
CN114329181A (en) Question recommendation method and device and electronic equipment
CN116737607B (en) Sample data caching method, system, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant