WO2023032013A1 - 演算装置、演算方法、及びプログラム - Google Patents

演算装置、演算方法、及びプログラム Download PDF

Info

Publication number
WO2023032013A1
WO2023032013A1 PCT/JP2021/031784 JP2021031784W WO2023032013A1 WO 2023032013 A1 WO2023032013 A1 WO 2023032013A1 JP 2021031784 W JP2021031784 W JP 2021031784W WO 2023032013 A1 WO2023032013 A1 WO 2023032013A1
Authority
WO
WIPO (PCT)
Prior art keywords
record
component
selection component
elements
order
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/031784
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
晋二 古庄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to PCT/JP2021/031784 priority Critical patent/WO2023032013A1/ja
Priority to JP2023544818A priority patent/JPWO2023032013A1/ja
Publication of WO2023032013A1 publication Critical patent/WO2023032013A1/ja
Priority to US18/588,343 priority patent/US20240211454A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Definitions

  • the present invention relates to an arithmetic device, an arithmetic method, and a program.
  • tabular data is uniquely decomposed into a component group composed of record-related components and column-related components.
  • the record-related component is an ordered set called OrdSet (Ordered Set), which is used to record the record number of the record group hit by the search and the record number of the record group rearranged by sorting.
  • OrdSet Orderered Set
  • a mechanism (algorithm group) that implements various operations such as searching, sorting, aggregation, and relational algebra operations using such a group of components is called a natural number index.
  • OrdSet In a natural number index, OrdSet may be used as it is, but OrdSet is decomposed into a component representing record numbers selected from the original tabular data (selection component) and a component representing the order of those record numbers (order component). It is sometimes used as Therefore, it is preferable that such decomposition can be performed at high speed.
  • An embodiment of the present invention has been made in view of the above points, and aims to decompose an ordered set into selected components and ordered components at high speed.
  • the arithmetic device uses integers from Q to Q + N-1 (where Q is a predetermined integer and N is a predetermined integer of 1 or more)
  • a record selection component S (n) (N) which is an ordered set consisting of n (where 0 ⁇ n ⁇ N) elements selected from the record number set, select from the record number set.
  • a record selection component L (n) (N) which is a set of n elements, and a record order component P (n) (n) representing the order of the elements of the record selection component.
  • a decomposing unit for decomposing the component S (n) (N) , the decomposing unit for dividing the record selection component L (n) (N) and the record order component P (n) (n) A predetermined operation is used to decompose the record selection component S (n) (N) .
  • An ordered set can be decomposed into selected components and ordered components at high speed.
  • FIG. 11 is a diagram for explaining an example of an ordered record selection component;
  • FIG. FIG. 11 is a flowchart (part 1) for explaining an example of decomposition processing of ordered record selection components;
  • FIG. 10 is a diagram for explaining an example of a Map array after initialization;
  • FIG. 10 is a diagram for explaining an example of updating a Map array;
  • FIG. 10 is a diagram for explaining an example of creating an inverse of a record selection component and a record order component;
  • FIG. 10 is a diagram for explaining an example of creating record order components;
  • FIG. 11 is a flowchart (part 2) for explaining an example of decomposition processing of an ordered record selection component;
  • FIG. 11 is a diagram for explaining an example of sorting of the position array of ordered record selection components; It is a figure for demonstrating an example of tables T0 , T1 , and T2 .
  • FIG. 10 is a diagram for explaining an example of display processing of a reverse sort result; It is a figure for demonstrating an example of internal sorting and creation of a chronological accumulation.
  • FIG. 11 is a diagram for explaining an example of a display of table T2 and time-order cumulative totals;
  • FIG. 10 is a diagram for explaining an example of a symmetrical arrangement other than base 0;
  • Tabular data is a data structure consisting of N records and M columns. Each column has N values of the same data type, and the i-th column has K i kinds of values.
  • N, M, and K i are used with the above meanings without any particular note. Note that K i is abbreviated as K when the target column is specified.
  • Arbitrary tabular data can be uniquely decomposed into components related to records and components related to columns.
  • a mechanism (algorithm group) that realizes various operations such as searching, sorting, aggregation, and relational algebra operations using these component groups is the Natural Numbered Index (NNI).
  • the component for records is an ordered set called OrdSet, and the component for columns is a set called Sorted Value List (SVL) and Natural Numbered Column (NNC). All of these components are represented by one-dimensional arrays whose elements are values of the same data type. Note that ascending order value list components and natural number item value components are obtained for each column.
  • Non-Patent Document 1 For the advantages of using the natural number index.
  • Each component used in the natural number index is all held as a one-dimensional array with base 0.
  • the elements of the array can be of various data types, such as integers, floating point numbers, and strings. Therefore, the type of the elements of the array is specified, and the array is also called a natural number array, a character string array, or the like.
  • Most of the components (that is, one-dimensional arrays) used in the natural number index are fully sequential N arrays having elements of natural numbers from 0 to N ⁇ 1, called complete sequential numbers N, which will be described later.
  • a one-dimensional array A of size n is denoted as A (n) .
  • the i-th element of the one-dimensional array A (n) is denoted as A (n) [i].
  • the elements of the one-dimensional array A are a complete sequential number N
  • a (N) When it is desired to explicitly indicate that the elements of the one-dimensional array A are a complete sequential number N, it is written as A (N) .
  • the size of the one-dimensional array A is n and its elements are the complete sequential numbers N, it is written as A (n) (N) . where n and N are independent of each other.
  • ⁇ Complete serial number N A continuous natural number from 0 to N ⁇ 1 is called a complete serial number N. Given an element i of a complete sequence number N, the total number of types of values (that is, N), the number of types of values that are less than i (that is, i), the number of types of values that are greater than i (that is, Ni- 1) is readily apparent.
  • the total number of value types is 7 from 0 to 6
  • the number of types of values smaller than 5 is 5 from 0 to 4
  • the number of types of values greater than 5 is 7. It can be seen that the number of types is 1 out of 6.
  • a one-dimensional array whose elements are N complete sequential numbers is called a complete sequential N array. That is, the value of an arbitrary element A (N) [i] in the fully sequential N array A (N) is any value from 0 to N ⁇ 1.
  • the symmetrical array N is a one-dimensional array of size N having values from 0 to N ⁇ 1 as elements without duplication. This symmetric array can also be thought of as representing a permutation for values from 0 to N-1. It should be noted that symmetric arrays form a group with respect to the index operator, which will be described later.
  • a (N ) is called an increasing array if it is a non-decreasing array and satisfies A (N) [i] ⁇ A (N) [j] if i ⁇ j.
  • the index operation has the properties shown in (1), (2) and (3) below.
  • the size of the one-dimensional array representing the operation result is the size of the one-dimensional array on the right side of the index operator (that is, the right operand).
  • the data type of the elements of the one-dimensional array representing the operation result is the data type of the elements of the one-dimensional array on the left side of the index operator (that is, the left operand).
  • any one-dimensional array P (N) ⁇ G is a symmetric array
  • (G, ⁇ ) forms a group.
  • the natural number index uses a component group composed of record-related components and column-related components.
  • tabular data has N records.
  • the top record of tabular data is identified by record number 0, and the last record is identified by record number N-1.
  • This record number is a complete serial number N.
  • a complete sequential N array S (n) (N) of size n ( ⁇ N) with no duplicate values i.e., if 0 ⁇ n ⁇ N, i ⁇ j then S (n) (N) [i] ⁇ S (n) (N ) ) that satisfies S (n) (N ) [j] can be used to represent any search and sort result for tabular data.
  • This S stands for OrdSet, which is hereinafter referred to as an ordered record selection component.
  • S (3) (N) (3, 0, 4) is an ordered set in which the 3rd, 0th, and 4th records of N records are selected and arranged in that order.
  • Algorithms that implement various operations e.g., searching, sorting, aggregation, relational algebra operations, etc.
  • S (n) (N) L (n) (N) ⁇ P (n) (n)
  • the record selection component L becomes an increasing array
  • S (n) (N) L (n) (N) P (n) (n) (more specifically, the case of decomposition in the order of O(n) or O(n ⁇ log(n)) in terms of the amount of calculation) will be described. This makes it possible to increase the speed of algorithms that implement various operations on natural number indices.
  • a column of tabular data can be regarded as a non-natural number array C (N) holding N values.
  • An ascending value list component SVL is obtained by retrieving the K kinds of values contained in C (N) and storing them in ascending order. SVL is also a non-natural number array. Next, replacing each element of C (N) with the storage position (0, 1, . . . , K ⁇ 1) on the SVL yields the natural number item value component NNC.
  • This P S is called a sort from P 1 to P 2 and its inverse P S ⁇ 1 is called an inverse sort.
  • the group (G, ⁇ ) is non-commutative, so that P SL is obtained by performing an index operation on P 1 from the left to obtain P 2 , and performing an index operation on P 1 from the right to obtain P 2 . There is a PSR to obtain.
  • P SR P 2 ⁇ PSR ⁇ P 2 -1
  • P SR P 2 ⁇ 1 ⁇ P SL ⁇ P 2
  • sorting P S is applied to a certain column of certain tabular data, and inverse sorting P S ⁇ 1 is created, and then 1 whose element is the cumulative total of certain column values after sorting.
  • inverse sorting P S ⁇ 1 to a dimensional array and displaying the result will be described. As a result, it is possible to display the total in a form corresponding to the tabular data while maintaining the display of the original tabular data.
  • FIG. 1 shows the hardware configuration of the index calculation device 10 in this embodiment.
  • the index calculation device 10 in this embodiment is realized by the hardware configuration of a general computer or computer system, and includes an input device 101, a display device 102, an external I/F 103, and a communication I/F.
  • F 104 processor 105 , memory device 106 and storage device 107 .
  • each of these pieces of hardware is connected via a bus 108 so as to be able to communicate with each other.
  • the input device 101 is, for example, a keyboard, mouse, touch panel, various physical buttons, and the like.
  • the display device 102 is, for example, a display, a display panel, or the like.
  • the external I/F 103 is an interface with an external device such as the recording medium 103a.
  • the index calculation device 10 can perform reading and writing of the recording medium 103a via the external I/F 103 .
  • Examples of the recording medium 103a include flexible disks, CDs (Compact Discs), DVDs (Digital Versatile Disks), SD memory cards (Secure Digital memory cards), USB (Universal Serial Bus) memory cards, and the like.
  • the communication I/F 104 is an interface for connecting the index calculation device 10 to a communication network.
  • the processor 105 is, for example, various arithmetic units such as a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit). Note that when the processor 105 is a CPU, it may be a multi-core CPU.
  • the memory device 106 is, for example, a main storage device such as RAM (Random Access Memory).
  • the storage device 107 is, for example, an auxiliary storage device such as a HDD (Hard Disk Drive) or SSD (Solid State Drive).
  • the index calculation device 10 in this embodiment can implement various processes described later.
  • the hardware configuration shown in FIG. 1 is an example, and the index calculation device 10 may have, for example, a plurality of processors 105, a plurality of memory devices 106, A plurality of storage devices 107 may be provided.
  • the index calculation device 10 may have various hardware other than the illustrated hardware.
  • FIG. 2 shows the functional configuration of the index calculation device 10 in this embodiment.
  • the index calculation device 10 in this embodiment has a decomposition section 201 , a reverse sorting section 202 , a display control section 203 and a storage section 204 .
  • the decomposing unit 201, the reverse sorting unit 202, and the display control unit 203 are realized by, for example, processing that one or more programs installed in the index calculation device 10 cause the processor 105 to execute.
  • the storage unit 204 is implemented by the memory device 106 or the storage device 107, or both.
  • the storage unit 204 may be realized by, for example, a storage device (eg, database server, NAS (Network Attached Storage), etc.) connected to the index computing device 10 via a communication network.
  • a storage device eg, database server, NAS (Network Attached Storage), etc.
  • the reverse sorting unit 202 creates a reverse sort P S ⁇ 1 for a certain sort P S .
  • the display control unit 203 displays tabular data, the result of applying the inverse sort P S ⁇ 1 to the one-dimensional array, and the like.
  • the storage unit 204 stores various information such as ordered record selection components S and tabular data. In addition to these, the storage unit 204 also stores calculation results during some process.
  • the Map sequence is hereinafter also referred to as Map.
  • the processor 105 provided in the index calculation device 10 in this embodiment is a multi-core CPU, hereinafter, the elements in the range from position 0 to 2 in the Map array are the processing target of Core 0, and the elements from position 3 to 5 are processed. It is assumed that the elements in the range are to be processed by Core1. This enables Core0 and Core1 to execute subsequent processes in parallel. Note that this is only an example, and if the processor 105 is a multi-core CPU having three or more cores, three or more ranges may be subjected to parallel processing.
  • the decomposing unit 201 sorts the position array by the elements of the ordered record selection component S (sorting in ascending order) to create a record selection component L and a record order component P (step S202).
  • FIG. 10 shows how this sorting is performed.
  • a table T 2 is the result of sorting by product name.
  • the reverse sorting unit 202 sorts each record of the table T2 by time (sorts in ascending order) (step S301). Note that the display by the display control unit 203 remains the table T2 . As a result of this sorting, a table T3 shown in FIG. 13 is obtained. Also obtained is an ordered record selection component S 3 , a record selection component L 3 and a record order component P 3 .
  • the sorting in step S301 is the sorting PS from P2 to P3 .
  • the reverse sorting unit 202 accumulates the sales of each record in the table T3 in chronological order, thereby creating a chronological total of sales (step S302).
  • the display by the display control unit 203 remains the table T2 .
  • a one-dimensional array whose elements are the time-order cumulative totals will be referred to as R (5) , and this will be referred to as a time-order cumulative total array.
  • R (5) (200,500,900,1200,1600).
  • a time-order cumulative array R (5) is obtained.
  • the reverse sorting unit 202 applies reverse sorting P S ⁇ 1 to the time-order cumulative array R (5) (step S304).
  • the display control unit 203 displays the one-dimensional array R' (5) obtained in step S304 as well as the table T2 (step S305).
  • This display result is shown in FIG.
  • the chronological total can be displayed in a manner corresponding to the table T2 .
  • the definition of the complete serial number N can be expanded as "continuous natural numbers from Q to Q+N-1". Therefore, the definition of a fully sequential N-array can be extended as well. That is, the one-dimensional array A (N) of the base Q is a complete sequential N array is extended to ⁇ the value of an arbitrary element A (N) [i] is any of Q to Q+N-1''. can.
  • various operations for example, searching, sorting, aggregation, relational algebraic operations, etc.
  • the index calculation device 10 can calculate reverse sort P S ⁇ 1 for sort P S from P 1 to P 2 with respect to arbitrary record order components P 1 , P 2 ⁇ G. . Therefore, various applications using this inverse sort P S ⁇ 1 can be realized. For example, after performing sort PS on a column with tabular data, if you want to obtain the cumulative value of that column value and display it together with the tabular data before sort PS , reverse sort P for the cumulative value By performing S -1 , it is possible to display the accumulated values in the same order as the record order of the tabular data before sorting PS .
  • index calculation device 101 input device 102 display device 103 external I/F 103a recording medium 104 communication I/F 105 processor 106 memory device 107 storage device 108 bus 201 decomposing unit 202 reverse sorting unit 203 display control unit 204 storage unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/JP2021/031784 2021-08-30 2021-08-30 演算装置、演算方法、及びプログラム Ceased WO2023032013A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/JP2021/031784 WO2023032013A1 (ja) 2021-08-30 2021-08-30 演算装置、演算方法、及びプログラム
JP2023544818A JPWO2023032013A1 (https=) 2021-08-30 2021-08-30
US18/588,343 US20240211454A1 (en) 2021-08-30 2024-02-27 Calculation device, calculation method, and recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/031784 WO2023032013A1 (ja) 2021-08-30 2021-08-30 演算装置、演算方法、及びプログラム

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/588,343 Continuation US20240211454A1 (en) 2021-08-30 2024-02-27 Calculation device, calculation method, and recording medium

Publications (1)

Publication Number Publication Date
WO2023032013A1 true WO2023032013A1 (ja) 2023-03-09

Family

ID=85412285

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/031784 Ceased WO2023032013A1 (ja) 2021-08-30 2021-08-30 演算装置、演算方法、及びプログラム

Country Status (3)

Country Link
US (1) US20240211454A1 (https=)
JP (1) JPWO2023032013A1 (https=)
WO (1) WO2023032013A1 (https=)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016095639A (ja) * 2014-11-13 2016-05-26 日本電気株式会社 データベース装置、データ管理方法、及びプログラム
JP2018045441A (ja) * 2016-09-14 2018-03-22 株式会社ターボデータラボラトリー データ統合方法、データ統合装置、データ処理システム及びコンピュータプログラム

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6380952B2 (ja) * 2014-12-12 2018-08-29 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation 多数の要素からなる配列をソートする装置、方法およびプログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016095639A (ja) * 2014-11-13 2016-05-26 日本電気株式会社 データベース装置、データ管理方法、及びプログラム
JP2018045441A (ja) * 2016-09-14 2018-03-22 株式会社ターボデータラボラトリー データ統合方法、データ統合装置、データ処理システム及びコンピュータプログラム

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FURUSHO SHINJI, IIZAWA ATSUSHI, NAGAO TADASHI, YAMAMOTO YUKIO, HAYABE SHUICHI, OZAMOTO YOSHIKATSU, KOBAYASHI MASAHIDE: "Accelerating Big Data Processing in Space Sciences with Natural Number Index (NNI)", JOURNAL OF SPACE SCIENCE INFORMATION ANALYSIS, no. 10, 22 February 2021 (2021-02-22), pages 1 - 65, XP093041489, ISSN: 2433-2216, DOI: 10.20637/00047371 *

Also Published As

Publication number Publication date
JPWO2023032013A1 (https=) 2023-03-09
US20240211454A1 (en) 2024-06-27

Similar Documents

Publication Publication Date Title
Schatz et al. High-throughput sequence alignment using Graphics Processing Units
JP4848317B2 (ja) データベースのインデックス作成システム、方法及びプログラム
JP5460486B2 (ja) データをソートする装置及び方法
KR20190104223A (ko) 연관 메모리 디바이스의 희소 행렬 곱셈
JP7070093B2 (ja) クラスタリング装置、クラスタリング方法及びプログラム
JPWO2019225401A1 (ja) 秘密集約関数計算システム、秘密計算装置、秘密集約関数計算方法、およびプログラム
CN104246778A (zh) 用于在多个元素的组合结果之间进行识别的信息处理设备、程序产品及用于其的方法
Ling et al. Design and implementation of a CUDA-compatible GPU-based core for gapped BLAST algorithm
WO2023276162A1 (ja) データ作成装置、データ作成方法、及びプログラム
US20220365920A1 (en) Search method and search device
JP7448857B2 (ja) 情報処理装置、情報処理方法、および、プログラム
Jalili et al. Next generation indexing for genomic intervals
JP3557162B2 (ja) データ抽出方法、データ抽出装置および記録媒体
JP7418781B2 (ja) 企業類似度算出サーバ及び企業類似度算出方法
Lee et al. Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis and beyond
JP7091930B2 (ja) テンソルデータ計算装置、テンソルデータ計算方法及びプログラム
WO2023032013A1 (ja) 演算装置、演算方法、及びプログラム
JP7120439B2 (ja) データ一般化装置、データ一般化方法、プログラム
JPWO2008155852A1 (ja) メモリ共有型並列処理システムにおいて表形式データを集計する方法及び装置
JP2019021011A (ja) 評価システム、評価方法及びプログラム
US20230325304A1 (en) Secret decision tree test apparatus, secret decision tree test system, secret decision tree test method, and program
Dang et al. MultiLayerMatrix: visualizing large taxonomic datasets
Kakkar et al. Interactive analysis of big geospatial data with high‐performance computing: A case study of partisan segregation in the United States
JP4772506B2 (ja) 情報処理方法、情報処理システムおよびプログラム
CN115408491B (zh) 一种历史数据的文本检索方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21955908

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023544818

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21955908

Country of ref document: EP

Kind code of ref document: A1