WO2016188280A1 - 数据库分表的写入方法及装置 - Google Patents

数据库分表的写入方法及装置 Download PDF

Info

Publication number
WO2016188280A1
WO2016188280A1 PCT/CN2016/080016 CN2016080016W WO2016188280A1 WO 2016188280 A1 WO2016188280 A1 WO 2016188280A1 CN 2016080016 W CN2016080016 W CN 2016080016W WO 2016188280 A1 WO2016188280 A1 WO 2016188280A1
Authority
WO
WIPO (PCT)
Prior art keywords
database
record
written
sub
belongs
Prior art date
Application number
PCT/CN2016/080016
Other languages
English (en)
French (fr)
Inventor
何健超
Original Assignee
阿里巴巴集团控股有限公司
何健超
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 何健超 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2016188280A1 publication Critical patent/WO2016188280A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Definitions

  • the present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for writing a database table.
  • the write function needs to be implemented through the database middleware.
  • the database middleware shields the sub-library table from the application layer.
  • the application layer does not know the organization form of the sub-division table, and it is not clear which sub-table the data is written into.
  • This is very friendly for online applications: it masks the cumbersome details of the sub-tables, which is equivalent to reading and writing a logical table.
  • offline large-scale data reflow that is, writing data to the database
  • the writing speed through the middleware is slow, it is difficult to achieve the desired performance requirements, and it is not flexible because the written table cannot be determined. Management data.
  • the present application provides a method for writing a database table, including:
  • the record is written into the determined part table based on the access parameter of the determined database to which the part belongs.
  • the application also provides a writing device for database table, comprising:
  • the topology and the parameter unit are used to obtain the database to which each sub-table belongs and its access parameters;
  • a record allocation unit for determining a part table to be written according to at least one field value of a record to be written into the database by using a certain allocation rule
  • a record writing unit for writing the record into the determined part table based on the access parameter of the determined database to which the part belongs.
  • the embodiment of the present application adopts an allocation rule to determine which part table to write the record to, and completes the writing to the determined part table by using the access parameter of the database to which the part belongs, by directly dividing the score.
  • the table performs write operations, improves the efficiency of large-scale data reflow to the sub-table, and enables flexible management of data using allocation rules.
  • FIG. 1 is a flowchart of a method for writing a database table in the embodiment of the present application
  • FIG. 2 is a schematic diagram of a writing process of a database table in an application example of the present application
  • FIG. 3 is a hardware structural diagram of a host to which an embodiment of the present application is applied;
  • FIG. 4 is a logical structural diagram of a device for writing a database table in the embodiment of the present application.
  • a method for writing a database table which can allocate and write data into each sub-table according to a certain allocation rule in a data reflow scenario, so as to solve the problems existing in the prior art.
  • the method in the embodiment of the present application may be applied to the application layer software, or may be applied to software that can be called by the application layer software to implement the table data writing.
  • the flow of the embodiment of the present application is as shown in FIG. 1 .
  • Step 110 Obtain a database to which each sub-table belongs and its access parameters.
  • All the sub-tables split from the table to be written data may be in one database or in multiple databases.
  • the access parameters required to access the database are often different depending on the type of database and the location of the database in the network. For example, for a relational database that is not local (that is, accessed through the network), the access parameters of the database usually include the database.
  • the connection string of the mysql database is usually: jdbc: mysql://ip:port (you can not fill the port, the default is 3306) / database name;
  • the connection string of the oracle database is: jdbc: oracle: thin: @ip:port ( Can not fill in the port, the default is 1521): database name.
  • the database to which all sub-tables belong and their access parameters can be automatically generated by the database management software, or manually generated by the system administrator, or generated by the system administrator based on the information automatically generated by the management software.
  • the embodiment is not limited.
  • Step 120 using a certain allocation rule, determining the sub-table to be written according to at least one field value of the record to be written into the database.
  • a column of a table is called a field
  • a row of the table is called a record
  • each record includes one or more field values corresponding to the columns of the table.
  • the data is written to the table in the database in units of records. Taking the user table shown in Table 1 as an example, the table includes two records, each of which includes 5 field values.
  • a predetermined allocation rule is applied to the field value of the record to be written into the database to determine which sub-table to write the record to.
  • the specific allocation rule can be determined according to the needs of the actual application scenario.
  • the user table shown in Table 1 is still taken as an example. If the table division according to the province can speed up the retrieval of the table, the allocation rule can be set as follows: The value of the province field, the record is written into the sub-table for storing the users of the province; if the sub-table according to the age group is more suitable for the application, the allocation rule can be set to: according to the value of the age field in the record For the corresponding age group, the record is written into the sub-table for storing the users of the age group.
  • the allocation rule may be set based on two or more field values, for example, a table of values written to a certain record may be determined according to the value of the province field and the value of the gender field.
  • a unique index value is first established for each sub-table, and then at least one field value of the record to be written into the database is input, and an allocation rule is used to obtain an index value to be written into the sub-table.
  • This method can conveniently describe the allocation rules with expressions, and is easy to programmatically implement.
  • Input one or more field values of the record into the expression embodying the distribution rule.
  • the output is the index value corresponding to the part table. .
  • the condition that satisfies the expression of the allocation rule is that the possible value of the field used in the record is within the possible range of the index value of the sub-table index.
  • the index value of each part table can be set to the corresponding province name, and the allocation rule can be described as the index value is equal to the province field value.
  • the index value of the sub-table can be set to be less than 1 from the number of the sub-tables; the allocation rule is set to: the predetermined field value of the record to be written into the database, modulo the number of sub-tables; this example The predetermined field value should be an integer.
  • Step 130 Write the record into the determined part table based on the determined access parameter of the database to which the part belongs.
  • a write operation may be initiated on the part table, and the record is written into the determined part table.
  • the specific way of using the access parameters to perform the table write operation is the type of the database to which the sub-table belongs, and the database to which the sub-list belongs belongs in the network. The position and other factors are determined, and can be implemented by referring to the prior art, which is not limited in this embodiment.
  • the database to which the part table belongs needs to be accessed through the network, you can first establish and maintain the connection to the database to which each part belongs according to the access parameters of the database to which each part belongs; when you want to write in the table in a database At the time of recording, the writing operation of the part table is performed by the connection to the database, and the record is written in the part table determined in step 120. In this way, each write operation no longer needs to establish and disconnect the connection process, but directly based on the established connection, and the processing efficiency of the write operation can be improved.
  • the access parameters usually include the IP address and port number of the host where the database is located (although the relational database) Generally, the default port number is set, but since the default port number can be modified by the administrator, in most application scenarios, the port number used by the database and the database name are indicated in the access parameters; for such a database
  • the sub-table can establish and maintain a TCP connection to the database to which each sub-list belongs by using the IP address of the host to which the database belongs and the port number of the database, and write the record to the determined sub-table through these TCP connections.
  • a corresponding buffer can be set for each sub-table; after determining the sub-table to be written to a record, the record is written into the buffer of the determined sub-table; When the usage of the buffer satisfies a predetermined condition (such as when the record in the buffer reaches a certain number of records, the storage space usage of the buffer reaches a predetermined threshold, etc.), all the records in the partial table buffer are written to the In the sub-table; thereby reducing the impact on other operations of the online database, greatly improving the performance of writing to the database.
  • a predetermined condition such as when the record in the buffer reaches a certain number of records, the storage space usage of the buffer reaches a predetermined threshold, etc.
  • the software that performs the database writing uses the allocation rule to determine the part table to be written, and completes the writing operation to the determined part table by using the access parameter of the database to which the part belongs.
  • the software for database writing can directly control the records in the sub-tables, so that the data in each sub-table can be organized according to the actual business requirements, and the high-efficiency and flexible large-volume The data is manually returned to the sub-database table to meet the functional and performance requirements of the data reflow of the sub-database.
  • the reflow server writes user data of the data source (the data source can be any storage capable of storing data, here, for example, Table 1) into the user table.
  • the user table includes 8 sub-tables distributed on 4 databases. Among them, the sub-tables user00 and user01 are in the database db0, the sub-tables user02 and user03 are in the database db1, the sub-tables user04 and user05 are in the database db2, and the sub-table user06 And user07 in the database db3.
  • the administrator configures the topology of the user table sub-table on the reflow server (and the correspondence between the sub-table and the sub-database) and the access parameters of each database.
  • the access parameters include the IP address of the host where each database resides and the port number of the access database. .
  • One possible form of configuration is as follows:
  • jdbcUrl is used to describe the access parameters of each database
  • table is used to describe the sub-tables in each database.
  • the reflow server obtains the four databases to which the eight sub-tables belong and the access parameters of each database from the administrator's configuration.
  • the reflow server establishes an index value for each sub-table.
  • the index value is 0 to 7 (that is, the number of sub-tables is reduced by 1).
  • the correspondence between the index value and the sub-table is as follows:
  • the administrator configures the allocation rule to: modulo the value of the column 0 field of the record that will be written to the database (such as the value of the sequence number field in Table 1) by the number of sub-tables 8.
  • Its Groovy (a development language) expression is:
  • the reflow server uses the access parameters of the database db0, db1, db2, and db3 of the four sub-tables to establish a TCP connection to each database and maintain the connection state. On the reflow server, maintain a buffer for each sub-table (such as a storage area with a space of 256 records).
  • the reflow server for a record from the data source head to be written into the sub-table, the reflow server according to the allocation rule, the value of the 0th column (ie, the serial number field) in the record is modulo 8 to obtain the sub-table to be written. Index value. If the record is recorded in the first row in Table 1, the index value obtained by the reflow server is 1, that is, the sub-table to be written is user01. The reflow server writes the first row record in Table 1 to the buffer of the subtable user01. When the buffer of user01 is full (if 256 records are reached), the reflow server writes all the records (256 records) in the buffer to the sub-table user01 through the connection with the database db0.
  • a device for writing a database table including a topology and a parameter unit, a record allocation unit, and a record writing unit, wherein: a topology and a parameter unit are used to obtain each sub-table Database and its access parameters; record allocation unit is used to adopt a certain allocation a rule of determining a part table to be written according to at least one field value of a record to be written into the database; the record writing unit is configured to write the record to the determined one based on the access parameter of the determined database to which the part belongs In the table.
  • the device further includes a buffer setting unit, configured to set a corresponding buffer for each part table;
  • the record writing unit includes a buffer module and a writing module, wherein: the buffer module is configured to: The record is written into the buffer of the determined part table; the write module is used to write all the records in the part table buffer to the place when the degree of use of the buffer of a certain part table satisfies a predetermined condition In the sub-table.
  • the device further includes an index value establishing unit, configured to establish a unique index value for each part table; the record allocating unit is specifically configured to: input at least one field value of the record written into the database And using the allocation rule to obtain an index value to be written into the sub-table.
  • an index value establishing unit configured to establish a unique index value for each part table; the record allocating unit is specifically configured to: input at least one field value of the record written into the database And using the allocation rule to obtain an index value to be written into the sub-table.
  • the index value is from 0 to the number of sub-tables minus one;
  • the allocation rule includes: a predetermined field value for a record to be written into the database, modulo the number of sub-tables;
  • the field value is an integer.
  • the device further includes a connection unit, configured to establish a connection to a database to which each part table belongs according to an access parameter of a database to which each part belongs; the record writing unit is specifically configured to: pass the determined part table to A connection to the database that writes the record into the determined sub-table.
  • the access parameters of the database to which the part table belongs include: an IP address, a port number, and a database name.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology.
  • the information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include temporary storage of computer readable media, such as modulated data signals and carrier waves.
  • embodiments of the present application can be provided as a method, system, or computer program product.
  • the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment in combination of software and hardware.
  • the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.

Abstract

本申请提供一种数据库分表的写入方法,包括:获取各个分表所属的数据库及其访问参数;采用一定的分配规则,根据将写入数据库的记录的至少一个字段值确定要写入的分表;基于所确定的分表所属数据库的访问参数,将所述记录写入到所确定的分表中。通过本申请的技术方案,提高了大规模数据回流到分表的效率,并且能够利用分配规则对数据进行灵活管理。

Description

数据库分表的写入方法及装置 技术领域
本申请涉及数据处理技术领域,尤其涉及一种数据库分表的写入方法及装置。
背景技术
随着信息技术的发展,越来越多的互联网应用都涉及到海量的数据存储和访问。数据通常以表的形式存储在数据库中,而表的容量、数据库的容量和会受到服务器硬件资源的限制。当表中的数据规模随着业务日益增长到一定程度后,常常需要将表拆分为多个数据库中的多个分表(分库分表),以维持对表中数据进行操作时的性能。
在大数据时代下,数据需要不断流转、交换才能价值最大化。在企业的数据仓库、商业智能建设中,通常会把存放于各种数据库的在线数据抽取到离线的存储平台、计算平台进行统一加工处理;另一方面,也会把离线的存储平台、计算平台或其他源头的数据写入到在线数据库中。
现有技术中,如果要写入的表为分库分表的形式,则需要通过数据库中间件来实现写入功能。数据库中间件对应用层屏蔽了分库分表,应用层不清楚分库分表的组织形式,也不清楚数据究竟写入了哪个分表中。通常这对于在线应用而言是十分友好的:屏蔽掉了分库分表的繁琐细节,只相当于对一张逻辑表进行读写操作。但是对于离线的大批量数据回流(即把数据写入数据库)的场景而言,通过中间件写入速度较慢,难以达到理想的性能要求,并且由于无法确定所写入的分表而不能灵活的管理数据。
发明内容
有鉴于此,本申请提供一种数据库分表的写入方法,包括:
获取各个分表所属的数据库及其访问参数;
采用一定的分配规则,根据将写入数据库的记录的至少一个字段值确定要写入的分表;
基于所确定的分表所属数据库的访问参数,将所述记录写入到所确定的分表中。
本申请还提供了一种数据库分表的写入装置,包括:
拓扑及参数单元,用于获取各个分表所属的数据库及其访问参数;
记录分配单元,用于采用一定的分配规则,根据将写入数据库的记录的至少一个字段值确定要写入的分表;
记录写入单元,用于基于所确定的分表所属数据库的访问参数,将所述记录写入到所确定的分表中。
由以上技术方案可见,本申请的实施例中采用分配规则来确定将记录写入哪个分表,并通过分表所属数据库的访问参数来完成到所确定分表的写入作,通过直接对分表进行写入操作,提高了大规模数据回流到分表的效率,并且能够利用分配规则对数据进行灵活管理。
附图说明
图1是本申请实施例中数据库分表的写入方法的流程图;
图2是本申请应用示例中数据库分表的写入过程的示意图;
图3是应用本申请实施例的主机的一种硬件结构图;
图4是本申请实施例中一种数据库分表的写入装置的逻辑结构图。
具体实施方式
本申请的实施例中提出一种数据库分表的写入方法,能够实现在数据回流的场景中根据一定的分配规则将数据分配并写入到各个分表中,以解决现有技术存在的问题。本申请实施例中的方法可以应用在应用层软件中,也可以应用在可供应用层软件调用以实现分表数据写入的软件中。本申请实施例的流程如图1所示。
步骤110,获取各个分表所属的数据库及其访问参数。
待写入数据的表拆分而成的所有分表可能在一个数据库中,也可能在多个数据库中。为了将数据写入到确定的分表中,需要得知每个分表所属的数据库,以及访问这个或这些数据库所需的访问参数。访问数据库所需的访问参数往往因数据库的类型和数据库在网络中的位置不同而不同,例如,对非本地(即需要通过网络访问)的关系型数据库,该数据库的访问参数通常包括该数据库所在服务器的IP(Internet Protocol,互联网协议)地址、该数据库使用的端口号和数据库名称。
实际应用中,不同的数据库类型,其访问参数有不同的具体格式,但是通常都需要提供数据库的IP、端口号和数据库名称。其中,一些类型的数据库为了简化,提供了默认端口号,当访问参数中不包括端口号时,可以使用默认端口号。比如通常mysql数据库的连接串是:jdbc:mysql://ip:port(可以不填写端口,默认为3306)/数据库名称;oracle数据库的连接串是:jdbc:oracle:thin:@ip:port(可以不填写端口,默认为1521):数据库名称。
所有分表所属的数据库及其访问参数可以由数据库管理软件自动生成,也可以由系统管理员手动生成,还可以由系统管理员以管理软件自动生成的信息为基础进行修改后生成,本申请的实施例不做限定。
步骤120,采用一定的分配规则,根据将写入数据库的记录的至少一个字段值确定要写入的分表。
数据库中,表的一列称为一个字段,表的一行称为一条记录,每条记录包括一到多个对应于该表的列的字段值。数据以记录为单位写入数据库的表中。以表1所示的用户表为例,该表包括两条记录,每条记录包括5个字段值。
Figure PCTCN2016080016-appb-000001
Figure PCTCN2016080016-appb-000002
表1
本实施例中,针对要写入数据库的记录的字段值,应用预定的分配规则,来决定将该记录写入哪一个分表。具体的分配规则可以根据实际应用场景的需要来确定,仍以表1所示的用户表为例,如果按照省份划分分表能够加快表的检索速度,则可以将分配规则设置为:按照记录中所属省份字段的值,将该记录写入用来存储该省份用户的分表中;如果按照年龄段划分分表更符合应用需要,则可以将分配规则设置为:按照记录中年龄字段的值得到对应的年龄段,将该记录写入用来存储该年龄段用户的分表中。此外,分配规则可以基于两个或两个以上字段值来设置,例如,可以根据所属省份字段的值以及性别字段的值,来确定写入某条记录的分表。
在一种实现方式中,先为每个分表建立唯一的索引值,然后以将写入数据库的记录的至少一个字段值为输入,采用分配规则得到要写入分表的索引值。这种方式可以方便的用表达式来描述分配规则,易于编程实现,将记录的一个到多个字段值输入到体现分配规则的表达式,经过运算后,其输出为对应于分表的索引值。体现分配规则的表达式要满足的条件是,对记录中所采用字段的可能取值,其运算结果在分表索引值的可能取值范围内。
例如,在按照省份划分分表的情形下,可以将每个分表的索引值设置为对应的省份名称,则分配规则可以描述为索引值等于所属省份字段值。再如,可以将分表的索引值设置为从0到分表的数量减1;将分配规则设置为:对将写入数据库的记录的预定字段值,以分表的数量取模;这个例子中该预定字段值应为整数。
步骤130,基于所确定的分表所属数据库的访问参数,将该记录写入到所确定的分表中。
在确定写入记录的分表后,根据该分表所属数据库的访问参数,可以对该分表发起写入操作,将记录写入到所确定的分表中。利用访问参数进行分表写入操作的具体方式由分表所属数据库的类型、分表所属数据库在网络中 的位置等因素来决定,可参照现有技术实现,本实施例中不做限定。
如果分表所属的数据库需要通过网络进行访问,可以先根据各个分表所属数据库的访问参数,建立并维持到每个分表所属数据库的连接;当要在某个数据库中的分表中写入记录时,通过到该数据库的连接来进行该分表的写入操作,将记录写入到步骤120所确定的分表中。这样每次的写入操作不再需要建立和断开连接的处理过程,而直接基于已经建立的连接来进行,能够提高写入操作的处理效率。
例如,对基于TCP/IP(Transmission Control Protocol/Internet Protocol,传输控制协议/互联网协议)协议进行网络访问的关系型数据库,其访问参数通常包括数据库所在主机的IP地址、端口号(尽管关系型数据库一般设置有缺省端口号,但由于该缺省端口号可以被管理员修改,在绝大多数应用场景中会在访问参数里指明数据库所使用的端口号)和数据库名称;对这样的数据库中的分表,可以如利用数据库所属主机的IP地址和数据库的端口号建立并维持到各个分表所属数据库的TCP连接,并通过这些TCP连接将记录写入到所确定的分表中。
在一些应用场合,例如将离线的存储平台、计算平台或其他源头的数据写入到在线数据库中时,频繁的写入操作可能影响在线数据库响应其他实时应用的速度。这种情形下,可以为每个分表设置对应的缓冲区;在确定写入某条记录的分表后,将该记录写入到所确定分表的缓冲区中;当某个分表的缓冲区的使用程度满足预定条件(如缓冲区中的记录达到一定的条数、缓冲区的存储空间使用率达到预定阈值等等)时,将该分表缓冲区中的所有记录写入到该分表中;从而减少对在线数据库其他操作的影响,大大提高写入数据库的性能。
可见,本申请的实施例中,由进行数据库写入的软件采用分配规则来确定将要将记录写入的分表,并通过分表所属数据库的访问参数来完成到所确定分表的写入操作,这样进行数据库写入的软件能够直接控制分表中的记录,从而能够按照实际业务需求来组织各个分表中的数据,高效灵活的将大批量 数据批量回流到分库分表中去,以满足分库分表数据回流的功能和性能需求。
在本申请的一个应用示例中,回流服务器将数据源头(数据源头可以是任何能够存放数据的存储,此处以表1为例)的用户数据写入到user(用户)表中。user表包括分布在4个数据库上的8个分表,其中,分表user00和user01在数据库db0中,分表user02和user03在数据库db1中,分表user04和user05在数据库db2中,分表user06和user07在数据库db3中。
管理员在回流服务器上配置user表分库分表的拓扑结构(及分表与所属数据库的对应关系)以及各个数据库的访问参数,访问参数包括各个数据库所在主机的IP地址和访问数据库的端口号。一种可能的配置形式如下所示:
Figure PCTCN2016080016-appb-000003
其中,jdbcUrl用来描述每个数据库的访问参数,table用来描述每个数据库中的分表。
回流服务器从管理员的配置中得到8个分表所属的4个数据库以及每个数据库的访问参数。回流服务器为每个分表建立索引值,索引值为0到7(即分表数量减1),索引值与分表的对应关系如下:
数据库db0:
user00→索引值:0
user01→索引值:1
数据库db1:
user02→索引值:2
user03→索引值:3
数据库db2:
user04→索引值:4
user05→索引值:5
数据库db3:
user06→索引值:6
user07→索引值:7
管理员将分配规则配置为:对将写入数据库的记录的第0列字段的值(如表1中序号字段的值),以分表的数量8取模。其Groovy(一种开发语言)表达式为:
def route(line){
     return line.get(0).toInteger%8;
}
回流服务器采用4个分表所在数据库db0、db1、db2和db3的访问参数,建立到每个数据库的TCP连接,并维持连接状态。在回流服务器上,为每个分表维护一个缓冲区(如空间大小为256条记录的存储区域)。
请参见图2,对来自数据源头、要写入分表的一条记录,回流服务器按照分配规则,对该记录中第0列(即序号字段)的值以8取模,得到要写入分表的索引值。如对表1中的第1行记录,回流服务器得到的索引值为1,即要写入的分表为user01。回流服务器将表1中的第1行记录写入到分表user01的缓冲区中。当user01的缓冲区满(如达到256条记录)时,回流服务器通过与数据库db0的连接,将缓冲区中的所有记录(256条记录)写入分表user01中。
与上述流程实现对应,本申请的实施例还提供了一种数据库分表的写入装置。该装置可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为逻辑意义上的装置,是通过主机的CPU将对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,除了图3所示的CPU(Central Process Unit,中央处理器)、内存以及非易失性存储器之外,该装置所在的主机通常还包括用于实现网络通信功能的板卡等其他硬件。
图4所示为本申请实施例提供的一种数据库分表的写入装置,包括拓扑及参数单元、记录分配单元和记录写入单元,其中:拓扑及参数单元用于获取各个分表所属的数据库及其访问参数;记录分配单元用于采用一定的分配 规则,根据将写入数据库的记录的至少一个字段值确定要写入的分表;记录写入单元用于基于所确定的分表所属数据库的访问参数,将所述记录写入到所确定的分表中。
可选的,所述装置还包括缓冲区设置单元,用于为每个分表设置对应的缓冲区;所述记录写入单元包括缓冲模块和写入模块,其中:缓冲模块用于将所述记录写入到所确定的分表的缓冲区中;写入模块用于当某个分表的缓冲区的使用程度满足预定条件时,将所述分表缓冲区中的所有记录写入到所述分表中。
可选的,所述装置还包括索引值建立单元,用于为每个分表建立唯一的索引值;所述记录分配单元具体用于:以将写入数据库的记录的至少一个字段值为输入,采用所述分配规则得到要写入分表的索引值。
一个例子中,所述索引值为从0到分表的数量减1;所述分配规则包括:对将写入数据库的记录的预定字段值,以分表的数量取模;所述记录的预定字段值为整数。
可选的,所述装置还包括连接单元,用于根据各个分表所属数据库的访问参数,建立到各个分表所属数据库的连接;记录写入单元具体用于:通过到所确定的分表所属数据库的连接,将所述记录写入到所确定的分表中。
可选的,所述分表所属数据库的访问参数包括:IP地址、端口号和数据库名称。
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。

Claims (12)

  1. 一种数据库分表的写入方法,其特征在于,包括:
    获取各个分表所属的数据库及其访问参数;
    采用一定的分配规则,根据将写入数据库的记录的至少一个字段值确定要写入的分表;
    基于所确定的分表所属数据库的访问参数,将所述记录写入到所确定的分表中。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:为每个分表设置对应的缓冲区;
    所述将记录写入到所确定的分表中,包括:
    将所述记录写入到所确定的分表的缓冲区中;
    当某个分表的缓冲区的使用程度满足预定条件时,将所述分表缓冲区中的所有记录写入到所述分表中。
  3. 根据权利要求1所述的方法,其特征在于,所述方法还包括:为每个分表建立唯一的索引值;
    所述采用一定的分配规则,根据将写入数据库的记录的至少一个字段值确定要写入的分表,包括:以将写入数据库的记录的至少一个字段值为输入,采用所述分配规则得到要写入分表的索引值。
  4. 根据权利要求3所述的方法,其特征在于,所述索引值为从0到分表的数量减1;
    所述分配规则包括:对将写入数据库的记录的预定字段值,以分表的数量取模;所述记录的预定字段值为整数。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:根据各个分表所属数据库的访问参数,建立到各个分表所属数据库的连接;
    所述基于所确定的分表所属数据库的访问参数,将所述记录写入到所确定的分表中,包括:通过到所确定的分表所属数据库的连接,将所述记录写 入到所确定的分表中。
  6. 根据权利要求1所述的方法,其特征在于,所述分表所属数据库的访问参数包括:IP地址、端口号和数据库名称。
  7. 一种数据库分表的写入装置,其特征在于,包括:
    拓扑及参数单元,用于获取各个分表所属的数据库及其访问参数;
    记录分配单元,用于采用一定的分配规则,根据将写入数据库的记录的至少一个字段值确定要写入的分表;
    记录写入单元,用于基于所确定的分表所属数据库的访问参数,将所述记录写入到所确定的分表中。
  8. 根据权利要求7所述的装置,其特征在于,所述装置还包括:缓冲区设置单元,用于为每个分表设置对应的缓冲区;
    所述记录写入单元包括:
    缓冲模块,用于将所述记录写入到所确定的分表的缓冲区中;
    写入模块,用于当某个分表的缓冲区的使用程度满足预定条件时,将所述分表缓冲区中的所有记录写入到所述分表中。
  9. 根据权利要求7所述的装置,其特征在于,所述装置还包括:索引值建立单元,用于为每个分表建立唯一的索引值;
    所述记录分配单元具体用于:以将写入数据库的记录的至少一个字段值为输入,采用所述分配规则得到要写入分表的索引值。
  10. 根据权利要求9所述的装置,其特征在于,所述索引值为从0到分表的数量减1;
    所述分配规则包括:对将写入数据库的记录的预定字段值,以分表的数量取模;所述记录的预定字段值为整数。
  11. 根据权利要求7所述的装置,其特征在于,所述装置还包括:连接单元,用于根据各个分表所属数据库的访问参数,建立到各个分表所属数据库的连接;
    记录写入单元具体用于:通过到所确定的分表所属数据库的连接,将所 述记录写入到所确定的分表中。
  12. 根据权利要求7所述的装置,其特征在于,所述分表所属数据库的访问参数包括:IP地址、端口号和数据库名称。
PCT/CN2016/080016 2015-05-25 2016-04-22 数据库分表的写入方法及装置 WO2016188280A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510272535.9A CN106294423A (zh) 2015-05-25 2015-05-25 数据库分表的写入方法及装置
CN201510272535.9 2015-05-25

Publications (1)

Publication Number Publication Date
WO2016188280A1 true WO2016188280A1 (zh) 2016-12-01

Family

ID=57393774

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/080016 WO2016188280A1 (zh) 2015-05-25 2016-04-22 数据库分表的写入方法及装置

Country Status (2)

Country Link
CN (1) CN106294423A (zh)
WO (1) WO2016188280A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737228A (zh) * 2020-06-23 2020-10-02 平安医疗健康管理股份有限公司 数据库的分库分表方法及装置

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144991B (zh) * 2017-06-15 2021-09-14 北京京东尚科信息技术有限公司 动态分表的方法、装置、电子设备和计算机可存储介质
CN107562790B (zh) * 2017-07-31 2020-05-01 北京北信源软件股份有限公司 一种实现数据处理批量入库的方法和系统
CN109800270B (zh) * 2019-01-22 2020-12-04 青岛聚好联科技有限公司 一种数据存储和查询的方法及物联网系统
CN113138986A (zh) * 2021-04-23 2021-07-20 上海中通吉网络技术有限公司 一种数据库分表数据的抽取方法、装置及系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102053982A (zh) * 2009-11-02 2011-05-11 阿里巴巴集团控股有限公司 一种数据库信息管理方法和设备
CN102867071A (zh) * 2012-10-19 2013-01-09 烽火通信科技股份有限公司 一种网管海量历史数据管理方法
CN103020193A (zh) * 2012-12-03 2013-04-03 北京奇虎科技有限公司 处理数据库操作请求的方法和设备

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1851750A (zh) * 2006-05-24 2006-10-25 徐超英 一种真品身份标志码的编码和认证方法
CN102262626B (zh) * 2010-05-24 2013-08-07 阿里巴巴集团控股有限公司 一种数据库存储数据的方法及装置
CN102567399B (zh) * 2010-12-31 2014-06-11 北京新媒传信科技有限公司 一种访问数据库的方法和装置
CN102999526B (zh) * 2011-09-16 2016-04-06 阿里巴巴集团控股有限公司 一种数据库关系表的拆分、查询方法及系统
CN103176782A (zh) * 2011-12-22 2013-06-26 北大方正集团有限公司 数据库访问的方法和装置
CN103714097B (zh) * 2012-10-09 2017-08-08 阿里巴巴集团控股有限公司 一种访问数据库的方法和装置
CN102915374B (zh) * 2012-11-07 2016-04-06 北京搜狐新媒体信息技术有限公司 一种控制数据库资源访问的方法、装置及系统
CN104462119B (zh) * 2013-09-18 2019-11-05 腾讯科技(深圳)有限公司 数据迁移方法及装置
CN104408174B (zh) * 2014-12-12 2018-06-19 用友网络科技股份有限公司 数据库路由装置和方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102053982A (zh) * 2009-11-02 2011-05-11 阿里巴巴集团控股有限公司 一种数据库信息管理方法和设备
CN102867071A (zh) * 2012-10-19 2013-01-09 烽火通信科技股份有限公司 一种网管海量历史数据管理方法
CN103020193A (zh) * 2012-12-03 2013-04-03 北京奇虎科技有限公司 处理数据库操作请求的方法和设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737228A (zh) * 2020-06-23 2020-10-02 平安医疗健康管理股份有限公司 数据库的分库分表方法及装置
CN111737228B (zh) * 2020-06-23 2022-11-15 深圳平安医疗健康科技服务有限公司 数据库的分库分表方法及装置

Also Published As

Publication number Publication date
CN106294423A (zh) 2017-01-04

Similar Documents

Publication Publication Date Title
US10754562B2 (en) Key value based block device
TWI712976B (zh) 資產管理系統、方法及裝置、電子設備
WO2016188280A1 (zh) 数据库分表的写入方法及装置
US11030247B2 (en) Layered graph data structure
WO2020238254A1 (zh) 数据存储方法及节点
US10984020B2 (en) System and method for supporting large queries in a multidimensional database environment
US8468171B2 (en) Attributed key-value-store database system
US10922316B2 (en) Using computing resources to perform database queries according to a dynamically determined query size
CN104182508B (zh) 一种数据处理方法和数据处理装置
TW201600985A (zh) 資料的查詢方法及查詢裝置
TW201530328A (zh) 爲半結構化資料構建NoSQL資料庫索引的方法及裝置
CN108959510B (zh) 一种分布式数据库的分区级连接方法和装置
US20180144061A1 (en) Edge store designs for graph databases
US20200169402A1 (en) Data storage on tree nodes
US11567999B2 (en) Using a B-tree to store graph information in a database
CN115599764A (zh) 一种表格数据的迁移方法、设备及介质
WO2016082559A1 (zh) 一种数据写入方法及存储设备
US20170235781A1 (en) Method, server and computer program stored in computer readable medium for managing log data in database
CN113934713A (zh) 一种订单数据索引方法、系统、计算机设备以及存储介质
US20230409235A1 (en) File system improvements for zoned storage device operations
CN113849482A (zh) 一种数据迁移方法、装置及电子设备
US20210294668A1 (en) Method and system for proximity based workload and data placement
CN104572711A (zh) 一种分布式文档形数据存取方法及装置
KR20200121986A (ko) 데이터베이스 관리 시스템에서 데이터 저장을 위한 공간 관리를 제공하는 컴퓨터 프로그램
WO2024016789A1 (zh) 日志数据查询方法、装置、设备和介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16799169

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16799169

Country of ref document: EP

Kind code of ref document: A1