WO2019161645A1 - Shell-based data table extraction method, terminal, device, and storage medium - Google Patents

Shell-based data table extraction method, terminal, device, and storage medium Download PDF

Info

Publication number
WO2019161645A1
WO2019161645A1 PCT/CN2018/101880 CN2018101880W WO2019161645A1 WO 2019161645 A1 WO2019161645 A1 WO 2019161645A1 CN 2018101880 W CN2018101880 W CN 2018101880W WO 2019161645 A1 WO2019161645 A1 WO 2019161645A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
data information
data table
different types
target
Prior art date
Application number
PCT/CN2018/101880
Other languages
French (fr)
Chinese (zh)
Inventor
林林
戴建明
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019161645A1 publication Critical patent/WO2019161645A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution

Definitions

  • the present application relates to the field of computer technology, and in particular, to a Shell-based data table extraction method, terminal, device, and storage medium.
  • Shell is a free programming language for communicating between automated and interactive tasks without human intervention. Use it to create a script to provide input to a command or program. The shell can simulate the input of the standard input provided by the program according to the prompt of the program to implement interactive program execution.
  • the shell script In the existing shell script application, the shell script often involves more data tables. If you manually sort each shell script to get the data table in the shell script, the extraction process will be very time consuming, and The workload is very large; in addition, the statement of the data table in the shell script will change with the modification of the application version. It takes a lot of manpower to manually organize and update the information, and the compiled data table is also prone to appear. error.
  • the embodiment of the present application provides a Shell-based data table extraction method, terminal, device, and storage medium, which can simplify the finishing and updating process to the greatest extent, and can save a lot of human resources.
  • the embodiment of the present application provides a Shell-based data table extraction method, where the method includes:
  • the embodiment of the present application provides a shell-based data table extraction terminal, where the terminal includes:
  • a recognition unit for identifying a data table in the shell script
  • An extracting unit configured to extract a table name of the data table
  • a classification unit configured to classify the data table according to the extracted table name, where the data table includes a source table and a target table;
  • the obtaining unit is configured to obtain data information corresponding to different types of data tables, and output the obtained different types of data information into the same preset document.
  • the embodiment of the present application further provides a Shell-based data table extraction device, including:
  • a memory for storing a program implementing the data table extraction method
  • a processor for running a program for implementing the data table extraction method stored in the memory to perform the following operations:
  • the embodiment of the present application further provides a computer readable storage medium, where the one or more computer programs are stored, and the one or more computer programs may be one or more
  • the processor executes to implement the following steps:
  • the improved data table extraction method of the application of the present application does not require manual and cumbersome search for each script-related data table, greatly simplifies the collation and update process, and can save a lot of human resources.
  • FIG. 1 is a schematic flow chart of a Shell-based data table extraction method according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a Shell-based data table extraction method according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a method for extracting a data table based on a shell provided by an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a Shell-based data table extraction method according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a method for extracting a data table based on a shell according to another embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a shell-based data table extraction terminal according to an embodiment of the present application.
  • FIG. 7 is another schematic block diagram of a shell-based data table extracting terminal according to an embodiment of the present application.
  • FIG. 8 is another schematic block diagram of a shell-based data table extracting terminal according to an embodiment of the present application.
  • FIG. 9 is another schematic block diagram of a shell-based data table extraction terminal according to an embodiment of the present application.
  • FIG. 10 is another schematic block diagram of a shell-based data table extraction terminal according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a Shell-based data table extraction device according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a method for extracting a data table based on a shell according to an embodiment of the present application.
  • the method can be run on terminals such as smart phones (such as Android phones, IOS phones, etc.), tablets, laptops, and smart devices.
  • the data dimension generation method described in the embodiment of the present application does not require manual cumbersome search for each script-related data table, greatly simplifies the collation and update process, and can save a lot of human resources.
  • 1 is a schematic flow chart of a Shell-based data table extraction method provided by an embodiment of the present application. The method includes steps S101 to S104.
  • the data table refers to a related data table that is called from a database through a SQL statement in a shell script; the database is connected in a shell script and the data table is called to obtain a database.
  • the data in the database can be obtained to achieve the purpose of monitoring certain information in the database, so as to further understand the performance of the device in real time.
  • Identifying the data table in the shell script can be implemented by identifying the keyword in the SQL statement.
  • the insert statement "insert into” can be identified to identify the data table following the insert statement;
  • the query can be identified by identifying the query *from", thereby identifying the data table following the query statement; also identifying the update statement "update” to identify the data table following the update statement; also identifying the delete statement "delete from” Delete the data table followed by the statement and so on.
  • the data table table name is stored in the temporary file, and the type of the data table includes a source table and a target table, where
  • the method for classifying the type of the data table according to the name of the data table may be: if the table name of the data table is an independent string, and there is a space or a new line before and after the independent string, and the from keyword is followed by the table name.
  • the type of the data table is a source table, if the data table table name is a separate string, and the independent string has spaces or newlines before and after the table name and the non-from keyword is followed by the table name, then the type of the data table
  • the target table optional, non-from keywords can be into, update and other SQL statement keywords.
  • the step S103 includes steps S201 to S202.
  • the character string refers to a string of characters corresponding to the table name of the data table. Since the table name of the data table may be composed of numbers, letters, and underscores, the string may also be a number, A character consisting of letters and underscores.
  • the data table is classified according to the character string, and the classification method is related to the SQL statement keyword in front of the data table table name, and the classification method may be: if the data table table name is an independent a string, and the independent string has spaces or newlines before and after the table name and the following keyword are followed by the from keyword, then the type of the data table is a source table, if the data table table name is an independent string, and the independent character There are spaces or newlines before and after the string and the non-from keyword is followed by the table name. Then the type of the data table is the target table.
  • the non-from keyword can be the SQL statement keyword such as into, update, and so on.
  • the data table is a Hadoop JOB association table
  • the Hadoop JOB association table is written by using a Hadoop statement and a SQL statement, and is saved in a corresponding database, and the Hadoop JOB association table is used.
  • the table name is written into the corresponding shell script.
  • the Hadoop JOB association table needs to be identified, the table name in the shell script is extracted first, that is, which JOB association tables are involved in the script, and the JOB associations are identified. Whether the type of the table belongs to the source table or the target table.
  • the source table refers to the table inside Hadoop and the table of the external relational database.
  • the string of the source table has spaces or newlines before and after the string, and the table name is followed by the from keyword;
  • the target table refers to Write mode is divided into insert target table and overlay target table, such as insert into tableA, this is the insert target table, insert overwrite tableB, this is the overlay target table,
  • the preset document can be the default database In the data table, for example, capture keywords and their related content from the script of the JOB association table, and record the captured content into a temporary file, which is done at the hdfs level of Hadoop, and then in the temporary file The result is loaded into the Hive table of the Hadoop.
  • the data in the Hive table is output to the specified preset Oracle database through the Sqoop method. Specifically, the data information is output to the pre-established data in the preset Oracle database. In the table.
  • the user can form a pre-established data table storing data information into an Oracle Pkg (Oracle packaging, Oracle packaging file). If the data table is needed for optimization, only the Oracle Pkg needs to be optimized.
  • the step S104 includes steps S301 to S303.
  • the source table is divided into an internal source table and an external source table.
  • the internal source table refers to a table inside Hadoop (for example, a Hive table of Hadoop), and the external source table refers to a table of an external relational database.
  • the data information includes table information, field information, and the like, where the table information may be a table name, a table type, or the like, and the field information may be a field name, a field type, or the like.
  • the preset document may be a pre-established data table in the preset Oracle database.
  • the acquired data information may be output to the pre-established data in the preset Oracle database.
  • the user can store the pre-established data table with data information and form an Oracle Pkg (Oracle packaging, Oracle package file). If you need to optimize the data table, you only need to optimize the Oracle Pkg to optimize. Preset the default data table created in the Oracle database.
  • the step S104 includes steps S401 to S403.
  • the target table is divided into an insertion target table and an overlay target table.
  • the insertion target table is, for example, the table tableA in the SQL statement "insert into table A", and the type of the target table is determined by the SQL statement keyword in front of the data table table name, such as into followed by insertion.
  • the target table, the coverage target table such as the table table B in the SQL statement "insert overwrite tableB”, the type of the target table is determined by the SQL statement keyword in front of the data table table name, such as overwrite followed by the overlay target table.
  • the data information includes table information, field information, and the like, where the table information may be a table name, a table type, or the like, and the field information may be a field name, a field type, or the like.
  • the preset document may be a pre-established data table in the preset Oracle database.
  • the acquired data information may be output to the pre-established data in the preset Oracle database.
  • the user can store the pre-established data table with data information and form an Oracle Pkg (Oracle packaging, Oracle package file). If you need to optimize the data table, you only need to optimize the Oracle Pkg to optimize. Preset the default data table created in the Oracle database.
  • the embodiment of the present application identifies the data table in the shell script, extracts the table name of the data table, and classifies the data table according to the extracted table name, where the data table includes the source table and the target.
  • the data information corresponding to different types of data tables is obtained, and the obtained different types of data information are output to the same preset document.
  • the improved data table extraction method of the application of the present application does not require manual and cumbersome search for each script-related data table, greatly simplifies the collation and update process, and can save a lot of human resources.
  • FIG. 5 is a schematic flowchart of a Shell-based data table extraction method according to an embodiment of the present application.
  • the method can be run on terminals such as smart phones (such as Android phones, IOS phones, etc.), tablets, laptops, and smart devices.
  • the method includes steps S501 to S506.
  • the shell script is traversed according to a preset keyword.
  • the rule for creating a short time is traversed by using a rule that traverses the data table first, and then traversing the rule that the data table is established for a long time, thereby implementing a data table for the shell script.
  • the traversal of the shell script from the creation time to the long creation time traversal rules can improve the efficiency of the shell script processing.
  • the position of the data table in the shell script is displayed by using the traversal result of the shell script, and the data table in the shell script is located by using the position information of the displayed data table.
  • the data table refers to a related data table that is called from a database through a SQL statement in a shell script; the database is connected in a shell script and the data table is called to obtain a database.
  • the data in the database can be obtained to achieve the purpose of monitoring certain information in the database, so as to further understand the performance of the device in real time.
  • Identifying the data table in the shell script can be implemented by identifying the keyword in the SQL statement.
  • the insert statement "insert into” can be identified to identify the data table following the insert statement;
  • the query can be identified by identifying the query *from", thereby identifying the data table following the query statement; also identifying the update statement "update” to identify the data table following the update statement; also identifying the delete statement "delete from” Delete the data table followed by the statement and so on.
  • the data table is classified according to the extracted table name, where the data table includes a source table and a target table.
  • the data table table name is stored in the temporary file, and the type of the data table includes a source table and a target table, where
  • the method for classifying the type of the data table according to the name of the data table may be: if the table name of the data table is an independent string, and there is a space or a new line before and after the independent string, and the from keyword is followed by the table name.
  • the type of the data table is a source table, if the data table table name is a separate string, and the independent string has spaces or newlines before and after the table name and the non-from keyword is followed by the table name, then the type of the data table
  • the target table optional, non-from keywords can be into, update and other SQL statement keywords.
  • the data table is a Hadoop JOB association table
  • the Hadoop JOB association table is written by using a Hadoop statement and a SQL statement, and is saved in a corresponding database, and the Hadoop JOB association table is used.
  • the table name is written into the corresponding shell script.
  • the Hadoop JOB association table needs to be identified, the table name in the shell script is extracted first, that is, which JOB association tables are involved in the script, and the JOB associations are identified. Whether the type of the table belongs to the source table or the target table.
  • the source table refers to the table inside Hadoop and the table of the external relational database.
  • the string of the source table has spaces or newlines before and after the string, and the table name is followed by the from keyword;
  • the target table refers to Write mode is divided into insert target table and overlay target table, such as insert into tableA, this is the insert target table, insert overwrite tableB, this is the overlay target table,
  • the preset document can be the default database In the data table, for example, capture keywords and their related content from the script of the JOB association table, and record the captured content into a temporary file, which is done at the hdfs level of Hadoop, and then in the temporary file The result is loaded into the Hive table of the Hadoop.
  • the data in the Hive table is output to the specified preset Oracle database through the Sqoop method. Specifically, the data information is output to the pre-established data in the preset Oracle database. In the table.
  • the user can store a pre-established data table with data information and form an Oracle Pkg (Oracle packaging, Oracle package file). If the data table is needed for optimization, only the Oracle Pkg needs to be optimized.
  • the embodiment of the present application further provides a Shell-based data table extraction terminal, where the terminal 100 includes: an identification unit 101, an extraction unit 102, a classification unit 103, The acquisition unit 104.
  • the identification unit 101 is configured to identify a data table in the shell script.
  • the extracting unit 102 is configured to extract a table name of the data table.
  • the classification unit 103 is configured to classify the data table according to the extracted table name, wherein the data table includes a source table and a target table.
  • the obtaining unit 104 is configured to acquire data information corresponding to different types of data tables, and output the acquired different types of data information into the same preset document.
  • the embodiment of the present application identifies the data table in the shell script, extracts the table name of the data table, and classifies the data table according to the extracted table name, where the data table includes the source table and the target.
  • the data information corresponding to different types of data tables is obtained, and the obtained different types of data information are output to the same preset document.
  • the improved data table extraction method of the application of the present application does not require manual and cumbersome search for each script-related data table, greatly simplifies the collation and update process, and can save a lot of human resources.
  • the classification unit 103 includes:
  • the determining unit 1031 is configured to determine a character string corresponding to the data table table name.
  • the classification subunit 1032 is configured to classify the data table according to the character string.
  • the obtaining unit 104 includes:
  • the first execution unit 1041 is configured to divide the source table into an internal source table and an external source table.
  • the first obtaining subunit 1042 is configured to acquire data information corresponding to the internal source table and the external source table.
  • the first output unit 1043 is configured to output the acquired data information into the preset document.
  • the obtaining unit 104 includes:
  • the second execution unit 1044 is configured to divide the target table into an insertion target table and an overlay target table.
  • the second obtaining subunit 1045 is configured to acquire data information corresponding to the insertion target table and the coverage target table.
  • the second output unit 1046 is configured to output the acquired data information into the preset document.
  • the embodiment of the present application further provides a data table extraction terminal based on a shell, where the terminal 200 includes: a traversing unit 201, a positioning unit 202, and an identifying unit 203.
  • the extracting unit 204, the classifying unit 205, and the obtaining unit 206 are included in the terminal 200 .
  • the traversing unit 201 is configured to traverse the shell script according to a preset keyword.
  • the locating unit 202 is configured to locate the data table in the shell script according to the result of the traversal.
  • the identification unit 203 is configured to identify a data table in the shell script.
  • the extracting unit 204 is configured to extract a table name of the data table.
  • the classification unit 205 is configured to classify the data table according to the extracted table name, wherein the data table includes a source table and a target table.
  • the obtaining unit 206 is configured to acquire data information corresponding to different types of data tables, and output the acquired different types of data information into the same preset document.
  • the above Shell-based data table extraction terminal can be implemented in the form of a computer program that can be run on a device as shown in FIG.
  • FIG. 11 is a schematic structural diagram of a Shell-based data table extraction device according to the present application.
  • the device may be a terminal or a server, wherein the terminal may be a communication device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
  • the server can be a standalone server or a server cluster consisting of multiple servers.
  • the computer device 500 includes a processor 502, a non-volatile storage medium 503, an internal memory 504, and a network interface 505 connected by a system bus 501.
  • the non-volatile storage medium 503 of the computer device 500 can store an operating system 5031 and a computer program 5032.
  • the processor 502 can be caused to execute a shell-based data table extraction method.
  • the processor 502 of the computer device 500 is used to provide computing and control capabilities to support the operation of the entire computer device 500.
  • the internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503, which when executed by the processor, causes the processor 502 to perform a shell-based data table extraction method.
  • the network interface 505 of the computer device 500 is used to perform network communications, such as sending assigned tasks and the like. It will be understood by those skilled in the art that the structure shown in FIG.
  • 11 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
  • the specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
  • the processor 502 performs the following operations:
  • the processor 502 also performs the following operations:
  • the data table in the shell script is located according to the result of the traversal.
  • the classifying the data table according to the extracted table name comprises:
  • the data table is classified according to the character string.
  • the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document includes:
  • the acquired data information is output to a preset document.
  • the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document includes:
  • the acquired data information is output to a preset document.
  • the embodiment of the Shell-based data table extraction device shown in FIG. 11 does not constitute a limitation on the specific configuration of the Shell-based data table extraction device.
  • the Shell-based data table The extraction device may include more or fewer components than illustrated, or some components may be combined, or different component arrangements.
  • the Shell-based data table extraction device includes only the memory and the processor. In such an embodiment, the structure and function of the memory and the processor are consistent with the embodiment shown in FIG. Narration.
  • the application provides a computer readable storage medium storing one or more computer programs, the one or more computer programs being executable by one or more processors to implement the above-described shell-based Data table extraction method.
  • the foregoing storage medium of the present application includes: a magnetic disk, an optical disk, a read-only memory (ROM), and the like, which can store various program codes.
  • the units in all the embodiments of the present application may be implemented by a general-purpose integrated circuit, such as a CPU (Central Processing Unit), or by an ASIC (Application Specific Integrated Circuit).
  • a general-purpose integrated circuit such as a CPU (Central Processing Unit), or by an ASIC (Application Specific Integrated Circuit).
  • the steps in the method for extracting the data table based on the Shell in the embodiment of the present application may be sequentially adjusted, merged, and deleted according to actual needs.
  • the unit in the terminal based on the data table of the Shell may be merged, divided, and deleted according to actual needs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed in the present application are a Shell-based data table extraction method, a terminal, a device, and a storage medium. The method comprises: identifying data tables in a Shell script; extracting the table names of the data tables; classifying the data tables according to the extracted table names, wherein the data tables comprise a source table and a target table; and obtaining data information corresponding to different types of data tables, and outputting the obtained different types of data information into the same preset document. By using the improved data table extraction method in the present application, there is no need to search for data tables related to each script manually and complicatedly, sorting and updating processes are simplified to the maximum extent, and a large amount of manpower resources can be saved.

Description

基于Shell的数据表提取方法、终端、设备及存储介质Shell-based data table extraction method, terminal, device and storage medium
本申请要求于2018年2月24日提交中国专利局、申请号为CN201810156612.8、申请名称为“基于Shell的数据表提取方法、终端、设备及存储介质”以及于2018年3月9日提交中国专利局、申请号为CN201810196485.4、申请名称为“基于Shell的数据表提取方法、终端、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application is required to be submitted to the China Patent Office on February 24, 2018, the application number is CN201810156612.8, and the application name is “Shell-based data table extraction method, terminal, equipment and storage medium” and submitted on March 9, 2018. The Chinese Patent Office, Application No. CN201810196485.4, the priority of the entire disclosure of the entire disclosure of the entire disclosure of the disclosure of the disclosure of the disclosure of the disclosure of the disclosure of the disclosure of the disclosure of the disclosure of the disclosure of
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种基于Shell的数据表提取方法、终端、设备及存储介质。The present application relates to the field of computer technology, and in particular, to a Shell-based data table extraction method, terminal, device, and storage medium.
背景技术Background technique
Shell是一个免费的编程语言,用来实现自动和交互式任务进行通信,而无需人的干预。使用它可以创建脚本用来实现对命令或程序提供输入,Shell则可以根据程序的提示模拟标准输入提供给程序需要的输入来实现交互程序执行。Shell is a free programming language for communicating between automated and interactive tasks without human intervention. Use it to create a script to provide input to a command or program. The shell can simulate the input of the standard input provided by the program according to the prompt of the program to implement interactive program execution.
在现有的Shell脚本的应用中,Shell脚本往往涉及了比较多的数据表,如果通过人工对每个Shell脚本进行整理,以获取Shell脚本中的数据表,会导致提取过程非常耗时,并且工作量非常大;另外,Shell脚本中的数据表的语句会随着应用版本的修改而发生变化,若依靠人工整理并更新这些信息同样需要耗费大量人力,且整理出来的数据表也很容易出现错误。In the existing shell script application, the shell script often involves more data tables. If you manually sort each shell script to get the data table in the shell script, the extraction process will be very time consuming, and The workload is very large; in addition, the statement of the data table in the shell script will change with the modification of the application version. It takes a lot of manpower to manually organize and update the information, and the compiled data table is also prone to appear. error.
发明内容Summary of the invention
有鉴于此,本申请实施例提供一种基于Shell的数据表提取方法、终端、设备及存储介质,能够最大程度地简化了整理和更新流程,并可以节省大量人力资源。In view of this, the embodiment of the present application provides a Shell-based data table extraction method, terminal, device, and storage medium, which can simplify the finishing and updating process to the greatest extent, and can save a lot of human resources.
一方面,本申请实施例提供了一种基于Shell的数据表提取方法,该方法包括:In one aspect, the embodiment of the present application provides a Shell-based data table extraction method, where the method includes:
识别Shell脚本中的数据表;Identify the data tables in the shell script;
提取所述数据表的表名;Extracting the table name of the data table;
根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;Sorting the data table according to the extracted table name, wherein the data table includes a source table and a target table;
获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。Obtaining data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document.
另一方面,本申请实施例提供了一种基于Shell的数据表提取终端,所述终端包括:On the other hand, the embodiment of the present application provides a shell-based data table extraction terminal, where the terminal includes:
识别单元,用于识别Shell脚本中的数据表;a recognition unit for identifying a data table in the shell script;
提取单元,用于提取所述数据表的表名;An extracting unit, configured to extract a table name of the data table;
分类单元,用于根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;a classification unit, configured to classify the data table according to the extracted table name, where the data table includes a source table and a target table;
获取单元,用于获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。The obtaining unit is configured to obtain data information corresponding to different types of data tables, and output the obtained different types of data information into the same preset document.
又一方面,本申请实施例还提供了一种基于Shell的数据表提取设备,其包括:In another aspect, the embodiment of the present application further provides a Shell-based data table extraction device, including:
存储器,用于存储实现数据表提取方法的程序;以及a memory for storing a program implementing the data table extraction method;
处理器,用于运行所述存储器中存储的实现数据表提取方法的程序,以执行以下操作:a processor for running a program for implementing the data table extraction method stored in the memory to perform the following operations:
识别Shell脚本中的数据表;Identify the data tables in the shell script;
提取所述数据表的表名;Extracting the table name of the data table;
根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;Sorting the data table according to the extracted table name, wherein the data table includes a source table and a target table;
获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。Obtaining data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document.
再一方面,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序可被一个或者一个以上的处理器执行,以实现以下步骤:In still another aspect, the embodiment of the present application further provides a computer readable storage medium, where the one or more computer programs are stored, and the one or more computer programs may be one or more The processor executes to implement the following steps:
识别Shell脚本中的数据表;Identify the data tables in the shell script;
提取所述数据表的表名;Extracting the table name of the data table;
根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及 目标表;Sorting the data table according to the extracted table name, wherein the data table includes a source table and a target table;
获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。Obtaining data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document.
本申请实施例通过改进的数据表提取方法,不需要人工很繁琐地查找每个脚本相关的数据表、最大程度地简化了整理和更新流程,并可以节省大量人力资源。The improved data table extraction method of the application of the present application does not require manual and cumbersome search for each script-related data table, greatly simplifies the collation and update process, and can save a lot of human resources.
附图说明DRAWINGS
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly described below. Obviously, the drawings in the following description are some embodiments of the present application, For the ordinary technicians, other drawings can be obtained based on these drawings without any creative work.
图1是本申请实施例提供的一种基于Shell的数据表提取方法的示意流程图;1 is a schematic flow chart of a Shell-based data table extraction method according to an embodiment of the present application;
图2是本申请实施例提供的一种基于Shell的数据表提取方法的示意流程图;2 is a schematic flowchart of a Shell-based data table extraction method according to an embodiment of the present application;
图3是本申请实施例提供的一种基于Shell的数据表提取方法的示意流程图;3 is a schematic flowchart of a method for extracting a data table based on a shell provided by an embodiment of the present application;
图4是本申请实施例提供的一种基于Shell的数据表提取方法的示意流程图;4 is a schematic flowchart of a Shell-based data table extraction method according to an embodiment of the present application;
图5是本申请另一实施例提供的一种基于Shell的数据表提取方法的示意流程图;FIG. 5 is a schematic flowchart of a method for extracting a data table based on a shell according to another embodiment of the present application; FIG.
图6是本申请实施例提供的一种基于Shell的数据表提取终端的示意性框图;6 is a schematic block diagram of a shell-based data table extraction terminal according to an embodiment of the present application;
图7是本申请实施例提供的一种基于Shell的数据表提取终端的另一示意性框图;FIG. 7 is another schematic block diagram of a shell-based data table extracting terminal according to an embodiment of the present application; FIG.
图8是本申请实施例提供的一种基于Shell的数据表提取终端的另一示意性框图;FIG. 8 is another schematic block diagram of a shell-based data table extracting terminal according to an embodiment of the present application; FIG.
图9是本申请实施例提供的一种基于Shell的数据表提取终端的另一示意性框图;9 is another schematic block diagram of a shell-based data table extraction terminal according to an embodiment of the present application;
图10是本申请实施例提供的一种基于Shell的数据表提取终端的另一示意性框图;FIG. 10 is another schematic block diagram of a shell-based data table extraction terminal according to an embodiment of the present application; FIG.
图11是本申请实施例提供的一种基于Shell的数据表提取设备的结构组成示意图。FIG. 11 is a schematic structural diagram of a Shell-based data table extraction device according to an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application are clearly and completely described in the following with reference to the drawings in the embodiments of the present application. It is obvious that the described embodiments are a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。The use of the terms "comprising", "comprising", "","," The presence or addition of a plurality of other features, integers, steps, operations, elements, components, and/or collections thereof.
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。The terminology used herein is for the purpose of describing particular embodiments and is not intended to be limiting. As used in the specification and the appended claims, the claims
请参阅图1,图1为本申请实施例提供的一种基于Shell的数据表提取方法的示意流程图。该方法可以运行在智能手机(如Android手机、IOS手机等)、平板电脑、笔记本电脑以及智能设备等终端中。本申请实施例所述的数据维度生成方法,不需要人工很繁琐地查找每个脚本相关的数据表、最大程度地简化了整理和更新流程,并可以节省大量人力资源。图1是本申请实施例提供的基于Shell的数据表提取方法的示意流程图。该方法包括步骤S101~S104。Please refer to FIG. 1. FIG. 1 is a schematic flowchart of a method for extracting a data table based on a shell according to an embodiment of the present application. The method can be run on terminals such as smart phones (such as Android phones, IOS phones, etc.), tablets, laptops, and smart devices. The data dimension generation method described in the embodiment of the present application does not require manual cumbersome search for each script-related data table, greatly simplifies the collation and update process, and can save a lot of human resources. 1 is a schematic flow chart of a Shell-based data table extraction method provided by an embodiment of the present application. The method includes steps S101 to S104.
S101,识别Shell脚本中的数据表。S101. Identify a data table in the shell script.
在本申请实施例中,所述数据表指的是在Shell脚本中通过SQL语句连接数据库,而从数据库中调用出来的相关的数据表;在Shell脚本中连接数据库并调用数据表是为了获取数据库中的数据,在日常的运维工作中,可以通过获取数据库中的数据,以达到监控数据库中某些信息的目的,从而进一步实时了解设备的性能。In the embodiment of the present application, the data table refers to a related data table that is called from a database through a SQL statement in a shell script; the database is connected in a shell script and the data table is called to obtain a database. In the daily operation and maintenance work, the data in the database can be obtained to achieve the purpose of monitoring certain information in the database, so as to further understand the performance of the device in real time.
识别Shell脚本中的数据表,可以通过识别SQL语句中的关键字来实现,例如,可以通过识别插入语句“insert into”,从而识别出插入语句后面跟随的数据表;可以通过识别查询语句“select*from”,从而识别出查询语句后面跟随的数据表;也可以通过识别更新语句“update”,从而识别出更新语句后面跟随的数据表;还可以通过识别删除语句“delete from”,从而识别出删除语句后面跟随的数据表等等。Identifying the data table in the shell script can be implemented by identifying the keyword in the SQL statement. For example, the insert statement "insert into" can be identified to identify the data table following the insert statement; the query can be identified by identifying the query *from", thereby identifying the data table following the query statement; also identifying the update statement "update" to identify the data table following the update statement; also identifying the delete statement "delete from" Delete the data table followed by the statement and so on.
S102,提取所述数据表的表名。S102. Extract a table name of the data table.
在本申请实施例中,通过识别出Shell脚本中的数据表后,提取所识别的数据表的表名,例如,在插入语句“insert into{TABLENAME}”中,提取到的数据表的表名为“TABLENAME”;在查询语句“select*from{USERNAME}”中,提取到的数据表的表名为“USERNAME”,在更新语句“update{DBNAME}”中提取到的数据表的表名为“DBNAME”,在删除语句“delete from{KBNAME}”中提取到的数据表的表名为“KBNAME”。In the embodiment of the present application, after identifying the data table in the shell script, extracting the table name of the identified data table, for example, in the insert statement "insert into {TABLENAME}", extracting the table name of the data table Is "TABLENAME"; in the query statement "select*from{USERNAME}", the table name of the extracted data table is "USERNAME", and the table name of the data table extracted in the update statement "update{DBNAME}" "DBNAME", the table name of the data table extracted in the delete statement "delete from{KBNAME}" is "KBNAME".
S103,根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表。S103. Sort the data table according to the extracted table name, where the data table includes a source table and a target table.
在本申请实施例中,通过一系列SQL语句的关键字对数据表的表名进行提取后,将数据表表名存储在临时文件中,所述数据表的类型包括源表以及目标表,其中,根据数据表表名对数据表的类型进行分类的方法可以为:若数据表表名是个独立的字符串,且该独立字符串前后有空格或者换行以及表名前面跟随的是from关键字,那么所述数据表的类型为源表,若数据表表名是个独立的字符串,且该独立字符串前后有空格或者换行以及表名前面跟随的是非from关键字,那么所述数据表的类型为目标表,可选的,非from关键字可以是into、update等SQL语句关键字。In the embodiment of the present application, after extracting the table name of the data table by using a keyword of a series of SQL statements, the data table table name is stored in the temporary file, and the type of the data table includes a source table and a target table, where The method for classifying the type of the data table according to the name of the data table may be: if the table name of the data table is an independent string, and there is a space or a new line before and after the independent string, and the from keyword is followed by the table name. Then the type of the data table is a source table, if the data table table name is a separate string, and the independent string has spaces or newlines before and after the table name and the non-from keyword is followed by the table name, then the type of the data table For the target table, optional, non-from keywords can be into, update and other SQL statement keywords.
进一步地,如图2所示,所述步骤S103包括步骤S201~S202。Further, as shown in FIG. 2, the step S103 includes steps S201 to S202.
S201,确定与所述数据表表名相对应的字符串。S201. Determine a character string corresponding to the data table table name.
在本申请实施例中,所述字符串指的是数据表表名所对应的一串字符,由于数据表表名可以是由数字、字母、下划线组成的,所以该字符串也可以是由数字、字母、下划线组成的字符。In the embodiment of the present application, the character string refers to a string of characters corresponding to the table name of the data table. Since the table name of the data table may be composed of numbers, letters, and underscores, the string may also be a number, A character consisting of letters and underscores.
S202,根据所述字符串对所述数据表进行分类。S202. Sort the data table according to the character string.
在本申请实施例中,根据所述字符串对所述数据表进行分类,分类的方法 与数据表表名前面的SQL语句关键字有关,分类的方法可以为:若数据表表名是个独立的字符串,且该独立字符串前后有空格或者换行以及表名前面跟随的是from关键字,那么所述数据表的类型为源表,若数据表表名是个独立的字符串,且该独立字符串前后有空格或者换行以及表名前面跟随的是非from关键字,那么所述数据表的类型为目标表,可选的,非from关键字可以是into、update等SQL语句关键字。通过所述字符串对所述数据表进行分类,可以排除无效数据表对有效数据表的干扰,例如from_unixtime,像这种from后面跟随的字符不属于规定的内容就会被认定为无效数据表,这种情况就不被考虑。In the embodiment of the present application, the data table is classified according to the character string, and the classification method is related to the SQL statement keyword in front of the data table table name, and the classification method may be: if the data table table name is an independent a string, and the independent string has spaces or newlines before and after the table name and the following keyword are followed by the from keyword, then the type of the data table is a source table, if the data table table name is an independent string, and the independent character There are spaces or newlines before and after the string and the non-from keyword is followed by the table name. Then the type of the data table is the target table. Optionally, the non-from keyword can be the SQL statement keyword such as into, update, and so on. By classifying the data table by the character string, the interference of the invalid data table to the valid data table can be excluded, for example, from_unixtime, and characters following such from not belonging to the specified content are recognized as invalid data tables. This situation is not considered.
S104,获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。S104. Acquire data information corresponding to different types of data tables, and output the acquired different types of data information into the same preset document.
在本申请实施例中,所述数据表为Hadoop的JOB关联表,该Hadoop的JOB关联表是利用Hadoop语句和SQL语句进行编写的,并保存在相对应的数据库中,将Hadoop的JOB关联表的表名写入相对应的Shell脚本中,当需要识别Hadoop的JOB关联表时,先提取Shell脚本中的表名,也就是识别出了在脚本中涉及到了哪些JOB关联表,以及这些JOB关联表的类型是属于源表还是目标表。In the embodiment of the present application, the data table is a Hadoop JOB association table, and the Hadoop JOB association table is written by using a Hadoop statement and a SQL statement, and is saved in a corresponding database, and the Hadoop JOB association table is used. The table name is written into the corresponding shell script. When the Hadoop JOB association table needs to be identified, the table name in the shell script is extracted first, that is, which JOB association tables are involved in the script, and the JOB associations are identified. Whether the type of the table belongs to the source table or the target table.
需要说明的是,源表指的是Hadoop内部的表和外部关系型数据库的表,源表的字符串前后有空格或者换行,以及表名前面跟随的是from关键字;目标表指的是按写入方式分为插入目标表和覆盖目标表,如insert into tableA,这种就是插入式的目标表,insert overwrite tableB,这种就是覆盖式的目标表,所述预设文档可以是预设数据库中的数据表,例如,从JOB关联表的脚本中捕获关键字及其相关内容,并将捕获到的内容记录到临时文件中,这些是在Hadoop的hdfs层面完成的,然后将临时文件中的结果加载到Hadoop的Hive表中,Hive表中的数据通过Sqoop方式将数据信息输出到指定的预设Oracle数据库中,具体的,将数据信息输出至所述预设Oracle数据库中的预先建立的数据表中。可选的,用户可以将存储有数据信息的预先建立的数据表形成一个Oracle Pkg(Oracle packaging,Oracle封装文件),若需要该数据表进行优化则只需要优化这个Oracle Pkg即可。It should be noted that the source table refers to the table inside Hadoop and the table of the external relational database. The string of the source table has spaces or newlines before and after the string, and the table name is followed by the from keyword; the target table refers to Write mode is divided into insert target table and overlay target table, such as insert into tableA, this is the insert target table, insert overwrite tableB, this is the overlay target table, the preset document can be the default database In the data table, for example, capture keywords and their related content from the script of the JOB association table, and record the captured content into a temporary file, which is done at the hdfs level of Hadoop, and then in the temporary file The result is loaded into the Hive table of the Hadoop. The data in the Hive table is output to the specified preset Oracle database through the Sqoop method. Specifically, the data information is output to the pre-established data in the preset Oracle database. In the table. Optionally, the user can form a pre-established data table storing data information into an Oracle Pkg (Oracle packaging, Oracle packaging file). If the data table is needed for optimization, only the Oracle Pkg needs to be optimized.
进一步地,如图3所示,若所述数据表为源表,所述步骤S104包括步骤S301~S303。Further, as shown in FIG. 3, if the data table is a source table, the step S104 includes steps S301 to S303.
S301,将所述源表分为内部源表和外部源表。S301. The source table is divided into an internal source table and an external source table.
在本申请实施例中,所述内部源表指的是在Hadoop内部的表(例如Hadoop的Hive表),所述外部源表指的是外部关系型数据库的表。In the embodiment of the present application, the internal source table refers to a table inside Hadoop (for example, a Hive table of Hadoop), and the external source table refers to a table of an external relational database.
S302,获取所述内部源表和外部源表对应的数据信息。S302. Acquire data information corresponding to the internal source table and the external source table.
在本申请实施例中,所述数据信息包括表信息、字段信息等,其中表信息可以是表名、表类型等,字段信息可以是字段名、字段类型等。In the embodiment of the present application, the data information includes table information, field information, and the like, where the table information may be a table name, a table type, or the like, and the field information may be a field name, a field type, or the like.
S303,将所获取的数据信息输出至预设文档中。S303. Output the acquired data information into the preset document.
在本申请实施例中,所述预设文档可以为预设Oracle数据库中的预先建立的数据表,具体的,可以将所获取的数据信息输出至所述预设Oracle数据库中的预先建立的数据表中,并且用户可以将存储有数据信息的预先建立的数据表并形成一个Oracle Pkg(Oracle packaging,Oracle封装文件),若需要该数据表进行优化则只需要优化这个Oracle Pkg,就可以达到优化预设Oracle数据库中所建立的预设数据表。In the embodiment of the present application, the preset document may be a pre-established data table in the preset Oracle database. Specifically, the acquired data information may be output to the pre-established data in the preset Oracle database. In the table, and the user can store the pre-established data table with data information and form an Oracle Pkg (Oracle packaging, Oracle package file). If you need to optimize the data table, you only need to optimize the Oracle Pkg to optimize. Preset the default data table created in the Oracle database.
进一步地,如图4所示,若所述数据表为目标表,所述步骤S104包括步骤S401~S403。Further, as shown in FIG. 4, if the data table is a target table, the step S104 includes steps S401 to S403.
S401,将所述目标表分为插入目标表和覆盖目标表。S401. The target table is divided into an insertion target table and an overlay target table.
在本申请实施例中,所述插入目标表比如SQL语句“insert into tableA”中的表tableA,该目标表的类型由数据表表名前面的SQL语句关键字决定,如into后面跟随的是插入目标表,所述覆盖目标表比如SQL语句“insert overwrite tableB”中的表tableB,该目标表的类型由数据表表名前面的SQL语句关键字决定,如overwrite后面跟随的是覆盖目标表。In the embodiment of the present application, the insertion target table is, for example, the table tableA in the SQL statement "insert into table A", and the type of the target table is determined by the SQL statement keyword in front of the data table table name, such as into followed by insertion. The target table, the coverage target table such as the table table B in the SQL statement "insert overwrite tableB", the type of the target table is determined by the SQL statement keyword in front of the data table table name, such as overwrite followed by the overlay target table.
S402,获取所述插入目标表和覆盖目标表对应的数据信息。S402. Acquire data information corresponding to the insertion target table and the coverage target table.
在本申请实施例中,所述数据信息包括表信息、字段信息等,其中表信息可以是表名、表类型等,字段信息可以是字段名、字段类型等。In the embodiment of the present application, the data information includes table information, field information, and the like, where the table information may be a table name, a table type, or the like, and the field information may be a field name, a field type, or the like.
S403,将所获取的数据信息输出至预设文档中。S403. Output the acquired data information into a preset document.
在本申请实施例中,所述预设文档可以为预设Oracle数据库中的预先建立的数据表,具体的,可以将所获取的数据信息输出至所述预设Oracle数据库中的预先建立的数据表中,并且用户可以将存储有数据信息的预先建立的数据表并形成一个Oracle Pkg(Oracle packaging,Oracle封装文件),若需要该数据表进行优化则只需要优化这个Oracle Pkg,就可以达到优化预设Oracle数据库中所建立的预设数据表。In the embodiment of the present application, the preset document may be a pre-established data table in the preset Oracle database. Specifically, the acquired data information may be output to the pre-established data in the preset Oracle database. In the table, and the user can store the pre-established data table with data information and form an Oracle Pkg (Oracle packaging, Oracle package file). If you need to optimize the data table, you only need to optimize the Oracle Pkg to optimize. Preset the default data table created in the Oracle database.
由以上可见,本申请实施例通过识别Shell脚本中的数据表;提取所述数据表的表名;根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。本申请实施例通过改进的数据表提取方法,不需要人工很繁琐地查找每个脚本相关的数据表、最大程度地简化了整理和更新流程,并可以节省大量人力资源。It can be seen from the above that the embodiment of the present application identifies the data table in the shell script, extracts the table name of the data table, and classifies the data table according to the extracted table name, where the data table includes the source table and the target. The data information corresponding to different types of data tables is obtained, and the obtained different types of data information are output to the same preset document. The improved data table extraction method of the application of the present application does not require manual and cumbersome search for each script-related data table, greatly simplifies the collation and update process, and can save a lot of human resources.
请参阅图5,图5是本申请实施例提供的一种基于Shell的数据表提取方法的示意流程图。该方法可以运行在智能手机(如Android手机、IOS手机等)、平板电脑、笔记本电脑以及智能设备等终端中。如图5所示,该方法包括步骤S501~S506。Please refer to FIG. 5. FIG. 5 is a schematic flowchart of a Shell-based data table extraction method according to an embodiment of the present application. The method can be run on terminals such as smart phones (such as Android phones, IOS phones, etc.), tablets, laptops, and smart devices. As shown in FIG. 5, the method includes steps S501 to S506.
S501,根据预设关键字对所述Shell脚本进行遍历。S501. The shell script is traversed according to a preset keyword.
在本申请实施例中,在对Shell脚本进行遍历时,是通过采用先遍历数据表创建时间短的规则,后遍历数据表建立时间长的规则进行遍历,从而实现对Shell脚本的进行的数据表的遍历,对Shell脚本进行从创建时间短到创建时间长的遍历规则,能够提高对Shell脚本处理的效率。In the embodiment of the present application, when the shell script is traversed, the rule for creating a short time is traversed by using a rule that traverses the data table first, and then traversing the rule that the data table is established for a long time, thereby implementing a data table for the shell script. The traversal of the shell script from the creation time to the long creation time traversal rules, can improve the efficiency of the shell script processing.
S502,根据遍历的结果对所述Shell脚本中的数据表进行定位。S502: Locating the data table in the shell script according to the result of the traversal.
在本申请实施例中,利用对所述Shell脚本的遍历结果把数据表在Shell脚本中的位置显示出来,通过所显示的数据表的位置信息从而对所述Shell脚本中的数据表进行定位。In the embodiment of the present application, the position of the data table in the shell script is displayed by using the traversal result of the shell script, and the data table in the shell script is located by using the position information of the displayed data table.
S503,识别Shell脚本中的数据表。S503, identifying a data table in the shell script.
在本申请实施例中,所述数据表指的是在Shell脚本中通过SQL语句连接数据库,而从数据库中调用出来的相关的数据表;在Shell脚本中连接数据库并调用数据表是为了获取数据库中的数据,在日常的运维工作中,可以通过获取数据库中的数据,以达到监控数据库中某些信息的目的,从而进一步实时了解设备的性能。In the embodiment of the present application, the data table refers to a related data table that is called from a database through a SQL statement in a shell script; the database is connected in a shell script and the data table is called to obtain a database. In the daily operation and maintenance work, the data in the database can be obtained to achieve the purpose of monitoring certain information in the database, so as to further understand the performance of the device in real time.
识别Shell脚本中的数据表,可以通过识别SQL语句中的关键字来实现,例如,可以通过识别插入语句“insert into”,从而识别出插入语句后面跟随的数据表;可以通过识别查询语句“select*from”,从而识别出查询语句后面跟随的数据表;也可以通过识别更新语句“update”,从而识别出更新语句后面跟随的数据表;还可以通过识别删除语句“delete from”,从而识别出删 除语句后面跟随的数据表等等。Identifying the data table in the shell script can be implemented by identifying the keyword in the SQL statement. For example, the insert statement "insert into" can be identified to identify the data table following the insert statement; the query can be identified by identifying the query *from", thereby identifying the data table following the query statement; also identifying the update statement "update" to identify the data table following the update statement; also identifying the delete statement "delete from" Delete the data table followed by the statement and so on.
S504,提取所述数据表的表名。S504. Extract a table name of the data table.
在本申请实施例中,通过识别出Shell脚本中的数据表后,提取所识别的数据表的表名,例如,在插入语句“insert into{TABLENAME}”中,提取到的数据表的表名为“TABLENAME”;在查询语句“select*from{USERNAME}”中,提取到的数据表的表名为“USERNAME”,在更新语句“update{DBNAME}”中提取到的数据表的表名为“DBNAME”,在删除语句“delete from{KBNAME}”中提取到的数据表的表名为“KBNAME”。In the embodiment of the present application, after identifying the data table in the shell script, extracting the table name of the identified data table, for example, in the insert statement "insert into {TABLENAME}", extracting the table name of the data table Is "TABLENAME"; in the query statement "select*from{USERNAME}", the table name of the extracted data table is "USERNAME", and the table name of the data table extracted in the update statement "update{DBNAME}" "DBNAME", the table name of the data table extracted in the delete statement "delete from{KBNAME}" is "KBNAME".
S505,根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表。S505. The data table is classified according to the extracted table name, where the data table includes a source table and a target table.
在本申请实施例中,通过一系列SQL语句的关键字对数据表的表名进行提取后,将数据表表名存储在临时文件中,所述数据表的类型包括源表以及目标表,其中,根据数据表表名对数据表的类型进行分类的方法可以为:若数据表表名是个独立的字符串,且该独立字符串前后有空格或者换行以及表名前面跟随的是from关键字,那么所述数据表的类型为源表,若数据表表名是个独立的字符串,且该独立字符串前后有空格或者换行以及表名前面跟随的是非from关键字,那么所述数据表的类型为目标表,可选的,非from关键字可以是into、update等SQL语句关键字。In the embodiment of the present application, after extracting the table name of the data table by using a keyword of a series of SQL statements, the data table table name is stored in the temporary file, and the type of the data table includes a source table and a target table, where The method for classifying the type of the data table according to the name of the data table may be: if the table name of the data table is an independent string, and there is a space or a new line before and after the independent string, and the from keyword is followed by the table name. Then the type of the data table is a source table, if the data table table name is a separate string, and the independent string has spaces or newlines before and after the table name and the non-from keyword is followed by the table name, then the type of the data table For the target table, optional, non-from keywords can be into, update and other SQL statement keywords.
S506,获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。S506. Acquire data information corresponding to different types of data tables, and output the obtained different types of data information into the same preset document.
在本申请实施例中,所述数据表为Hadoop的JOB关联表,该Hadoop的JOB关联表是利用Hadoop语句和SQL语句进行编写的,并保存在相对应的数据库中,将Hadoop的JOB关联表的表名写入相对应的Shell脚本中,当需要识别Hadoop的JOB关联表时,先提取Shell脚本中的表名,也就是识别出了在脚本中涉及到了哪些JOB关联表,以及这些JOB关联表的类型是属于源表还是目标表。In the embodiment of the present application, the data table is a Hadoop JOB association table, and the Hadoop JOB association table is written by using a Hadoop statement and a SQL statement, and is saved in a corresponding database, and the Hadoop JOB association table is used. The table name is written into the corresponding shell script. When the Hadoop JOB association table needs to be identified, the table name in the shell script is extracted first, that is, which JOB association tables are involved in the script, and the JOB associations are identified. Whether the type of the table belongs to the source table or the target table.
需要说明的是,源表指的是Hadoop内部的表和外部关系型数据库的表,源表的字符串前后有空格或者换行,以及表名前面跟随的是from关键字;目标表指的是按写入方式分为插入目标表和覆盖目标表,如insert into tableA,这种就是插入式的目标表,insert overwrite tableB,这种就是覆盖式的目标表,所述预设文档可以是预设数据库中的数据表,例如,从JOB关联表的脚本中捕 获关键字及其相关内容,并将捕获到的内容记录到临时文件中,这些是在Hadoop的hdfs层面完成的,然后将临时文件中的结果加载到Hadoop的Hive表中,Hive表中的数据通过Sqoop方式将数据信息输出到指定的预设Oracle数据库中,具体的,将数据信息输出至所述预设Oracle数据库中的预先建立的数据表中。可选的,用户可以将存储有数据信息的预先建立的数据表并形成一个Oracle Pkg(Oracle packaging,Oracle封装文件),若需要该数据表进行优化则只需要优化这个Oracle Pkg即可。It should be noted that the source table refers to the table inside Hadoop and the table of the external relational database. The string of the source table has spaces or newlines before and after the string, and the table name is followed by the from keyword; the target table refers to Write mode is divided into insert target table and overlay target table, such as insert into tableA, this is the insert target table, insert overwrite tableB, this is the overlay target table, the preset document can be the default database In the data table, for example, capture keywords and their related content from the script of the JOB association table, and record the captured content into a temporary file, which is done at the hdfs level of Hadoop, and then in the temporary file The result is loaded into the Hive table of the Hadoop. The data in the Hive table is output to the specified preset Oracle database through the Sqoop method. Specifically, the data information is output to the pre-established data in the preset Oracle database. In the table. Optionally, the user can store a pre-established data table with data information and form an Oracle Pkg (Oracle packaging, Oracle package file). If the data table is needed for optimization, only the Oracle Pkg needs to be optimized.
请参阅图6,对应上述一种基于Shell的数据表提取方法,本申请实施例还提出一种基于Shell的数据表提取终端,该终端100包括:识别单元101、提取单元102、分类单元103、获取单元104。Referring to FIG. 6 , corresponding to the above-mentioned Shell-based data table extraction method, the embodiment of the present application further provides a Shell-based data table extraction terminal, where the terminal 100 includes: an identification unit 101, an extraction unit 102, a classification unit 103, The acquisition unit 104.
其中,所述识别单元101,用于识别Shell脚本中的数据表。The identification unit 101 is configured to identify a data table in the shell script.
提取单元102,用于提取所述数据表的表名。The extracting unit 102 is configured to extract a table name of the data table.
分类单元103,用于根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表。The classification unit 103 is configured to classify the data table according to the extracted table name, wherein the data table includes a source table and a target table.
获取单元104,用于获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。The obtaining unit 104 is configured to acquire data information corresponding to different types of data tables, and output the acquired different types of data information into the same preset document.
由以上可见,本申请实施例通过识别Shell脚本中的数据表;提取所述数据表的表名;根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。本申请实施例通过改进的数据表提取方法,不需要人工很繁琐地查找每个脚本相关的数据表、最大程度地简化了整理和更新流程,并可以节省大量人力资源。It can be seen from the above that the embodiment of the present application identifies the data table in the shell script, extracts the table name of the data table, and classifies the data table according to the extracted table name, where the data table includes the source table and the target. The data information corresponding to different types of data tables is obtained, and the obtained different types of data information are output to the same preset document. The improved data table extraction method of the application of the present application does not require manual and cumbersome search for each script-related data table, greatly simplifies the collation and update process, and can save a lot of human resources.
如图7所示,所述分类单元103,包括:As shown in FIG. 7, the classification unit 103 includes:
确定单元1031,用于确定与所述数据表表名相对应的字符串。The determining unit 1031 is configured to determine a character string corresponding to the data table table name.
分类子单元1032,用于根据所述字符串对所述数据表进行分类。The classification subunit 1032 is configured to classify the data table according to the character string.
如图8所示,若所述数据表为源表,所述获取单元104,包括:As shown in FIG. 8, if the data table is a source table, the obtaining unit 104 includes:
第一执行单元1041,用于将所述源表分为内部源表和外部源表。The first execution unit 1041 is configured to divide the source table into an internal source table and an external source table.
第一获取子单元1042,用于获取所述内部源表和外部源表对应的数据信息。第一输出单元1043,用于将所获取的数据信息输出至预设文档中。The first obtaining subunit 1042 is configured to acquire data information corresponding to the internal source table and the external source table. The first output unit 1043 is configured to output the acquired data information into the preset document.
如图9所示,若所述数据表为目标表,所述获取单元104,包括:As shown in FIG. 9, if the data table is a target table, the obtaining unit 104 includes:
第二执行单元1044,用于将所述目标表分为插入目标表和覆盖目标表。The second execution unit 1044 is configured to divide the target table into an insertion target table and an overlay target table.
第二获取子单元1045,用于获取所述插入目标表和覆盖目标表对应的数据信息。The second obtaining subunit 1045 is configured to acquire data information corresponding to the insertion target table and the coverage target table.
第二输出单元1046,用于将所获取的数据信息输出至预设文档中。The second output unit 1046 is configured to output the acquired data information into the preset document.
请参阅图10,对应上述一种基于Shell的数据表提取方法,本申请实施例还提出一种基于Shell的数据表提取终端,该终端200包括:遍历单元201、定位单元202、识别单元203、提取单元204、分类单元205、获取单元206。Referring to FIG. 10 , corresponding to the foregoing method for extracting a data table based on a shell, the embodiment of the present application further provides a data table extraction terminal based on a shell, where the terminal 200 includes: a traversing unit 201, a positioning unit 202, and an identifying unit 203. The extracting unit 204, the classifying unit 205, and the obtaining unit 206.
其中,所述遍历单元201,用于根据预设关键字对所述Shell脚本进行遍历。The traversing unit 201 is configured to traverse the shell script according to a preset keyword.
定位单元202,用于根据遍历的结果对所述Shell脚本中的数据表进行定位。The locating unit 202 is configured to locate the data table in the shell script according to the result of the traversal.
识别单元203,用于识别Shell脚本中的数据表。The identification unit 203 is configured to identify a data table in the shell script.
提取单元204,用于提取所述数据表的表名。The extracting unit 204 is configured to extract a table name of the data table.
分类单元205,用于根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表。The classification unit 205 is configured to classify the data table according to the extracted table name, wherein the data table includes a source table and a target table.
获取单元206,用于获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。The obtaining unit 206 is configured to acquire data information corresponding to different types of data tables, and output the acquired different types of data information into the same preset document.
上述基于Shell的数据表提取终端可以实现为一种计算机程序的形式,计算机程序可以在如图11所示的设备上运行。The above Shell-based data table extraction terminal can be implemented in the form of a computer program that can be run on a device as shown in FIG.
图11为本申请一种基于Shell的数据表提取设备的结构组成示意图。该设备可以是终端,也可以是服务器,其中,终端可以是智能手机、平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式装置等具有通信功能的电子装置。服务器可以是独立的服务器,也可以是多个服务器组成的服务器集群。参照图11,该计算机设备500包括通过系统总线501连接的处理器502、非易失性存储介质503、内存储器504和网络接口505。其中,该计算机设备500的非易失性存储介质503可存储操作系统5031和计算机程序5032,该计算机程序5032被执行时,可使得处理器502执行一种基于Shell的数据表提取方法。该计算机设备500的处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序被处理器执行时,可使得处理器502执行一种基于Shell的数据表提取方法。计算机设备500的网络接口505用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图11中示出的结构,仅仅是与本 申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。FIG. 11 is a schematic structural diagram of a Shell-based data table extraction device according to the present application. The device may be a terminal or a server, wherein the terminal may be a communication device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device. The server can be a standalone server or a server cluster consisting of multiple servers. Referring to FIG. 11, the computer device 500 includes a processor 502, a non-volatile storage medium 503, an internal memory 504, and a network interface 505 connected by a system bus 501. The non-volatile storage medium 503 of the computer device 500 can store an operating system 5031 and a computer program 5032. When the computer program 5032 is executed, the processor 502 can be caused to execute a shell-based data table extraction method. The processor 502 of the computer device 500 is used to provide computing and control capabilities to support the operation of the entire computer device 500. The internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503, which when executed by the processor, causes the processor 502 to perform a shell-based data table extraction method. The network interface 505 of the computer device 500 is used to perform network communications, such as sending assigned tasks and the like. It will be understood by those skilled in the art that the structure shown in FIG. 11 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied. The specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
其中,所述处理器502执行如下操作:The processor 502 performs the following operations:
识别Shell脚本中的数据表;Identify the data tables in the shell script;
提取所述数据表的表名;Extracting the table name of the data table;
根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;Sorting the data table according to the extracted table name, wherein the data table includes a source table and a target table;
获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。Obtaining data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document.
在一个实施例中,所述处理器502还执行如下操作:In one embodiment, the processor 502 also performs the following operations:
根据预设关键字对所述Shell脚本进行遍历;Traversing the shell script according to a preset keyword;
根据遍历的结果对所述Shell脚本中的数据表进行定位。The data table in the shell script is located according to the result of the traversal.
在一个实施例中,所述根据所提取的表名对所述数据表进行分类,包括:In one embodiment, the classifying the data table according to the extracted table name comprises:
确定与所述数据表表名相对应的字符串;Determining a string corresponding to the table name of the data table;
根据所述字符串对所述数据表进行分类。The data table is classified according to the character string.
在一个实施例中,若所述数据表为源表,所述获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中,包括:In an embodiment, if the data table is a source table, the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document, includes:
将所述源表分为内部源表和外部源表;Dividing the source table into an internal source table and an external source table;
获取所述内部源表和外部源表对应的数据信息;Obtaining data information corresponding to the internal source table and the external source table;
将所获取的数据信息输出至预设文档中。The acquired data information is output to a preset document.
在一个实施例中,若所述数据表为目标表,所述获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中,包括:In an embodiment, if the data table is a target table, the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document, includes:
将所述目标表分为插入目标表和覆盖目标表;Dividing the target table into an insertion target table and an overlay target table;
获取所述插入目标表和覆盖目标表对应的数据信息;Obtaining data information corresponding to the insertion target table and the coverage target table;
将所获取的数据信息输出至预设文档中。The acquired data information is output to a preset document.
本领域技术人员可以理解,图11中示出的基于Shell的数据表提取设备的实施例并不构成对基于Shell的数据表提取设备具体构成的限定,在其他实施例中,基于Shell的数据表提取设备可以包括比图示更多或更少的部件,或者 组合某些部件,或者不同的部件布置。例如,在一些实施例中,基于Shell的数据表提取设备仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图11所示实施例一致,在此不再赘述。Those skilled in the art can appreciate that the embodiment of the Shell-based data table extraction device shown in FIG. 11 does not constitute a limitation on the specific configuration of the Shell-based data table extraction device. In other embodiments, the Shell-based data table The extraction device may include more or fewer components than illustrated, or some components may be combined, or different component arrangements. For example, in some embodiments, the Shell-based data table extraction device includes only the memory and the processor. In such an embodiment, the structure and function of the memory and the processor are consistent with the embodiment shown in FIG. Narration.
本申请提供了一种计算机可读存储介质,计算机可读存储介质存储有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序可被一个或者一个以上的处理器执行,以实现上述基于Shell的数据表提取方法。The application provides a computer readable storage medium storing one or more computer programs, the one or more computer programs being executable by one or more processors to implement the above-described shell-based Data table extraction method.
本申请前述的存储介质包括:磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等各种可以存储程序代码的介质。The foregoing storage medium of the present application includes: a magnetic disk, an optical disk, a read-only memory (ROM), and the like, which can store various program codes.
本申请所有实施例中的单元可以通过通用集成电路,例如CPU(Central Processing Unit,中央处理器),或通过ASIC(Application Specific Integrated Circuit,专用集成电路)来实现。The units in all the embodiments of the present application may be implemented by a general-purpose integrated circuit, such as a CPU (Central Processing Unit), or by an ASIC (Application Specific Integrated Circuit).
本申请实施例基于Shell的数据表提取方法中的步骤可以根据实际需要进行顺序调整、合并和删减。The steps in the method for extracting the data table based on the Shell in the embodiment of the present application may be sequentially adjusted, merged, and deleted according to actual needs.
本申请实施例基于Shell的数据表提取终端中的单元可以根据实际需要进行合并、划分和删减。In the embodiment of the present application, the unit in the terminal based on the data table of the Shell may be merged, divided, and deleted according to actual needs.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any equivalents can be easily conceived by those skilled in the art within the technical scope disclosed in the present application. Modifications or substitutions are intended to be included within the scope of the present application. Therefore, the scope of protection of this application should be determined by the scope of protection of the claims.

Claims (20)

  1. 一种基于Shell的数据表提取方法,其特征在于,所述方法包括:A Shell-based data table extraction method, the method comprising:
    识别Shell脚本中的数据表;Identify the data tables in the shell script;
    提取所述数据表的表名;Extracting the table name of the data table;
    根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;Sorting the data table according to the extracted table name, wherein the data table includes a source table and a target table;
    获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。Obtaining data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document.
  2. 如权利要求1所述的方法,其特征在于,在所述识别Shell脚本中的数据表之前,所述方法还包括:The method of claim 1, wherein before the identifying the data table in the shell script, the method further comprises:
    根据预设关键字对所述Shell脚本进行遍历;Traversing the shell script according to a preset keyword;
    根据遍历的结果对所述Shell脚本中的数据表进行定位。The data table in the shell script is located according to the result of the traversal.
  3. 如权利要求1所述的方法,其特征在于,所述根据所提取的表名对所述数据表进行分类,包括:The method of claim 1, wherein the classifying the data table according to the extracted table name comprises:
    确定与所述数据表表名相对应的字符串;Determining a string corresponding to the table name of the data table;
    根据所述字符串对所述数据表进行分类。The data table is classified according to the character string.
  4. 如权利要求1所述的方法,其特征在于,若所述数据表为源表,所述获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中,包括:The method according to claim 1, wherein if the data table is a source table, the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset The documentation includes:
    将所述源表分为内部源表和外部源表;Dividing the source table into an internal source table and an external source table;
    获取所述内部源表和外部源表对应的数据信息;Obtaining data information corresponding to the internal source table and the external source table;
    将所获取的数据信息输出至预设文档中。The acquired data information is output to a preset document.
  5. 如权利要求1所述的方法,其特征在于,若所述数据表为目标表,所述获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中,包括:The method according to claim 1, wherein if the data table is a target table, the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset The documentation includes:
    将所述目标表分为插入目标表和覆盖目标表;Dividing the target table into an insertion target table and an overlay target table;
    获取所述插入目标表和覆盖目标表对应的数据信息;Obtaining data information corresponding to the insertion target table and the coverage target table;
    将所获取的数据信息输出至预设文档中。The acquired data information is output to a preset document.
  6. 一种基于Shell的数据表提取终端,其特征在于,所述终端包括:A shell-based data table extracting terminal, wherein the terminal comprises:
    识别单元,用于识别Shell脚本中的数据表;a recognition unit for identifying a data table in the shell script;
    提取单元,用于提取所述数据表的表名;An extracting unit, configured to extract a table name of the data table;
    分类单元,用于根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;a classification unit, configured to classify the data table according to the extracted table name, where the data table includes a source table and a target table;
    获取单元,用于获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。The obtaining unit is configured to obtain data information corresponding to different types of data tables, and output the obtained different types of data information into the same preset document.
  7. 如权利要求6所述的终端,其特征在于,所述终端还包括:The terminal according to claim 6, wherein the terminal further comprises:
    遍历单元,用于根据预设关键字对所述Shell脚本进行遍历;a traversing unit, configured to traverse the shell script according to a preset keyword;
    定位单元,用于根据遍历的结果对所述Shell脚本中的数据表进行定位。a positioning unit, configured to locate the data table in the shell script according to the result of the traversal.
  8. 如权利要求6所述的终端,其特征在于,所述分类单元,包括:The terminal according to claim 6, wherein the classification unit comprises:
    确定单元,用于确定与所述数据表表名相对应的字符串;a determining unit, configured to determine a character string corresponding to the data table table name;
    分类子单元,用于根据所述字符串对所述数据表进行分类。a classification subunit, configured to classify the data table according to the character string.
  9. 如权利要求6所述的终端,其特征在于,若所述数据表为源表,所述获取单元,包括:The terminal according to claim 6, wherein the obtaining unit comprises: if the data table is a source table,
    第一执行单元,用于将所述源表分为内部源表和外部源表;a first execution unit, configured to divide the source table into an internal source table and an external source table;
    第一获取子单元,用于获取所述内部源表和外部源表对应的数据信息;a first acquiring subunit, configured to acquire data information corresponding to the internal source table and the external source table;
    第一输出单元,用于将所获取的数据信息输出至预设文档中。The first output unit is configured to output the acquired data information into the preset document.
  10. 如权利要求6所述的终端,其特征在于,若所述数据表为目标表,所述获取单元,包括:The terminal according to claim 6, wherein if the data table is a target table, the obtaining unit comprises:
    第二执行单元,用于将所述目标表分为插入目标表和覆盖目标表;a second execution unit, configured to divide the target table into an insertion target table and an overlay target table;
    第二获取子单元,用于获取所述插入目标表和覆盖目标表对应的数据信息;a second acquiring subunit, configured to acquire data information corresponding to the insertion target table and the coverage target table;
    第二输出单元,用于将所获取的数据信息输出至预设文档中。And a second output unit, configured to output the acquired data information into the preset document.
  11. 一种基于Shell的数据表提取设备,其特征在于,包括:A Shell-based data table extraction device, comprising:
    存储器,用于存储实现数据表提取方法的程序;以及a memory for storing a program implementing the data table extraction method;
    处理器,用于运行所述存储器中存储的实现数据表提取方法的程序,以执行以下操作:a processor for running a program for implementing the data table extraction method stored in the memory to perform the following operations:
    识别Shell脚本中的数据表;Identify the data tables in the shell script;
    提取所述数据表的表名;Extracting the table name of the data table;
    根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;Sorting the data table according to the extracted table name, wherein the data table includes a source table and a target table;
    获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。Obtaining data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document.
  12. 如权利要求11所述的设备,其特征在于,所述处理器还执行如下操作:The device of claim 11 wherein said processor further performs the following operations:
    根据预设关键字对所述Shell脚本进行遍历;Traversing the shell script according to a preset keyword;
    根据遍历的结果对所述Shell脚本中的数据表进行定位。The data table in the shell script is located according to the result of the traversal.
  13. 如权利要求11所述的设备,其特征在于,所述根据所提取的表名对所述数据表进行分类,包括:The device according to claim 11, wherein said classifying said data table according to said extracted table name comprises:
    确定与所述数据表表名相对应的字符串;Determining a string corresponding to the table name of the data table;
    根据所述字符串对所述数据表进行分类。The data table is classified according to the character string.
  14. 如权利要求11所述的设备,其特征在于,若所述数据表为源表,所述获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中,包括:The device according to claim 11, wherein if the data table is a source table, the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset The documentation includes:
    将所述源表分为内部源表和外部源表;Dividing the source table into an internal source table and an external source table;
    获取所述内部源表和外部源表对应的数据信息;Obtaining data information corresponding to the internal source table and the external source table;
    将所获取的数据信息输出至预设文档中。The acquired data information is output to a preset document.
  15. 如权利要求11所述的设备,其特征在于,若所述数据表为目标表,所述获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中,包括:The device according to claim 11, wherein if the data table is a target table, the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset The documentation includes:
    将所述目标表分为插入目标表和覆盖目标表;Dividing the target table into an insertion target table and an overlay target table;
    获取所述插入目标表和覆盖目标表对应的数据信息;Obtaining data information corresponding to the insertion target table and the coverage target table;
    将所获取的数据信息输出至预设文档中。The acquired data information is output to a preset document.
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序可被一个或者一个以上的处理器执行,以实现以下步骤:A computer readable storage medium, wherein the computer readable storage medium stores one or more computer programs, the one or more computer programs being executable by one or more processors to implement the following step:
    识别Shell脚本中的数据表;Identify the data tables in the shell script;
    提取所述数据表的表名;Extracting the table name of the data table;
    根据所提取的表名对所述数据表进行分类,其中所述数据表包括源表以及目标表;Sorting the data table according to the extracted table name, wherein the data table includes a source table and a target table;
    获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中。Obtaining data information corresponding to different types of data tables, and outputting the acquired different types of data information to the same preset document.
  17. 如权利要求16所述的计算机可读存储介质,其特征在于,所述步骤还包括:The computer readable storage medium of claim 16 wherein said step further comprises:
    根据预设关键字对所述Shell脚本进行遍历;Traversing the shell script according to a preset keyword;
    根据遍历的结果对所述Shell脚本中的数据表进行定位。The data table in the shell script is located according to the result of the traversal.
  18. 如权利要求16所述的计算机可读存储介质,其特征在于,所述根据所提取的表名对所述数据表进行分类,包括:The computer readable storage medium of claim 16, wherein the classifying the data table according to the extracted table name comprises:
    确定与所述数据表表名相对应的字符串;Determining a string corresponding to the table name of the data table;
    根据所述字符串对所述数据表进行分类。The data table is classified according to the character string.
  19. 如权利要求16所述的计算机可读存储介质,其特征在于,若所述数据表为源表,所述获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中,包括:The computer readable storage medium according to claim 16, wherein if the data table is a source table, the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information To the same preset document, including:
    将所述源表分为内部源表和外部源表;Dividing the source table into an internal source table and an external source table;
    获取所述内部源表和外部源表对应的数据信息;Obtaining data information corresponding to the internal source table and the external source table;
    将所获取的数据信息输出至预设文档中。The acquired data information is output to a preset document.
  20. 如权利要求16所述的计算机可读存储介质,其特征在于,若所述数据表为目标表,所述获取不同类型的数据表对应的数据信息,并将所获取的不同类型的数据信息输出至同一预设文档中,包括:The computer readable storage medium according to claim 16, wherein if the data table is a target table, the acquiring data information corresponding to different types of data tables, and outputting the acquired different types of data information To the same preset document, including:
    将所述目标表分为插入目标表和覆盖目标表;Dividing the target table into an insertion target table and an overlay target table;
    获取所述插入目标表和覆盖目标表对应的数据信息;Obtaining data information corresponding to the insertion target table and the coverage target table;
    将所获取的数据信息输出至预设文档中。The acquired data information is output to a preset document.
PCT/CN2018/101880 2018-02-24 2018-08-23 Shell-based data table extraction method, terminal, device, and storage medium WO2019161645A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201810156612 2018-02-24
CN201810156612.8 2018-02-24
CN201810196485.4 2018-03-09
CN201810196485.4A CN108536745B (en) 2018-02-24 2018-03-09 Shell-based data table extraction method, terminal, equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2019161645A1 true WO2019161645A1 (en) 2019-08-29

Family

ID=63483448

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/101880 WO2019161645A1 (en) 2018-02-24 2018-08-23 Shell-based data table extraction method, terminal, device, and storage medium

Country Status (2)

Country Link
CN (1) CN108536745B (en)
WO (1) WO2019161645A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984659A (en) * 2020-07-28 2020-11-24 招联消费金融有限公司 Data updating method and device, computer equipment and storage medium
CN116578651A (en) * 2023-07-12 2023-08-11 北京集度科技有限公司 Data table structure synchronization method, system and equipment

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109359160A (en) * 2018-10-12 2019-02-19 平安科技(深圳)有限公司 Method of data synchronization, device, computer equipment and storage medium
CN110647564B (en) * 2019-08-14 2023-11-24 中国平安财产保险股份有限公司 Hive table building method, electronic device and computer readable storage medium
CN111460241B (en) * 2020-04-26 2024-01-23 甬矽电子(宁波)股份有限公司 Data query method and device, electronic equipment and storage medium
CN113190603A (en) * 2021-04-28 2021-07-30 中国邮政储蓄银行股份有限公司 Data processing method, data processing device, computer readable storage medium and processor

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11265293A (en) * 1998-03-17 1999-09-28 Nec Corp Script processor
CN102375826A (en) * 2010-08-13 2012-03-14 中国移动通信集团公司 Structured query language script analysis method, device and system
CN107169023A (en) * 2017-04-07 2017-09-15 广东精点数据科技股份有限公司 Data lineage analysis system and method based on sql semantic automatic analysis

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101944128A (en) * 2010-09-25 2011-01-12 中兴通讯股份有限公司 Data export and import method and device
US8612487B2 (en) * 2011-09-07 2013-12-17 International Business Machines Corporation Transforming hierarchical language data into relational form
US8589450B2 (en) * 2011-12-28 2013-11-19 Business Objects Software Limited Mapping non-relational database objects into a relational database model
CN104536987B (en) * 2014-12-08 2017-12-05 联动优势电子商务有限公司 A kind of method and device for inquiring about data
CN105868204B (en) * 2015-01-21 2019-06-21 中移信息技术有限公司 A kind of method and device for converting Oracle scripting language SQL
CN104866595B (en) * 2015-05-29 2019-05-03 北京京东尚科信息技术有限公司 Relational database script is added the method and device of transaction controlling

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11265293A (en) * 1998-03-17 1999-09-28 Nec Corp Script processor
CN102375826A (en) * 2010-08-13 2012-03-14 中国移动通信集团公司 Structured query language script analysis method, device and system
CN107169023A (en) * 2017-04-07 2017-09-15 广东精点数据科技股份有限公司 Data lineage analysis system and method based on sql semantic automatic analysis

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984659A (en) * 2020-07-28 2020-11-24 招联消费金融有限公司 Data updating method and device, computer equipment and storage medium
CN116578651A (en) * 2023-07-12 2023-08-11 北京集度科技有限公司 Data table structure synchronization method, system and equipment
CN116578651B (en) * 2023-07-12 2023-11-17 北京集度科技有限公司 Data table structure synchronization method, system and equipment

Also Published As

Publication number Publication date
CN108536745B (en) 2021-03-16
CN108536745A (en) 2018-09-14

Similar Documents

Publication Publication Date Title
WO2019161645A1 (en) Shell-based data table extraction method, terminal, device, and storage medium
US20230126005A1 (en) Consistent filtering of machine learning data
US11182691B1 (en) Category-based sampling of machine learning data
US11100420B2 (en) Input processing for machine learning
WO2021114810A1 (en) Graph structure-based official document recommendation method, apparatus, computer device, and medium
WO2017124713A1 (en) Data model determination method and apparatus
US9754015B2 (en) Feature rich view of an entity subgraph
CN111177113B (en) Data migration method, device, computer equipment and storage medium
WO2019200700A1 (en) Official document processing method and apparatus, and terminal device and storage medium
WO2020215689A1 (en) Query method and apparatus for column-oriented files
US9390111B2 (en) Database insert with deferred materialization
CN112328592A (en) Data storage method, electronic device and computer readable storage medium
WO2019161620A1 (en) Application dependency update method, terminal and device, and storage medium
US10324933B2 (en) Technique for processing query in database management system
CN109376154B (en) Data reading and writing method and data reading and writing system
CN111858581A (en) Page query method and device, storage medium and electronic equipment
US9201937B2 (en) Rapid provisioning of information for business analytics
WO2022223038A1 (en) Key name generation method and device, and computer readable storage medium
CN113741864B (en) Automatic semantic service interface design method and system based on natural language processing
WO2023081032A1 (en) Query-based database redaction
CN111221698A (en) Task data acquisition method and device
CN115858487A (en) Data migration method and device
CN113553458A (en) Data export method and device in graph database
WO2019153547A1 (en) Database operation method, apparatus and device, and computer-readable storage medium
US20220138186A1 (en) Data set acquisition method, terminal device and computer readable storage medium

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07/12/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18907012

Country of ref document: EP

Kind code of ref document: A1