WO2017162026A1 - 生成描述信息的方法及装置 - Google Patents

生成描述信息的方法及装置 Download PDF

Info

Publication number
WO2017162026A1
WO2017162026A1 PCT/CN2017/075947 CN2017075947W WO2017162026A1 WO 2017162026 A1 WO2017162026 A1 WO 2017162026A1 CN 2017075947 W CN2017075947 W CN 2017075947W WO 2017162026 A1 WO2017162026 A1 WO 2017162026A1
Authority
WO
WIPO (PCT)
Prior art keywords
data table
data
traversed
adjacent
tables
Prior art date
Application number
PCT/CN2017/075947
Other languages
English (en)
French (fr)
Inventor
殷琳君
林沛坤
罗净
朱洪波
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2017162026A1 publication Critical patent/WO2017162026A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Definitions

  • the present invention relates to the field of computers, and in particular to a method and apparatus for generating description information.
  • the first data table contains three fields, namely: device identification (ID), date, and device status; the second data table contains two fields, which are: device ID and company ID; third The data table contains two fields, namely: company ID and company name, then the meta information of the data table can be known, the first data table and the second data table can be connected by the "device ID", The two data tables and the third data table can be connected by the “company ID”.
  • a preset time period for example, the last one month.
  • Event for example: equipment failure rate
  • table join statements can be written according to specific cases, that is, business personnel need to customize different SQL statements for each case.
  • the related art does not provide how to perform data analysis by adopting an implementation method of automatically generating a legal table connection SQL statement given a plurality of data tables and knowing the relationship between the data tables.
  • Embodiments of the present invention provide a method and an apparatus for generating description information, so as to at least solve the problem in the related art.
  • the technical problem of data analysis is completed by adopting an implementation method of automatically generating a legal table connection SQL statement.
  • a method for generating description information includes: acquiring field information included in each data table in a plurality of data tables; and determining association relationship between the plurality of data tables according to the field information.
  • Descriptive information is generated by using a plurality of data tables to associate with each other, wherein the description information is used to record a connection order of the plurality of data tables with each other and a connection condition used between the adjacent data tables.
  • determining, according to the field information, the relationship between the plurality of data tables includes: selecting a step: selecting any one of the plurality of data tables to be traversed; and searching step: searching and selecting the included data table a data table having one or more identical fields in the field information, and setting the found data table as a data table to be connected of the selected data table; establishing an association relationship between the selected data table and the data table to be connected; Select the steps until multiple data tables are traversed.
  • generating the description information by using the association relationship between the plurality of data tables includes: sequentially obtaining the identification information of the next traversed data table from the initially traversed data table according to the relationship between the plurality of data tables, and The identification information of the data table adjacent to the traversed data table and the connection condition between the next traversed data table and the adjacent data table, wherein the adjacent data table is a data table that has been traversed; The identification information of the next traversed data table, the identification information of the adjacent data table, and the connection condition between the next traversed data table and the adjacent data table generate description information.
  • the identifier information of the acquired data table of the next traversal, the identifier information of the adjacent data table, and the connection condition between the data table of the next traversal and the adjacent data table are used to generate the description information, including:
  • the identification information of a traversed data table and the identification information of the adjacent data table establish an association relationship between the next traversed data table and the adjacent data table, and then record the next traversed data table and the adjacent data table.
  • the connection condition respectively generates segment connection information corresponding to the data table that has been traversed; combines all segment connection information to generate description information.
  • next traversed data table is adjacent to multiple data tables at the same time and/or the next traversed data table has multiple identical fields with the partially adjacent data table
  • next record is recorded.
  • the connection condition between the traversed data table and the adjacent data table is a connection condition generated by the next traversed data table adjacent to the plurality of data tables and/or a partial contiguous portion of the data table from the next traversal
  • the data table has a union of the join conditions generated by multiple identical fields.
  • an apparatus for generating description information including: an obtaining module, configured to acquire field information included in each data table of the plurality of data tables; and a determining module, configured to Field letter Determining the relationship between the plurality of data tables; the generating module is configured to generate the description information by using the relationship between the plurality of data tables, wherein the description information is used to record the connection order of the plurality of data tables and the adjacent The connection conditions used between the data tables.
  • the determining module includes: a selecting unit, configured to select any one of the plurality of data tables to be traversed; the searching unit is configured to search and select one or more of the field information included in the selected data table a data table of the same field, and setting the found data table as a data table to be connected of the selected data table; establishing a unit for establishing an association relationship between the selected data table and the data table to be connected; returning the selected unit Until the multiple data tables are traversed.
  • a selecting unit configured to select any one of the plurality of data tables to be traversed
  • the searching unit is configured to search and select one or more of the field information included in the selected data table a data table of the same field, and setting the found data table as a data table to be connected of the selected data table; establishing a unit for establishing an association relationship between the selected data table and the data table to be connected; returning the selected unit Until the multiple data tables are traversed.
  • the generating module includes: an acquiring unit, configured to sequentially acquire, according to the relationship between the plurality of data tables, the identification information of the data table of the next traversal from the initial traversed data table, and the data table of the next traversing The identification information of the adjacent data table and the connection condition between the next traversed data table and the adjacent data table, wherein the adjacent data table is a data table that has been traversed; the generating unit is configured to adopt the acquired The identification information of one traversed data table, the identification information of the adjacent data table, and the connection condition between the next traversed data table and the adjacent data table generate description information.
  • the generating unit includes: a first generating subunit, configured to first establish, between the data table of the next traversal and the adjacent data table, according to the identification information of the data table of the next traversal and the identification information of the adjacent data table. After the association relationship, the connection condition between the next traversed data table and the adjacent data table is recorded, and the segment connection information corresponding to the data table that has been traversed is separately generated; the second generation sub-unit is used to unite all segments. Connect information and generate description information.
  • the first generating subunit is configured to simultaneously associate with the plurality of data tables in the next traversed data table and/or the data table of the next traversal has multiple identical fields with the partially adjacent data table.
  • the connection condition between the next traversed data table and the adjacent data table is the connection condition generated by the next traversed data table adjacent to the plurality of data tables and/or by the next traversal
  • the data table and the partially adjacent data table have a union of connection conditions generated by a plurality of identical fields.
  • the association between the plurality of data tables is utilized.
  • the relationship generates description information for recording the connection order of the plurality of data tables and the connection conditions used between the adjacent data tables, so that the description information can be automatically generated without manual participation (for example, a legal table connection)
  • the SQL statement is used to complete the purpose of data analysis, thereby realizing the technical information effect of improving the accuracy of the data analysis result by automatically generating the description information of the relationship between the data tables, and further improving the accuracy of the data analysis result.
  • the technical problem of data analysis can be completed by adopting an implementation method of automatically generating a legal table connection SQL statement given a plurality of data tables and knowing the relationship between the data tables.
  • FIG. 1 is a block diagram showing the hardware structure of a computer terminal for generating a method for describing information according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method of generating description information according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of establishing a complete topology between a plurality of data tables in accordance with a preferred embodiment of the present invention
  • FIG. 4 is a schematic diagram of a method of generating description information in accordance with a preferred embodiment of the present invention.
  • FIG. 5 is a structural block diagram of an apparatus for generating description information according to an embodiment of the present invention.
  • FIG. 6 is a structural block diagram of an apparatus for generating description information according to a preferred embodiment of the present invention.
  • FIG. 7 is a structural block diagram of a computer terminal according to an embodiment of the present invention.
  • FIG. 1 is a hardware structural block diagram of a computer terminal for generating a method for describing information according to an embodiment of the present invention.
  • computer terminal 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • a memory 104 for storing data
  • a transmission device 106 for communication functions.
  • computer terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.
  • the memory 104 can be used to store software programs and modules of the application software, such as program instructions/modules corresponding to the method for generating description information in the embodiment of the present invention, and the processor 102 executes by executing the software programs and modules stored in the memory 104.
  • Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • Transmission device 106 is for receiving or transmitting data via a network.
  • the network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10.
  • the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • FIG. 2 is a flow chart of a method of generating description information in accordance with an embodiment of the present invention. As shown in FIG. 2, the method may include the following processing steps:
  • Step S20 Acquire field information included in each data table in the plurality of data tables
  • Step S22 determining, according to the field information, an association relationship between the plurality of data tables
  • Step S24 generating description information by using an association relationship between the plurality of data tables, wherein the description information is used for recording a connection order of the plurality of data tables with each other and a connection condition used between the adjacent data tables.
  • the description information of the relationship between the plurality of data tables which is mainly based on the table connection statement, needs to be manually written by the business personnel according to the specific case, that is, the business personnel need to separately customize different SQL statements for each case, This not only results in lower work efficiency, but also a higher probability of error.
  • the technical solution provided by the embodiment of the present invention uses multiple fields by obtaining field information included in each data table in multiple data tables and determining association relationship between multiple data tables according to the field information.
  • the association relationship between the data tables generates description information for recording the connection order of the plurality of data tables and the connection conditions used between the adjacent data tables, so that the description information can be automatically generated without manual participation
  • the legal table is connected to the SQL statement to complete the purpose of data analysis, thereby realizing the technical effect of automatically generating the relationship between the data tables not only improving the efficiency of data analysis but also improving the accuracy of the data analysis result.
  • the efficiency improvement is more significant, which solves the problem that the related technology cannot obtain the relationship between the data tables in a given number of data tables, and adopts automatic generation legality.
  • the table connects the implementation of the SQL statement to complete the technical problem of data analysis.
  • step S22 determining, according to the field information, the association relationship between the plurality of data tables may include the following steps:
  • Step S220 selecting any data table to be traversed from the plurality of data tables
  • Step S222 searching for a data table having one or more identical fields in the field information included in the selected data table, and setting the found data table as the data table to be connected of the selected data table;
  • Step S224 is established: establishing an association relationship between the selected data table and the data table to be connected; and returning the selection step until the plurality of data tables are all traversed.
  • one of the data tables may be randomly selected as a starting traversal data table or as a reference data table, and then the remaining data tables are sequentially accessed. Since each data table contains multiple fields (for example: field 1 is the product name, field 2 is the product model, field 3 is the manufacturer, field 4 is the lifetime), so after selecting one of the data tables You need to find other data tables that have the same fields as the data table. Here, other data tables with the same field can be one, or of course multiple. After finding other data tables with the same fields, you can set such data tables as the data tables to be connected of the previously selected data tables and establish a connection relationship with each other. After searching the data table to be connected of the currently traversed data table, continue to traverse the next data table, and so on, until all the data tables have undergone one traversal, thereby establishing between the plurality of data tables Complete topology.
  • fields for example: field 1 is the product name, field 2 is the product model, field 3 is the manufacturer, field 4 is the lifetime
  • FIG. 3 is a schematic diagram of establishing a complete topology between multiple data tables in accordance with a preferred embodiment of the present invention.
  • the data table 1 contains the fields: field 1, field 2, field 3, and field 4, data table. 2 The fields included are: Field 1, Field 3, and Field 5.
  • the data table 3 contains the fields: Field 2, Field 5, and Field 7, and Data Table 4 contains the fields: Field 7, Field 8, Field 9, and Field. 10.
  • the data table 1 is first randomly selected as the starting traversal data table or as the reference data table, and then the remaining data tables are sequentially accessed.
  • the data table 1 has the same fields (field 1 and field 3) as the data table 2, and the data table 1 has the same field (field 2) as the data table 3, whereby the data table 2 and the data table 3 can be determined. All are the data tables to be connected of the data table 1, and the relationship between the data table 1 and the data table 2 and between the data table 1 and the data table 3 can be established accordingly.
  • step S24 generating the description information by using the association relationship between the plurality of data tables may include the following execution steps:
  • Step S240 sequentially acquire the identification information of the data table of the next traversal from the data table initially traversed according to the relationship between the plurality of data tables, and the identification information of the data table adjacent to the data table of the next traversal and the next a connection condition between a traversed data table and an adjacent data table, wherein the adjacent data table is a data table that has been traversed;
  • Step S242 Generate identification information by using the acquired identification information of the data table of the next traversal, the identification information of the adjacent data table, and the connection condition between the next traversed data table and the adjacent data table.
  • the identification information of the next traversed data table may be sequentially obtained from the data table that starts traversing or as the reference data table, and further acquired with the data table of the next traversal. Identification information of adjacent data tables. It should be noted that the data table adjacent to the next traversed data table refers to all data tables that have been traversed from the beginning of the traversed data table to the next traversed data table. The data table adjacent to the next traversed data table. By obtaining the above identification information, the next one can be determined. The relationship between the traversed data table and which data table has been traversed, based on which the connection condition between the next traversed data table and the adjacent data table can be obtained (ie, which identical fields exist) .
  • the specific manner of generating the description information is as follows:
  • the data table traversed starting from data table 1 or as a reference data table, and then access the remaining data tables in turn.
  • both the data table 2 and the data table 3 are the data tables to be connected of the data table 1, and the relationship between the data table 1 and the data table 2 and between the data table 1 and the data table 3 has been established accordingly, then A data table can be randomly selected between data table 2 and data table 3.
  • the data table of the next traversal is the data table 2, and only the data table 1 that has been traversed and adjacent to the data table 2 is the data connection condition at this time.
  • Table 1 has the same fields (field 1 and field 3) as data table 2.
  • the data table of the next traversal is the data table 3.
  • the data table 1 and the data table 3 have the same field (field 2), and the data table 2 has the same field (field 5) as the data table 3, and the connection condition at this time is Data Table 1 has the same field (Field 2) as Data Table 3, and Data Table 2 has the same field (Field 5) as Data Table 3.
  • the data table of the next traversal is the data table 4.
  • the data table 3 has only the same field (field 7) as the data table 4, and the connection condition at this time is that the data table 3 has the same field as the data table 4. (field 7). So far, description information about the correlation relationship between the data table 1, the data table 2, the data table 3, and the data table 4 has been formed.
  • step S242 the identifier information of the acquired data table of the next traversal, the identification information of the adjacent data table, and the connection condition between the data table of the next traversal and the adjacent data table are used to generate the description information.
  • the following steps can be included:
  • Step S2420 First, according to the identification information of the data table of the next traversal and the identification information of the adjacent data table, establish an association relationship between the data table of the next traversal and the adjacent data table, and then record the data table of the next traversal and The connection conditions between the adjacent data tables respectively generate segment connection information corresponding to the data table that has been traversed;
  • Step S2422 Combine all segment connection information to generate description information.
  • connection conditions include: data table 1 has the same fields (field 1 and field 3) as data table 2, data table 1 has the same field (field 2) as data table 3, and data table 2 has the same field as data table 3 (field 5) and data table 3 have the same fields (field 7) as data table 4, and have been used once.
  • connection condition at this time includes data table 1 and data table 2 have the same field 1 and the same field 3; when traversing data table 3, the data table 1 has been traversed and adjacent to the data table 3 Data table 2, the connection condition at this time may include data table 1 and data table 3 have the same field 2 and data table 2 has the same field 5 as data table 3; when traversing data table 4, it has been traversed and the data table 4 Only the data table 3 is adjacent, and the connection condition at this time only includes the data table 3 and the data table 4 having the same field 7.
  • connection condition is: the data table 1 and the data table 2 have the same field 1 and the same field 3;
  • Connection data table 3 the connection conditions are: data table 1 and data table 3 have the same field 2 and data table 2 and data table 3 have the same field 5;
  • connection data table 4 the connection conditions are: data table 3 and data table 4 has the same field 7.
  • step S2420 if the next traversed data table is simultaneously adjacent to the plurality of data tables and/or the next traversed data table has a plurality of identical fields with the partially adjacent data table
  • the connection condition between the next traversed data table and the adjacent data table is the connection condition generated by the next traversed data table adjacent to the plurality of data tables and/or by the next traversal
  • the data table and the partially adjacent data table have a union of connection conditions generated by a plurality of identical fields.
  • Table 3 and the data table 4 form a single connection relationship, and also includes a complex connection relationship between the data table 1 and the data table 2 and the data table 1, the data table 2 and the data table 3, that is, for the data table 1 and
  • the connection condition at this time includes the data table 1 and the data.
  • Table 2 has the same field 1 and the same field 3, and it is because there are two identical fields between the data table 1 and the data table 2, so there are two between the data table 1 and the data table 2 a connection relationship, which is a union of the same field 1 and the same field 3 of the data table 1 and the data table 2; for traversing the data for the connection relationship between the data table 1, the data table 2, and the data table 3
  • a connection relationship which is a union of the same field 1 and the same field 3 of the data table 1 and the data table 2; for traversing the data for the connection relationship between the data table 1, the data table 2, and the data table 3
  • the data table 1 has the same field 2 as the data table 3 and the data table 2 has the same field 5 as the data table 3
  • FIG. 4 is a schematic diagram of a method of generating description information in accordance with a preferred embodiment of the present invention. As shown in FIG. 4, it is assumed that there are currently four data tables, which are data table A, data table B, data table C, and data table D, respectively, and f1, f2, and f3 are defined in the meta information of the data table. Fields, that is, data tables containing these fields, can be joined together.
  • connection fields included in the above four data tables are:
  • Data table A contains: connection field f3;
  • the data table B includes: a connection field f1, a connection field f2, and a connection field f3;
  • Data table C contains: a connection field f1;
  • the data table D contains: a connection field f1 and a connection field f2.
  • connection conditions existing between data table A, data table B: data table A.
  • connection field f3 data table B.
  • connection field f1 data table D.
  • connection field f1 data table D.
  • connection field f2 data table D.
  • data table A and data table B are connected by f3 field
  • data table B and data table C are connected by f1 field
  • data table B and data table D are connected by two fields f1 and f2
  • data table C and data table D are passed by f1 Field connection.
  • the preferred embodiment abstracts it into a traversal problem of the graph, that is, the graph vertex is used to represent the data table.
  • the edges of the graph are used to represent the connection relationship between the data tables.
  • a legal table connection statement which is as follows: starting from the vertex A of the graph, reaching other vertices in a certain order, and the current vertices and the front side thereof have been The edge between the vertices that have been reached is added to the join condition, and the legal table join statement with the vertices of the graph as the join table and the edges of the graph as the join condition is generated. Each table and join condition appears in the statement and only appears. Once, and the table that appears in the join condition has been "joined" first (the corresponding graph vertex has been reached).
  • the preferred embodiment determines the order of arrival of the vertices by employing a traversal method of the graph (including: depth-first traversal and breadth-first traversal). Taking breadth-first traversal as an example, starting from vertex A, the vertex is counted as the first layer, and the second layer vertex connected to its edge is reached first, that is, the vertex with distance A from A, and then reaches the distance increasing with A. The vertices of each layer are randomly ordered within the same layer.
  • the vertex A is the first layer
  • the vertex B is the second layer
  • the vertices C and D are both connected to B and belong to the third layer, so the order of arrival of the four vertices is A, B, C, D or A, B, D, C.
  • the method for generating description information according to the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but many In the case of the former is a better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.
  • the apparatus includes: an obtaining module 10, configured to acquire each data table in multiple data tables.
  • the field information is included;
  • the determining module 20 is configured to determine, according to the field information, the relationship between the plurality of data tables;
  • the generating module 30 is configured to generate the description information by using the relationship between the plurality of data tables, wherein the description information is It is used to record the order of connection between multiple data tables and the connection conditions used between adjacent data tables.
  • FIG. 6 is a structural block diagram of an apparatus for generating description information according to a preferred embodiment of the present invention.
  • the determining module 20 includes: a selecting unit 200, configured to select any one of a plurality of data tables. a data table to be traversed; the searching unit 202 is configured to search for a data table having one or more identical fields in the field information included in the selected data table, and set the found data table to the selected data table.
  • the connection data table is configured to establish an association relationship between the selected data table and the data table to be connected; returning the selection unit until the plurality of data tables are traversed.
  • the generating module 30 includes: an obtaining unit 300, configured to sequentially acquire, according to an association relationship between the plurality of data tables, identification information of the next traversed data table, starting from the initially traversed data table, The identification information of the data table adjacent to the next traversed data table and the connection condition between the next traversed data table and the adjacent data table, wherein the adjacent data table is a data table that has been traversed; 302. Use the identifier information of the acquired data table of the next traversal, the identifier information of the adjacent data table, and the connection condition between the data table of the next traversal and the adjacent data table to generate description information.
  • the generating unit 302 includes: a first generating subunit (not shown in the figure), used to first according to the next The identification information of the traversed data table and the identification information of the adjacent data table establish an association relationship between the next traversed data table and the adjacent data table, and then record the next traversed data table and the adjacent data table.
  • the connection condition respectively generates segment connection information corresponding to the data table that has been traversed; the second generation sub-unit (not shown) is used to combine all segment connection information to generate description information.
  • the first generating subunit (not shown) is configured to simultaneously associate with the plurality of data tables and/or the next traversed data table and the partially adjacent data in the next traversed data table.
  • the connection condition between the next traversed data table and the adjacent data table is the connection condition generated by the next traversed data table adjacent to the plurality of data tables. / or the union of the connection conditions generated by the next traversed data table and the partially adjacent data table with multiple identical fields.
  • Embodiments of the present invention may provide a computer terminal, which may be any one of computer terminal groups.
  • the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.
  • the computer terminal may be located in at least one network device of the plurality of network devices of the computer network.
  • FIG. 7 is a structural block diagram of a computer terminal according to an embodiment of the present invention.
  • the computer terminal can include one or more (only one shown in the figure) processor and memory.
  • the memory can be used to store software programs and modules, such as the method and device corresponding to the method and device for generating the description information in the embodiment of the present invention, and the processor executes various programs by running the software programs and modules stored in the memory. Functional application and data processing, that is, the method of generating the description information described above.
  • the memory may include a high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • the memory can further include memory remotely located relative to the processor, which can be connected to the terminal over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the processor can call the memory stored information and the application through the transmission device to perform the following steps:
  • the foregoing processor may further execute the following program code: the selecting step: selecting any one of the data tables to be traversed from the plurality of data tables; and searching step: searching and selecting the field information included in the selected data table. a data table having one or more identical fields, and setting the found data table as a data table to be connected of the selected data table; establishing an association relationship between the selected data table and the data table to be connected; returning the selection step, Until all the data sheets are traversed.
  • the foregoing processor may further execute the following program code: according to the relationship between the plurality of data tables, the identification information of the next traversed data table is sequentially obtained from the initially traversed data table, and the next traversal The identification information of the data table adjacent to the data table and the connection condition between the next traversed data table and the adjacent data table, wherein the adjacent data table is the data table that has been traversed; using the acquired next traversal The identification information of the data table, the identification information of the adjacent data table, and the connection condition between the next traversed data table and the adjacent data table generate description information.
  • the foregoing processor may further execute the following program code: firstly, according to the identification information of the data table of the next traversal and the identification information of the adjacent data table, establishing a data table between the next traversal and the adjacent data table. After the association relationship, the connection condition between the next traversed data table and the adjacent data table is recorded, and the segment connection information corresponding to the data table that has been traversed is respectively generated; the all segment connection information is combined to generate the description information.
  • the foregoing processor may further execute the following program code: if the next traversed data table is adjacent to the plurality of data tables at the same time and/or the next traversed data table and the partially adjacent data table exist When a plurality of identical fields are used, the connection condition between the data table of the next traversal recorded and the adjacent data table is a connection condition generated by the next traversed data table adjacent to the plurality of data tables and/or The data table generated by the next traversal and the partially adjacent data table have a union of connection conditions generated by a plurality of identical fields.
  • the relationship between the plurality of data tables is utilized.
  • the technical problem of data analysis is completed by adopting an implementation method of automatically generating a legal table connection SQL statement.
  • the structure shown in FIG. 7 is merely illustrative, and the computer terminal may also be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, an applause computer, and a mobile Internet device (Mobile). Terminal devices such as Internet Devices, MID) and PAD.
  • FIG. 7 does not limit the structure of the above electronic device.
  • the computer terminal may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 7, or have a different configuration than that shown in FIG.
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be used to save the program code executed by the method for generating description information provided by Embodiment 1 above.
  • the foregoing storage medium may be located in any one of the computer terminal groups in the computer network, or in any one of the mobile terminal groups.
  • the storage medium is arranged to store program code for performing the following steps:
  • the storage medium is further configured to store program code for performing the following steps: selecting step: selecting any one of the plurality of data tables to be traversed; and searching step: searching and selecting the included data table
  • the field information has one or more data tables of the same field, and sets the found data table to the data table to be connected of the selected data table; and establishes an association relationship between the selected data table and the data table to be connected; Return to the selection step until multiple data tables are traversed.
  • the storage medium is further configured to store program code for performing the following steps: sequentially obtaining the identification information of the next traversed data table from the initially traversed data table according to the relationship between the plurality of data tables; The identification information of the data table adjacent to the next traversed data table and the connection condition between the next traversed data table and the adjacent data table, wherein the adjacent data table is a data table that has been traversed; The identification information of the next traversed data table, the identification information of the adjacent data table, and the connection condition between the next traversed data table and the adjacent data table generate description information.
  • the storage medium is further configured to store program code for performing the following steps: first establishing a next traversed data table and adjacent data according to the identification information of the data table of the next traversal and the identification information of the adjacent data table. After the association relationship between the tables, the connection condition between the next traversed data table and the adjacent data table is recorded, and the segment connection information corresponding to the data table that has been traversed is separately generated; the all segment connection information is combined to generate Description.
  • the storage medium is further arranged to store program code for performing the following steps: if the next traversed data table is simultaneously adjacent to the plurality of data tables and/or the next traversed data table is partially adjacent to the data table
  • the connection condition between the next traversed data table and the adjacent data table is the connection generated by the next traversed data table adjacent to the plurality of data tables.
  • the condition and/or the data table generated by the next traversal and the partially adjacent data table have a union of connection conditions generated by a plurality of identical fields.
  • the disclosed technical contents may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium. , including several instructions to make a calculation
  • the device (which may be a personal computer, server or network device, etc.) performs all or part of the steps of the method of the various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)

Abstract

一种生成描述信息的方法及装置。其中,该方法包括:获取多张数据表中每张数据表所包含的字段信息(S20);根据字段信息确定多张数据表相互间的关联关系(S22);利用多张数据表相互间的关联关系生成描述信息,其中,描述信息用于记录多张数据表相互间的连接次序以及相邻数据表质检所使用的连接条件(S24)。该方法解决了相关技术中无法在给定若干张数据表且可获知数据表之间的关系的情况下,通过采用自动生成合法的表连接SQL语句的实现方式来完成数据分析的技术问题。

Description

生成描述信息的方法及装置
本申请要求2016年03月21日递交的申请号为201610162827.1、发明名称为“生成描述信息的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机领域,具体而言,涉及一种生成描述信息的方法及装置。
背景技术
目前,随着大数据技术的迅猛发展,如何便于业务人员快速分析数据,是大数据应用需要解决的关键问题。
业务人员经常需要基于基础数据进行各种统计分析,该过程通常需要编写SQL语句在数据库中运行。例如:对于基于多张数据表的统计分析,需要采用人工方式通过数据表的元信息来获取数据表之间的连接关系。假设第一张数据表包含有3个字段,其分别为:设备标识(ID)、日期以及设备状态;第二张数据表包含有2个字段,其分别为:设备ID和公司ID;第三张数据表包含有2个字段,其分别为:公司ID和公司名称,那么通过数据表的元信息可以获知,第一张数据表和第二张数据表可以通过“设备ID”加以连接,第二张数据表和第三张数据表可以通过“公司ID”加以连接,通过连接这三张数据表,可以分析出每个公司在预设时间段内(例如:最近1个月)发生的特定事件(例如:设备故障率)。
对于case by case的统计分析,表连接语句可以根据特定案例来编写,即业务人员需要为每个案例分别定制不同的SQL语句。然而,正是由于case by case的统计分析方式需要针对每个案例分别编写SQL语句,由此易造成此种操作方式不仅工作效率较低,而且发生错误的概率较高。
因此,相关技术中并没有提供在给定若干张数据表且可获知数据表之间的关系的情况下,如何通过采用自动生成合法的表连接SQL语句的实现方式来完成数据分析。
针对上述的问题,目前尚未提出有效的解决方案。
发明内容
本发明实施例提供了一种生成描述信息的方法及装置,以至少解决相关技术中无法 在给定若干张数据表且可获知数据表之间的关系的情况下,通过采用自动生成合法的表连接SQL语句的实现方式来完成数据分析的技术问题。
根据本发明实施例的一个方面,提供了一种生成描述信息的方法,包括:获取多张数据表中每张数据表所包含的字段信息;根据字段信息确定多张数据表相互间的关联关系;利用多张数据表相互间的关联关系生成描述信息,其中,描述信息用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件。
可选地,根据字段信息确定多张数据表相互间的关联关系包括:选取步骤:从多张数据表中选取任意一张待遍历的数据表;查找步骤:查找与选取的数据表所包含的字段信息中具有一个或多个相同字段的数据表,并将查找到的数据表设置为选取的数据表的待连接数据表;在选取的数据表与待连接数据表之间建立关联关系;返回选取步骤,直至多张数据表被全部遍历。
可选地,利用多张数据表相互间的关联关系生成描述信息包括:按照多张数据表相互间的关联关系从初始遍历的数据表开始依次获取下一个遍历的数据表的标识信息,与下一个遍历的数据表相邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件,其中,邻接的数据表为已经遍历过的数据表;采用获取到的下一个遍历的数据表的标识信息,邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件生成描述信息。
可选地,采用获取到的下一个遍历的数据表的标识信息,邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件生成描述信息包括:先根据下一个遍历的数据表的标识信息和邻接的数据表的标识信息建立下一个遍历的数据表与邻接的数据表之间的关联关系后,再记录下一个遍历的数据表与邻接的数据表之间的连接条件,分别生成与已经遍历的数据表对应的分段连接信息;联合全部分段连接信息,生成描述信息。
可选地,如果下一个遍历的数据表同时与多个数据表相邻接和/或下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段时,则记录的下一个遍历的数据表与邻接的数据表之间的连接条件是由下一个遍历的数据表与多个数据表相邻接所产生的连接条件和/或由下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段所产生的连接条件的并集。
根据本发明实施例的另一方面,还提供了一种生成描述信息的装置,包括:获取模块,用于获取多张数据表中每张数据表所包含的字段信息;确定模块,用于根据字段信 息确定多张数据表相互间的关联关系;生成模块,用于利用多张数据表相互间的关联关系生成描述信息,其中,描述信息用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件。
可选地,确定模块包括:选取单元,用于从多张数据表中选取任意一张待遍历的数据表;查找单元,用于查找与选取的数据表所包含的字段信息中具有一个或多个相同字段的数据表,并将查找到的数据表设置为选取的数据表的待连接数据表;建立单元,用于在选取的数据表与待连接数据表之间建立关联关系;返回选取单元,直至多张数据表被全部遍历。
可选地,生成模块包括:获取单元,用于按照多张数据表相互间的关联关系从初始遍历的数据表开始依次获取下一个遍历的数据表的标识信息,与下一个遍历的数据表相邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件,其中,邻接的数据表为已经遍历过的数据表;生成单元,用于采用获取到的下一个遍历的数据表的标识信息,邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件生成描述信息。
可选地,生成单元包括:第一生成子单元,用于先根据下一个遍历的数据表的标识信息和邻接的数据表的标识信息建立下一个遍历的数据表与邻接的数据表之间的关联关系后,再记录下一个遍历的数据表与邻接的数据表之间的连接条件,分别生成与已经遍历的数据表对应的分段连接信息;第二生成子单元,用于联合全部分段连接信息,生成描述信息。
可选地,第一生成子单元,用于在下一个遍历的数据表同时与多个数据表相邻接和/或下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段时,则记录的下一个遍历的数据表与邻接的数据表之间的连接条件是由下一个遍历的数据表与多个数据表相邻接所产生的连接条件和/或由下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段所产生的连接条件的并集。
在本发明实施例中,采用通过获取多张数据表中每张数据表所包含的字段信息并根据字段信息确定多张数据表相互间的关联关系的方式,利用多张数据表相互间的关联关系生成用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件的描述信息,达到了不需要人工参与便可以自动生成上述描述信息(例如:合法的表连接SQL语句)来完成数据分析的目的,从而实现了通过自动生成数据表之间相互关系的描述信息不仅提高了数据分析效率而且还提升了数据分析结果准确性的技术效果,进而解 决了相关技术中无法在给定若干张数据表且可获知数据表之间的关系的情况下,通过采用自动生成合法的表连接SQL语句的实现方式来完成数据分析的技术问题。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是本发明实施例的一种生成描述信息的方法的计算机终端的硬件结构框图;
图2是根据本发明实施例的生成描述信息的方法的流程图;
图3是根据本发明优选实施例的在多张数据表的相互间建立完整的拓扑结构的示意图;
图4是根据本发明优选实施例的生成描述信息的方法的示意图;
图5是根据本发明实施例的生成描述信息的装置的结构框图;
图6是根据本发明优选实施例的生成描述信息的装置的结构框图;
图7是根据本发明实施例的一种计算机终端的结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
实施例1
根据本发明实施例,还提供了一种生成描述信息的方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
本申请实施例一所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在计算机终端上为例,图1是本发明实施例的一种生成描述信息的方法的计算机终端的硬件结构框图。如图1所示,计算机终端10可以包括一个或多个(图中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输装置106。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,计算机终端10还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。
存储器104可用于存储应用软件的软件程序以及模块,如本发明实施例中的生成描述信息的方法对应的程序指令/模块,处理器102通过运行存储在存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的生成描述信息的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
在上述运行环境下,本申请提供了如图2所示的生成描述信息的方法。图2是根据本发明实施例的生成描述信息的方法的流程图。如图2所示,该方法可以包括以下处理步骤:
步骤S20:获取多张数据表中每张数据表所包含的字段信息;
步骤S22:根据字段信息确定多张数据表相互间的关联关系;
步骤S24:利用多张数据表相互间的关联关系生成描述信息,其中,描述信息用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件。
相关技术中,以表连接语句为主的记录多张数据表之间相互关系的描述信息需要业务人员根据特定案例来进行手工编写,即业务人员需要为每个案例分别定制不同的SQL语句,由此不仅造成工作效率较低,而且发生错误的概率也较高。然而,通过本发明实施例所提供的技术方案,采用通过获取多张数据表中每张数据表所包含的字段信息并根据字段信息确定多张数据表相互间的关联关系的方式,利用多张数据表相互间的关联关系生成用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件的描述信息,达到了不需要人工参与便可以自动生成上述描述信息(例如:合法的表连接SQL语句)来完成数据分析的目的,从而实现了通过自动生成数据表之间相互关系的描述信息不仅提高了数据分析效率而且还提升了数据分析结果准确性的技术效果,尤其是对于批量的数据分析工作而言,其效率提升更为显著,进而解决了相关技术中无法在给定若干张数据表且可获知数据表之间的关系的情况下,通过采用自动生成合法的表连接SQL语句的实现方式来完成数据分析的技术问题。
可选地,在步骤S22中,根据字段信息确定多张数据表相互间的关联关系可以包括以下执行步骤:
选取步骤S220:从多张数据表中选取任意一张待遍历的数据表;
查找步骤S222:查找与选取的数据表所包含的字段信息中具有一个或多个相同字段的数据表,并将查找到的数据表设置为选取的数据表的待连接数据表;
建立步骤S224:在选取的数据表与待连接数据表之间建立关联关系;返回选取步骤,直至多张数据表被全部遍历。
在需要建立相互之间的关联关系的多张数据表中,可以先随机选取其中一张数据表作为起始遍历的数据表或者作为参照数据表,然后依次访问其余数据表。由于每张数据表都包含有多个字段(例如:字段1为产品名称,字段2为产品型号,字段3为生产厂商,字段4为使用寿命),因此,在选定其中一张数据表之后,便需要查找与该数据表具有相同字段的其他数据表。此处,具有相同字段的其他数据表可以为一张,当然也可以为多张。在查找到具有相同字段的其他数据表之后,便可以将此类数据表设置为先前选取的数据表的待连接数据表,并建立彼此之间的连接关系。在对当前遍历的数据表的待连接数据表查找完毕后,再继续遍历下一张数据表,以此类推,直至全部数据表均经历过一次遍历,从而在上述多张数据表的相互间建立完整的拓扑结构。
例如:图3是根据本发明优选实施例的在多张数据表的相互间建立完整的拓扑结构的示意图。如图3所示(图3中仅示出多张数据表相互间的相同字段),在该示例中,数据表1包含的字段为:字段1、字段2、字段3和字段4,数据表2包含的字段为:字段1、字段3和字段5,数据表3包含的字段为:字段2、字段5和字段7以及数据表4包含的字段为:字段7、字段8、字段9和字段10。根据上述分析,首先随机选取数据表1作为起始遍历的数据表或者作为参照数据表,然后依次访问其余数据表。通过比较可以发现,数据表1与数据表2具有相同字段(字段1和字段3)以及数据表1与数据表3具有相同字段(字段2),由此,可以确定数据表2和数据表3均为数据表1的待连接数据表,并可以据此在数据表1与数据表2之间以及数据表1与数据表3之间建立关联关系。其次,再遍历数据表2,通过比较可以发现,数据表2仅与数据表3具有相同字段(字段5),由此,可以确定数据表3为数据表2的待连接数据表,并可以据此在数据表2与数据表3之间建立关联关系。然后,再遍历数据表3,通过比较可以发现,数据表3仅与数据表4具有相同字段(字段7),由此,可以确定数据表4为数据表3的待连接数据表,并可以据此在数据表3与数据表4之间建立关联关系。最后,再遍历数据表4,通过比较可以发现,除了上述已经建立关联关系的数据表3之外,并没有其他待连接数据表。至此,关于数据表1、数据表2、数据表3以及数据表4之间相互关联关系的拓扑结构已经形成。
可选地,在步骤S24中,利用多张数据表相互间的关联关系生成描述信息可以包括以下执行步骤:
步骤S240:按照多张数据表相互间的关联关系从初始遍历的数据表开始依次获取下一个遍历的数据表的标识信息,与下一个遍历的数据表相邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件,其中,邻接的数据表为已经遍历过的数据表;
步骤S242:采用获取到的下一个遍历的数据表的标识信息,邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件生成描述信息。
在确定多张数据表相互间的关联关系后,可以从起始遍历的数据表或者作为参照数据表开始依次获取下一个遍历的数据表的标识信息,并进一步获取与下一个遍历的数据表相邻接的数据表的标识信息。需要说明的是,此处提到的与下一个遍历的数据表相邻接的数据表是指从起始遍历的数据表开始到下一个遍历的数据表为止已经被遍历的全部数据表中与下一个遍历的数据表邻接的数据表。通过获取上述标识信息可以确定下一个 遍历的数据表与哪个或哪些已经被遍历的数据表之间存在关联关系,基于此,可以再获取下一个遍历的数据表与邻接的数据表之间的连接条件(即存在哪些相同的字段)。
例如:基于上述已经形成的关于数据表1、数据表2、数据表3以及数据表4之间相互关联关系的拓扑结构,生成描述信息的具体方式如下:
以数据表1为起始遍历的数据表或者作为参照数据表,然后依次访问其余数据表。考虑到数据表2和数据表3均为数据表1的待连接数据表,并已经据此在数据表1与数据表2之间以及数据表1与数据表3之间建立过关联关系,那么可以在数据表2和数据表3之间先随机选择一个数据表。假设此次选取的是数据表2,那么下一个遍历的数据表即为数据表2,而已经遍历的且与数据表2相邻接的仅有数据表1,则此时的连接条件为数据表1与数据表2具有相同字段(字段1和字段3)。接下来,在对数据表2完成遍历后,由于数据表2仅与数据表3之间存在连接关系,因此,下一个遍历的数据表即为数据表3。在已经遍历的数据表1和数据表2中,数据表1与数据表3具有相同字段(字段2),数据表2与数据表3具有相同字段(字段5),则此时的连接条件为数据表1与数据表3具有相同字段(字段2),数据表2与数据表3具有相同字段(字段5)。然后,在对数据表3完成遍历后,由于数据表3仅与数据表4之间存在连接关系,因此,下一个遍历的数据表即为数据表4。在已经遍历的数据表1、数据表2以及数据表3中,数据表3仅与数据表4具有相同字段(字段7),则此时的连接条件为数据表3与数据表4具有相同字段(字段7)。至此,关于数据表1、数据表2、数据表3以及数据表4之间相互关联关系的描述信息已经形成。
可选地,在步骤S242中,采用获取到的下一个遍历的数据表的标识信息,邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件生成描述信息可以包括以下执行步骤:
步骤S2420:先根据下一个遍历的数据表的标识信息和邻接的数据表的标识信息建立下一个遍历的数据表与邻接的数据表之间的关联关系后,再记录下一个遍历的数据表与邻接的数据表之间的连接条件,分别生成与已经遍历的数据表对应的分段连接信息;
步骤S2422:联合全部分段连接信息,生成描述信息。
在生成描述信息的过程中,数据表之间合法的表连接描述信息需要同时满足如下三个条件:
条件一、所有数据表均被且仅被连接一次;
条件二、所有连接条件均被且仅被使用一次;
条件三、每个连接条件中出现的数据表需要先被连接。
具体到上述示例中,在已经形成的关于数据表1、数据表2、数据表3以及数据表4之间相互关联关系的拓扑结构中,数据表1、数据表2、数据表3以及数据表4已均被连接一次。所有的连接条件,包括:数据表1与数据表2具有相同字段(字段1和字段3),数据表1与数据表3具有相同字段(字段2),数据表2与数据表3具有相同字段(字段5)以及数据表3与数据表4具有相同字段(字段7),已均被使用一次。在每遍历一张数据表时,在连接条件中出现的数据表必须为已经被遍历过的数据表,即在遍历数据表2时,已经遍历的且与数据表2相邻接的仅有数据表1,则此时的连接条件包含数据表1与数据表2具有相同字段1和相同字段3;在遍历数据表3时,已经遍历的且与数据表3相邻接的有数据表1和数据表2,则此时的连接条件可以包含数据表1与数据表3具有相同字段2以及数据表2与数据表3具有相同字段5;在遍历数据表4时,已经遍历的且与数据表4相邻接的仅有数据表3,则此时的连接条件仅包含数据表3与数据表4具有相同字段7。
将得到上述分段连接信息联合起来便可以生成最终所要获得的描述信息,即数据表1,连接数据表2,其连接条件为:数据表1与数据表2具有相同字段1和相同字段3;连接数据表3,其连接条件为:数据表1与数据表3具有相同字段2以及数据表2与数据表3具有相同字段5;连接数据表4,其连接条件为:数据表3与数据表4具有相同字段7。
在优选实施过程中,在步骤S2420中,如果下一个遍历的数据表同时与多个数据表相邻接和/或下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段时,则记录的下一个遍历的数据表与邻接的数据表之间的连接条件是由下一个遍历的数据表与多个数据表相邻接所产生的连接条件和/或由下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段所产生的连接条件的并集。
在已经形成的关于数据表1、数据表2、数据表3以及数据表4之间相互关联关系的拓扑结构中,不仅包含因连接条件为数据表3与数据表4具有相同字段7而在数据表3与数据表4形成的单一连接关系,而且还包含数据表1与数据表2之间以及数据表1、数据表2以及数据表3相互间形成的复杂连接关系,即对于数据表1与数据表2两者之间的连接关系而言,在遍历数据表2时,已经遍历的且与数据表2相邻接的仅有数据表1,则此时的连接条件包含数据表1与数据表2具有相同字段1和相同字段3,而正是由于数据表1与数据表2之间具有2个相同字段,因此,数据表1与数据表2之间存在两 种连接关系,其为数据表1与数据表2具有相同字段1和相同字段3的并集;对于数据表1、数据表2以及数据表3三者之间的连接关系而言,在遍历数据表3时,已经遍历的且与数据表3相邻接的有数据表1和数据表2,而且数据表1与数据表3具有相同字段2以及数据表2与数据表3具有相同字段5,因此,数据表3与已经遍历的数据表1和数据表2之间分别存在连接关系,其为数据表1与数据表3具有相同字段2以及数据表2与数据表3具有相同字段5的并集。
下面将结合以下优选实施方式对上述优选实施过程作进一步地描述。
图4是根据本发明优选实施例的生成描述信息的方法的示意图。如图4所示,假设当前总共有4张数据表,其分别为数据表A、数据表B、数据表C以及数据表D,在数据表的元信息中定义了f1、f2、f3为连接字段,即包含这些字段的数据表可以连接在一起。
具体地,上述4张数据表所包含的连接字段分别为:
数据表A包含:连接字段f3;
数据表B包含:连接字段f1,连接字段f2以及连接字段f3;
数据表C包含:连接字段f1;
数据表D包含:连接字段f1和连接字段f2。
由此,可以得到这4张数据表之间存在如下连接条件:
(数据表A,数据表B)之间存在的连接条件:数据表A.连接字段f3=数据表B.连接字段f3;
(数据表B,数据表C)之间存在的连接条件:数据表B.连接字段f1=数据表C.连接字段f1;
(数据表B,数据表D)之间存在的连接条件:数据表B.连接字段f1=数据表D.连接字段f1,数据表B.连接字段f2=数据表D.连接字段f2;
(数据表C,数据表D)之间存在的连接条件:数据表C.连接字段f1=数据表D.连接字段f1;
即数据表A和数据表B通过f3字段连接,数据表B和数据表C通过f1字段连接,数据表B和数据表D通过f1和f2两个字段连接以及数据表C和数据表D通过f1字段连接。
在该优选实施例中,如何根据数据表间的连接关系生成合法的描述信息(例如:表连接语句),该优选实施例将其抽象为图的遍历问题,即,采用图顶点来表示数据表, 采用图的边来表示数据表之间的连接关系,上述举例中的数据表以及相互间的连接关系可以表示为:
首先从数据表A开始,依次连接其它3张数据表,为生成合法的表连接语句,其具体方式如下:从图顶点A出发,按一定顺序到达其它顶点,并将当前到达顶点与它前边已经到达过的顶点之间的边加到连接条件中,即可生成以图顶点为连接表,图的边为连接条件的合法表连接语句,每个表和连接条件在该语句中出现且仅出现一次,且连接条件中出现的表已经先进行“join”(对应图顶点已经到达过)。
该优选实施例通过采用图的遍历方法(包括:深度优先遍历和广度优先遍历)来确定顶点的到达次序。以广度优先遍历为例,从顶点A出发,将该顶点计为第1层,先到达与它有边连接的第2层顶点,即与A距离为1的顶点,再依次到达与A距离递增的各层顶点,同一层内到达顺序随机。在上述示例中,顶点A为第1层,顶点B为第2层,顶点C、D均与B相连,同属第3层,所以这4个顶点的到达次序为A、B、C、D或者A、B、D、C。
假设以到达次序A、B、C、D为例,来展开表连接语句的生成过程:
第一步、从顶点A出发,到达顶点B后,join顶点B,并将顶点B与它前边已经到达过的顶点之间的边加到连接条件中,由于在到达顶点B之前只到达过顶点A,因此,可以将顶点B与顶点A之间的边添加到连接条件中,生成A join B on A.f3=B.f3;
第二步、从顶点B出发,到达顶点C后,join顶点C,并将顶点C与它前边已经到达过的顶点之间的边添加到连接条件中,由于在到达顶点C之前已经到达过顶点A和顶点B,但顶点C只与顶点B之间有边,所以将顶点C与顶点B之间的边添加到连接条件中,生成join C on B.f1=C.f1;
第三步、从顶点C触发,到达顶点D后,join顶点D,并将顶点D与它前边已经到达过的顶点之间的边添加到连接条件中,由于在到达顶点D之前已经到达过顶点A、顶点B以及顶点C,而顶点D与顶点B、顶点C之间有边,所以将顶点D与顶点B、顶点C之间的边添加到连接条件中,生成join D on B.f1=D.f1 and B.f2=D.f2 and C.f1=D.f1;
在所有顶点全部到达后,便可生成最终的表连接语句A join B on A.f3=B.f3 join C on B.f1=C.f1 join D on B.f1=D.f1 and B.f2=D.f2 and C.f1=D.f1。
因此,数据表A、数据表B、数据表C以及数据表D之间合法的表连接语句如下:
A join B on A.f3=B.f3 join C on B.f1=C.f1 join D on B.f1=D.f1 and B.f2=D.f2 and C.f1=D.f1。
假设以到达次序A、B、D、C为例,来展开表连接语句的生成过程:
第一步、从顶点A出发,到达顶点B后,join顶点B,并将顶点B与它前边已经到达过的顶点之间的边加到连接条件中,由于在到达顶点B之前只到达过顶点A,因此,可以将顶点B与顶点A之间的边添加到连接条件中,生成A join B on A.f3=B.f3;
第二步、从顶点B出发,到达顶点D后,join顶点D,并将顶点D与它前边已经到达过的顶点之间的边添加到连接条件中,由于在到达顶点D之前已经到达过顶点A和顶点B,但顶点D只与顶点B之间有边,所以将顶点D与顶点B之间的边添加到连接条件中,生成join D on B.f1=D.f1 and B.f2=D.f2;
第三步、从顶点D出发,到达顶点C后,join顶点C,并将顶点C与它前边已经到达过的顶点之间的边添加到连接条件中,由于在到达顶点C之前已经到达过顶点A、顶点B以及顶点D,而顶点C与顶点B、顶点D之间有边,所以将顶点C与顶点B、顶点D之间的边添加到连接条件中,生成join C on B.f1=C.f1 and C.f1=D.f1;
在所有顶点全部到达后,便可生成最终的表连接语句A join B on A.f3=B.f3 join D on B.f1=D.f1 and B.f2=D.f2 join C on B.f1=C.f1 and C.f1=D.f1。
因此,数据表A、数据表B、数据表C以及数据表D之间合法的表连接语句如下:
A join B on A.f3=B.f3 join D on B.f1=D.f1 and B.f2=D.f2 join C on B.f1=C.f1 and C.f1=D.f1。
另外,与合法的表连接语句相对,不合法的表连接语句的示例如下:
A join B on A.f3=B.f3 join C on B.f1=C.f1 and C.f1=D.f1 join D on B.f1=D.f1 and B.f2=D.f2;
在该表连接语句中的连接条件“C.f1=D.f1”在“join D”之前出现是不合法的,其原因在于,连接条件中的数据表D需要先“join”,即当前遍历的数据表为数据表C,已经遍历的数据表仅为数据表A和数据表B,其并不包括数据表D,因此,数据表D不能出现在关于连接数据表C的连接条件中,而只能在继续遍历数据表D后,出现在关于连接数据表D的连接条件中。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制, 因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的生成描述信息的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
实施例2
根据本发明实施例,还提供了一种用于实施上述生成描述信息的装置的结构框图,如图5所示,该装置包括:获取模块10,用于获取多张数据表中每张数据表所包含的字段信息;确定模块20,用于根据字段信息确定多张数据表相互间的关联关系;生成模块30,用于利用多张数据表相互间的关联关系生成描述信息,其中,描述信息用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件。
可选地,图6是根据本发明优选实施例的生成描述信息的装置的结构框图,如图6所示,确定模块20包括:选取单元200,用于从多张数据表中选取任意一张待遍历的数据表;查找单元202,用于查找与选取的数据表所包含的字段信息中具有一个或多个相同字段的数据表,并将查找到的数据表设置为选取的数据表的待连接数据表;建立单元204,用于在选取的数据表与待连接数据表之间建立关联关系;返回选取单元,直至多张数据表被全部遍历。
可选地,如图6所示,生成模块30包括:获取单元300,用于按照多张数据表相互间的关联关系从初始遍历的数据表开始依次获取下一个遍历的数据表的标识信息,与下一个遍历的数据表相邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件,其中,邻接的数据表为已经遍历过的数据表;生成单元302,用于采用获取到的下一个遍历的数据表的标识信息,邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件生成描述信息。
可选地,生成单元302包括:第一生成子单元(图中未示出),用于先根据下一个 遍历的数据表的标识信息和邻接的数据表的标识信息建立下一个遍历的数据表与邻接的数据表之间的关联关系后,再记录下一个遍历的数据表与邻接的数据表之间的连接条件,分别生成与已经遍历的数据表对应的分段连接信息;第二生成子单元(图中未示出),用于联合全部分段连接信息,生成描述信息。
可选地,第一生成子单元(图中未示出),用于在下一个遍历的数据表同时与多个数据表相邻接和/或下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段时,则记录的下一个遍历的数据表与邻接的数据表之间的连接条件是由下一个遍历的数据表与多个数据表相邻接所产生的连接条件和/或由下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段所产生的连接条件的并集。
实施例3
本发明的实施例可以提供一种计算机终端,该计算机终端可以是计算机终端群中的任意一个计算机终端设备。可选地,在本实施例中,上述计算机终端也可以替换为移动终端等终端设备。
可选地,在本实施例中,上述计算机终端可以位于计算机网络的多个网络设备中的至少一个网络设备。
可选地,图7是根据本发明实施例的一种计算机终端的结构框图。如图7所示,该计算机终端可以包括:一个或多个(图中仅示出一个)处理器以及存储器。
其中,存储器可用于存储软件程序以及模块,如本发明实施例中的生成描述信息的方法和装置对应的程序指令/模块,处理器通过运行存储在存储器内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的生成描述信息的方法。存储器可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器可进一步包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
处理器可以通过传输装置调用存储器存储的信息及应用程序,以执行下述步骤:
S1,获取多张数据表中每张数据表所包含的字段信息;
S2,根据字段信息确定多张数据表相互间的关联关系;
S3,利用多张数据表相互间的关联关系生成描述信息,其中,描述信息用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件。
可选的,上述处理器还可以执行如下步骤的程序代码:选取步骤:从多张数据表中选取任意一张待遍历的数据表;查找步骤:查找与选取的数据表所包含的字段信息中具有一个或多个相同字段的数据表,并将查找到的数据表设置为选取的数据表的待连接数据表;在选取的数据表与待连接数据表之间建立关联关系;返回选取步骤,直至多张数据表被全部遍历。
可选的,上述处理器还可以执行如下步骤的程序代码:按照多张数据表相互间的关联关系从初始遍历的数据表开始依次获取下一个遍历的数据表的标识信息,与下一个遍历的数据表相邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件,其中,邻接的数据表为已经遍历过的数据表;采用获取到的下一个遍历的数据表的标识信息,邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件生成描述信息。
可选的,上述处理器还可以执行如下步骤的程序代码:先根据下一个遍历的数据表的标识信息和邻接的数据表的标识信息建立下一个遍历的数据表与邻接的数据表之间的关联关系后,再记录下一个遍历的数据表与邻接的数据表之间的连接条件,分别生成与已经遍历的数据表对应的分段连接信息;联合全部分段连接信息,生成描述信息。
可选的,上述处理器还可以执行如下步骤的程序代码:如果下一个遍历的数据表同时与多个数据表相邻接和/或下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段时,则记录的下一个遍历的数据表与邻接的数据表之间的连接条件是由下一个遍历的数据表与多个数据表相邻接所产生的连接条件和/或由下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段所产生的连接条件的并集。
采用本发明实施例,采用通过获取多张数据表中每张数据表所包含的字段信息并根据字段信息确定多张数据表相互间的关联关系的方式,利用多张数据表相互间的关联关系生成用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件的描述信息,达到了不需要人工参与便可以自动生成上述描述信息(例如:合法的表连接SQL语句)来完成数据分析的目的,从而实现了通过自动生成数据表之间相互关系的描述信息不仅提高了数据分析效率而且还提升了数据分析结果准确性的技术效果,进而解决了相关技术中无法在给定若干张数据表且可获知数据表之间的关系的情况下,通过采用自动生成合法的表连接SQL语句的实现方式来完成数据分析的技术问题。
本领域普通技术人员可以理解,图7所示的结构仅为示意,计算机终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声电脑以及移动互联网设备(Mobile  Internet Devices,MID)、PAD等终端设备。图7其并不对上述电子装置的结构造成限定。例如,计算机终端还可包括比图7中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图7所示不同的配置。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
实施例4
本发明的实施例还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以用于保存上述实施例一所提供的生成描述信息的方法所执行的程序代码。
可选地,在本实施例中,上述存储介质可以位于计算机网络中计算机终端群中的任意一个计算机终端中,或者位于移动终端群中的任意一个移动终端中。
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:
S1,获取多张数据表中每张数据表所包含的字段信息;
S2,根据字段信息确定多张数据表相互间的关联关系;
S3,利用多张数据表相互间的关联关系生成描述信息,其中,描述信息用于记录多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:选取步骤:从多张数据表中选取任意一张待遍历的数据表;查找步骤:查找与选取的数据表所包含的字段信息中具有一个或多个相同字段的数据表,并将查找到的数据表设置为选取的数据表的待连接数据表;在选取的数据表与待连接数据表之间建立关联关系;返回选取步骤,直至多张数据表被全部遍历。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:按照多张数据表相互间的关联关系从初始遍历的数据表开始依次获取下一个遍历的数据表的标识信息,与下一个遍历的数据表相邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件,其中,邻接的数据表为已经遍历过的数据表;采用获取到的下一个遍历的数据表的标识信息,邻接的数据表的标识信息以及下一个遍历的数据表与邻接的数据表之间的连接条件生成描述信息。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:先根据下一个遍历的数据表的标识信息和邻接的数据表的标识信息建立下一个遍历的数据表与邻接的数据表之间的关联关系后,再记录下一个遍历的数据表与邻接的数据表之间的连接条件,分别生成与已经遍历的数据表对应的分段连接信息;联合全部分段连接信息,生成描述信息。
可选地,存储介质还被设置为存储用于执行以下步骤的程序代码:如果下一个遍历的数据表同时与多个数据表相邻接和/或下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段时,则记录的下一个遍历的数据表与邻接的数据表之间的连接条件是由下一个遍历的数据表与多个数据表相邻接所产生的连接条件和/或由下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段所产生的连接条件的并集。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算 机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (10)

  1. 一种生成描述信息的方法,其特征在于,包括:
    获取多张数据表中每张数据表所包含的字段信息;
    根据所述字段信息确定所述多张数据表相互间的关联关系;
    利用所述多张数据表相互间的关联关系生成描述信息,其中,所述描述信息用于记录所述多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件。
  2. 根据权利要求1所述的方法,其特征在于,根据所述字段信息确定所述多张数据表相互间的关联关系包括:
    选取步骤:从所述多张数据表中选取任意一张待遍历的数据表;
    查找步骤:查找与选取的数据表所包含的字段信息中具有一个或多个相同字段的数据表,并将查找到的数据表设置为所述选取的数据表的待连接数据表;
    在所述选取的数据表与所述待连接数据表之间建立关联关系;返回所述选取步骤,直至所述多张数据表被全部遍历。
  3. 根据权利要求2所述的方法,其特征在于,利用所述多张数据表相互间的关联关系生成所述描述信息包括:
    按照所述多张数据表相互间的关联关系从初始遍历的数据表开始依次获取下一个遍历的数据表的标识信息,与所述下一个遍历的数据表相邻接的数据表的标识信息以及所述下一个遍历的数据表与邻接的数据表之间的连接条件,其中,所述邻接的数据表为已经遍历过的数据表;
    采用获取到的所述下一个遍历的数据表的标识信息,所述邻接的数据表的标识信息以及所述下一个遍历的数据表与所述邻接的数据表之间的连接条件生成所述描述信息。
  4. 根据权利要求3所述的方法,其特征在于,采用获取到的所述下一个遍历的数据表的标识信息,所述邻接的数据表的标识信息以及所述下一个遍历的数据表与所述邻接的数据表之间的连接条件生成所述描述信息包括:
    先根据所述下一个遍历的数据表的标识信息和所述邻接的数据表的标识信息建立所述下一个遍历的数据表与所述邻接的数据表之间的关联关系后,再记录所述下一个遍历的数据表与所述邻接的数据表之间的连接条件,分别生成与已经遍历的数据表对应的分段连接信息;
    联合全部分段连接信息,生成所述描述信息。
  5. 根据权利要求4所述的方法,其特征在于,如果所述下一个遍历的数据表同时 与多个数据表相邻接和/或所述下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段时,则记录的所述下一个遍历的数据表与所述邻接的数据表之间的连接条件是由所述下一个遍历的数据表与多个数据表相邻接所产生的连接条件和/或由所述下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段所产生的连接条件的并集。
  6. 一种生成描述信息的装置,其特征在于,包括:
    获取模块,用于获取多张数据表中每张数据表所包含的字段信息;
    确定模块,用于根据所述字段信息确定所述多张数据表相互间的关联关系;
    生成模块,用于利用所述多张数据表相互间的关联关系生成描述信息,其中,所述描述信息用于记录所述多张数据表相互间的连接次序以及相邻数据表之间所使用的连接条件。
  7. 根据权利要求6所述的装置,其特征在于,所述确定模块包括:
    选取单元,用于从所述多张数据表中选取任意一张待遍历的数据表;
    查找单元,用于查找与选取的数据表所包含的字段信息中具有一个或多个相同字段的数据表,并将查找到的数据表设置为所述选取的数据表的待连接数据表;
    建立单元,用于在所述选取的数据表与所述待连接数据表之间建立关联关系;返回所述选取单元,直至所述多张数据表被全部遍历。
  8. 根据权利要求7所述的装置,其特征在于,所述生成模块包括:
    获取单元,用于按照所述多张数据表相互间的关联关系从初始遍历的数据表开始依次获取下一个遍历的数据表的标识信息,与所述下一个遍历的数据表相邻接的数据表的标识信息以及所述下一个遍历的数据表与邻接的数据表之间的连接条件,其中,所述邻接的数据表为已经遍历过的数据表;
    生成单元,用于采用获取到的所述下一个遍历的数据表的标识信息,所述邻接的数据表的标识信息以及所述下一个遍历的数据表与所述邻接的数据表之间的连接条件生成所述描述信息。
  9. 根据权利要求8所述的装置,其特征在于,所述生成单元包括:
    第一生成子单元,用于先根据所述下一个遍历的数据表的标识信息和所述邻接的数据表的标识信息建立所述下一个遍历的数据表与所述邻接的数据表之间的关联关系后,再记录所述下一个遍历的数据表与所述邻接的数据表之间的连接条件,分别生成与已经遍历的数据表对应的分段连接信息;
    第二生成子单元,用于联合全部分段连接信息,生成所述描述信息。
  10. 根据权利要求9所述的装置,其特征在于,所述第一生成子单元,用于在所述下一个遍历的数据表同时与多个数据表相邻接和/或所述下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段时,则记录的所述下一个遍历的数据表与所述邻接的数据表之间的连接条件是由所述下一个遍历的数据表与多个数据表相邻接所产生的连接条件和/或由所述下一个遍历的数据表与部分相邻接的数据表存在多个相同的字段所产生的连接条件的并集。
PCT/CN2017/075947 2016-03-21 2017-03-08 生成描述信息的方法及装置 WO2017162026A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610162827.1A CN107220251A (zh) 2016-03-21 2016-03-21 生成描述信息的方法及装置
CN201610162827.1 2016-03-21

Publications (1)

Publication Number Publication Date
WO2017162026A1 true WO2017162026A1 (zh) 2017-09-28

Family

ID=59899308

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/075947 WO2017162026A1 (zh) 2016-03-21 2017-03-08 生成描述信息的方法及装置

Country Status (3)

Country Link
CN (1) CN107220251A (zh)
TW (1) TW201734861A (zh)
WO (1) WO2017162026A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756125A (zh) * 2023-08-14 2023-09-15 中信证券股份有限公司 描述信息生成方法、装置、电子设备和计算机可读介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382198B (zh) * 2018-12-28 2023-09-19 中国移动通信集团山西有限公司 数据还原方法、装置、设备及存储介质
CN111708918A (zh) * 2020-05-15 2020-09-25 北京明略软件系统有限公司 一种数据处理方法、电子设备和存储介质
CN113407536B (zh) * 2021-06-10 2024-05-31 平安科技(深圳)有限公司 表数据的关联方法、装置、终端设备及介质
CN113434507B (zh) * 2021-06-29 2023-07-07 中国联合网络通信集团有限公司 数据文本化方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050080820A1 (en) * 2003-10-11 2005-04-14 Koppel Carl Arnold Method and system for generating, associating and employing user-defined fields in a relational database within an information technology system
US20060271957A1 (en) * 2005-05-31 2006-11-30 Dave Sullivan Method for utilizing audience-specific metadata
CN101482821A (zh) * 2009-02-13 2009-07-15 山东浪潮齐鲁软件产业股份有限公司 一种实现信息辅助录入功能的模型驱动开发方法
CN101673287A (zh) * 2009-10-16 2010-03-17 金蝶软件(中国)有限公司 一种sql语句生成方法及系统
CN102110134A (zh) * 2010-12-28 2011-06-29 青岛海信网络科技股份有限公司 轨道交通用实时数据库、操作方法及操作装置
CN104317939A (zh) * 2014-10-31 2015-01-28 北京思特奇信息技术股份有限公司 一种基于数字电影播放服务器的日志统计方法及系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103135976B (zh) * 2011-11-30 2016-05-11 阿里巴巴集团控股有限公司 代码自动生成方法及装置
CN104182405B (zh) * 2013-05-22 2017-05-24 阿里巴巴集团控股有限公司 一种连接查询方法及装置
CN105224597A (zh) * 2015-08-28 2016-01-06 上海斐讯数据通信技术有限公司 一种可将数据库中的外键关系生成图像的系统及方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050080820A1 (en) * 2003-10-11 2005-04-14 Koppel Carl Arnold Method and system for generating, associating and employing user-defined fields in a relational database within an information technology system
US20060271957A1 (en) * 2005-05-31 2006-11-30 Dave Sullivan Method for utilizing audience-specific metadata
CN101482821A (zh) * 2009-02-13 2009-07-15 山东浪潮齐鲁软件产业股份有限公司 一种实现信息辅助录入功能的模型驱动开发方法
CN101673287A (zh) * 2009-10-16 2010-03-17 金蝶软件(中国)有限公司 一种sql语句生成方法及系统
CN102110134A (zh) * 2010-12-28 2011-06-29 青岛海信网络科技股份有限公司 轨道交通用实时数据库、操作方法及操作装置
CN104317939A (zh) * 2014-10-31 2015-01-28 北京思特奇信息技术股份有限公司 一种基于数字电影播放服务器的日志统计方法及系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116756125A (zh) * 2023-08-14 2023-09-15 中信证券股份有限公司 描述信息生成方法、装置、电子设备和计算机可读介质
CN116756125B (zh) * 2023-08-14 2023-10-27 中信证券股份有限公司 描述信息生成方法、装置、电子设备和计算机可读介质

Also Published As

Publication number Publication date
TW201734861A (zh) 2017-10-01
CN107220251A (zh) 2017-09-29

Similar Documents

Publication Publication Date Title
WO2017162026A1 (zh) 生成描述信息的方法及装置
TWI740901B (zh) 執行資料恢復操作的方法及裝置
US9170977B2 (en) Method and system for managing server information data based on position information of a server baseboard
US10412035B2 (en) Method and device for pushing information based on communication group
CN105653630B (zh) 分布式数据库的数据迁移方法与装置
WO2018000607A1 (zh) 一种识别测试用例失败原因的方法及电子设备
WO2021043064A1 (zh) 社区发现方法、装置、计算机设备和存储介质
CN105824846B (zh) 数据迁移方法及装置
WO2018149137A1 (zh) 无线保真Wi-Fi连接方法及相关产品
CN110377570A (zh) 节点切换方法、装置、计算机设备及存储介质
WO2021129379A1 (zh) 信息分享链的生成方法及装置、电子设备、存储介质
WO2018149138A1 (zh) 无线保真Wi-Fi连接方法及相关产品
EP2990947A1 (en) Method and apparatus for backing up data and electronic device
CN110224899B (zh) 一种tcp应用的调用链获取方法及装置
CN104615658A (zh) 一种确定用户身份的方法
CN106446300A (zh) 一种基于共享存储池的事务处理方法及系统
WO2019080719A1 (zh) 数据处理方法、装置、存储介质、处理器和系统
CN103701653B (zh) 一种接口热插拔配置数据的处理方法及网络配置服务器
CN108733698A (zh) 一种日志消息的处理方法及后台服务系统
CN115238062A (zh) 技术产权匹配方法和系统
WO2016061947A1 (zh) 三层接口ip地址冲突的检测方法及装置
WO2017059778A1 (zh) 检测空壳网站的方法、装置及系统
CN107609197B (zh) 一种数据同步方法、数据同步装置及移动终端
CN106789446A (zh) 一种节点对等的集群分布式测试框架和方法
CN104182348A (zh) 软件测试方法及装置

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17769303

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17769303

Country of ref document: EP

Kind code of ref document: A1