CN112668287A - Data table determination method, system and device - Google Patents

Data table determination method, system and device Download PDF

Info

Publication number
CN112668287A
CN112668287A CN201910944542.7A CN201910944542A CN112668287A CN 112668287 A CN112668287 A CN 112668287A CN 201910944542 A CN201910944542 A CN 201910944542A CN 112668287 A CN112668287 A CN 112668287A
Authority
CN
China
Prior art keywords
schema information
data
field
data value
schema
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910944542.7A
Other languages
Chinese (zh)
Inventor
张安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201910944542.7A priority Critical patent/CN112668287A/en
Publication of CN112668287A publication Critical patent/CN112668287A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application discloses a data table determining method, system and device. Wherein, the method comprises the following steps: acquiring first Schema information, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; broadcasting the first Schema information to an executing device; receiving second Schema information fed back by the execution device based on the first Schema information, the attribute of a second field, and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by the execution device after analyzing the obtained user data; and constructing a data table based on the first Schema information and the second Schema information. The method and the device solve the technical problem that in the scheme in the prior art, when the format of the data changes and the analysis code corresponding to the Schema needs to be modified, the analysis efficiency of the data is low.

Description

Data table determination method, system and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, a system, and an apparatus for determining a data table.
Background
In the prior art, with the rapid development of the internet and the internet of things technology, global information data shows explosive growth, and the demands of people on analysis, mining, prediction and the like in the presence of mass data are increasingly strengthened. The formats of data are different, and in order to effectively recognize the data, a certain data structure must be constructed to map the disordered data. The most common processing method is to analyze the data with different structures into a data warehouse through a predefined parser and convert the data into structured data for further processing.
In a distributed big data computing scenario, most big data frameworks do not support the definition of Schema for randomly changing tables, where Schema is a collection of database objects, and the database objects are commonly used tables, indexes, views, storage procedures, and the like for describing the structure of data. In analyzing the data format of the client, the conventional technology generally requires that a developer and the client subscribe the data structure in advance, make a set of fixed schema, and then the developer analyzes the data of the client according to the set of fixed schema. Obviously, such a method has great limitations, because different clients may need to make different schemas, and the same client may also have a need to increase fields in the schemas at any time, when a scheme for analyzing client data mostly needs a developer to agree with the client in advance, what format of data the client sends, and what content (fields and values) is included in the data, once the agreement is reached, when the client needs to change or add fields for sending data, the developer must correspondingly modify analysis codes corresponding to the schemas, which takes a long time, and the data analysis efficiency is low.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the application provides a method, a system and a device for determining a data table, so as to solve at least the technical problem that in the scheme in the prior art, when the format of data changes and an analysis code corresponding to a Schema needs to be modified, the analysis efficiency of the data is low.
According to an aspect of an embodiment of the present application, there is provided a data table determination method, including: acquiring first Schema information, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; broadcasting the first Schema information to an executing device; receiving second Schema information fed back by the execution device based on the first Schema information, the attribute of a second field, and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by the execution device after analyzing the obtained user data; and constructing a data table based on the first Schema information and the second Schema information.
Optionally, constructing a data table based on the first Schema information and the second Schema information includes: merging the first Schema information and the second Schema information, and performing deduplication processing to obtain third Schema information; and determining the data table according to the third Schema information.
According to an aspect of an embodiment of the present application, there is provided a data table determination method, including: receiving first Schema information, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; acquiring user data, and analyzing the user data to obtain the attribute of a second field and a second data value corresponding to the second field; determining a second data type for the second data value; determining second Schema information based on the attribute of the second field, the second data value, and the second data type; determining third Schema information based on the first Schema information and the second Schema information; and sending the third Schema information to the drive equipment.
Optionally, determining the third Schema information based on the first Schema information and the second Schema information includes: deleting the Schema information which is the same as the first Schema information in the second Schema information to obtain the third Schema information.
Optionally, determining second Schema information based on the attribute of the second field, the second data value, and the second data type includes: and accumulating and/or removing the attribute of the second field, the second data value and the second data type to obtain the second Schema information.
According to an aspect of an embodiment of the present application, there is provided a data table determination system including: the driving device is used for acquiring first Schema information from a database, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; broadcasting the first Schema information to an executing device, receiving second Schema information, constructing a data table based on the first Schema information and the second Schema information, and sending the data table to the database; the execution device is configured to determine the second Schema information based on the first Schema information, an attribute of a second field, and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by analyzing, by the execution device, the obtained user data.
According to an aspect of an embodiment of the present application, there is provided a data table determination apparatus including: the obtaining module is configured to obtain first Schema information, where the first Schema information includes at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; a broadcasting module, configured to broadcast the first Schema information to an executing device; a receiving module, configured to receive second Schema information fed back by the execution device based on attributes of the first Schema information and a second field and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by analyzing, by the execution device, the obtained user data; and the constructing module is used for constructing the data table based on the first Schema information and the second Schema information.
Optionally, the building module comprises: a merging module, configured to merge the first Schema information and the second Schema information, and perform deduplication processing to obtain third Schema information; and the determining module is used for determining the data table according to the third Schema information.
According to an aspect of the embodiments of the present application, there is provided a storage medium, where the storage medium includes a stored program, and when the program runs, a device in which the storage medium is located is controlled to execute the above-mentioned data table determination method.
According to an aspect of the embodiments of the present application, there is provided an electronic device, the electronic device including at least one processor, and at least one memory and a bus connected to the processor, wherein the processor and the memory complete communication with each other through the bus; the processor is used for calling the program instructions in the memory so as to execute the data table determination method.
In this embodiment of the present application, obtaining first Schema information is adopted, where the first Schema information includes at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; broadcasting the first Schema information to an executing device; receiving second Schema information fed back by the execution device based on the first Schema information, the attribute of a second field, and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by the execution device after analyzing the obtained user data; the method for constructing the data table based on the first Schema information and the second Schema information comprises the steps that the execution equipment automatically analyzes the obtained user data to generate new Schema information, the driving equipment generates a new data table together according to the new Schema information and the original Schema information, the data in different data formats are automatically analyzed, and the new data table is created; therefore, the technical effect of improving the data analysis efficiency when the format of the data changes and the analysis code corresponding to the Schema needs to be modified is achieved, and the technical problem that in the scheme in the prior art, when the format of the data changes and the analysis code corresponding to the Schema needs to be modified, the data analysis efficiency is low is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a schematic flow chart diagram illustrating an alternative data table determination method according to an embodiment of the present application;
FIG. 2 is a schematic data flow diagram of an alternative data table determination method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart diagram illustrating an alternative data table determination method according to an embodiment of the present application;
FIG. 4 is a block diagram of an alternative data table determination system according to an embodiment of the present application;
FIG. 5 is a schematic diagram of an alternative data table determination apparatus according to an embodiment of the present application;
FIG. 6 is a block diagram of an alternative electronic device according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. .
In accordance with an embodiment of the present application, there is provided a data table determination method embodiment, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that while a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.
Fig. 1 is a schematic flow chart of a data table determination method according to an embodiment of the present application, and as shown in fig. 1, the method at least includes the following steps:
step S102, obtaining first Schema information, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value;
alternatively, the drive device may obtain the first Schema information, and the drive device may be a computer device.
The first Schema information may be Schema information of an original table read by the drive device from a database, and the attribute of the first field may be a field name.
Step S104, broadcasting the first Schema information to an executing device;
in some alternative embodiments of the present application, the execution device may be a computer, and the number of the execution devices may be multiple. And after reading the first Schema information, the drive device broadcasts the first Schema to each execution device. And notifying the first Schema information existing in the databases of the execution devices so that the execution devices delete the information same as the first Schema information after acquiring the user data.
Step S106, receiving second Schema information fed back by the execution device based on the first Schema information, the attribute of the second field, and the second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by the execution device after analyzing the obtained user data;
in some optional embodiments of the application, before each executing device feeds back the second Schema information to the driving device, user data needs to be acquired, where the user data includes data with multiple data formats, for example: user data in csv format, user data in json format, and user data in txt format. After the execution equipment acquires the user data, analyzing the user data in different data formats to obtain the attribute of the second field and a second data value corresponding to the second field.
Optionally, after the execution device analyzes the user data to obtain the attribute of the second field and the second data value corresponding to the second field, the data type of the second data value may be determined. For the type inference, several suitable data types may be preset first, and an attempt may be made to infer that the data type belongs to a certain type according to the value of the second data, for example: a value of 3242, it is inferred to be of integer type; with a value of "Chinese", it is inferred to be of the string type.
In some optional embodiments of the application, after each executing device parses the user data, the data type inference and the accumulation of the attribute of the second field, the second data value, and the data type of the second data value may be performed dynamically, where the accumulation process of the attribute of the second field, the second data value, and the data type of the second data value may also be performed on a preset independent device, that is, the independent device obtains the attribute of the second field, the second data value, and the data type of the second data value from each executing device, and then accumulates the attribute of the second field, the second data value, and the data type of the second data value. And the second Schema information comprises the attribute of the second field, the second data value and the data type of the second data value.
Step S108, a data table is constructed based on the first Schema information and the second Schema information.
Optionally, after acquiring the attribute of the accumulated second field, the second data value, and the data type of the second data value, the drive device may perform some additional fault tolerance processing, for example: and the second data values corresponding to a certain second field are various, so that fault-tolerant processing is required when different data types are deduced.
Optionally, the fault tolerance processing mode may be: when the data type judgment result of the second data value is multiple data types, the target data type corresponding to the second data value can be determined according to the attribute of the second field.
Optionally, constructing the data table based on the first Schema information and the second Schema information may be implemented in the following manner: merging the first Schema information and the second Schema information, and performing deduplication processing to obtain third Schema information; and determining the data table according to the third Schema information.
The application provides a solution for dynamically creating a schema of a table and deducing data types under distributed processing of large data volume, and the main technical means is to use the concepts of broadcasting and accumulation, wherein the broadcasting means that: the existing schema structure is broadcast out to be known to each distributed machine. By accumulation is meant: and acquiring a new schema from each distributed machine, and then delivering the schema to a host machine for accumulation and summary. After the processing, the complete schema and the data information can be finally obtained, so that a complete data table is constructed, and thus, the original data information is reserved and the newly added data information is also included.
The method for dynamically constructing the schema flow under the distributed scene can realize dynamic deduction of data types, firstly, various client data formats are supported, only various resolvers are required to be arranged in execution equipment, different formats are resolved by using different resolvers, secondly, when a client is allowed to use a certain data format, fields are increased randomly, and application scenes are increased; the data can be rapidly processed by utilizing distributed computation under the condition of large data quantity.
In this embodiment of the present application, obtaining first Schema information is adopted, where the first Schema information includes at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; broadcasting the first Schema information to an executing device; receiving second Schema information fed back by the execution device based on the first Schema information, the attribute of a second field, and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by the execution device after analyzing the obtained user data; the method for constructing the data table based on the first Schema information and the second Schema information comprises the steps that the execution equipment automatically analyzes the obtained user data to generate new Schema information, the driving equipment generates a new data table together according to the new Schema information and the original Schema information, the data in different data formats are automatically analyzed, and the new data table is created; therefore, the technical effect of improving the data analysis efficiency when the format of the data changes and the analysis code corresponding to the Schema needs to be modified is achieved, and the technical problem that in the scheme in the prior art, when the format of the data changes and the analysis code corresponding to the Schema needs to be modified, the data analysis efficiency is low is solved.
Fig. 2 is a schematic data flow diagram of a data table determination method according to an embodiment of the present application, as shown in fig. 2, the method at least includes the following steps:
s202, the driving device 22 reads the first Schema information from the database 20;
the first Schema information comprises at least one of: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value;
alternatively, the drive device 22 may acquire the first Schema information, and the drive device 22 may be a computer device.
The first Schema information may be Schema information of an original table read by the drive device 22 from a database, and the attribute of the first field may be a field name.
S204, the driving device 22 broadcasts the first Schema information to the executing device 24;
in some alternative embodiments of the present application, the execution device 24 may be a computer, and the number of the execution devices 24 may be multiple, such as the execution device 242, the execution device 244, the execution device 246, and the execution device 248 in the figure. After reading the first Schema information, the drive device 22 broadcasts the first Schema to each execution device. To notify the first Schema information existing in the database of each executing device, so that after each executing device acquires the user data from the user data device 26, the executing device deletes the information that is the same as the first Schema information.
S206, the execution device 24 obtains the user data from the user data device 24;
s208, the execution device 24 analyzes the user data to obtain the attribute of the second field and a second data value corresponding to the second field;
s210, the execution device 24 automatically determines the data type of the second data value;
s212, the executive device 24 accumulates the second Schema information, and sends the accumulated second Schema information to the drive device 22;
in some optional embodiments of the present application, before each executing device feeds back the second Schema information to the driving device 22, user data needs to be acquired, where the user data includes data with multiple data formats, for example: user data in csv format, user data in json format, and user data in txt format. After the execution equipment acquires the user data, analyzing the user data in different data formats to obtain the attribute of the second field and a second data value corresponding to the second field.
Optionally, after each executing device analyzes the user data to obtain the attribute of the second field and the second data value corresponding to the second field, the data type of the second data value may be determined. For the type inference, several suitable data types may be preset first, and an attempt may be made to infer that the data type belongs to a certain type according to the value of the second data, for example: a value of 3242, it is inferred to be of integer type; with a value of "Chinese", it is inferred to be of the string type.
In some optional embodiments of the application, after each executing device parses the user data, the data type inference and the accumulation of the attribute of the second field, the second data value, and the data type of the second data value may be performed dynamically, where the accumulation process of the attribute of the second field, the second data value, and the data type of the second data value may also be performed on a preset independent device, that is, the independent device obtains the attribute of the second field, the second data value, and the data type of the second data value from each executing device, and then accumulates the attribute of the second field, the second data value, and the data type of the second data value.
S214, the drive device 22 performs additional processing, Schema processing, data merging, and sends the processing result to the database 20.
Optionally, after acquiring the attribute of the accumulated second field, the second data value, and the data type of the second data value, the drive device 22 may perform some additional processing, which refers to fault-tolerant processing, for example: and the second data values corresponding to a certain second field are various, so that fault-tolerant processing is required when different data types are deduced.
Optionally, the fault tolerance processing mode may be: when the data type judgment result of the second data value is multiple data types, the target data type corresponding to the second data value can be determined according to the attribute of the second field.
Optionally, the Schema processing and data merging finger driving device 22 constructs a data table based on the first Schema information and the second Schema information, and constructing the data table based on the first Schema information and the second Schema information may be implemented in the following manner: merging the first Schema information and the second Schema information, and performing deduplication processing to obtain third Schema information; and determining the data table according to the third Schema information.
Fig. 3 is a schematic flowchart of a data table determining method according to an embodiment of the present application, and as shown in fig. 3, the method at least includes the following steps:
step S302, receiving first Schema information, where the first Schema information includes at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value;
optionally, the execution device receives the first Schema information broadcasted from the drive device, both the execution device and the drive device may be computing devices, and the number of the execution devices may be one or more.
Step S304, acquiring user data, and analyzing the user data to obtain the attribute of a second field and a second data value corresponding to the second field;
the first Schema information may be Schema information of an original table read by the drive device from a database, and the attribute of the first field may be a field name.
In some alternative embodiments of the present application, the execution device may be a computer, and the number of the execution devices may be multiple. And after reading the first Schema information, the drive device broadcasts the first Schema to each execution device. And notifying the first Schema information existing in the databases of the execution devices so that the execution devices delete the information same as the first Schema information after acquiring the user data.
Step S306, determining a second data type of the second data value;
step S308, determining second Schema information based on the attribute of the second field, the second data value, and the second data type;
in some optional embodiments of the present application, the user data comprises data having a plurality of data formats, for example: user data in csv format, user data in json format, and user data in txt format. After the execution equipment acquires the user data, analyzing the user data in different data formats to obtain the attribute of the second field and a second data value corresponding to the second field.
Optionally, after the execution device analyzes the user data to obtain the attribute of the second field and the second data value corresponding to the second field, the data type of the second data value may be determined. For the type inference, several suitable data types may be preset first, and an attempt may be made to infer that the data type belongs to a certain type according to the value of the second data, for example: a value of 3242, it is inferred to be of integer type; with a value of "Chinese", it is inferred to be of the string type.
In some optional embodiments of the application, after each executing device parses the user data, the data type inference and the accumulation of the attribute of the second field, the second data value, and the data type of the second data value may be performed dynamically, where the accumulation process of the attribute of the second field, the second data value, and the data type of the second data value may also be performed on a preset independent device, that is, the independent device obtains the attribute of the second field, the second data value, and the data type of the second data value from each executing device, and then accumulates the attribute of the second field, the second data value, and the data type of the second data value. And the second Schema information comprises the attribute of the second field, the second data value and the data type of the second data value.
Step S310, determining third Schema information based on the first Schema information and the second Schema information;
step S312, sending the third Schema information to the drive device.
Optionally, determining the third Schema information based on the first Schema information and the second Schema information may be implemented in the following manner: deleting the Schema information which is the same as the first Schema information in the second Schema information to obtain the third Schema information.
Optionally, the third Schema information in fig. 3 is the second Schema information in fig. 1.
Optionally, determining the second Schema information based on the attribute of the second field, the second data value, and the second data type may be implemented by: and accumulating and/or removing the attribute of the second field, the second data value and the second data type to obtain the second Schema information.
Fig. 4 is a schematic structural diagram of a data table determination system according to an embodiment of the present application, and as shown in fig. 4, the system at least includes:
the drive device 42 is configured to obtain first Schema information from the database, where the first Schema information includes at least one of: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; broadcasting the first Schema information to an executing device, and receiving the second Schema information; constructing a data table based on the first Schema information and the second Schema information; and sending the data table to the database.
Alternatively, the drive device 42 may acquire the first Schema information, and the drive device 42 may be a computer device.
The first Schema information may be Schema information of an original table read by the drive device 42 from a database, and the attribute of the first field may be a field name.
The execution device 44 is configured to determine the second Schema information based on the first Schema information, the attribute of the second field, and the second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by analyzing, by the execution device 44, the obtained user data.
In some alternative embodiments of the present application, the execution device 44 may be a computer, and the number of the execution devices 44 may be plural. After reading the first Schema information, the drive device 42 broadcasts the first Schema to the respective enforcement devices 44. To notify each executing device 44 of the existing first Schema information in the database, so that after the executing device 44 acquires the user data, the executing device deletes the information that is the same as the first Schema information.
In some optional embodiments of the present application, before each executing device 44 feeds back the second Schema information to the driving device 42, user data needs to be acquired, where the user data includes data with multiple data formats, for example: user data in csv format, user data in json format, and user data in txt format. After the execution device 44 obtains the user data, the user data in different data formats is analyzed to obtain the attribute of the second field and the second data value corresponding to the second field.
Optionally, the executing device 44 may determine the data type of the second data value after analyzing the user data to obtain the attribute of the second field and the second data value corresponding to the second field. For the type inference, several suitable data types may be preset first, and an attempt may be made to infer that the data type belongs to a certain type according to the value of the second data, for example: a value of 3242, it is inferred to be of integer type; with a value of "Chinese", it is inferred to be of the string type.
In some alternative embodiments of the present application, after each execution device 44 parses the user data, the data type inference and the accumulation of the attribute of the second field, the second data value, and the data type of the second data value may be dynamically performed, or may be performed on a preset independent device, that is, the independent device obtains the attribute of the second field, the second data value, and the data type of the second data value from each execution device 44, and then accumulates the attribute of the second field, the second data value, and the data type of the second data value. And the second Schema information comprises the attribute of the second field, the second data value and the data type of the second data value.
Optionally, after obtaining the accumulated attribute of the second field, the second data value, and the data type of the second data value, the driving device 42 may perform some additional fault tolerance processing, for example: and the second data values corresponding to a certain second field are various, so that fault-tolerant processing is required when different data types are deduced.
Optionally, the fault tolerance processing mode may be: when the data type judgment result of the second data value is multiple data types, the target data type corresponding to the second data value can be determined according to the attribute of the second field.
Optionally, constructing the data table based on the first Schema information and the second Schema information may be implemented in the following manner: merging the first Schema information and the second Schema information, and performing deduplication processing to obtain third Schema information; and determining the data table according to the third Schema information.
It should be noted that, reference may be made to the description related to the embodiment shown in fig. 1 for a preferred implementation of the embodiment shown in fig. 4, and details are not described here again.
Fig. 5 is a schematic structural diagram of a data table determining apparatus according to an embodiment of the present application, and as shown in fig. 5, the apparatus at least includes:
the obtaining module 52 is configured to obtain first Schema information, where the first Schema information includes at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value;
a broadcasting module 54, configured to broadcast the first Schema information to an executing device;
a receiving module 56, configured to receive second Schema information fed back by the execution device based on attributes of the first Schema information and the second field and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by analyzing, by the execution device, the obtained user data;
a constructing module 58, configured to construct the data table based on the first Schema information and the second Schema information.
Optionally, the building module 58 comprises: a merging module, configured to merge the first Schema information and the second Schema information, and perform deduplication processing to obtain third Schema information; and the determining module is used for determining the data table according to the third Schema information.
It should be noted that, reference may be made to the description related to the embodiment shown in fig. 1 for a preferred implementation of the embodiment shown in fig. 5, and details are not described here again.
The data table determining device comprises a processor and a memory, the acquiring module 52, the broadcasting module 54, the receiving module 56, the constructing module 58 and the like are all stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, and the technical problem of low data analysis efficiency in the scheme of the prior art is solved by adjusting the kernel parameters when the format of the data changes and the analysis code corresponding to the Schema needs to be modified.
An embodiment of the present application provides a storage medium on which a program is stored, the program implementing the data table determination method when executed by a processor.
The embodiment of the application provides a processor, wherein the processor is used for running a program, and the data table determining method is executed when the program runs.
An embodiment of the present application provides a structure diagram of an electronic device, as shown in fig. 6, the electronic device includes at least one processor 601, at least one memory 602 connected to the processor 601, and a bus 603; the processor 601 and the memory 602 complete communication with each other through the bus 603; the processor 601 is used to call program instructions in the memory 602 to perform the above-described data table determination method. The electronic device 60 herein may be a server, a PC, a PAD, a cell phone, etc.
The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device: acquiring first Schema information, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; broadcasting the first Schema information to an executing device; receiving second Schema information fed back by the execution device based on the first Schema information, the attribute of a second field, and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by the execution device after analyzing the obtained user data; and constructing a data table based on the first Schema information and the second Schema information.
Optionally, constructing a data table based on the first Schema information and the second Schema information includes: merging the first Schema information and the second Schema information, and performing deduplication processing to obtain third Schema information; and determining the data table according to the third Schema information.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a device includes one or more processors (CPUs), memory, and a bus. The device may also include input/output interfaces, network interfaces, and the like.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip. The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (trans-entity media) such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for determining a data table, comprising:
acquiring first Schema information, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value;
broadcasting the first Schema information to an executing device;
receiving second Schema information fed back by the execution device based on the first Schema information, the attribute of a second field, and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by the execution device after analyzing the obtained user data;
and constructing a data table based on the first Schema information and the second Schema information.
2. The method according to claim 1, wherein constructing a data table based on the first Schema information and the second Schema information comprises:
merging the first Schema information and the second Schema information, and performing deduplication processing to obtain third Schema information;
and determining the data table according to the third Schema information.
3. A method for determining a data table, comprising:
receiving first Schema information, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value;
acquiring user data, and analyzing the user data to obtain the attribute of a second field and a second data value corresponding to the second field;
determining a second data type for the second data value;
determining second Schema information based on the attribute of the second field, the second data value, and the second data type;
determining third Schema information based on the first Schema information and the second Schema information;
and sending the third Schema information to the drive equipment.
4. The method of claim 3, wherein determining third Schema information based on the first Schema information and the second Schema information comprises:
deleting the Schema information which is the same as the first Schema information in the second Schema information to obtain the third Schema information.
5. The method of claim 3, wherein determining second Schema information based on the attribute of the second field, the second data value, and the second data type comprises:
and accumulating and/or removing the attribute of the second field, the second data value and the second data type to obtain the second Schema information.
6. A data table determination system, comprising:
the driving device is used for acquiring first Schema information from a database, wherein the first Schema information comprises at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value; broadcasting the first Schema information to an executing device, receiving second Schema information, constructing a data table based on the first Schema information and the second Schema information, and sending the data table to the database;
the execution device is configured to determine the second Schema information based on the first Schema information, an attribute of a second field, and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by analyzing, by the execution device, the obtained user data.
7. A data table determination apparatus, comprising:
the obtaining module is configured to obtain first Schema information, where the first Schema information includes at least one of the following: the method comprises the following steps of obtaining a first data value corresponding to a first field and a first data type of the first data value;
a broadcasting module, configured to broadcast the first Schema information to an executing device;
a receiving module, configured to receive second Schema information fed back by the execution device based on attributes of the first Schema information and a second field and a second data value corresponding to the second field, where the attribute of the second field and the second data value corresponding to the second field are obtained by analyzing, by the execution device, the obtained user data;
and the constructing module is used for constructing the data table based on the first Schema information and the second Schema information.
8. The apparatus of claim 7, wherein the building module comprises:
a merging module, configured to merge the first Schema information and the second Schema information, and perform deduplication processing to obtain third Schema information;
and the determining module is used for determining the data table according to the third Schema information.
9. A storage medium, characterized in that the storage medium comprises a stored program, wherein when the program runs, a device in which the storage medium is located is controlled to execute the data table determination method of any one of claims 1 to 2 or any one of claims 3 to 5.
10. An electronic device, comprising at least one processor, at least one memory connected to the processor, and a bus, wherein the processor and the memory communicate with each other via the bus; the processor is configured to call program instructions in the memory to perform the data table determination method of any of claims 1 to 2 or any of claims 3 to 5.
CN201910944542.7A 2019-09-30 2019-09-30 Data table determination method, system and device Pending CN112668287A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910944542.7A CN112668287A (en) 2019-09-30 2019-09-30 Data table determination method, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910944542.7A CN112668287A (en) 2019-09-30 2019-09-30 Data table determination method, system and device

Publications (1)

Publication Number Publication Date
CN112668287A true CN112668287A (en) 2021-04-16

Family

ID=75399683

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910944542.7A Pending CN112668287A (en) 2019-09-30 2019-09-30 Data table determination method, system and device

Country Status (1)

Country Link
CN (1) CN112668287A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114610959A (en) * 2022-05-12 2022-06-10 恒生电子股份有限公司 Data processing method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006009768A1 (en) * 2004-06-23 2006-01-26 Oracle International Corporation Efficient evaluation of queries using translation
CN106921614A (en) * 2015-12-24 2017-07-04 北京国双科技有限公司 Business data processing method and device
CN107368517A (en) * 2017-06-02 2017-11-21 上海恺英网络科技有限公司 A kind of method and apparatus of high amount of traffic inquiry
CN108573006A (en) * 2017-06-06 2018-09-25 北京金山云网络技术有限公司 Across computer room data synchronous system, method and device, electronic equipment
CN109582726A (en) * 2018-12-18 2019-04-05 网易(杭州)网络有限公司 The treating method and apparatus of tables of data
CN109815228A (en) * 2018-12-14 2019-05-28 深圳壹账通智能科技有限公司 Creation method, device, computer equipment and the readable storage medium storing program for executing of database table
CN110019242A (en) * 2017-12-29 2019-07-16 北京京东尚科信息技术有限公司 Processing method, device and system for tables of data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006009768A1 (en) * 2004-06-23 2006-01-26 Oracle International Corporation Efficient evaluation of queries using translation
CN106921614A (en) * 2015-12-24 2017-07-04 北京国双科技有限公司 Business data processing method and device
CN107368517A (en) * 2017-06-02 2017-11-21 上海恺英网络科技有限公司 A kind of method and apparatus of high amount of traffic inquiry
CN108573006A (en) * 2017-06-06 2018-09-25 北京金山云网络技术有限公司 Across computer room data synchronous system, method and device, electronic equipment
CN110019242A (en) * 2017-12-29 2019-07-16 北京京东尚科信息技术有限公司 Processing method, device and system for tables of data
CN109815228A (en) * 2018-12-14 2019-05-28 深圳壹账通智能科技有限公司 Creation method, device, computer equipment and the readable storage medium storing program for executing of database table
CN109582726A (en) * 2018-12-18 2019-04-05 网易(杭州)网络有限公司 The treating method and apparatus of tables of data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴冬冬: "SparkSQL:Parquet数据源之合并元数据", Retrieved from the Internet <URL:http://blog.csdn.net/lastsweetop/article/details/9900129> *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114610959A (en) * 2022-05-12 2022-06-10 恒生电子股份有限公司 Data processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111552899B (en) Method and system for improving display performance of front-end report
CN103838867A (en) Log processing method and device
CN108200070B (en) Method and device for generating list
CN111241182A (en) Data processing method and apparatus, storage medium, and electronic apparatus
CN110019298B (en) Data processing method and device
CN106886545B (en) Page display method, page resource caching method and device
CN104462096A (en) Public opinion monitoring and analysis method and device
CN106557483B (en) Data processing method, data query method, data processing equipment and data query equipment
CN114064647A (en) Data storage method, device and medium based on stream processing
CN111414361A (en) Label data storage method, device, equipment and readable storage medium
CN114490641A (en) Industrial Internet data sharing method, equipment and medium
CN112668287A (en) Data table determination method, system and device
CN113849523A (en) Data query method, equipment and medium
CN107329832B (en) Data receiving method and device
CN107016050B (en) Data processing method and device
CN112148972A (en) Method and device for screening information to be recommended
CN104572996A (en) Processing method and device for video webpage
CN110968555B (en) Dimension data processing method and device
CN112579118A (en) Method, device, system, medium and equipment for updating configuration information of microservice
CN106817592B (en) Method and device for recommending and scheduling home page
CN108023920B (en) Data packet transmission method, equipment and application interface
CN104834728A (en) Pushing method and device for subscribed video
CN115914358A (en) Message pushing method and device, electronic equipment and computer storage medium
CN114138787A (en) Bar code identification method, equipment and medium
CN110019296B (en) Database query script generation method and device, storage medium and processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination