CN111143464B - Data acquisition method and device and electronic equipment - Google Patents

Data acquisition method and device and electronic equipment Download PDF

Info

Publication number
CN111143464B
CN111143464B CN201911256334.4A CN201911256334A CN111143464B CN 111143464 B CN111143464 B CN 111143464B CN 201911256334 A CN201911256334 A CN 201911256334A CN 111143464 B CN111143464 B CN 111143464B
Authority
CN
China
Prior art keywords
data
storage carrier
query
data model
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911256334.4A
Other languages
Chinese (zh)
Other versions
CN111143464A (en
Inventor
陈昌源
周亮
黄璞
周杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201911256334.4A priority Critical patent/CN111143464B/en
Publication of CN111143464A publication Critical patent/CN111143464A/en
Application granted granted Critical
Publication of CN111143464B publication Critical patent/CN111143464B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/243Natural language query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure discloses a data acquisition method, a data acquisition device, electronic equipment and a computer readable storage medium. The data acquisition method comprises the following steps: acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier; scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than that of the first storage carrier; and in response to receiving a second query command for data, acquiring the data from the second storage carrier. By the method, the first data model is expressed by using the first query language, and the data is scheduled into the second storage carrier with higher access speed, so that the technical problems of data solidification and low access speed in the prior art are solved.

Description

Data acquisition method and device and electronic equipment
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a data acquisition method, apparatus, electronic device, and computer readable storage medium.
Background
In the data-driven era, big data analysis has become a necessary means for driving business development of enterprises, and in the analysis process, the data is refined to directly relate to the mining value in the data analysis process. The ever-finer data, resulting in an exponential increase in data volume, has become an indispensable capability for analysis of massive data (TB level or even PB level).
Analysis of mass data inevitably leads to rapid increase of time consumption of data processing, and how to respond to the mass data analysis process in second level becomes a problem to be solved urgently in the industry.
In the prior art, a scheme for carrying out data analysis by using a data visual analysis platform is provided, the scheme uniformly imports data of various data sources into a hive data table to support the storage of massive data, on the basis, the data uniformly stored in the hive data table is logically managed, and finally, the bottom data is presented in various chart forms by a visual analysis means so as to be used for a data analyst to master the service details for analysis. Although the analysis method provided by the scheme can support mass data, a traditional data analysis method is adopted in the analysis process, the hive data form is solidified, the whole hive table is required to be queried and read and written in the data analysis process, and the quick response problem in the data analysis process cannot be overcome because the hive data table is large generally, so that the time consumption in the analysis process reaches the minute level or even the hour level. The data analysis link is not friendly, and the data is difficult to play a due role.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In a first aspect, an embodiment of the present disclosure provides a data acquisition method, including:
acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than that of the first storage carrier;
and in response to receiving a second query command for data, acquiring the data from the second storage carrier.
Further, the acquiring the first data model includes:
a first query language file describing the first data model is obtained.
Further, the scheduling data from the first storage carrier into the second storage carrier according to the data model includes:
Acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
Further, a second query command for the data is represented by a second query language.
Further, the second query command is a solidification query command, wherein the solidification query command is a periodic fixed query command.
Further, the data acquisition method further includes:
importing the data from the second storage carrier into a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier;
and in response to receiving a third query command for data, acquiring the data from the third storage carrier.
Further, the first data model represented by the first query language includes:
data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the number of the groups of groups,
The data in the data model is combined by a first query language from a plurality of tables in the first storage carrier.
In a second aspect, embodiments of the present disclosure provide a data acquisition system comprising:
a first storage carrier for storing the raw data;
a second storage carrier for storing data corresponding to the first data model scheduled from the first storage carrier;
wherein the first data model is represented by a first query language, the access speed of the second storage carrier being greater than the access speed of the first storage carrier;
a human-computer interface for receiving a second query command for data;
the second query command is used for acquiring data corresponding to the second query command from the second storage carrier.
In a third aspect, an embodiment of the present disclosure provides a data acquisition apparatus, including:
the data model acquisition module is used for acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
a scheduling module for scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier;
And the query module is used for responding to a second query command of received data and acquiring the data from the second storage carrier.
In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the data acquisition methods of the first aspect described above.
In a fifth aspect, an embodiment of the present disclosure provides a non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer instructions for causing a computer to perform any one of the data acquisition methods of the first aspect.
The embodiment of the disclosure discloses a data acquisition method, a data acquisition device, electronic equipment and a computer readable storage medium. The data acquisition method comprises the following steps: acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier; scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than that of the first storage carrier; and in response to receiving a second query command for data, acquiring the data from the second storage carrier. By the method, the first data model is expressed by using the first query language, and the data is scheduled into the second storage carrier with higher access speed, so that the technical problems of data solidification and low access speed in the prior art are solved.
The foregoing description is only an overview of the disclosed technology, and may be implemented in accordance with the disclosure of the present disclosure, so that the above-mentioned and other objects, features and advantages of the present disclosure can be more clearly understood, and the following detailed description of the preferred embodiments is given with reference to the accompanying drawings.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
Fig. 1 is a schematic view of an application scenario in an embodiment of the disclosure;
fig. 2 is a flow chart of a data acquisition method according to an embodiment of the disclosure;
fig. 3 is a schematic diagram of a specific example of step S202 in the data acquisition method according to the embodiment of the present disclosure;
fig. 4 is a further flowchart of a data acquisition method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a data acquisition system provided by an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an embodiment of a data acquisition device according to an embodiment of the present disclosure;
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
Fig. 1 is a schematic diagram of an application scenario in an embodiment of the disclosure. As shown in fig. 1, a user 103 performs data analysis at a terminal 101, wherein the data analysis is performed based on a data model, which is an exemplary analysis model of article reading data, including data of names of articles, click volume, reading time, user, and so on. Specific data in the data model is stored in a storage carrier 102, where the storage carrier 102 is a mass data storage carrier, and the storage carrier is an HDFS (Hadoop distributed file system) for example, where the data is stored based on an HIVE table, when a user 103 analyzes the data based on the data model, the user needs to acquire the data from the HDFS, that is, operate the HIVE table to read and write the data, and an analysis tool displays the acquired data on a display device of the terminal 101 for the user 103 to view.
Fig. 2 is a flowchart of an embodiment of a data acquisition method according to an embodiment of the present disclosure, where the data acquisition method provided by the embodiment may be performed by a data acquisition device, and the data acquisition device may be implemented as software, or implemented as a combination of software and hardware, and the data acquisition device may be integrally provided in a device in a data acquisition system, such as a data acquisition server or a data acquisition terminal device. As shown in fig. 2, the method comprises the steps of:
step S201, a first data model is obtained;
in the present disclosure, the first data model is a model used when analyzing data, and illustratively, the first data model is an analysis model of article reading data, where the analysis model of article reading data includes data of names of articles, click-through amounts of articles, reading times of articles, users reading articles, and the like, raw data of the first data model is stored in a first storage carrier, the first storage carrier is a mass data storage carrier, illustratively, the first storage carrier is an HDFS, data on the HDFS is stored in a form of an HIVE table, and HIVE is a data warehouse tool based on Hadoop, which is a mechanism for extracting, converting, loading data, which can store, query, and analyze large-scale data stored in Hadoop. The HIVE data warehouse tool can map structured data files into a database table, i.e., HIVE table.
In the present disclosure, the first data model is represented by a first query language, and data corresponding to the first data model is stored in the first storage carrier. Illustratively, when the data is stored in the first storage carrier in the form of HIVE tables, the first query language is HIVE SQL, that is, the first data model is logically expressed by HIVE SQL, for example, there are multiple HIVE tables in HDFS, where table a has fields (user id, age, access time, article id,) and table B has fields (article id, article name, click rate), then the data is expressed by code of HIVE SQL: select a, B, di from a, B where a, b=b, the article id can get a logical description of a data model, i.e. get an analysis table, so that the data in the analysis table is the data satisfying the condition a, article id=b. It will be appreciated that the first data model used in actual use is more complex than in the above example, and that code describing the first query language of the first data model may be saved as a file in the first query language, the obtaining the first data model includes: a first query language file describing the first data model is obtained. The first data model may be maintained only by maintaining the first query language file.
Because the first query language is more flexible than the storage format in the first storage carrier, for example, when the first query language is stored in the HIVE table, the format of the HIVE table is fixed, and if the data model needs to be modified, a new HIVE table needs to be built, which is very inconvenient. While the data model is represented using the first query language, the data model may be modified by modifying code statements of the first query language, the first data model being represented by the first query language comprising, by way of example: data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or, the data in the data model is combined by a first query language from a plurality of tables in the first storage carrier. Because the hive table is a storage carrier of massive data, the data volume stored by one hive table reaches hundreds of TB and is even calculated by PB, so that operability is very weak, operation delay is also unacceptable, and in most cases, data required to be operated by a single user is only part of the hive data table, so that the data really required by the user is extracted by adopting a data extraction method. The data extraction comprises two steps, wherein the first step is column extraction, the number of field columns of the original hive table is possibly more than hundred columns, and a user can select part of field columns according to own needs to meet own needs; the second part is data screening, the data of the original hive table covers the whole data, for example, table A stores the ten years of access behaviors of all users of a certain APP, and the users only need to analyze the data of the last year, so that the data of the last year only need to be screened, and finer granularity screening can be performed through the attributes of the users. The data extraction in the two steps is described by the HIVESQL codes, so that the column extraction and the transverse data screening of the data are realized, and the whole hive table data extraction is completed. Combining different data for multiple hive tables: the task of a HIVE table is to permanently store massive data, so that the change effect on the HIVE table data is relatively large, and the change effect is usually carried out in an incremental mode, so that different data can be stored by using different HIVE tables, therefore, when data analysis is carried out, the data of a plurality of HIVE tables are required to be combined for analysis, a multi-table connection method is adopted, different forms of connection can be carried out on a plurality of different tables according to the requirement, and the plurality of HIVE tables can be connected in the form of HIVE SQL codes. Illustratively, the first data model is represented by a first query language, further comprising: a first data model is generated by modifying a first query language code of the historical data model. In this embodiment, the historical data model is also described by the first query language, and the modification of the data model can be accomplished by modifying the first query language code representing the historical data model without the need for an additional new data model. Because the user may have multiple adjustment requirements on the data model, such as connection of the HIVE original tables, modification of the extraction field of the HIVE table, and modification of the screening data conditions, in the embodiment of the present disclosure, the description of the data model uses HIVE SQL codes, so when the user modifies the data model, the user only needs to modify the generated HIVE SQL codes, and modification of the data model can be completed. It will be appreciated that in the disclosed embodiments, the data model is logically expressed using HIVE SQL.
Step S202, dispatching data from a first storage carrier into a second storage carrier according to the first data model;
in the present disclosure, the access speed of the second storage carrier is greater than the access speed of the first storage carrier. The second storage carrier is an internal memory, and the specific implementation form of the second storage carrier is an internal memory distributed database, and because the internal memory distributed database runs in the internal memory, the data model is implemented in the second storage carrier, so that the access speed to the data can be increased. Scheduling data of a data model used for data analysis from a first storage carrier into a second storage carrier before the data analysis is required, comprising:
step S301, a first query command of a first query language corresponding to the data model is acquired;
step S302, generating a dependency tree corresponding to data in the data model according to the first query command;
step S303, in response to the data in the dependency tree being ready, reading the data corresponding to the first query command from the first storage carrier;
step S304, storing the data in the second storage carrier.
When scheduling data, it is first necessary to know which data to schedule, where the data model is represented by a first query language code, and the first query language code may be disassembled into one or more first query commands, and the first query commands are exemplary query commands formed by HIVE SQL, and query data corresponding to the first data model in a HIVE table. In step S302, a data dependency tree is generated according to the first query command, where, illustratively, data in the data model depends on data in other data sources, for example, a data represents a reading amount of an article in a quarter, and if the dependent data represents a reading amount of the article in each month in the quarter, the reading amount in the quarter can be obtained after the reading amount in each month is obtained. In step S303, when the data of each node in the dependency tree is ready, it indicates that the data corresponding to the first data model is ready, and at this time, the data corresponding to the first query instruction is read from the first storage carrier, and then in step S304, the data is stored in the second storage carrier, where the data is still stored in the second storage carrier in the form of the first data model. In this way, the data for data analysis is carried by the second storage carrier, which speeds up the access to the data.
Step S203, in response to receiving a second query command of data, acquiring the data from the second storage carrier.
In an embodiment of the present disclosure, the second query command is represented by a second query language. Illustratively, the second query language is SQL. I.e. the user inputs a second query command via the human-machine interface to retrieve data in the first data model from the second storage carrier for data analysis.
In one embodiment, the second query command is a cure query command, which is a periodic fixed query command. For example, the user makes the data to be checked into a billboard, the data is checked by using the billboard every day, the query corresponding to the billboard is fixed, the query is directly acted on the second storage carrier, the query has a time attribute, such as the number of users in the last day of the query, and the query needs to update the data every day, wherein the day is the period of the query. On the basis of the embodiment, the data acquisition method further includes:
step S401, importing the data from the second storage carrier into a third storage carrier, wherein the access speed of the third storage carrier is greater than that of the second carrier;
Step S402, in response to receiving a third query command of data, acquiring the data from the third storage carrier.
In step S401, when the second query command is a fixed query command, the data is imported from the second storage carrier into the third storage carrier, where the data corresponding to the second query command is the data corresponding to the first data model or the second data model, and the data of the second data model is a subset of the data of the first data model. In the embodiment of the disclosure, the execution time of the fixed query command may be preset, and the data in the second data model is refreshed at a fixed time every day. The third storage carrier is illustratively a cache, which has a higher access speed than the memory and the system memory, so that for a high frequency data analysis scenario, such as the daily updated sign mentioned in the example, the data is scheduled into the cache, which may further improve the access speed of the data in the data analysis.
Fig. 5 is a schematic diagram of a data acquisition system according to an embodiment of the present disclosure. As shown in fig. 5, the data acquisition system includes a first storage carrier 501, a second storage carrier 502; wherein the first storage carrier 501 is configured to store raw data, and the second storage carrier 502 is configured to store data scheduled from the first storage carrier and corresponding to a first data model, wherein the first data model is represented by a first query language, and an access speed of the second storage carrier is greater than an access speed of the first storage carrier; a human-machine interface 504 for receiving a second query command for data, wherein the second query command is used for acquiring data corresponding to the second query command from the second storage carrier. Further, the data acquisition system further comprises a third storage carrier 503 for storing data corresponding to the first data model or the second data model scheduled from the second data carrier, wherein an access speed of the third storage carrier is greater than an access speed of the second storage carrier, and the man-machine interface 504 is configured to receive a third query command of the data, wherein the third query command is configured to acquire the data corresponding to the third query command from the third storage carrier. Through the stored hierarchical structure, the access of the data used in the data analysis is accelerated, the data analysis speed is greatly accelerated, and the data model of the data analysis is logically described by using a flexible query language, so that the expression of the data model is more flexible and is not limited to the logic expression in the first storage carrier.
The embodiment of the disclosure discloses a data acquisition method, a data acquisition device, electronic equipment and a computer readable storage medium. The data acquisition method comprises the following steps: acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier; scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than that of the first storage carrier; and in response to receiving a second query command for data, acquiring the data from the second storage carrier. By the method, the first data model is expressed by using the first query language, and the data is scheduled into the second storage carrier with higher access speed, so that the technical problems of data solidification and low access speed in the prior art are solved.
In the foregoing, although the steps in the foregoing method embodiments are described in the foregoing order, it should be clear to those skilled in the art that the steps in the embodiments of the disclosure are not necessarily performed in the foregoing order, but may be performed in reverse order, parallel, cross, etc., and other steps may be further added to those skilled in the art on the basis of the foregoing steps, and these obvious modifications or equivalent manners are also included in the protection scope of the disclosure and are not repeated herein.
Fig. 6 is a schematic structural diagram of an embodiment of a data acquisition device according to an embodiment of the disclosure, as shown in fig. 6, the device 600 includes: a data model acquisition module 601, a scheduling module 602, and a query module 603.
Wherein, the liquid crystal display device comprises a liquid crystal display device,
a data model obtaining module 601, configured to obtain a first data model, where the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
a scheduling module 602, configured to schedule data from a first storage carrier into a second storage carrier according to the first data model, where an access speed of the second storage carrier is greater than an access speed of the first storage carrier;
a query module 603, configured to obtain the data from the second storage carrier in response to receiving a second query command of the data.
Further, the data model obtaining module 601 is further configured to:
a first query language file describing the first data model is obtained.
Further, the scheduling module 602 is further configured to:
acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency tree corresponding to the data in the first data model according to the first query command;
Reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
Further, a second query command for the data is represented by a second query language.
Further, the second query command is a solidification query command, wherein the solidification query command is a periodic fixed query command.
Further, the scheduling module 602 is further configured to: importing the data from the second storage carrier into a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier; the query module 603 is further configured to: and in response to receiving a third query command for data, acquiring the data from the third storage carrier.
Further, the first data model represented by the first query language includes: data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or, the data in the data model is combined by a first query language from a plurality of tables in the first storage carrier.
The apparatus of fig. 6 may perform the method of the embodiment of fig. 2-4, and reference is made to the relevant description of the embodiment of fig. 2-4 for parts of this embodiment not described in detail. The implementation process and the technical effect of this technical solution are described in the embodiments shown in fig. 2 to 4, and are not described herein.
Referring now to fig. 7, a schematic diagram of an electronic device 700 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 7 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 7, the electronic device 700 may include a processing means (e.g., a central processor, a graphics processor, etc.) 701, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
In general, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 shows an electronic device 700 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 709, or installed from storage 708, or installed from ROM 702. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 701.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText TransferProtocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier; scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than that of the first storage carrier; and in response to receiving a second query command for data, acquiring the data from the second storage carrier.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, there is provided a data acquisition method including:
acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than that of the first storage carrier;
and in response to receiving a second query command for data, acquiring the data from the second storage carrier.
Further, the acquiring the first data model includes:
a first query language file describing the first data model is obtained.
Further, the scheduling data from the first storage carrier into the second storage carrier according to the data model includes:
acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
Storing said data in said second storage carrier.
Further, a second query command for the data is represented by a second query language.
Further, the second query command is a solidification query command, wherein the solidification query command is a periodic fixed query command.
Further, the data acquisition method further includes:
importing the data from the second storage carrier into a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier;
and in response to receiving a third query command for data, acquiring the data from the third storage carrier.
Further, the first data model represented by the first query language includes:
data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the number of the groups of groups,
the data in the data model is combined by a first query language from a plurality of tables in the first storage carrier.
According to one or more embodiments of the present disclosure, there is provided a data acquisition system comprising:
a first storage carrier for storing the raw data;
a second storage carrier for storing data corresponding to the first data model scheduled from the first storage carrier;
Wherein the first data model is represented by a first query language, the access speed of the second storage carrier being greater than the access speed of the first storage carrier;
a human-computer interface for receiving a second query command for data;
the second query command is used for acquiring data corresponding to the second query command from the second storage carrier.
According to one or more embodiments of the present disclosure, there is provided a data acquisition apparatus including:
the data model acquisition module is used for acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
a scheduling module for scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier;
and the query module is used for responding to a second query command of received data and acquiring the data from the second storage carrier.
Further, the data model acquisition module is further configured to:
a first query language file describing the first data model is obtained.
Further, the scheduling module is further configured to:
acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
Further, a second query command for the data is represented by a second query language.
Further, the second query command is a solidification query command, wherein the solidification query command is a periodic fixed query command.
Further, the scheduling module is further configured to: importing the data from the second storage carrier into a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier; the query module is further configured to: and in response to receiving a third query command for data, acquiring the data from the third storage carrier.
Further, the first data model represented by the first query language includes: data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or, the data in the data model is combined by a first query language from a plurality of tables in the first storage carrier.
According to one or more embodiments of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the data acquisition methods of the first aspect described above.
According to one or more embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores computer instructions for causing a computer to perform any of the data acquisition methods of the foregoing first aspect.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (9)

1. A method of data acquisition, comprising:
acquiring a first query language file describing a first data model for data analysis, wherein the first data model is represented by a first query language in the first query language file, and data corresponding to the first data model is stored in a first storage carrier;
scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than that of the first storage carrier;
acquiring the data from the second storage carrier in response to receiving a second query command for the data;
importing the data from the second storage carrier into a third storage carrier when the second query command is a periodic fixed query command; wherein the access speed of the third storage carrier is greater than the access speed of the second storage carrier.
2. The data acquisition method of claim 1, wherein said scheduling data from a first storage carrier into a second storage carrier according to said data model comprises:
acquiring a first query command of a first query language corresponding to the first data model;
Generating a dependency tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
3. The data acquisition method of claim 1, wherein the second query command for data is represented by a second query language.
4. The data acquisition method of claim 1, further comprising:
and in response to receiving a third query command for data, acquiring the data from the third storage carrier.
5. The data acquisition method of claim 1, wherein the first data model is represented by a first query language comprising:
data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the number of the groups of groups,
the data in the data model is combined by a first query language from a plurality of tables in the first storage carrier.
6. A data acquisition system, comprising:
a first storage carrier for storing the raw data;
A second storage carrier for storing data corresponding to a first data model for data analysis, which is scheduled from the first storage carrier;
wherein the first data model is described by a first query language file, the first data model is represented by a first query language in the first query language file, and the access speed of the second storage carrier is greater than the access speed of the first storage carrier;
a human-computer interface for receiving a second query command for data;
the second query command is used for acquiring data corresponding to the second query command from the second storage carrier;
a third storage carrier for storing data imported from the second storage carrier when the second inquiry command is a periodic fixed inquiry command; wherein the access speed of the third storage carrier is greater than the access speed of the second storage carrier.
7. A data acquisition device, comprising:
the data model acquisition module is used for acquiring a first query language file describing a first data model for data analysis, wherein the first data model is represented by a first query language in the first query language file, and data corresponding to the first data model is stored in a first storage carrier;
A scheduling module for scheduling data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier; and importing the data from the second storage carrier into a third storage carrier when the received second query command is a periodic fixed query command; wherein the access speed of the third storage carrier is greater than the access speed of the second storage carrier;
and the query module is used for responding to a second query command of received data and acquiring the data from the second storage carrier.
8. An electronic device, comprising:
a memory for storing computer readable instructions; and
a processor for executing the computer readable instructions such that the processor when run implements the data acquisition method according to any one of claims 1-5.
9. A non-transitory computer readable storage medium storing computer readable instructions which, when executed by a computer, cause the computer to perform the data acquisition method of any one of claims 1-5.
CN201911256334.4A 2019-12-10 2019-12-10 Data acquisition method and device and electronic equipment Active CN111143464B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911256334.4A CN111143464B (en) 2019-12-10 2019-12-10 Data acquisition method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911256334.4A CN111143464B (en) 2019-12-10 2019-12-10 Data acquisition method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN111143464A CN111143464A (en) 2020-05-12
CN111143464B true CN111143464B (en) 2023-07-18

Family

ID=70517886

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911256334.4A Active CN111143464B (en) 2019-12-10 2019-12-10 Data acquisition method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111143464B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966129A (en) * 2021-02-26 2021-06-15 北京奇艺世纪科技有限公司 Multimedia data attention parameter query method, device and equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066499A (en) * 2016-12-30 2017-08-18 江苏瑞中数据股份有限公司 The data query method of multi-source data management and visualization system is stored towards isomery
CN109726217A (en) * 2019-01-10 2019-05-07 北京字节跳动网络技术有限公司 A kind of database operation method, device, equipment and storage medium
CN110489427A (en) * 2019-08-26 2019-11-22 杭州城市大数据运营有限公司 A kind of data query method, apparatus, computer equipment and storage medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193004A1 (en) * 2008-01-30 2009-07-30 Business Objects, S.A. Apparatus and method for forming database tables from queries
JP4385387B1 (en) * 2009-07-02 2009-12-16 修平 西山 Database system with attributed key-value store
CN103324724B (en) * 2013-06-26 2017-02-08 华为技术有限公司 Method and device for processing data
US8914323B1 (en) * 2014-04-10 2014-12-16 Sqrrl Data, Inc. Policy-based data-centric access control in a sorted, distributed key-value data store
US10970280B2 (en) * 2015-10-07 2021-04-06 International Business Machines Corporation Query plan based on a data storage relationship
CN107315751A (en) * 2016-04-26 2017-11-03 北京京东尚科信息技术有限公司 Multidimensional data query method and device
US11042560B2 (en) * 2016-06-19 2021-06-22 data. world, Inc. Extended computerized query language syntax for analyzing multiple tabular data arrangements in data-driven collaborative projects
US11138201B2 (en) * 2017-11-29 2021-10-05 Omics Data Automation, Inc. System and method for integrating data for precision medicine
CN108009250B (en) * 2017-12-01 2021-09-07 武汉斗鱼网络科技有限公司 Multi-classification event data cache establishing and querying method and device
CN108052569A (en) * 2017-12-07 2018-05-18 深圳市康必达控制技术有限公司 Data bank access method, device, computer readable storage medium and computing device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066499A (en) * 2016-12-30 2017-08-18 江苏瑞中数据股份有限公司 The data query method of multi-source data management and visualization system is stored towards isomery
CN109726217A (en) * 2019-01-10 2019-05-07 北京字节跳动网络技术有限公司 A kind of database operation method, device, equipment and storage medium
CN110489427A (en) * 2019-08-26 2019-11-22 杭州城市大数据运营有限公司 A kind of data query method, apparatus, computer equipment and storage medium

Also Published As

Publication number Publication date
CN111143464A (en) 2020-05-12

Similar Documents

Publication Publication Date Title
EP3513317B1 (en) Data serialization in a distributed event processing system
US10824403B2 (en) Application builder with automated data objects creation
CN115757400B (en) Data table processing method, device, electronic equipment and computer readable medium
CN111857720B (en) User interface state information generation method and device, electronic equipment and medium
CN111950857A (en) Index system management method and device based on service indexes and electronic equipment
CN112948486A (en) Batch data synchronization method and system and electronic equipment
CN111125064A (en) Method and device for generating database mode definition statement
CN111143464B (en) Data acquisition method and device and electronic equipment
US10275525B2 (en) Method and system for mining trends around trending terms
WO2023056841A1 (en) Data service method and apparatus, and related product
CN111309988B (en) Character string retrieval method and device based on coding and electronic equipment
CN112380476A (en) Information display method and device and electronic equipment
CN110598133A (en) Method, apparatus, electronic device, and computer-readable storage medium for determining an order of search items
CN111694833B (en) Data processing method, device, electronic equipment and computer readable storage medium
CN115168478B (en) Data type conversion method, electronic device and readable storage medium
CN110619093B (en) Method, apparatus, electronic device, and computer-readable storage medium for determining an order of search items
US11442954B2 (en) Techniques for accessing on-premise data sources from public cloud for designing data processing pipelines
CN111897827B (en) Data updating method and system for data warehouse and electronic equipment
CN112486494A (en) File generation method and device, electronic equipment and computer readable storage medium
CN110674401B (en) Method and device for determining sequence of search items and electronic equipment
CN114372055A (en) Method and device for displaying database columnar storage in row storage mode
CN116303529A (en) Object acquisition method, device, electronic equipment and computer readable medium
CN117632907A (en) Data processing method, system, electronic device and storage medium
CN117609226A (en) Information stream data storage method and device, electronic equipment and readable medium
CN117271550A (en) Processing method, device and equipment of data processing statement and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant