CN111143464A - Data acquisition method and device and electronic equipment - Google Patents

Data acquisition method and device and electronic equipment Download PDF

Info

Publication number
CN111143464A
CN111143464A CN201911256334.4A CN201911256334A CN111143464A CN 111143464 A CN111143464 A CN 111143464A CN 201911256334 A CN201911256334 A CN 201911256334A CN 111143464 A CN111143464 A CN 111143464A
Authority
CN
China
Prior art keywords
data
storage carrier
data model
query
query command
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911256334.4A
Other languages
Chinese (zh)
Other versions
CN111143464B (en
Inventor
陈昌源
周亮
黄璞
周杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201911256334.4A priority Critical patent/CN111143464B/en
Publication of CN111143464A publication Critical patent/CN111143464A/en
Application granted granted Critical
Publication of CN111143464B publication Critical patent/CN111143464B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/243Natural language query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure discloses a data acquisition method, a data acquisition device, electronic equipment and a computer-readable storage medium. The data acquisition method comprises the following steps: obtaining a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier; dispatching data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier; retrieving data from the second storage carrier in response to receiving a second query command for the data. By the method, the first query language is used for representing the first data model, and the data are dispatched into the second storage carrier with higher access speed, so that the technical problems of data solidification and low access speed in the prior art are solved.

Description

Data acquisition method and device and electronic equipment
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a data acquisition method and apparatus, an electronic device, and a computer-readable storage medium.
Background
In the data-driven era, big data analysis becomes a necessary means for driving business development of enterprises, and in the analysis process, the refinement of data directly relates to the mining value in the data analysis process. The data volume increases exponentially due to the increasingly refined data, and therefore, analysis of massive data (TB level and even PB level) has become an indispensable capability.
The analysis of mass data inevitably leads to a rapid increase of the time consumption of data processing, and how to achieve the second-level response in the mass data analysis process becomes a difficult point which is urgently needed to be solved in the industry.
In the prior art, a scheme for analyzing data by using a data visualization analysis platform uniformly imports data of various data sources into a hive data table to support storage of massive data, and on the basis, the data uniformly in the hive data table is subjected to logic management, and finally, bottom data is presented in various diagrams by a visualization analysis means so as to be analyzed by a data analyst by mastering business details. Although the analysis method provided by the scheme can support mass data, the format of the hive data table is solidified by adopting a traditional data analysis method in the analysis process, the whole hive table needs to be inquired and read and written during data analysis, and as the hive data table is generally very large, the problem of quick response in the data analysis process cannot be solved, so that the time consumption of the analysis process reaches the level of minutes or even hours. The data analysis link is not friendly, and the data is difficult to exert due effect.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In a first aspect, an embodiment of the present disclosure provides a data obtaining method, including:
obtaining a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
dispatching data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier;
retrieving data from the second storage carrier in response to receiving a second query command for the data.
Further, the obtaining the first data model includes:
a first query language file describing the first data model is obtained.
Further, the dispatching data from the first storage carrier to the second storage carrier according to the data model includes:
acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency relation tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
Further, a second query command for the data is represented by a second query language.
Further, the second query command is a fixed query command, wherein the fixed query command is a periodic fixed query command.
Further, the data acquisition method further includes:
importing the data from a second storage carrier to a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier;
retrieving data from the third storage carrier in response to receiving a third query command for the data.
Further, the first data model represented by a first query language comprises:
data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the presence of a gas in the gas,
data in the data model is combined from a plurality of tables in the first storage carrier by a first query language.
In a second aspect, an embodiment of the present disclosure provides a data acquisition system, including:
a first storage carrier for storing original data;
a second storage carrier for storing data corresponding to the first data model scheduled from the first storage carrier;
wherein the first data model is represented by a first query language, the access speed of the second storage carrier being greater than the access speed of the first storage carrier;
the human-computer interface is used for receiving a second query command of the data;
wherein the second query command is used to retrieve data corresponding to the second query command from the second storage carrier.
In a third aspect, an embodiment of the present disclosure provides a data acquisition apparatus, including:
the data model acquisition module is used for acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
a scheduling module for scheduling data from a first storage carrier to a second storage carrier according to the first data model, wherein an access speed of the second storage carrier is greater than an access speed of the first storage carrier;
and the query module is used for responding to a second query command of the received data and acquiring the data from the second storage carrier.
In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data acquisition method of any one of the preceding first aspects.
In a fifth aspect, the disclosed embodiments provide a non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores computer instructions for causing a computer to execute the data acquisition method of any one of the foregoing first aspects.
The embodiment of the disclosure discloses a data acquisition method, a data acquisition device, electronic equipment and a computer-readable storage medium. The data acquisition method comprises the following steps: obtaining a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier; dispatching data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier; retrieving data from the second storage carrier in response to receiving a second query command for the data. By the method, the first query language is used for representing the first data model, and the data are dispatched into the second storage carrier with higher access speed, so that the technical problems of data solidification and low access speed in the prior art are solved.
The foregoing is a summary of the present disclosure, and for the purposes of promoting a clear understanding of the technical means of the present disclosure, the present disclosure may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.
Fig. 1 is a schematic view of an application scenario of an embodiment of the present disclosure;
fig. 2 is a schematic flow chart of a data acquisition method according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of a specific example of step S202 in the data acquisition method according to the embodiment of the disclosure;
FIG. 4 is a schematic flow chart diagram illustrating a data acquisition method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a data acquisition system provided by an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an embodiment of a data acquisition apparatus provided in an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device provided according to an embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Fig. 1 is a schematic view of an application scenario of the embodiment of the present disclosure. As shown in fig. 1, a user 103 performs data analysis at a terminal 101, wherein the data analysis is performed based on a data model, which is an analysis model of reading data of an article, for example, the data includes the name of the article, the click amount, the reading time, the user, and the like. Specific data in the data model is stored in a storage carrier 102, the storage carrier 102 is a mass data storage carrier, for example, the storage carrier is an HDFS (Hadoop distributed file system), the data is stored based on a HIVE table, when a user 103 analyzes the data based on the data model, the data needs to be acquired from the HDFS, that is, the HIVE table is operated to read and write the data, and an analysis tool displays the acquired data on a display device of the terminal 101 for the user 103 to view.
Fig. 2 is a flowchart of an embodiment of a data acquisition method provided in an embodiment of the present disclosure, where the data acquisition method provided in this embodiment may be executed by a data acquisition device, and the data acquisition device may be implemented as software, or implemented as a combination of software and hardware, and the data acquisition device may be integrated in some device in a data acquisition system, such as a data acquisition server or a data acquisition terminal device. As shown in fig. 2, the method comprises the steps of:
step S201, acquiring a first data model;
in the present disclosure, the first data model is a model used when analyzing data, and is an exemplary analysis model of article reading data, where the analysis model of the article reading data includes data of a name of an article, a click rate of the article, a reading time of the article, a user reading the article, and the like, original data of the first data model is stored in a first storage carrier, the first storage carrier is a mass data storage carrier, and is an exemplary HDFS, data on the HDFS is stored in a form of a HIVE table, and the HIVE is a data warehouse tool based on Hadoop, and is used for performing data extraction, transformation, and loading, which is a mechanism that can store, query, and analyze large-scale data stored in the Hadoop. The HIVE data warehouse tool can map the structured data file into a database table, namely an HIVE table.
In the present disclosure, the first data model is represented by a first query language, and data corresponding to the first data model is stored in the first storage carrier. For example, when the data is stored in the first storage carrier in the form of a HIVE table, the first query language is HIVE SQL, that is, the first data model is logically expressed by HIVE SQL, for example, there are multiple HIVE tables in HDFS, where there are fields in table a (user id, age, access time, article id, etc.) and fields in table B (article id, article name, click rate), and then by the code of HIVE SQL: the select a, article id, B, article di from a, B where a, article id is B, the article id can obtain a logical description of a data model, i.e. an analysis table, so that the data in the analysis table is the data satisfying the condition a, article id is B, article id. It will be appreciated that the first data model used in actual use is much more complex than in the above example, and the code in the first query language describing the first data model may be saved as a file in the first query language, and then the obtaining the first data model includes: a first query language file describing the first data model is obtained. Only the first query language file needs to be maintained to maintain the first data model.
Because the first query language is more flexible than the storage format in the first storage carrier, if the first query language is stored in the HIVE table, the format of the HIVE table is fixed, and if the data model needs to be modified, a new HIVE table needs to be established, which is very inconvenient. And using the first query language to represent the data model, the data model being modifiable by modifying code statements of the first query language, illustratively, the first data model being represented by the first query language including: data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the data in the data model is combined from multiple tables in the first storage carrier by a first query language. Because the hive table is a storage carrier of mass data, the data amount stored in one hive table reaches several hundred TB, even calculated by PB, the operability is very weak, the operation delay is unacceptable, and in most cases, the data required to be operated by a single user is only a part of the hive data table, so that the data really required by the user is extracted by adopting a data extraction method. The data extraction comprises two steps, wherein the first step is column extraction, the number of field columns of an original hive table can be more than hundreds, and a user can select partial field columns to meet the requirements of the user according to the needs of the user; the second part is data screening, the data of the original hive table covers the whole data, for example, table a stores ten-year access behaviors of all users of a certain APP, and the user only needs to analyze the data of the last year, so that only the data of the last year needs to be screened out, and finer-grained screening can be performed through the attributes of the user. The data extraction of the two steps is described by a HIVESSQL code, so that the column extraction and the transverse data screening of the data are realized, and the data extraction of the whole hive table is completed. For a combination of different data of a plurality of hive tables: the task of one HIVE table is to permanently store mass data, so the change influence on the data of the HIVE table is large, and generally, the change influence is performed in an incremental manner, so different data can be stored by using different HIVE tables, therefore, when data analysis is performed, data analysis needs to be performed by combining the data of a plurality of HIVE tables, a multi-table connection method is adopted, different forms of connection can be performed on a plurality of different tables according to the requirement, and a plurality of HIVE tables can be connected in the same way through a HIVE SQL code. Illustratively, the first data model is represented by a first query language, further comprising: a first data model is generated by modifying a first query language code of the historical data model. In this embodiment, the historical data model is also described in the first query language, and the modification of the data model can be completed by modifying the first query language code representing the historical data model without additionally creating a new data model. Because a user may have a need for adjusting the data model for many times, such as connecting which HIVE original tables, changing extracted fields of the HIVE tables, and changing screened data conditions, in the embodiment of the present disclosure, the description of the data model uses the HIVE SQL codes, and therefore, when the user modifies the data model, the modification of the data model can be completed only by modifying the generated HIVE SQL codes. It is to be appreciated that in the disclosed embodiments, the data model is expressed logically using HIVE SQL.
Step S202, dispatching data from a first storage carrier into a second storage carrier according to the first data model;
in the present disclosure, the access speed of the second storage carrier is greater than the access speed of the first storage carrier. Illustratively, the second storage carrier is a memory, and the specific implementation form of the second storage carrier is a memory distributed database, and since the memory distributed database runs in the memory, implementing a data model in the second storage carrier can accelerate the access speed of data. Dispatching data of a data model used for data analysis from a first storage carrier to a second storage carrier before data analysis is required, comprising:
step S301, acquiring a first query command of a first query language corresponding to the data model;
step S302, generating a dependency relationship tree corresponding to the data in the data model according to the first query command;
step S303, in response to that the data in the dependency tree are ready, reading the data corresponding to the first query command from the first storage carrier;
step S304, storing the data into the second storage carrier.
When scheduling data, firstly, it is required to know which data to schedule, the data model is represented by a first query language code, and the first query language code may be disassembled into one or more first query commands, for example, the first query command is a query command composed of HIVE SQL, and queries the HIVE table for data corresponding to the first data model. In step S302, a data dependency tree is generated according to the first query command, for example, data in the data model depends on data in other data sources, for example, if one data represents the reading amount of an article in a certain quarter, the dependency data is the reading amount of the article in each month in the quarter, and the reading amount in the quarter can be obtained after the reading amount in each month is obtained. In step S303, when the data of each node in the dependency tree is ready, it indicates that the data corresponding to the first data model is ready, at this time, the data corresponding to the first query instruction is read from the first storage carrier, and then in step S304, the data is stored in the second storage carrier, where the data is also stored in the form of the first data model. In this way, the data for data analysis is carried by the second storage carrier, which speeds up the access to the data.
Step S203, in response to receiving a second query command for data, obtaining the data from the second storage carrier.
In an embodiment of the present disclosure, the second query command is represented by a second query language. Illustratively, the second query language is SQL. Namely, the user inputs a second query command through the man-machine interface to acquire the data in the first data model from the second storage carrier for data analysis.
In one embodiment, the second query command is a fixed query command, and the fixed query command is a periodic fixed query command. Illustratively, the user makes the data to be viewed into a billboard, the billboard is used for viewing the data every day, the query corresponding to the billboard is fixed, the query is directly acted on the second storage carrier, the query has a time attribute, for example, the number of users in the last day is queried, and the query needs to be updated every day, wherein the day is the period of the query. On the basis of this embodiment, the data acquisition method further includes:
step S401, importing the data from a second storage carrier to a third storage carrier, wherein the access speed of the third storage carrier is greater than that of the second carrier;
step S402, in response to receiving a third query command for data, retrieving the data from the third storage carrier.
In step S401, when the second query command is a fixed query command, the data is imported from the second storage carrier to a third storage carrier, where the data corresponding to the second query command is data corresponding to the first data model or the second data model, and the data of the second data model is a subset of the data of the first data model. In the embodiment of the present disclosure, the execution time of the fixed query command may be preset, and the data in the second data model is refreshed at a fixed time every day. Illustratively, the third storage medium is a cache, and the access speed of the third storage medium is greater than that of the memory and the system memory, so that for a high-frequency data analysis scenario, such as the daily updated billboard mentioned in the example, the data is scheduled into the cache, and the access speed of the data in the data analysis can be further improved.
Fig. 5 is a schematic diagram of a data acquisition system according to an embodiment of the present disclosure. As shown in fig. 5, the data acquisition system comprises a first storage carrier 501, a second storage carrier 502; wherein the first storage carrier 501 is used for storing original data, and the second storage carrier 502 is used for storing data corresponding to a first data model scheduled from the first storage carrier, wherein the first data model is represented by a first query language, and the access speed of the second storage carrier is greater than that of the first storage carrier; the human-machine interface 504 is configured to receive a second query command for data, where the second query command is used to obtain data corresponding to the second query command from the second storage carrier. Further, the data acquisition system further comprises a third storage carrier 503 for storing data scheduled from a second data carrier corresponding to the first data model or the second data model, wherein the access speed of the third storage carrier is greater than the access speed of the second storage carrier, and the human-machine interface 504 is configured to receive a third query command for data, wherein the third query command is configured to acquire the data corresponding to the third query command from the third storage carrier. Through the stored hierarchical structure, access of data used in data analysis is accelerated, the data analysis speed is greatly accelerated, and a flexible query language is used for logic description of a data model of the data analysis, so that the expression of the data model is more flexible, and the data model is not limited to logic expression in a first storage carrier.
The embodiment of the disclosure discloses a data acquisition method, a data acquisition device, electronic equipment and a computer-readable storage medium. The data acquisition method comprises the following steps: obtaining a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier; dispatching data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier; retrieving data from the second storage carrier in response to receiving a second query command for the data. By the method, the first query language is used for representing the first data model, and the data are dispatched into the second storage carrier with higher access speed, so that the technical problems of data solidification and low access speed in the prior art are solved.
In the above, although the steps in the above method embodiments are described in the above sequence, it should be clear to those skilled in the art that the steps in the embodiments of the present disclosure are not necessarily performed in the above sequence, and may also be performed in other sequences such as reverse, parallel, and cross, and further, on the basis of the above steps, other steps may also be added by those skilled in the art, and these obvious modifications or equivalents should also be included in the protection scope of the present disclosure, and are not described herein again.
Fig. 6 is a schematic structural diagram of an embodiment of a data acquisition apparatus provided in an embodiment of the present disclosure, and as shown in fig. 6, the apparatus 600 includes: a data model acquisition module 601, a scheduling module 602, and a query module 603.
Wherein,
a data model obtaining module 601, configured to obtain a first data model, where the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
a scheduling module 602, configured to schedule data from a first storage carrier to a second storage carrier according to the first data model, where an access speed of the second storage carrier is greater than an access speed of the first storage carrier;
a query module 603 configured to retrieve data from the second storage carrier in response to receiving a second query command for the data.
Further, the data model obtaining module 601 is further configured to:
a first query language file describing the first data model is obtained.
Further, the scheduling module 602 is further configured to:
acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency relation tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
Further, a second query command for the data is represented by a second query language.
Further, the second query command is a solidified query command, where the solidified query command is a periodic fixed query command.
Further, the scheduling module 602 is further configured to: importing the data from a second storage carrier to a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier; the query module 603 is further configured to: retrieving data from the third storage carrier in response to receiving a third query command for the data.
Further, the first data model represented by a first query language comprises: data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the data in the data model is combined from multiple tables in the first storage carrier by a first query language.
The apparatus shown in fig. 6 can perform the method of the embodiment shown in fig. 2-4, and the detailed description of this embodiment can refer to the related description of the embodiment shown in fig. 2-4. The implementation process and technical effect of the technical solution refer to the descriptions in the embodiments shown in fig. 2 to fig. 4, and are not described herein again.
Referring now to FIG. 7, shown is a schematic diagram of an electronic device 700 suitable for use in implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 7, electronic device 700 may include a processing means (e.g., central processing unit, graphics processor, etc.) 701 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from storage 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Generally, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 illustrates an electronic device 700 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication means 709, or may be installed from the storage means 708, or may be installed from the ROM 702. The computer program, when executed by the processing device 701, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (hypertext transfer protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: obtaining a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier; dispatching data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier; retrieving data from the second storage carrier in response to receiving a second query command for the data.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, there is provided a data acquisition method including:
obtaining a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
dispatching data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier;
retrieving data from the second storage carrier in response to receiving a second query command for the data.
Further, the obtaining the first data model includes:
a first query language file describing the first data model is obtained.
Further, the dispatching data from the first storage carrier to the second storage carrier according to the data model includes:
acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency relation tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
Further, a second query command for the data is represented by a second query language.
Further, the second query command is a fixed query command, wherein the fixed query command is a periodic fixed query command.
Further, the data acquisition method further includes:
importing the data from a second storage carrier to a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier;
retrieving data from the third storage carrier in response to receiving a third query command for the data.
Further, the first data model represented by a first query language comprises:
data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the presence of a gas in the gas,
data in the data model is combined from a plurality of tables in the first storage carrier by a first query language.
According to one or more embodiments of the present disclosure, there is provided a data acquisition system including:
a first storage carrier for storing original data;
a second storage carrier for storing data corresponding to the first data model scheduled from the first storage carrier;
wherein the first data model is represented by a first query language, the access speed of the second storage carrier being greater than the access speed of the first storage carrier;
the human-computer interface is used for receiving a second query command of the data;
wherein the second query command is used to retrieve data corresponding to the second query command from the second storage carrier.
According to one or more embodiments of the present disclosure, there is provided a data acquisition apparatus including:
the data model acquisition module is used for acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
a scheduling module for scheduling data from a first storage carrier to a second storage carrier according to the first data model, wherein an access speed of the second storage carrier is greater than an access speed of the first storage carrier;
and the query module is used for responding to a second query command of the received data and acquiring the data from the second storage carrier.
Further, the data model obtaining module is further configured to:
a first query language file describing the first data model is obtained.
Further, the scheduling module is further configured to:
acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency relation tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
Further, a second query command for the data is represented by a second query language.
Further, the second query command is a solidified query command, where the solidified query command is a periodic fixed query command.
Further, the scheduling module is further configured to: importing the data from a second storage carrier to a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier; the query module is further configured to: retrieving data from the third storage carrier in response to receiving a third query command for the data.
Further, the first data model represented by a first query language comprises: data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the data in the data model is combined from multiple tables in the first storage carrier by a first query language.
According to one or more embodiments of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data acquisition method of any one of the preceding first aspects.
According to one or more embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, characterized in that the non-transitory computer-readable storage medium stores computer instructions for causing a computer to execute the data acquisition method of any one of the preceding first aspects.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims (11)

1. A method of data acquisition, comprising:
obtaining a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
dispatching data from a first storage carrier into a second storage carrier according to the first data model, wherein the access speed of the second storage carrier is greater than the access speed of the first storage carrier;
retrieving data from the second storage carrier in response to receiving a second query command for the data.
2. The data acquisition method as recited in claim 1, wherein said acquiring a first data model comprises:
a first query language file describing the first data model is obtained.
3. A method of data acquisition as claimed in claim 1 or 2, wherein said scheduling data from a first storage carrier to a second storage carrier according to said data model comprises:
acquiring a first query command of a first query language corresponding to the first data model;
generating a dependency relation tree corresponding to the data in the first data model according to the first query command;
reading data corresponding to the first query command from the first storage carrier in response to the data in the dependency tree being ready;
storing said data in said second storage carrier.
4. The data acquisition method as in claim 1, wherein the second query command for the data is represented by a second query language.
5. The data acquisition method as claimed in claim 1, wherein the second query command is a fixed query command, wherein the fixed query command is a periodic fixed query command.
6. The data acquisition method as set forth in claim 5, further comprising:
importing the data from a second storage carrier to a third storage carrier, wherein the access speed of the third storage carrier is greater than the access speed of the second carrier;
retrieving data from the third storage carrier in response to receiving a third query command for the data.
7. The data acquisition method as in claim 1, wherein the first data model is represented by a first query language comprising:
data in the data model is extracted from a single table in the first storage carrier by a first query language; and/or the presence of a gas in the gas,
data in the data model is combined from a plurality of tables in the first storage carrier by a first query language.
8. A data acquisition system, comprising:
a first storage carrier for storing original data;
a second storage carrier for storing data corresponding to the first data model scheduled from the first storage carrier;
wherein the first data model is represented by a first query language, the access speed of the second storage carrier being greater than the access speed of the first storage carrier;
the human-computer interface is used for receiving a second query command of the data;
wherein the second query command is used to retrieve data corresponding to the second query command from the second storage carrier.
9. A data acquisition apparatus, comprising:
the data model acquisition module is used for acquiring a first data model, wherein the first data model is represented by a first query language, and data corresponding to the first data model is stored in a first storage carrier;
a scheduling module for scheduling data from a first storage carrier to a second storage carrier according to the first data model, wherein an access speed of the second storage carrier is greater than an access speed of the first storage carrier;
and the query module is used for responding to a second query command of the received data and acquiring the data from the second storage carrier.
10. An electronic device, comprising:
a memory for storing computer readable instructions; and
a processor for executing the computer readable instructions such that the processor when running implements the data acquisition method of any one of claims 1-7.
11. A non-transitory computer readable storage medium storing computer readable instructions which, when executed by a computer, cause the computer to perform the data acquisition method of any one of claims 1-7.
CN201911256334.4A 2019-12-10 2019-12-10 Data acquisition method and device and electronic equipment Active CN111143464B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911256334.4A CN111143464B (en) 2019-12-10 2019-12-10 Data acquisition method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911256334.4A CN111143464B (en) 2019-12-10 2019-12-10 Data acquisition method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN111143464A true CN111143464A (en) 2020-05-12
CN111143464B CN111143464B (en) 2023-07-18

Family

ID=70517886

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911256334.4A Active CN111143464B (en) 2019-12-10 2019-12-10 Data acquisition method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111143464B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966129A (en) * 2021-02-26 2021-06-15 北京奇艺世纪科技有限公司 Multimedia data attention parameter query method, device and equipment

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193004A1 (en) * 2008-01-30 2009-07-30 Business Objects, S.A. Apparatus and method for forming database tables from queries
US20110004638A1 (en) * 2009-07-02 2011-01-06 Shuhei Nishiyama Attributed key-value-store database system
CN103324724A (en) * 2013-06-26 2013-09-25 华为技术有限公司 Method and device for processing data
US20150294120A1 (en) * 2014-04-10 2015-10-15 Sqrrl Data, Inc. Policy-based data-centric access control in a sorted, distributed key-value data store
US20170103104A1 (en) * 2015-10-07 2017-04-13 International Business Machines Corporation Query plan based on a data storage relationship
CN107066499A (en) * 2016-12-30 2017-08-18 江苏瑞中数据股份有限公司 The data query method of multi-source data management and visualization system is stored towards isomery
CN107315751A (en) * 2016-04-26 2017-11-03 北京京东尚科信息技术有限公司 Multidimensional data query method and device
CN108009250A (en) * 2017-12-01 2018-05-08 武汉斗鱼网络科技有限公司 A kind of more foundation of classification race data buffer storage, querying method and devices
CN108052569A (en) * 2017-12-07 2018-05-18 深圳市康必达控制技术有限公司 Data bank access method, device, computer readable storage medium and computing device
US20190065567A1 (en) * 2016-06-19 2019-02-28 data. world, Inc. Extended computerized query language syntax for analyzing multiple tabular data arrangements in data-driven collaborative projects
CN109726217A (en) * 2019-01-10 2019-05-07 北京字节跳动网络技术有限公司 A kind of database operation method, device, equipment and storage medium
US20190163679A1 (en) * 2017-11-29 2019-05-30 Omics Data Automation, Inc. System and method for integrating data for precision medicine
CN110489427A (en) * 2019-08-26 2019-11-22 杭州城市大数据运营有限公司 A kind of data query method, apparatus, computer equipment and storage medium

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090193004A1 (en) * 2008-01-30 2009-07-30 Business Objects, S.A. Apparatus and method for forming database tables from queries
US20110004638A1 (en) * 2009-07-02 2011-01-06 Shuhei Nishiyama Attributed key-value-store database system
CN103324724A (en) * 2013-06-26 2013-09-25 华为技术有限公司 Method and device for processing data
US20150294120A1 (en) * 2014-04-10 2015-10-15 Sqrrl Data, Inc. Policy-based data-centric access control in a sorted, distributed key-value data store
US20170103104A1 (en) * 2015-10-07 2017-04-13 International Business Machines Corporation Query plan based on a data storage relationship
CN107315751A (en) * 2016-04-26 2017-11-03 北京京东尚科信息技术有限公司 Multidimensional data query method and device
US20190065567A1 (en) * 2016-06-19 2019-02-28 data. world, Inc. Extended computerized query language syntax for analyzing multiple tabular data arrangements in data-driven collaborative projects
CN107066499A (en) * 2016-12-30 2017-08-18 江苏瑞中数据股份有限公司 The data query method of multi-source data management and visualization system is stored towards isomery
US20190163679A1 (en) * 2017-11-29 2019-05-30 Omics Data Automation, Inc. System and method for integrating data for precision medicine
CN108009250A (en) * 2017-12-01 2018-05-08 武汉斗鱼网络科技有限公司 A kind of more foundation of classification race data buffer storage, querying method and devices
CN108052569A (en) * 2017-12-07 2018-05-18 深圳市康必达控制技术有限公司 Data bank access method, device, computer readable storage medium and computing device
CN109726217A (en) * 2019-01-10 2019-05-07 北京字节跳动网络技术有限公司 A kind of database operation method, device, equipment and storage medium
CN110489427A (en) * 2019-08-26 2019-11-22 杭州城市大数据运营有限公司 A kind of data query method, apparatus, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966129A (en) * 2021-02-26 2021-06-15 北京奇艺世纪科技有限公司 Multimedia data attention parameter query method, device and equipment

Also Published As

Publication number Publication date
CN111143464B (en) 2023-07-18

Similar Documents

Publication Publication Date Title
CN110321958B (en) Training method of neural network model and video similarity determination method
US20170115968A1 (en) Application builder with automated data objects creation
US10175954B2 (en) Method of processing big data, including arranging icons in a workflow GUI by a user, checking process availability and syntax, converting the workflow into execution code, monitoring the workflow, and displaying associated information
CN115757400B (en) Data table processing method, device, electronic equipment and computer readable medium
US20240302947A1 (en) Method, apparatus, electronic device and storage medium for displaying reminding information
CN111950857A (en) Index system management method and device based on service indexes and electronic equipment
CN111857720B (en) User interface state information generation method and device, electronic equipment and medium
WO2024183805A1 (en) Label determination method and apparatus, information recommendation method and apparatus, device and storage medium
CN115793911A (en) Data processing method and device, electronic equipment and storage medium
CN113468196B (en) Method, apparatus, system, server and medium for processing data
CN111488386B (en) Data query method and device
CN111143464B (en) Data acquisition method and device and electronic equipment
CN112380476A (en) Information display method and device and electronic equipment
CN113190517A (en) Data integration method and device, electronic equipment and computer readable medium
WO2023056841A1 (en) Data service method and apparatus, and related product
CN116243926A (en) Service processing method, device, medium and electronic equipment
CN112307723B (en) Method and device for generating code document and electronic equipment
CN112100159A (en) Data processing method and device, electronic equipment and computer readable medium
CN110598133A (en) Method, apparatus, electronic device, and computer-readable storage medium for determining an order of search items
CN112035256A (en) Resource allocation method, device, electronic equipment and medium
CN112115154A (en) Data processing and data query method, device, equipment and computer readable medium
CN111831527A (en) Method, apparatus, electronic device, and medium for scanning database performance problems
CN111338621A (en) Data display method and device, electronic equipment and computer readable medium
US20240296394A1 (en) Data analysis method, apparatus, device and medium
CN116303529A (en) Object acquisition method, device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant