CN104268172B - The method and apparatus for extracting data - Google Patents

The method and apparatus for extracting data Download PDF

Info

Publication number
CN104268172B
CN104268172B CN201410467821.6A CN201410467821A CN104268172B CN 104268172 B CN104268172 B CN 104268172B CN 201410467821 A CN201410467821 A CN 201410467821A CN 104268172 B CN104268172 B CN 104268172B
Authority
CN
China
Prior art keywords
data
data extraction
destination file
storage device
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410467821.6A
Other languages
Chinese (zh)
Other versions
CN104268172A (en
Inventor
刘彦伟
王晓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong three hundred and sixty degree e-commerce Co., Ltd.
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201410467821.6A priority Critical patent/CN104268172B/en
Publication of CN104268172A publication Critical patent/CN104268172A/en
Application granted granted Critical
Publication of CN104268172B publication Critical patent/CN104268172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6209Protecting access to data via a platform, e.g. using keys or access control rules to a single file or object, e.g. in a secure envelope, encrypted and accessed using a key, or with access control rules appended to the object itself

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of method and apparatus for extracting data, can save the manpower from data warehouse extraction data and improve Information Security.The method of the extraction data of the present invention includes:Preserve data extraction task;In the case where listening to and saving new data extraction task, perform the new data extraction task and obtain the destination file of data extraction to extract data from data source;The destination file is sent in storage device, so that user obtains the destination file from the storage device.

Description

The method and apparatus for extracting data
Technical field
The present invention relates to a kind of method and apparatus for extracting data.
Background technology
With the development of internet, the data of generation are more and more, and people also increasingly pay attention to data analysis research, Data warehouse increasingly plays huge effect in this context, and the power that business side also possesses bigger is ground in data analysis Study carefully aspect and make lasting input.In order to meet the flexile data analysis requirements in business side, data mining engineer is frequent The related data that business side is manually needed from data warehouse that wants help extracts, and then gives in the form of a file Business side.This process is exactly the process of a data extraction.
When carrying out data extraction, data mining engineer is according to the demand of business side, the data of analysis business side's demand Storage location in data warehouse, then by performing the form of the sentence of database that uses of data warehouse by hand by data Data in warehouse are converted to common text files, then data warehouse server downloads to data mining from line by text file The personal work computer of engineer is finally sent to business side by the tool of communications of enterprises again, completes a data and carries Take flow.
The execution time of database statement is generally long, and download text file, sends text file that it is longer also to need Time, and these three links have continuity, the failure of any one link is required for manually re-operating, so holding Data mining engineer has to last for remaining focused on during these three links of row, therefore is difficult parallel to go to complete it simultaneously His work occupies a large amount of manpower.Also, all it is to be had been manually done under line during the entire process of being made of above three link, Data by repeatedly circulation, cause data in multiple places there are multiple backups, these Backup Datas lack in this process Enough records and supervision, there are the risks of leaking data.
Therefore main problem existing for the scheme from data warehouse extraction data is to occupy a large amount of manpowers and data at present Safety is inadequate.
Invention content
In view of this, the present invention provides a kind of method and apparatus for extracting data, can save from data warehouse and extract number According to manpower and improve Information Security.
To achieve the above object, according to an aspect of the invention, there is provided a kind of method for extracting data.
The method of the extraction data of the present invention includes:Preserve data extraction task;It saves new data listening to and carries In the case of taking task, the new data extraction task is performed to extract data from data source and obtains the result text of data extraction Part;The destination file is sent in storage device, so that user obtains the destination file from the storage device.
Optionally, it is further included before preserving data extraction task:Data are received by list and extract sentence, then basis should Data extraction sentence generation data extraction task.
Optionally, the data extraction sentence extracts sentence for the data of database used in the data source, described Data extraction task is the data extraction task of the database.
Optionally, the step destination file being sent in storage device includes:The destination file is saved in In temporary storing directory;Data in the temporary storing directory are uploaded in cloud storage device, are then deleted described interim Data in storage catalogue.
According to another aspect of the present invention, a kind of device for extracting data is provided
The device of the extraction data of the present invention includes:Preserving module, for preserving data extraction task;Module is monitored, is used New data extraction task whether is saved in the monitoring preserving module;Execution module, for being monitored in the monitoring module In the case of new data extraction task is saved, perform the new data extraction task and obtained with extracting data from data source The destination file extracted to data;Processing module, for the destination file to be sent in storage device, for user from this Storage device obtains the destination file.
Optionally, receiving module and generation module are further included, wherein:The receiving module receives number for passing through list According to extraction sentence;The generation module, for extracting sentence generation data extraction task according to the data.
Optionally, the data extraction sentence extracts sentence for the data of database used in the data source, described Data extraction task is the data extraction task of the database.
Optionally, the processing module is additionally operable to:The destination file is saved in temporary storing directory;Face described When storage catalogue in data upload in cloud storage device, then delete the data in the temporary storing directory.
According to the technique and scheme of the present invention, data extraction task is pre-saved, the data extraction task of preservation is supervised It listens and performs the data extraction task listened to, then user is supplied to carry out the data that execution data extraction task obtains It downloads.As can be seen that the combination of these steps causes data extraction substantially to complete in an automated manner, data mining engineering Teacher only need to extract demand according to the data of business side, the logging data extraction sentence in man-machine interface, then without data mining Engineer continues to pay close attention to, so that it may so that business side obtains data from storage device such as cloud storage device.In this scenario, from The data that data source extracts are first stored in temp directory, and pending data dumps to the cloud storage device with higher-security The content of the temp directory is deleted later, helps to ensure that the safety of data.
Description of the drawings
Attached drawing does not form inappropriate limitation of the present invention for more fully understanding the present invention.Wherein:
Fig. 1 is the schematic diagram of the key step of the method for extraction data according to embodiments of the present invention;
Fig. 2 is the schematic diagram of the main modular of the device of extraction data according to embodiments of the present invention.
Specific embodiment
It explains below in conjunction with attached drawing to the exemplary embodiment of the present invention, including the various of the embodiment of the present invention Details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize It arrives, various changes and modifications can be made to the embodiments described herein, without departing from scope and spirit of the present invention.Together For clarity and conciseness, the description to known function and structure is omitted in sample in following description.
Fig. 1 is the schematic diagram of the key step of the method for extraction data according to embodiments of the present invention.This method can lead to A data extraction device as software is crossed to realize.As shown in Figure 1, the method for the extraction data mainly includes following step Rapid S11 to step S17.
Step S11:Data are received by list and extract sentence.Above-mentioned data extraction device can provide man-machine interface Data extraction sentence is received, such as list or other controls are provided and extract language to receive the data of data mining engineer input Sentence.Data extraction sentence is that the data of database used in data source extract sentence, such as data source uses SQL data Library, correspondingly data extraction sentence is SQL statement.
Step S12:Sentence generation data extraction task is extracted according to the data of reception and then is preserved.Data mining engineer Other tools can also be used to generate data extraction task, then be preserved by the data extraction device.
Step S13:Judge whether to listen to and save new data extraction task.In the present embodiment, data extraction dress Lasting monitoring is put to determine whether new data extraction task.If so, S14 is entered step, otherwise by monitoring frequency delay This step is returned later to continue to monitor.
Step S14:Perform the new data extraction task listened to.The result of execution is that number is extracted from data source According to, obtain data extraction destination file.
Step S15:Destination file is saved in temporary storing directory.Because data extraction needs certain time, accordingly Ground, which preserves destination file, needs certain time, and pending data forms complete destination file, then carry out subsequent processing when extracting result.
Step S16:Data in temporary storing directory are uploaded in cloud storage device.Here data are above-mentioned Destination file.If there is multiple tasks execution simultaneously, data here can also be the multiple destination files to be formed.Step S15 Purpose with step S16 is that the data that will be extracted are stored in a storage device so that user obtains the data.Cloud storage Device has data safety measures, therefore data are finally stored in the safety that data are helped to improve in cloud storage device. User such as business side can use Account Logon to carry out data download to cloud storage device.
Step S17:Delete the data in temporary storing directory.Data are being uploaded into cloud storage dress from temporary storing directory After putting, preferably the content in temporary storing directory is emptied, to ensure the safety of data.
Fig. 2 is the schematic diagram of the main modular of the device of extraction data according to embodiments of the present invention.As shown in Fig. 2, this The device 20 of the extraction data of inventive embodiments mainly includes preserving module 21, monitors module 22, execution module 23 and processing Module 24.Preserving module 21 is used to preserve data extraction task.Module 22 is monitored for monitoring whether preserving module 21 saves New data extraction task.Execution module 23 saves the situation of new data extraction task for being listened in monitoring module 22 Under, it performs the new data extraction task and obtains the destination file of data extraction to extract data from data source.Processing module 24 for the destination file to be sent in storage device, so that user obtains the destination file from the storage device.Handle mould Block 24 can also be used to destination file being saved in temporary storing directory;And the data in temporary storing directory are uploaded into cloud In storage device, the data in temporary storing directory are then deleted.
The device 20 of extraction data can also include receiving module and generation module (not shown).Receiving module is used for Data are received by list and extract sentence.Generation module is used to extract sentence generation data extraction task according to data.
Technical solution according to embodiments of the present invention, pre-saves data extraction task, to the data extraction task of preservation The data extraction task listened to is monitored and performed, the data that execution data extraction task obtains then are supplied to use Family is downloaded.As can be seen that the combination of these steps causes data extraction substantially to complete in an automated manner, data are dug Dig engineer only need to extract demand according to the data of business side, the logging data extraction sentence in man-machine interface, then without number It is paid close attention to according to excavation Shi Jixu, so that it may so that business side obtains data from storage device such as cloud storage device.In the party In case, the data extracted from data source are first stored in temp directory, and pending data dumps to the cloud with higher-security The content of the temp directory is deleted after storage device, helps to ensure that the safety of data.
The basic principle of the present invention is described above in association with specific embodiment, however, it is desirable to, it is noted that this field For those of ordinary skill, it is to be understood that the whole either any steps or component of the process and apparatus of the present invention, Ke Yi Any computing device (including processor, storage medium etc.) either in the network of computing device with hardware, firmware, software or Combination thereof is realized that this is that those of ordinary skill in the art use them in the case of the explanation for having read the present invention Basic programming skill can be achieved with.
Therefore, the purpose of the present invention can also by run on any computing device a program or batch processing come It realizes.The computing device can be well known fexible unit.Therefore, the purpose of the present invention can also be included only by offer The program product of the program code of the method or device is realized to realize.That is, such program product is also formed The present invention, and the storage medium for being stored with such program product also forms the present invention.Obviously, the storage medium can be Any well known storage medium or any storage medium developed in the future.
It may also be noted that in apparatus and method of the present invention, it is clear that each component or each step are can to decompose And/or reconfigure.These decompose and/or reconfigure the equivalent scheme that should be regarded as the present invention.Also, perform above-mentioned series The step of processing, can perform in chronological order according to the sequence of explanation naturally, but not need to centainly sequentially in time It performs.Certain steps can perform parallel or independently of one another.
Above-mentioned specific embodiment, does not form limiting the scope of the invention.Those skilled in the art should be bright It is white, depending on design requirement and other factors, various modifications, combination, sub-portfolio and replacement can occur.It is any Modifications, equivalent substitutions and improvements made within the spirit and principles in the present invention etc., should be included in the scope of the present invention Within.

Claims (2)

  1. A kind of 1. method for extracting data, which is characterized in that including:
    Data are received by list and extract sentence, then extract sentence generation data extraction task according to the data;The data It extracts sentence and extracts sentence for the data of database used in data source, the data extraction task is the data of the database Extraction task;
    Preserve data extraction task;
    In the case where listening to and saving new data extraction task, the new data extraction task is performed with from data source Extraction data obtain the destination file of data extraction;
    The destination file is sent in storage device, so that user obtains the destination file from the storage device;
    The step destination file being sent in storage device includes:The destination file is saved in temporary storing directory In;Data in the temporary storing directory are uploaded in cloud storage device, are then deleted in the temporary storing directory Data.
  2. 2. a kind of device for extracting data, which is characterized in that including:
    Preserving module, for preserving data extraction task;
    Module is monitored, for monitoring whether the preserving module saves new data extraction task;
    In the case of listening in the monitoring module and saving new data extraction task, it is new to perform this for execution module Data extraction task with from data source extract data obtain data extraction destination file;
    Processing module, for the destination file to be sent in storage device, for described in user from storage device acquisition Destination file;
    Receiving module and generation module are further included, wherein:
    The receiving module extracts sentence for receiving data by list;
    The generation module, for extracting sentence generation data extraction task according to the data;The data extract sentence The data extraction sentence of database used in the data source, the data extraction task are appointed for the data extraction of the database Business;
    The processing module is additionally operable to:The destination file is saved in temporary storing directory;By the temporary storing directory In data upload in cloud storage device, then delete the data in the temporary storing directory.
CN201410467821.6A 2014-09-15 2014-09-15 The method and apparatus for extracting data Active CN104268172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410467821.6A CN104268172B (en) 2014-09-15 2014-09-15 The method and apparatus for extracting data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410467821.6A CN104268172B (en) 2014-09-15 2014-09-15 The method and apparatus for extracting data

Publications (2)

Publication Number Publication Date
CN104268172A CN104268172A (en) 2015-01-07
CN104268172B true CN104268172B (en) 2018-06-26

Family

ID=52159694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410467821.6A Active CN104268172B (en) 2014-09-15 2014-09-15 The method and apparatus for extracting data

Country Status (1)

Country Link
CN (1) CN104268172B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018203351A1 (en) * 2017-05-05 2018-11-08 Vidhi Techinnovation Opportunities Network Private Limited A method and system for extraction of event data from user devices

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556586A (en) * 2008-04-07 2009-10-14 华为技术有限公司 Method, system and device of automatic data collection
CN102012912A (en) * 2010-11-19 2011-04-13 清华大学 Management method for unstructured data based on cloud computing environment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8775544B2 (en) * 2009-02-04 2014-07-08 Citrix Systems, Inc. Methods and systems for dynamically switching between communications protocols

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556586A (en) * 2008-04-07 2009-10-14 华为技术有限公司 Method, system and device of automatic data collection
CN102012912A (en) * 2010-11-19 2011-04-13 清华大学 Management method for unstructured data based on cloud computing environment

Also Published As

Publication number Publication date
CN104268172A (en) 2015-01-07

Similar Documents

Publication Publication Date Title
CN107784026B (en) ETL data processing method and device
WO2018059056A1 (en) Service system data processing method and device
CN107463709A (en) A kind of ETL processing method and processing devices based on multi-data source
CN108628906A (en) Short text template method for digging, device, electronic equipment and readable storage medium storing program for executing
CN105095059A (en) Method and device for automated testing
US20150370616A1 (en) Method and system for recommending computer products on the basis of observed usage patterns of a computational device of known configuration
CN104915359A (en) Theme label recommending method and device
CN106411650A (en) Distributed security and confidentiality checking method
CN109241247A (en) The problem of multiparty collaboration project processing method, system and server
CN104268172B (en) The method and apparatus for extracting data
CN109191078A (en) A kind of traffic flow modeling method, device and equipment
CN105312275A (en) Ultrasonic cleaner with cloud service function
CN102999566B (en) Remove the method and apparatus of equipment use vestige
US10885013B2 (en) Automated application lifecycle tracking using batch processing
CN102999565B (en) A kind of method for cleaning of equipment use vestige and device
CN112699314A (en) Hot event determination method and device, electronic equipment and storage medium
US20180189038A1 (en) Automatic conversion of application program code listing segments for off-line environment
CN104539449A (en) Handling method and related device for fault information
CN107436883A (en) The method, apparatus and system of data pick-up based on complementation
CN106484912A (en) A kind for the treatment of method and apparatus of cloud disk resource
CN110851519A (en) Method for processing data through ETL tool based on NLP natural language
CN102368235B (en) To the method and system that word is associated with numerical information in input method
CN113347075A (en) WeChat group message response method and device
CN105512230A (en) Data storage method and device
CN105512232A (en) Data storage method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20191128

Address after: 100176 room 222, 2f, building C, No. 18, Kechuang 11th Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Patentee after: Beijing Jingdong three hundred and sixty degree e-commerce Co., Ltd.

Address before: 100195 1-4 layer, 1-4 layer, western section of 11C building, building, West District, Haidian District, Beijing, China

Co-patentee before: Beijing Jingdong Century Commerce Co., Ltd.

Patentee before: Beijing Jingdong Shangke Information Technology Co., Ltd.