CN110515969A - Data query processing method, electronic device, computer equipment and storage medium - Google Patents

Data query processing method, electronic device, computer equipment and storage medium Download PDF

Info

Publication number
CN110515969A
CN110515969A CN201910608830.5A CN201910608830A CN110515969A CN 110515969 A CN110515969 A CN 110515969A CN 201910608830 A CN201910608830 A CN 201910608830A CN 110515969 A CN110515969 A CN 110515969A
Authority
CN
China
Prior art keywords
data
mapping function
query result
query
processing method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910608830.5A
Other languages
Chinese (zh)
Inventor
刘行行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN201910608830.5A priority Critical patent/CN110515969A/en
Publication of CN110515969A publication Critical patent/CN110515969A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Abstract

The invention discloses data query processing method, electronic device, computer equipment and storage medium, method includes: to receive user for inquiry request;Based on Tool for Data Warehouse, the inquiry request is converted, obtains corresponding MapReduce task;Based on the MapReduce task, mapping function and reduction function are set;Based on the mapping function and reduction function, query result is obtained.Using data query processing method, electronic device, computer equipment and storage medium provided in an embodiment of the present invention, it is able to solve the problem of hive that can not effectively run in the prior art carries out data processing.

Description

Data query processing method, electronic device, computer equipment and storage medium
Technical field
The present invention relates to the data query technique field of Hadoop platform more particularly to a kind of data query processing method, Electronic device, computer equipment and storage medium.
Background technique
Hadoop platform is that the open source software realized by Apache foundation based on MapReduce parallel process model is put down Platform is with good expansibility, and can be simply and rapidly deployed in the collection as composed by tens of or even thousands of computers On group's platform, to carry out efficient parallel processing to mass data with batch style.MapReduce is parallel towards big data Computation model, frame and the platform of processing, MapReduce are used to rambling data be summed up according to certain feature, Then it handles and obtains result to the end.It is required that user writes corresponding processing routine according to MapReduce programming paradigm, i.e., The purpose handled the data stored with key-value formal distribution can be achieved.
However, compared to data base querying description language as similar SQL, for lacking the general of database specialty background For general family, the writing data processor of the task is still complicated, and causes obstacle with coordination to the communication between user. Hive is a Tool for Data Warehouse based on Hadoop, and the data file of structuring can be mapped as to a database table, And simple sql query function is provided, sql sentence can be converted to MapReduce task and run.
In the prior art, since the data volume of data service platform was substantially 1,000,000,000, for example, golden house keeper, bury point, Talkingdata, webtrends, ubars data, and be all much no subregion, index, major key etc., often inquiry one Data are wanted several hours or can not be inquired at all, Run Script is slow, and which results in can not effectively be counted by hive According to optimization processing.
Therefore it provides it is skill urgently to be resolved that one kind, which effectively carries out large batch of data-optimized processing based on hive task, Art problem.
Summary of the invention
In view of this, the present invention proposes that a kind of data query processing method, electronic device, computer equipment and storage are situated between Matter, it is intended to solve the problems, such as that effectively large batch of data processing can not be carried out based on hive task in the prior art.
Firstly, to achieve the above object, the present invention proposes a kind of data query processing method, the method includes the steps:
User is received by inquiry request;
Based on Tool for Data Warehouse, the inquiry request is converted, obtains corresponding MapReduce task;
Based on the MapReduce task, mapping function and reduction function are set;
Based on the mapping function and reduction function, query result is obtained.
Further, described to be based on Tool for Data Warehouse, the inquiry request is converted, is obtained corresponding The step of MapReduce task, comprising:
Tool for Data Warehouse is obtained in the corresponding divisional description information of data-base cluster;
MapReduce task is generated according to the inquiry request, the Tool for Data Warehouse and the divisional description information.
Further, described the step of being based on the mapping function and reduction function, obtaining query result, comprising:
Respectively each partition table distributes a mapping function, obtains the first query result;
First query result is fed back into the reduction function, the second inquiry is obtained by first query result As a result;
Second query result is determined as query result corresponding with the inquiry request.
Further, described is respectively the step of each partition table distributes a mapping function, obtains the first query result, Include:
Respectively each subregion distributes a mapping function;
According to the execution parameter and time parameter of setting, the mapping function is executed, obtains the first query result.
Further, described is respectively the step of each subregion distributes a mapping function, comprising:
According to the input format of the subregion and data-base cluster in Hadoop, by each point of the data-base cluster Area's table is converted to corresponding input fragment;
For each input fragment, start the inquiry that pre-assigned mapping function executes respective partition.
Further, described is respectively the step of each partition table distributes a mapping function, obtains the first query result, Include:
For each mapping function, subregion query statement is generated according to querying condition corresponding to the partition table;
Data are read in the corresponding input fragment of the partition table by the subregion query statement as the first inquiry As a result.
Further, the execution parameter and time parameter according to setting executes the mapping function, obtains first and looks into The step of asking result, comprising:
The execution time span mean value of MapReduce operation and the mean value of mapping function are calculated according to history execution journal;
Based on the mean value of the time span mean value and the mapping function, setting executes parameter and time parameter;
According to set execution parameter and time parameter, the mapping function is executed, obtains the first query result.
In addition, to achieve the above object, the present invention also provides a kind of electronic device, described device includes:
Receiving module, for receiving user for inquiry request;
Conversion module is converted the inquiry request, is obtained corresponding for being based on Tool for Data Warehouse MapReduce task;
Mapping function and reduction function is arranged for being based on the MapReduce task in setup module;
Enquiry module obtains query result for being based on the mapping function and reduction function.
Further, the conversion module, comprising:
Acquisition submodule, for obtaining Tool for Data Warehouse in the corresponding divisional description information of data-base cluster;
Submodule is generated, for raw according to the inquiry request, the Tool for Data Warehouse and the divisional description information At MapReduce task.
Further, the enquiry module, comprising:
Distribution sub module distributes a mapping function for respectively each partition table, obtains the first query result;
Feedback submodule is inquired for first query result to be fed back to the reduction function by described first As a result the second query result is obtained;
Submodule is determined, for second query result to be determined as query result corresponding with the inquiry request.
It further, further include distribution sub module, comprising:
Allocation unit distributes a mapping function for respectively each subregion;
Execution unit executes the mapping function for the execution parameter and time parameter according to setting, obtains first and looks into Ask result.
Further, the allocation unit is specifically used for: defeated in Hadoop according to the subregion and data-base cluster Each partition table of the data-base cluster is converted to corresponding input fragment by entry format;For each input Fragment starts the inquiry that pre-assigned mapping function executes respective partition.
Further, the distribution sub module, is used for: for each mapping function, being looked into according to corresponding to the partition table Inquiry condition generates subregion query statement;Number is read in the corresponding input fragment of the partition table by the subregion query statement According to as the first query result.
Further, the execution unit, is used for: when calculating the execution of MapReduce operation according to history execution journal Between length mean value and mapping function mean value;Based on the mean value of the time span mean value and the mapping function, setting is held Row parameter and time parameter;According to the execution parameter and time parameter of setting, the mapping function is executed, obtains the first inquiry knot Fruit.
In addition, to achieve the above object, the present invention also provides a kind of equipment, including memory, processor and it is stored in On memory and the computer program that can run on a processor, the processor are realized any when executing the computer program The step of item data query processing method.
In addition, to achieve the above object, the present invention also provides a kind of storage mediums, it is stored thereon with computer program, institute State the step of any one data query processing method is realized when computer program is executed by processor.
Compared to the prior art, data query processing method, device, equipment and storage medium proposed by the invention lead to Reception user is crossed by inquiry request;Then by being based on Tool for Data Warehouse hive, the inquiry request is converted, is obtained Corresponding MapReduce task;Mapping function and reduction function are set, then by mapping function Map, are used to one group of key assignments To one group of new key-value pair is mapped to, concurrent reduction function Reduce is specified, in the key-value pair for guaranteeing all mappings Each shares identical key group, obtains query result.By being converted to obtain MapReduce to each user request Task is then based on MapReduce task and resets mapping function and reduction function, so that realizing does not have in database The mass data of subregion, index and Major key carries out being split as the processing that small grain size datacycle carries out data, can be effective Large batch of data processing is carried out based on hive task, therefore solves and can not effectively be counted in the prior art by hive According to optimization processing the problem of.
Detailed description of the invention
Fig. 1 is the flow diagram of the data query processing method of first embodiment of the invention;
Fig. 2 is the flow diagram of the data query processing method of second embodiment of the invention;
Fig. 3 is the flow diagram of the data query processing method of third embodiment of the invention;
Fig. 4 is the optional applied environment figure of electronic device one of the embodiment of the present invention;
Fig. 5 is the hardware structure schematic diagram of the electronic device of first embodiment of the invention;
Fig. 6 is the program module schematic diagram of the electronic device of first embodiment of the invention;
Fig. 7 is the program module schematic diagram of the electronic device of second embodiment of the invention;
Fig. 8 is the program module schematic diagram of the electronic device of third embodiment of the invention.
Appended drawing reference:
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not For limiting the present invention.Based on the embodiments of the present invention, those of ordinary skill in the art are not before making creative work Every other embodiment obtained is put, shall fall within the protection scope of the present invention.
It should be noted that the description for being related to " first ", " second " etc. in the present invention is used for description purposes only, and cannot It is interpreted as its relative importance of indication or suggestion or implicitly indicates the quantity of indicated technical characteristic.Define as a result, " the One ", the feature of " second " can explicitly or implicitly include at least one of the features.In addition, the skill between each embodiment Art scheme can be combined with each other, but must be based on can be realized by those of ordinary skill in the art, when technical solution Will be understood that the combination of this technical solution is not present in conjunction with there is conflicting or cannot achieve when, also not the present invention claims Protection scope within.
To solve prior art problem, as shown in Figure 1, the embodiment of the invention provides a kind of data query processing method, It comprises the following steps that
S101 receives user for inquiry request.
In the embodiment of the present invention, user utilizes the interactive interface of primary database, and input inquiry requests corresponding query statement, And query statement is submitted, query statement is submitted in completion, and Hadoop platform receives the inquiry request that user sends.
S102 is based on Tool for Data Warehouse, converts to the inquiry request, obtain corresponding MapReduce task.
It should be noted that hive is a Tool for Data Warehouse based on Hadoop, it can be by the data text of structuring Part is mapped as a database table, and provides simple sql query function.Hive is built upon the data warehouse base on Hadoop Plinth framework can be used to carry out data by preset tool to extract conversion load (ETL), in addition, Hive is defined simply Class SQL query language, referred to as HQL, it allows to be familiar with the user query data of SQL.
The embodiment of the present invention in the specific implementation, can be split to the subquery sentence in parallel query sentence, have The split process of body an are as follows: custom function such as split function is first established, by the function income data library.Such as pass through language Sentence: the word string that " CREATE FUNCTION mysplit-- " will be segmented with separator extracts substring, so in a designated order number It is afterwards at least one MapReduce task by the character string transposition extracted.
S103 is based on the MapReduce task, and mapping function and reduction function is arranged.
In a kind of implementation, for each MapReduce task, with mapping function Map and reduction function Reduce Parallel computation task is realized in programming, abstract operation and multiple programming interface is provided, simply and easily to complete extensive number According to programming and calculation processing.Specifically, each subquery sentence corresponds to a mapping function Map and reduction function Reduce processing stage.
S104 is based on the mapping function and reduction function, obtains query result.
In a kind of implementation of the invention, for according to the subquery sentence after fractionation, according to mapping function Map with return Two stages of about function Reduce form MapReduce operating process;It passes sequentially through according to mapping function Map complicated task Multiple tasks are decomposed into handle, each task will be substantially reduced in data or the opposite predecessor's business of the scale of calculating in this way;And The task can be assigned on the node for storing required data and be calculated;There is no dependence between task.
Then in being carried out by reduction function Reduce to the result in mapping function Map stage, according to mapping function Map's As a result reduction function Reduce can be redefined, user can reduction letter determines according to actual conditions according to an embodiment of the present invention The quantity of number reduce, and multiple parallel reduction function reduce can be opened simultaneously and carry out result acquisition.
It will be per continuous a pair of of operation (i.e. reduction function Reduce operation and mapping function Map operation), foundation MapReduce rule is handled, and corresponding Map and Reduce task is assigned in cluster computer in each stage, To obtain query result.
Compared to the prior art, data query processing method proposed by the invention, by receiving user for inquiry request; Then by being based on Tool for Data Warehouse hive, the inquiry request is converted, corresponding MapReduce task is obtained; Mapping function and reduction function are set, one group of key-value pair is then mapped to by mapping function Map by one group of new key-value pair, is referred to Fixed concurrent reduction function Reduce obtains for guaranteeing that each of the key-value pair of all mappings shares identical key group Query result.By being converted to obtain MapReduce task to each user request, it is then based on MapReduce and appoints Business resets mapping function and reduction function, to realize to not having a large amount of of subregion, index and Major key in database Data carry out being split as the processing that small grain size datacycle carries out data, can effectively be carried out based on hive task large batch of Data processing.Therefore, it solves the problems, such as effectively carry out the optimization processing of data by hive in the prior art.
It is described to be based on Tool for Data Warehouse in a kind of embodiment of the invention, the inquiry request is converted, is obtained The step of corresponding MapReduce task, as shown in Figure 2, comprising:
S201 obtains Tool for Data Warehouse in the corresponding divisional description information of data-base cluster.
S202 generates MapReduce according to the inquiry request, the Tool for Data Warehouse and the divisional description information Task.
It should be noted that the data file of structuring can be mapped as a database table by hive, so, it can incite somebody to action Data-base cluster carries out divisional description information, carries out granularity refinement, and inquiry request is then based on each subregion and is generated MapReduce task, so as to carry out data query in each subregion.Specific query process, the present invention provide one kind Specific embodiment as shown in Figure 3, comprising:
S301, respectively each partition table distribute a mapping function, obtain the first query result.
It should be noted that mapping function returns to an iterator, it, can be with by executing each Mapreduce task Lopsided iteration meets the composition of function condition until obtaining object, and gives this iterator transmission result, to obtain first Query result.In the embodiment of the present invention, one group of key-value pair can be mapped to one group of new key-value pair especially by mapping function, It is the first query result that this, which organizes new key-value pair,.
In specific embodiments of the present invention, the realization of step S301 includes: that respectively each subregion distributes a mapping letter Number;According to set execution parameter and time parameter, the mapping function is executed, obtains the first query result.
In specific embodiments of the present invention, parameter is executed by setting and executes the time, carries out the execution of mapping function, so Overall data search efficiency can be improved by the correspondence of mapping function and subregion afterwards, thus when reducing required for data processing Between.
In order to further increase the reasonability for executing parameter and time parameter, the embodiment of the present invention is used and is executed according to history Log calculates the execution time span mean value of MapReduce operation and the mean value of mapping function;Based on the time span mean value With the mean value of the mapping function, setting executes parameter and time parameter;According to set execution parameter and time parameter, hold The row mapping function, obtains the first query result.
The embodiment of the present invention can be equal according to execution time span of the MapReduce operation of history within preset time period Value obtains corresponding mapping function and executes the time and penetrate function timing.Therefore, it using the embodiment of the present invention, can predict The operation time mean value for the mapping function that may be generated and the execution time average of specification function.
In a specific embodiment of the present invention, for each mapping function, according to inquiry corresponding to the partition table Condition generates subregion query statement;Data are read in the corresponding input fragment of the partition table by the subregion query statement As the first query result.
In another implementation, described is respectively the step of each subregion distributes a mapping function, comprising: according to institute The input format of subregion and data-base cluster in Hadoop is stated, each partition table of the data-base cluster is converted to respectively Corresponding input fragment;For each input fragment, start the inquiry that pre-assigned mapping function executes respective partition.
It is understood that the work tracker in Hadoop is retouched after obtaining MapReduce executive plan according to subregion It states the input format of information and data-base cluster in Hadoop and each partition table of data-base cluster is converted into respective correspondence Input fragment, by taking MySQL database as an example, work tracker will be each according to divisional description information and MySQL input format Partition table cutting is an input fragment.
First query result is fed back to the reduction function by S302, obtains by first query result Two query results.
Guarantee that each of the key-value pair that all mapping functions obtain shares identical key by gathering reduction function again Group.Therefore, by the screening to key-value pair in the first query result, identical key group in the first query result is obtained, as the Two query results.
Second query result is determined as query result corresponding with the inquiry request by S303.
Refering to shown in Fig. 4 and Fig. 5, being the optional application environment schematic diagram of electronic device 40 1 of the present invention.
In the present embodiment, the electronic device 40 can pass through wired or wireless way and terminal device 20 and database 30 It is communicated.The electronic device 40 obtains the input information of the terminal device 20 by network interface 43, according to getting Input information transfer corresponding query result from database 30 after treatment, and query result is passed through into network interface 43 It is sent on the display interface of the terminal device 20.The terminal device 20 includes mobile phone, plate and personal computer etc..Institute Database 30 is stated including at least data server.
As shown in fig.4, being the optional hardware structure schematic diagram of electronic device 40 1 of the present invention.Electronic device 40 includes, But it is not limited only to, connection memory 41, processor 42 and network interface 43 can be in communication with each other by system bus, Fig. 4 only shows Go out the electronic device 40 with component 41-43, it, can be with it should be understood that be not required for implementing all components shown The implementation of substitution is more or less component.
The memory 41 include at least a type of readable storage medium storing program for executing, the readable storage medium storing program for executing include flash memory, Hard disk, multimedia card, card-type memory (for example, SD or DX memory etc.), random access storage device (RAM), static random are visited It asks memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), may be programmed read-only deposit Reservoir (PROM), magnetic storage, disk, CD etc..In some embodiments, the memory 41 can be the electronics dress Set 40 internal storage unit, such as the hard disk or memory of the electronic device 40.In further embodiments, the memory It can be the plug-in type hard disk being equipped on the External memory equipment of the electronic device 40, such as the electronic device 40, intelligently deposit Card storage (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) Deng.Certainly, the memory 41 can also both including the electronic device 20 internal storage unit and also including its external storage Equipment.In the present embodiment, the memory 41 is installed on the operating system of the electronic device 40 and all kinds of commonly used in storage Application software, such as the program code of data query processing system 44 etc..In addition, the memory 41 can be also used for temporarily Store the Various types of data that has exported or will export.
The processor 42 can be in some embodiments central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chips.The processor 42 is commonly used in the control electricity The overall operation of sub-device 40.In the present embodiment, the processor 42 is for running the program generation stored in the memory 41 Code or processing data, such as run the data query processing system 44 etc..
The network interface 43 may include radio network interface or wired network interface, which is commonly used in Communication connection is established between the electronic device 40 and other electronic equipments.
So far, oneself is through describing the hardware configuration and function of relevant device of the present invention in detail.In the following, above-mentioned introduction will be based on It is proposed each embodiment of the invention.
Firstly, the present invention proposes a kind of electronic device 40.
As shown in fig.3, being the program module schematic diagram of the electronic device 40 of first embodiment of the invention.
In the present embodiment, the electronic device 40 includes that a series of computer program being stored on memory 41 refers to It enables, when the computer program instructions are executed by processor 42, the data query processing behaviour of various embodiments of the present invention may be implemented Make.In some embodiments, the specific operation realized based on the computer program instructions each section, electronic device 40 can be with It is divided into one or more modules.For example, the electronic device 40 can be divided into receiving module 401, turn in Fig. 6 Change the mold block 402, setup module 403, enquiry module 404.Wherein:
Receiving module 401, for receiving user for inquiry request;
Conversion module 402 is converted the inquiry request, is obtained corresponding for being based on Tool for Data Warehouse MapReduce task;
Mapping function and reduction function is arranged for being based on the MapReduce task in setup module 403;
Enquiry module 404 obtains query result for being based on the mapping function and reduction function.
Further, as shown in fig. 7, the conversion module 402, comprising:
Acquisition submodule 701, for obtaining Tool for Data Warehouse in the corresponding divisional description information of data-base cluster;
Submodule 702 is generated, for according to the inquiry request, the Tool for Data Warehouse and the divisional description information Generate MapReduce task.
Further, as shown in figure 8, the enquiry module 404, comprising:
Distribution sub module 801 distributes a mapping function for respectively each partition table, obtains the first query result;
Feedback submodule 802 is looked into for first query result to be fed back to the reduction function by described first It askes result and obtains the second query result;
It determines submodule 803, is tied for second query result to be determined as inquiry corresponding with the inquiry request Fruit.
It further, further include distribution sub module 801, comprising: allocation unit and execution unit (being not shown in figure).Point With unit, a mapping function is distributed for respectively each subregion;Execution unit, for according to the execution parameter of setting and when Between parameter, execute the mapping function, obtain the first query result.
Further, the allocation unit is specifically used for: defeated in Hadoop according to the subregion and data-base cluster Each partition table of the data-base cluster is converted to corresponding input fragment by entry format;For each input Fragment starts the inquiry that pre-assigned mapping function executes respective partition.
Further, the distribution sub module 801, is used for: for each mapping function, according to corresponding to the partition table Querying condition generates subregion query statement;It is read in the corresponding input fragment of the partition table by the subregion query statement Data are as the first query result.
Further, the execution unit, is used for: when calculating the execution of MapReduce operation according to history execution journal Between length mean value and mapping function mean value;Based on the mean value of the time span mean value and the mapping function, setting is held Row parameter and time parameter;According to the execution parameter and time parameter of setting, the mapping function is executed, obtains the first inquiry knot Fruit.
Compared to the prior art, electronic device provided in an embodiment of the present invention, by receiving user for inquiry request;Then By being based on Tool for Data Warehouse hive, the inquiry request is converted, corresponding MapReduce task is obtained;Setting Mapping function and reduction function, then by mapping function Map, for one group of key-value pair is mapped to one group of new key-value pair, Concurrent reduction function Reduce is specified, for guaranteeing that each of the key-value pair of all mappings shares identical key group, is obtained Take query result.By being converted to obtain MapReduce task to each user request, it is then based on MapReduce Task resets mapping function and reduction function, to realize to not having the big of subregion, index and Major key in database Amount data carry out being split as the processing that small grain size datacycle carries out data, realize effectively large quantities of based on the progress of hive task The data processing of amount avoids the problem of effectively can not carrying out the optimization processing of data by hive in the prior art.
The present invention also provides a kind of computer equipments, can such as execute smart phone, tablet computer, the notebook electricity of program Brain, desktop computer, rack-mount server, blade server, tower server or Cabinet-type server (including independent clothes Server cluster composed by business device or multiple servers) etc..The computer equipment of the present embodiment includes at least but unlimited In: memory, the processor etc. of connection can be in communication with each other by system bus.
The present embodiment also provides a kind of storage medium, as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory etc.), random access storage device (RAM), static random-access memory (SRAM), read-only memory (ROM), electricity can Erasable programmable read-only memory (EPROM) (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD, service Device, App are stored thereon with computer program, corresponding function are realized when program is executed by processor using store etc..This implementation The storage medium of example is used for storage electronics 20, and data query processing of the invention is realized when being executed by processor.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, computer, clothes Business device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of data query processing method, which is characterized in that the method includes the steps:
Receive the inquiry request of user;
Based on Tool for Data Warehouse, the inquiry request is converted, obtains corresponding MapReduce task;
Based on the MapReduce task, mapping function and reduction function are set;
Based on the mapping function and reduction function, query result is obtained.
2. a kind of data query processing method according to claim 1, which is characterized in that described to be based on data warehouse work Tool, the step of being converted to the inquiry request, obtain corresponding MapReduce task, comprising:
Tool for Data Warehouse is obtained in the corresponding divisional description information of data-base cluster;
MapReduce task is generated according to the inquiry request, the Tool for Data Warehouse and the divisional description information.
3. a kind of data query processing method according to claim 1, which is characterized in that described to be based on the mapping function With reduction function, obtain query result the step of, comprising:
Respectively each partition table distributes a mapping function, obtains the first query result;
First query result is fed back into the reduction function, the second inquiry knot is obtained by first query result Fruit;
Second query result is determined as query result corresponding with the inquiry request.
4. a kind of data query processing method according to claim 3, which is characterized in that described is respectively each partition table The step of distributing a mapping function, obtaining the first query result, comprising:
Respectively each subregion distributes a mapping function;
According to the execution parameter and time parameter of setting, the mapping function is executed, obtains the first query result.
5. a kind of data query processing method according to claim 4, which is characterized in that described is respectively each subregion point The step of with a mapping function, comprising:
According to the input format of the subregion and data-base cluster in Hadoop, by each partition table of the data-base cluster Be converted to corresponding input fragment;
For each input fragment, start the inquiry that pre-assigned mapping function executes respective partition.
6. a kind of data query processing method according to claim 3, which is characterized in that described is respectively each partition table The step of distributing a mapping function, obtaining the first query result, comprising:
For each mapping function, subregion query statement is generated according to querying condition corresponding to the partition table;
Data are read in the corresponding input fragment of the partition table by the subregion query statement as the first query result.
7. a kind of data query processing method according to claim 4, which is characterized in that described to execute ginseng according to setting The step of several and time parameter executes the mapping function, obtains the first query result, comprising:
The execution time span mean value of MapReduce operation and the mean value of mapping function are calculated according to history execution journal;
Based on the mean value of the time span mean value and the mapping function, setting executes parameter and time parameter;
According to set execution parameter and time parameter, the mapping function is executed, obtains the first query result.
8. a kind of electronic device, which is characterized in that described device includes:
Receiving module, for receiving user for inquiry request;
Conversion module converts the inquiry request, obtains corresponding MapReduce for being based on Tool for Data Warehouse Task;
Mapping function and reduction function is arranged for being based on the MapReduce task in setup module;
Enquiry module obtains query result for being based on the mapping function and reduction function.
9. a kind of computer equipment, including memory, processor and it is stored on the memory and can be in the processor The computer program of upper operation, which is characterized in that the processor realizes claim 1 to 7 when executing the computer program The step of data query processing method is based on described in any one.
10. a kind of storage medium, which is characterized in that be stored with computer program, the computer program in the storage medium It can be performed by least one processor, so that at least one described processor is executed as described in any one of claim 1-7 Based on data query processing method the step of.
CN201910608830.5A 2019-07-08 2019-07-08 Data query processing method, electronic device, computer equipment and storage medium Pending CN110515969A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910608830.5A CN110515969A (en) 2019-07-08 2019-07-08 Data query processing method, electronic device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910608830.5A CN110515969A (en) 2019-07-08 2019-07-08 Data query processing method, electronic device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110515969A true CN110515969A (en) 2019-11-29

Family

ID=68622385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910608830.5A Pending CN110515969A (en) 2019-07-08 2019-07-08 Data query processing method, electronic device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110515969A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111966687A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Method and device for partitioning database table of mainframe DB2
CN112037874A (en) * 2020-09-03 2020-12-04 合肥工业大学 Distributed data processing method based on mapping reduction
CN112233727A (en) * 2020-10-29 2021-01-15 北京诺禾致源科技股份有限公司 Data partition storage method and device
CN112445759A (en) * 2020-11-30 2021-03-05 中国人寿保险股份有限公司 Method and device for cluster data replication across distributed databases and electronic equipment
CN112966031A (en) * 2019-12-12 2021-06-15 北京奇艺世纪科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN113220760A (en) * 2021-04-28 2021-08-06 北京达佳互联信息技术有限公司 Data processing method, device, server and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927331A (en) * 2014-03-21 2014-07-16 珠海多玩信息技术有限公司 Data querying method, data querying device and data querying system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927331A (en) * 2014-03-21 2014-07-16 珠海多玩信息技术有限公司 Data querying method, data querying device and data querying system

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112966031A (en) * 2019-12-12 2021-06-15 北京奇艺世纪科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN111966687A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Method and device for partitioning database table of mainframe DB2
CN111966687B (en) * 2020-08-18 2024-04-16 中国银行股份有限公司 Mainframe DB2 database table partitioning method and device
CN112037874A (en) * 2020-09-03 2020-12-04 合肥工业大学 Distributed data processing method based on mapping reduction
CN112037874B (en) * 2020-09-03 2022-09-13 合肥工业大学 Distributed data processing method based on mapping reduction
CN112233727A (en) * 2020-10-29 2021-01-15 北京诺禾致源科技股份有限公司 Data partition storage method and device
CN112233727B (en) * 2020-10-29 2024-01-26 北京诺禾致源科技股份有限公司 Data partition storage method and device
CN112445759A (en) * 2020-11-30 2021-03-05 中国人寿保险股份有限公司 Method and device for cluster data replication across distributed databases and electronic equipment
CN112445759B (en) * 2020-11-30 2024-04-16 中国人寿保险股份有限公司 Method and device for copying data across clusters of distributed database and electronic equipment
CN113220760A (en) * 2021-04-28 2021-08-06 北京达佳互联信息技术有限公司 Data processing method, device, server and storage medium

Similar Documents

Publication Publication Date Title
CN110515969A (en) Data query processing method, electronic device, computer equipment and storage medium
US9146948B2 (en) Hilbert ordering of multidimensional tuples within computing systems
CN107888716A (en) A kind of sort method of domain name resolution server, terminal device and storage medium
CN109508355A (en) A kind of data pick-up method, system and terminal device
CN112800095A (en) Data processing method, device, equipment and storage medium
CN111160658B (en) Collaborative manufacturing resource optimization method, system and platform
CN109670101B (en) Crawler scheduling method and device, electronic equipment and storage medium
CN109324905A (en) Database operation method, device, electronic equipment and storage medium
CN113196231A (en) Techniques for decoupling access to infrastructure models
US20070150430A1 (en) Decision support methods and apparatus
EP2965492B1 (en) Selection of data storage settings for an application
CN112287015A (en) Image generation system, image generation method, electronic device, and storage medium
Sherali et al. A decomposition algorithm for a discrete location-allocation problem
CN108446329A (en) Adaptive databases partition method and system towards industrial time series database
CN115619448A (en) User loss prediction method and device, computer equipment and storage medium
CN109726219A (en) The method and terminal device of data query
CN109240893A (en) Using operating status querying method and terminal device
CN1783121A (en) Method and system for executing design automation
WO2001013283A1 (en) A system, method, and computer program product for configuring stochastic simulation models in an object oriented environment
CN111522840A (en) Label configuration method, device, equipment and computer readable storage medium
CN111861322A (en) Emergency equipment material auxiliary control method, system and storage medium
Pan et al. Skyline web service selection with mapreduce
CN116700929A (en) Task batch processing method and system based on artificial intelligence
Sarnovský et al. Cloud computing as a platform for distributed fuzzy FCA approach in data analysis
Bretthauer et al. A model for resource constrained production and inventory management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination