Embodiment
Embodiments herein proposes a kind of method of new data filtering, and the one of initial data is represented by identifying
To multiple row, and filtering rule is built to multiple marks and corresponding to the screening conditions of mark using one, crossing filter data
When can apply screening conditions one by one, to be directed to the row in initial data specified by corresponding mark to carry out processing computing, make
Obtain computing to be simplified, and reduce the data volume for participating in computing;Simultaneously developer can by for identify specify row come
The data structure of initial data is matched with, without changing sentence or carrying out data structure conversion, so as to solve prior art
Present in problem.
Embodiments herein may operate in any equipment with calculating and storage capacity, such as mobile phone, flat board electricity
The equipment such as brain, PC (Personal Computer, PC), notebook, server;Can also be by operating in two or two
The logical node of individual above equipment realizes the various functions in the embodiment of the present application.
In embodiments herein, the sources of data is an initial data for arriving multiple systems, different system it is original
Data can have identical or different data structure.The filtering rule of initial data is set for the row of initial data,
, should when some of row record arranges or the value of some row meets established condition i.e. for the row record of an initial data
Row record turns into target data by screening.
Filtering rule can be expressed with one to multiple marks and its corresponding screening conditions, wherein, each mark bag
At least one row are included, the set for the row that all marks include is all row that the filtering rule is used;Some mark is corresponding
Screening conditions specify the value of row (row that i.e. mark includes) to define with the mark, be a composition portion of filtering rule
Point.It can be seen that the computing between mark and its corresponding screening conditions, two or more marks and its corresponding screening conditions
Relation (for including two and filtering rule identified above) can be used for expressing all filtering rules.Need what is illustrated
It is to apply multiple marks in same filtering rule to include identical to arrange, a filtering rule can also repeatedly make
With a mark, do not limit.
To two kinds of initial data of different pieces of information structure, developer can be matched with number by being specified for same mark
According to the row of structure, to cause filtering rule to go for every kind of initial data, the program of filtering rule is realized without change,
Without the conversion for carrying out data structure.
In embodiments herein, the flow of the method for data filtering is as shown in Figure 1.
Step 110, the row that all marks that filtering rule is related to are specified are extracted from initial data, obtain source data.
To the initial data in some source, the row specified according to the mark for the data structure for being matched with the source, will filter
In rule using to the row specified of each mark extract, form what is used when entering every trade record filtering in step 120
Source data.Source data is typically a part (it could also be possible that initial data is in itself) for initial data, and source data is every in other words
Row record be all initial data often row record it is some or all, source data often row record in all include be used for expressed
Filter the value of the row included by any mark of rule.
For example, mark 1 and mark 2 are used in filtering rule, wherein row A and row B of the mark 1 by some raw data table
Composition, mark 2 are made up of the row B and row C of the raw data table, then what all marks that the filtering rule is related to were specified is classified as
A, row B and row C, these is listed in each row record in raw data table and extracted, obtains the source formed by arranging A, row B and row C
Data.
Step 120, according to filtering rule, successively using screening conditions corresponding to each mark the row of source data is recorded into
Row filtering, obtains target data.
After the row specified according to mark extract source data, according to the screening bar of each mark used in filtering rule
Operation relation between part and the screening conditions of different identification, filtered one by one using screening conditions corresponding to each mark
Source data often row record, by by institute tagged screening row record be used as target data.
Can be according to the number of the mark involved by filtering rule in practical application scene and when being related to two or two
Operation relation between the screening conditions identified when identified above, to determine to filter the detailed process of the row record of source data, this
The embodiment of application does not limit.It is illustrated below.
In first application scenarios, filtering rule is related to a mark.To the often row record in source data, according to the mark
Know and specify whether the value of row meets screening conditions corresponding to the mark, to determine the selection result.If the selection result is by this
The screening of mark, then the row is recorded as target data, otherwise abandon the row data.
It is related to two or more marks in second application scenarios, in filtering rule, and each identifies
It will not be reused in filtering rule (i.e. using each mark during identity expression filtering rule only using once).Assuming that filtering
All N (N is the natural number more than 1) that are identified as that rule is related to are individual, if N number of be identified as mark 1, mark 2, until identifying N, then
Target data can be obtained using following process:
Pending data using source data as mark 1, the often row in pending data is recorded, in being recorded according to the row
Mark 1 specify row value, with mark 1 corresponding screening conditions, determine the row record whether by identify 1 screening;According to being
It is no by the selection result and filtering rule the row is recorded to or is used as target data or as reduced data or
It is dropped.When the selection result be by mark 1 screening when, the row record be as target data, reduced data or by
Abandon, it is related with the operation relation of other mark screening conditions to the screening conditions of mark 1 in filtering rule;
Pending data using the reduced data of mark 1 as mark 2, above-mentioned screened is repeated using mark 2
Journey (is similar to above-mentioned processing procedure when being screened using mark 1), obtains the reduced data of mark 2;
Until identifying pending data of the reduced data of (N-1) as mark N, after the screening by identifying N, obtain
Target data.
Below for two example explanations in second application scenarios, how the operation relation between screening conditions influences
To the result of row record when being screened according to some mark.
In example one, filtering rule is related to the first mark and the second mark, and filtering rule is:Pass through the first mark
Screen and by the screening of the second mark.When carrying out data filtering, first using source data as the pending of the first mark
Data, reduced data will be saved as by the row record of the first mark screening in these pending datas, will not pass through first
The row record of mark screening abandons;Secondly, the pending data using the reduced data of the first mark as the second mark, will be logical
The row record for crossing the second mark screening saves as target data, will not abandoned by the row record of the second mark screening.
In example two, filtering rule is related to the 3rd mark and the 4th mark, and filtering rule is:Pass through the 3rd mark
Screening or the screening by the 4th mark.When carrying out data filtering, first using source data as the pending of the 3rd mark
Data, target data will be saved as by the row record of the 3rd mark screening in these pending datas, will not pass through the 3rd mark
The row record for knowing screening saves as the reduced data of the 3rd mark;Secondly, using the reduced data of the 3rd mark as the 4th
The pending data of mark, the row record by the 4th mark screening is saved as into target data, will not sieved by the 4th mark
The row record of choosing abandons.
Filtering direction can be applied to mark in filtering rule, to specify screening bar corresponding to involved each mark
How part is applicable.Filtering direction includes forward and reverse, when the filtering direction of some mark is positive, if some row records
In the mark specify the value of row to meet screening conditions corresponding to the mark, then row record passes through the screening of the mark;When some
When the filtering direction of mark is reverse, if the mark specifies the value of row to be unsatisfactory for screening corresponding to the mark in some row record
Condition, the then screening that row record passes through the mark.
For the ease of being built and using filtering rule from the angle of business, can be expressed using one to multiple labels
Filter rule.Label is used for describing the business for possessing certain feature, by more than one mark and corresponding to the screening bar each identified
Part forms, wherein at least one specified row each identified are related to the features described above of business described by label.
When label includes the mark of two and the above, the computing that label is also included between the screening conditions of these marks is closed
System;It is similar, when label of the filtering rule using two and the above, it can also specify the operation relation between these labels.Can
See, filtering rule is substantially still determined by mark and its operation relation between corresponding screening conditions, screening conditions, and is marked
Label are simply used for a kind of form of organization identification from operational angle, do not interfere with the foregoing substantial filtration process to source data.
Application scenarios to preserving mark and its corresponding screening conditions using label, when carrying out data filtering, when
When using some mark, screening conditions corresponding to the mark can be searched in the label belonging to the mark, according to finding
Screening conditions the row record of source data is filtered, obtain target data after all marks are traveled through according to filtering rule.
In business has the application scenarios of different levels, the subdivision industry for possessing certain feature can be described using label
Business, and it is (common i.e. with the subdivision business to describe to include the subdivision business and other same types subdivision business using tag combination
Possess certain business general character other with level business) high-level business.Tag combination includes at least two labels, each
Label is all the part of tag combination, in other words, belong between the label of same tag combination be or operation relation.
For example, industrial and commercial bank's Net silver channel, Construction Bank's Net silver channel, agricultural bank's Net silver channel are 3 labels, and Net silver channel is then to include this 3
The tag combination of label.
In the application scenarios using tag combination, filtering rule can use at least one label and/or at least one
Tag combination is expressed., can be in the label or mark belonging to mark when to use some mark when carrying out data filtering
Screening conditions corresponding to the mark are searched in tag combination belonging to knowing, the row of source data is remembered according to the screening conditions found
Record is filtered;Target data is obtained after all marks are traveled through according to filtering rule.
It can be seen that in embodiments herein, multiple row are arrived to represent the one of initial data by identifying, use label with
And filtering rule is expressed corresponding to the screening conditions of label, can be by after outgoing label is extracted from initial data and specifies row
It is individual to apply screening conditions, to be directed to the row in initial data specified by corresponding mark to carry out processing computing, handled simplifying
Reduce the data volume of processing while computing, accelerate the speed of data filtering;Meanwhile while developer can by for
Mark specifies row to be matched with the data structure of initial data, without changing sentence or carrying out data structure conversion, mitigates
The workload of developer.
In one of the application application example, in Third-party payment platform, between multiple trade companies, between multiple users,
And between trade company and user, some channels provided by Third-party payment platform and multiple banks are mutually paid,
The multiple operation systems for being responsible for the different payment transactions of processing generate the daily record of each business, and these daily records will be used as initial data,
Classified according to bank, channel (such as quick, Net silver), funds flow (as flowed into, flowing out), dynamically filtered out in daily record
Corresponding data, it is presented in the monitoring form such as form, pie chart, block diagram.
In Third-party payment platform, according to the statistical demand to daily record data, described using mark in initial data
With a certain or some data filtering of business and the row of statistical correlation, also the row in initial data or these row are defined
To identify the row specified.When mark is applied into some data filtering process, meet the data filtering mistake for the flag
The screening conditions of journey demand, and several marks associated with the characteristic of a certain business and its screening conditions are combined as marking
Label.Be to target data in application scenarios by respectively by two or more labels filtering after data acquisition system situation,
The set of these labels can also be generated tag combination.A kind of tag combination, label and the relation of mark are as shown in Figure 2.
Third-party payment platform using label and/or tag combination to express data filtering when filtering rule, expressing
During filtering rule, it is possible to specify label or the filtering direction of tag combination, be that will meet all marks in label or tag combination
The row record of corresponding screening conditions still will not meet label or set of tags as the row record (forward filter) by screening
The row record of the screening conditions of all marks pair is as the row record (reverse filtration) by screening in conjunction.A kind of filtering direction
Configuration example is as shown in Figure 3.
The server (hereinafter referred to as filtering server) for being responsible for carrying out data filtering in Third-party payment platform is receiving
After coming from the initial data of some operation system, the configuration for the filtering rule for filtering initial data use (including is used
Label, tag combination, operation relation between mark and its corresponding screening conditions, screening conditions etc.) write-in caching.
Filtering server reads the tagged specified row of institute that the filtering rule uses from caching, by these specify row from
Extracted in initial data, form the source data for filtering.According to the operation relation between being identified in filtering rule, filtering
Server the pending data using source data as the mark, is inquired about belonging to the mark in the buffer since identifying first
The corresponding screening conditions of its in label or tag combination and filtering direction, judge to treat line by line based on screening conditions and filtering direction
Whether the row record of processing data by the screening of this mark, further according to the operation relation identified with other, decision will by and
The row record not screened by this mark writes target data, write-in reduced data or abandoned;Using first mark
After all rows record for traveling through pending data, filtering server extracts second mark, by the processed number of first mark
According to the pending data as second mark, process is repeated the above, until the processing procedure by last mark
Afterwards, target data is obtained.
For example, to some mark, its processing procedure can be:A virtual mark is formed according to the specified row of the mark
Know, search whether the mark consistent with the virtual identifying in the buffer;If it is not, the filtering rule in may caching can
Can be updated, exit filter process;If so, then searching the mark belongs to which label or which complicated label, application
In the label or complicated label the processing of the row record of pending data is carried out corresponding to the screening conditions of the mark.
For example, the example that the trading volume data of all channels are obtained from data source is as shown in table 1, wherein dimension 2 is hair
The related organization of these raw transaction:
Dimension 1 |
Dimension 2 |
Trading volume |
Fininflux |
Icbc9011 |
100 |
Fininflux |
Icbc701 |
101 |
supergw |
ICBC51_ICBCSH010120 |
102 |
Fininflux |
CCb701 |
103 |
… |
… |
… |
Table 1
If it is desired to filter out the data that relational structure is industrial and commercial bank, then dimension 1 and dimension 2 are formed into mark A, by label A
It is arranged to include mark 1, and it is that dimension 2 is industrial and commercial bank's channel to identify screening conditions corresponding to 1, then and the row in table 1 records
When being handled, according to the definition of mark 1, dimension 1, dimension 2 are assembled into dummy index, and search whether exist in the buffer
Consistent mark;After consistent mark 1 is found, screening conditions corresponding to mark 1 in label A belonging to mark 1 are searched;Using corresponding
In the screening conditions of mark 1, if the dimension 2 of row record is industrial and commercial bank's channel, the row is recorded as target data.
Corresponding with the realization of above-mentioned flow, embodiments herein additionally provides a kind of device of data filtering.The device can
To be realized by software, can also be realized by way of hardware or software and hardware combining.Exemplified by implemented in software, as logic
Device in meaning, it is by corresponding calculating by the CPU (Central Process Unit, central processing unit) of place equipment
Machine programmed instruction reads what operation in internal memory was formed.For hardware view, except the CPU shown in Fig. 4, internal memory and it is non-easily
Outside the property lost memory, the equipment where the device of data filtering generally also includes being used to carry out chip of wireless signal transmitting-receiving etc.
Other hardware, and/or other hardware such as board for realizing network communicating function.
Fig. 5 show a kind of device of data filtering of the embodiment of the present application offer, and filtering rule is by one to multiple marks
Know and expressed corresponding to the screening conditions each identified, it is described to identify at least one row for including initial data, the dress
Put including source data extraction unit and source data filter element, wherein:Source data extraction unit is used to extract from initial data
The row that all marks that the filtering rule is related to are specified, obtain source data;Source data filter element is used for according to filtering rule
Expression, the row record of source data is filtered using screening conditions corresponding to each mark successively, obtains target data.
In a kind of implementation, the filtering rule be related to it is all be identified as it is N number of and each identify expression filter
Only using once when regular, N is natural number;The source data filter element is specifically used for:Wait to locate using source data as mark 1
Data are managed, the value of row is specified according to mark 1 in the often row record of pending data, determines institute with 1 corresponding screening conditions of mark
State row record whether by mark 1 screening, according to whether by result and filtering rule the row is recorded or is used as
Target data or as reduced data or being dropped;Using the reduced data of upper one mark as next mark
Pending data, above-mentioned screening process is repeated using next mark;After the screening by identifying N, number of targets is obtained
According to.
In above-mentioned implementation, the filtering rule includes:By the first screening identified and pass through the second mark
Screening;The source data filter element is specifically used for:The row of the first mark screening will be passed through in the pending data of first mark
Record saves as reduced data;Pending data using the reduced data of the first mark as the second mark, will be by the
The row record of two mark screenings saves as target data.
In above-mentioned implementation, the filtering rule includes:By the 3rd screening identified or pass through the 4th mark
Screening;The source data filter element is specifically used for:The row of the 3rd mark screening will be passed through in the pending data of 3rd mark
Record saves as target data, the reduced data of the 3rd mark will not be saved as by the row record of the 3rd mark screening;Will
Pending data of the reduced data of 3rd mark as the 4th mark, the row record by the 4th mark screening is saved as
Target data.
In above-mentioned implementation, the filtering rule also includes:Filtering direction;When filtering direction is positive, if certain
Being identified in individual row record specifies the value of row to meet screening conditions corresponding to the mark, then the row record passes through the mark
Screening;When filtering direction is reverse, if the value that specified row are identified in some row record is unsatisfactory for, the mark is corresponding to sieve
Condition is selected, then the screening that the row record passes through the mark.
In one example, the filtering rule is expressed by least one label;The label is used for describing to possess necessarily
The business of feature, including more than one is identified and corresponding to the screening conditions each identified, what is each identified is at least one specified
Row are related to the feature;The source data filter element is specifically used for:To each mark, searched in the label belonging to mark
Screening conditions corresponding to mark, the row record of source data is filtered according to the screening conditions;According to filtering rule time
Target data is obtained after going through all marks.
In above-mentioned example, the filtering rule is expressed by least one label and/or at least one tag combination, described
Tag combination includes at least two labels;The source data filter element is specifically used for:To each mark, in the mark belonging to mark
Screening conditions corresponding to searching mark in label or tag combination, were carried out according to the screening conditions to the row record of source data
Filter;Target data is obtained after all marks are traveled through according to filtering rule.
The preferred embodiment of the application is the foregoing is only, not limiting the application, all essences in the application
God any modification, equivalent substitution and improvements done etc., should be included within the scope of the application protection with principle.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and internal memory.
Internal memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moved
State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus
Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, calculate according to herein
Machine computer-readable recording medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising so that process, method, commodity or equipment including a series of elements not only include those key elements, but also wrapping
Include the other element being not expressly set out, or also include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that wanted including described
Other identical element also be present in the process of element, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program product.
Therefore, the application can be using the embodiment in terms of complete hardware embodiment, complete software embodiment or combination software and hardware
Form.Deposited moreover, the application can use to can use in one or more computers for wherein including computer usable program code
The shape for the computer program product that storage media is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.