CN114925092A - Data processing method and device, electronic equipment and storage medium - Google Patents
Data processing method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN114925092A CN114925092A CN202210498840.XA CN202210498840A CN114925092A CN 114925092 A CN114925092 A CN 114925092A CN 202210498840 A CN202210498840 A CN 202210498840A CN 114925092 A CN114925092 A CN 114925092A
- Authority
- CN
- China
- Prior art keywords
- nodes
- node
- target
- tree
- rule tree
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24564—Applying rules; Deductive queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure relates to a data processing method, an apparatus, an electronic device, and a storage medium, including: generating an initial rule tree according to a pre-acquired data processing rule; traversing nodes in the initial rule tree, and merging leaf nodes with the same father nodes according to corresponding label rules to obtain a simplified rule tree; sequentially determining target calculation engines, traversing nodes in the simplified rule tree according to the sequence from top to bottom, and marking any unmarked node as a calculation node of the target calculation engine under the condition that any unmarked node meets the applicable condition of the target calculation engine until the root node in the simplified rule tree is marked; converting the simplified rule tree into an execution plan tree; and processing the data to be processed based on the execution plan tree to obtain a processing result. Therefore, based on the characteristics of various computing engines, the appropriate computing engine is selected for different nodes, and therefore computing speed and resource utilization rate are improved.
Description
Technical Field
The present disclosure relates to the field of data analysis, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.
Background
In many service scenarios, service personnel need to select object groups according to object attributes of different objects by specifying information such as tag names, operators, tag values and the like according to a custom rule, so that corresponding service activities can be performed on each object group.
In the existing technical solution, when an object is circled, all circled conditions are spliced into an sql statement in a hive Structured Query Language (hive storage engine) -based manner, and the final result is obtained after the sql statement is executed.
However, for complex selection rules, a long sql statement is spliced, a plurality of queries are nested, a hive engine is used for querying, the number of the queries is 1-2 hours as small as 1-2 hours, the number of the queries is 10 hours as large as more, the querying speed is low, the resource occupation condition is serious, and other queries can be blocked.
Disclosure of Invention
The present disclosure provides a data processing method, an apparatus, an electronic device, and a storage medium, to at least solve the problems of slow query speed, severe resource occupation, and possible blocking of other queries in the related art. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a data processing method, including:
generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logical relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logical relationship between connected child nodes;
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and combining leaf nodes with the same father nodes according to the corresponding label rules to obtain a simplified rule tree;
sequentially determining target calculation engines according to a preset priority order, traversing nodes in the simplified rule tree according to the sequence from top to bottom, and marking any unmarked node as a calculation node of the target calculation engine under the condition that any unmarked node meets the applicable condition of the target calculation engine until the marking of the root node in the simplified rule tree is finished;
converting the simplified rule tree into an execution plan tree according to the node marks, wherein leaf nodes of the execution plan tree are used for representing a computing unit, and non-leaf nodes of the execution plan tree are used for representing the logical relation between the connected child nodes;
and processing the data to be processed based on the execution plan tree to obtain a processing result.
Optionally, traversing the nodes in the initial rule tree according to the sequence from bottom to top, and merging the leaf nodes having the same parent node according to the corresponding label rule to obtain a simplified rule tree, where the method includes:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed nodes as target nodes;
under the condition that the child nodes of the target node comprise a plurality of leaf nodes, acquiring label rules corresponding to the leaf child nodes, wherein the label rules comprise label names, operation rules and label values;
and taking the nodes with the same label signature in the included leaf nodes as candidate nodes, and merging the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet the merging conditions to obtain the simplified rule tree.
Optionally, the traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and if any unmarked node meets the applicable condition of the target computation engine, marking the unmarked node as a computation node of the target computation engine until marking of the root node in the simplified rule tree is completed, includes:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the currently traversed nodes as target nodes;
under the condition that the target node is not marked, if the target node meets the applicable condition of the target computing engine, marking the target node as the computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child node exists, and marking the root node.
Optionally, under the condition that the target node is not marked, if the target node does not satisfy the applicable condition of the target computing engine, the method further includes:
and under the condition that the lower nodes of the target nodes meet the adjustment conditions of the target computing engine, merging the lower nodes meeting the adjustment conditions into merged sub-nodes of the target nodes, and marking the merged sub-nodes as computing nodes of the target computing engine.
Optionally, the converting the simplified rule tree into an execution plan tree according to the node label includes:
and converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to the any node according to the node mark to obtain an execution plan tree.
According to a second aspect of an embodiment of the present disclosure, there is provided a data processing apparatus including:
the generating unit is configured to execute generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logic relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logic relationship between connected child nodes;
the merging unit is configured to traverse the nodes in the initial rule tree according to the sequence from bottom to top, and merge the leaf nodes with the same father nodes according to the corresponding label rules to obtain a simplified rule tree;
the marking unit is configured to execute the steps of sequentially determining target calculation engines according to a preset priority order, traversing nodes in the simplified rule tree according to the sequence from top to bottom, and marking any unmarked node as a calculation node of the target calculation engine under the condition that any unmarked node meets the applicable condition of the target calculation engine until the root node in the simplified rule tree is marked;
a conversion unit configured to perform conversion of the simplified rule tree into an execution plan tree according to the node marks, wherein leaf nodes of the execution plan tree are used for representing the calculation unit, and non-leaf nodes of the execution plan tree are used for representing the logic relationship between the connected child nodes;
and the processing unit is configured to execute processing on the data to be processed based on the execution plan tree to obtain a processing result.
Optionally, the merging unit is configured to perform:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed nodes as target nodes;
under the condition that the child nodes of the target node comprise a plurality of leaf nodes, obtaining label rules corresponding to the leaf nodes, wherein the label rules comprise label names, operation rules and label values;
and taking the nodes with the same label signature in the included leaf nodes as candidate nodes, and merging the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet the merging conditions to obtain the simplified rule tree.
Optionally, the marking unit is further configured to perform:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the currently traversed nodes as target nodes;
under the condition that the target node is not marked, if the target node meets the applicable condition of the target computing engine, marking the target node as a computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child node exists, and marking the root node.
Optionally, the marking unit is further configured to perform:
under the condition that the target node is not marked, if the target node does not meet the applicable condition of the target computing engine, under the condition that the lower layer node of the target node meets the adjustment condition of the target computing engine, combining the lower layer nodes meeting the adjustment condition into a combined sub-node of the target node, and marking the combined sub-node as the computing node of the target computing engine.
Optionally, the conversion unit is configured to perform:
and converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to the any node according to the node mark to obtain an execution plan tree.
According to a third aspect of embodiments of the present disclosure, there is provided a data processing electronic device including:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data processing method of any one of the above.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, in which instructions, when executed by a processor of a data processing electronic device, enable the data processing electronic device to perform any one of the data processing methods described above.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer programs/instructions which, when executed by a processor, implement the data processing method of any one of the above.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logical relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logical relationship between connected child nodes; traversing the nodes in the initial rule tree according to the sequence from bottom to top, and combining the leaf nodes with the same father nodes according to the corresponding label rules to obtain a simplified rule tree; sequentially determining target calculation engines according to a preset priority sequence, traversing nodes in the simplified rule tree according to a sequence from top to bottom, and marking any unmarked node as a calculation node of the target calculation engine under the condition that any unmarked node meets the applicable condition of the target calculation engine until the root node in the simplified rule tree is marked; converting the simplified rule tree into an execution plan tree according to the node marks, wherein leaf nodes of the execution plan tree are used for representing a computing unit, and non-leaf nodes of the execution plan tree are used for representing the logical relationship between the connected child nodes; and processing the data to be processed based on the execution plan tree to obtain a processing result.
Therefore, based on the characteristics of various computing engines, on the premise of not changing the operation result, the nodes in the initial rule tree are subjected to structural adjustment such as combination, splitting, moving and the like, and appropriate computing engines are selected for different nodes and are respectively executed to obtain a data processing result, so that the advantages of each computing engine can be fully exerted, and the computing speed and the resource utilization rate are improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a flow chart illustrating a method of data processing according to an exemplary embodiment.
Fig. 2 is a flow diagram illustrating a process of merging initial rule trees to obtain a simplified rule tree according to an exemplary embodiment.
FIG. 3 is a flow diagram illustrating tagging of a simplified rule tree, according to an example embodiment.
FIG. 4 is a logical schematic diagram illustrating an exemplary embodiment.
FIG. 5 is a block diagram illustrating a data processing device according to an example embodiment.
FIG. 6 is a block diagram illustrating an electronic device for data processing in accordance with an exemplary embodiment.
FIG. 7 is a block diagram illustrating an apparatus for data processing in accordance with an example embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flow chart illustrating a data processing method according to an exemplary embodiment, which includes the following steps, as shown in fig. 1.
In step S11, an initial rule tree is generated according to a pre-obtained data processing rule, where the data processing rule includes at least one label rule and a logical relationship between the label rules, the initial rule tree includes leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logical relationship between the connected child nodes.
For example, in the object group selection service, the tag may refer to attributes of the object, such as gender, age, and interest, that is, the tag rule refers to conditions for data processing, and when the service person performs rule selection, the service person may specify a tag name, an operator, and a tag value, such as an age range of the object. One round of selection can comprise a plurality of labels, and the plurality of labels can have a combination relationship of intersection, union, difference and nesting.
In the disclosure, by analyzing the data processing rule input by the service personnel, the label rule can be encapsulated into leaf nodes, the inter-label condition is encapsulated into non-leaf nodes, and the initial rule tree is recursively combined. In the initial rule tree, leaf nodes and non-leaf nodes are included, the leaf nodes are used for representing tag rules, the non-leaf nodes are used for representing logical relations between connected child nodes, for example, a circle selection rule input by a service person can be mapped to a tree, the leaf nodes of the tree are tag rules and include tag names, operators and tag values, and the non-leaf nodes are logical relations and can be intersections, sums, differences and the like.
In step S12, the nodes in the initial rule tree are traversed in the order from bottom to top, and the leaf nodes having the same parent node are merged according to the corresponding label rule, so as to obtain a simplified rule tree.
In one implementation, traversing the nodes in the initial rule tree according to a sequence from bottom to top, and merging the leaf nodes having the same parent node according to the corresponding label rule to obtain a simplified rule tree, including:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed nodes as target nodes; under the condition that the child nodes of the target node comprise a plurality of leaf nodes, acquiring label rules corresponding to the included leaf nodes, wherein the label rules comprise label names, operation rules and label values; and taking the nodes with the same signature as the leaf nodes as candidate nodes, and combining the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet the combination conditions to obtain the simplified rule tree.
As shown in fig. 2, a schematic flow diagram of the simplified rule tree is obtained by merging the initial rule trees. That is, if the subsection of the node contains a plurality of leaf nodes and there are leaf nodes with the same tag name, it is determined whether the leaf nodes with the same tag name satisfy the merge condition, and if so, the nodes satisfying the condition are merged into one node. For example, the conditions [ sex or sex for woman ] may be combined into [ sex in (male, female) ].
It can be understood that, through traversal of the initial rule tree, the tag rules represented by each leaf node are combed and merged, so that originally, according to the initial rule tree, two leaf nodes represent different tag rules, and a twice judgment process needs to be performed.
In step S13, sequentially determining the target calculation engines according to the preset priority order, traversing the nodes in the simplified rule tree according to the order from top to bottom, and marking any unmarked node as a calculation node of the target calculation engine until the root node in the simplified rule tree is marked when any unmarked node meets the applicable condition of the target calculation engine.
The computing engine is a tool capable of performing data processing in a certain storage medium by using a specific method, and in different services, a customized computing engine can be generated in advance, so that the computing efficiency is further improved, that is, the application conditions of the computing engine can be set in a customized manner, for example, for a clickhouse bitmap (bitmap) computing engine, the computing engine is suitable for enumerating rule conditions, such as screening labels of gender, regions and the like of an object; the clickhousesql calculation engine is suitable for continuous numerical value class rule conditions, such as screening of tags of payment amount, online time and the like of the object.
The bitmap is an efficient data structure, uses continuous binary bit (bit) storage for inquiring and removing the duplicate of a large amount of integer data, is widely applied to the aspects of indexing, data compression and the like, and is a storage engine, and the click method is characterized by high computing speed and unsuitability for complex computing.
In one implementation, this step may include:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the currently traversed nodes as target nodes; under the condition that the target node is not marked, if the target node meets the applicable condition of the target computing engine, marking the target node as the computing node of the target computing engine; and under the condition that the target node is marked, traversing the next node until no unmarked child node exists, and marking the root node.
That is to say, the simplified rule tree is traversed, and an applicable calculation engine is selected for each node according to the specific situation of the node, until each node in the simplified rule tree is marked completely, that is, each node in the simplified rule tree finds a corresponding calculation engine, so that in the subsequent processing process, the calculation of each node can call the applicable calculation engine, and the higher the applicability is, the faster the calculation effect is, the better the efficiency of object selection is, and the better the efficiency of object selection is.
In one implementation, if the target node does not satisfy the applicable condition of the target computing engine under the condition that the target node is not marked, the method further includes:
and under the condition that the lower nodes of the target nodes meet the adjustment conditions of the target computing engine, merging the lower nodes meeting the adjustment conditions into merged sub-nodes of the target nodes, and marking the merged sub-nodes as computing nodes of the target computing engine. The adjustment conditions are similar to the aforementioned applicable conditions, and can be set by self-definition, which is not limited in this disclosure.
It can be understood that the simplified rule tree is a multi-layer tree structure, and therefore, in the traversal process, when the target node does not satisfy the applicable condition of the target computing engine, the child nodes of the target node can be further traversed to ensure that the nodes of each layer can determine the computing engine to which the node is applicable, thereby further improving the efficiency of object selection.
Fig. 3 is a schematic flow chart illustrating the labeling of the simplified rule tree. That is, all custom compute engines are traversed from high to low in priority, traversing the reduced rule tree top-down for each compute engine. For each node of the reduced rule tree, the following logical process is performed: and judging whether the node is marked completely, if not, continuously judging whether the current computing engine is applicable, if so, marking that the current node is processed completely, and marking that the current computing engine is processed. If the current node is not applicable and the current node comprises child nodes, the lower-layer node of the current node is adjusted, the lower-layer nodes meeting the conditions are combined into one child node of the current node, and the current node is marked to be processed and processed by the current computing engine. And if all child nodes of the current node are marked completely, marking that the current node is processed completely. And circularly adjusting the marks until the root node is marked, and exiting the circulation.
In step S14, the simplified rule tree is converted into an execution plan tree according to the node labels, leaf nodes of the execution plan tree are used for representing the computing units, and non-leaf nodes of the execution plan tree are used for representing the logical relationship between the connected child nodes.
In one implementation, converting the reduced rule tree to an execution plan tree based on the node labels includes:
and converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to any node according to the node mark to obtain an execution plan tree.
That is to say, the simplified rule tree is traversed, and the nodes marked with the computation engine are converted into execution plan nodes, wherein the execution plan nodes are computation units which can be directly identified by the computation engine, the leaf nodes of the execution plan tree are computation units, and the non-leaf nodes are in a logical relationship, so that the subsequent processing of the execution plan tree is facilitated, and the object selection efficiency is further improved.
In step S15, the data to be processed is processed based on the execution plan tree, and a processing result is obtained.
In this step, during data processing, the user-defined execution engine is used to execute the calculation unit by traversing the leaf nodes of the execution plan tree, so as to obtain a sub-result. And according to the operation rule of the execution plan tree, recursively assembling the sub-results into a final result.
Therefore, the simplified rule tree is obtained by adjusting the structure of the initial rule tree, the simplified rule tree is lower in complexity compared with the initial rule tree, and then nodes in the simplified rule tree are aggregated according to different applicable calculation engines to obtain an execution plan tree, so that the efficiency of object selection is further improved.
Fig. 4 is a logic diagram of the present solution. Compared with the traditional hive sql mode, the method is applied to the object circle selection service, the average speed of object circle selection is reduced to 5 minutes from 2 hours, and the calculation speed and the resource utilization rate are greatly improved.
Specifically, first, an initial rule tree may be generated according to an object selection rule, where a "relation" node is a non-leaf node and is used to characterize a logical relationship between connected child nodes, and a "label" node is a leaf node and is used to characterize a label rule. Then, the nodes in the initial rule tree may be traversed in the order from bottom to top, and the leaf nodes having the same parent node are merged according to the corresponding label rule to obtain a simplified rule tree, as shown in fig. 4, two "label" nodes in the dotted line indicate two leaf nodes that may be merged, and after merging the two leaf nodes, the simplified rule tree is obtained. Furthermore, according to a preset priority order, sequentially determining target calculation engines, traversing the nodes in the simplified rule tree according to the order from top to bottom, and under the condition that any unmarked node meets the applicable conditions of the target calculation engines, marking any unmarked node as a calculation node of the target calculation engines until the marking of the root node in the simplified rule tree is completed, as shown in fig. 4, "relation" node in a dotted line and its corresponding "label" leaf node, and "label" leaf node on the other side are all nodes applicable to the calculation engine "engine 1", and based on this, the three nodes and nodes applicable to other calculation engines can be marked separately. Then, the simplified rule tree is converted into an execution plan tree according to the node labels, leaf nodes of the execution plan tree are used for representing the computing units, and non-leaf nodes of the execution plan tree are used for representing the logical relationship between the connected child nodes, as shown in fig. 4, nodes with the same label of the computing engine in the simplified rule tree correspond to one node in the execution plan tree, and each node is processed by a different computing unit, namely, a computing unit 1, a computing unit 2 and a computing unit 3. And finally, processing the data to be processed based on the execution plan tree to obtain a processing result, specifically, obtaining sub-results corresponding to each computing unit respectively, and then aggregating the computing results to obtain final results corresponding to the object selection rule.
As can be seen from the above, the technical solution provided in the embodiments of the present disclosure, based on the characteristics of multiple computing engines, performs structural adjustments such as merging, splitting, and moving on nodes in the initial rule tree on the premise of not changing the operation result, and selects appropriate computing engines for different nodes to perform respectively, so as to obtain data processing results, thereby fully exerting the advantages of each computing engine, and improving the computing speed and resource utilization rate.
FIG. 5 is a block diagram illustrating a data processing apparatus according to an example embodiment, the apparatus comprising:
the generating unit is configured to execute generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logic relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logic relationship between connected child nodes;
the merging unit is configured to traverse the nodes in the initial rule tree according to the sequence from bottom to top, and merge the leaf nodes with the same father nodes according to the corresponding label rules to obtain a simplified rule tree;
the marking unit is configured to execute the steps of sequentially determining target calculation engines according to a preset priority order, traversing nodes in the simplified rule tree according to the sequence from top to bottom, and marking any unmarked node as a calculation node of the target calculation engine under the condition that any unmarked node meets the applicable condition of the target calculation engine until the root node in the simplified rule tree is marked;
a conversion unit configured to perform conversion of the simplified rule tree into an execution plan tree according to the node markers, wherein leaf nodes of the execution plan tree are used for representing the calculation unit, and non-leaf nodes of the execution plan tree are used for representing the logical relationship between the connected child nodes;
and the processing unit is configured to execute processing on the data to be processed based on the execution plan tree to obtain a processing result.
In one implementation, the merging unit is configured to perform:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed nodes as target nodes;
under the condition that the child nodes of the target node comprise a plurality of leaf nodes, obtaining label rules corresponding to the leaf nodes, wherein the label rules comprise label names, operation rules and label values;
and taking the nodes with the same label signature in the included leaf nodes as candidate nodes, and merging the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet the merging conditions to obtain the simplified rule tree.
In one implementation, the marking unit is further configured to perform:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the currently traversed nodes as target nodes;
under the condition that the target node is not marked, if the target node meets the applicable condition of the target computing engine, marking the target node as a computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child node exists, and marking the root node.
In one implementation, the marking unit is further configured to perform:
under the condition that the target node is not marked, if the target node does not meet the applicable condition of the target computing engine, under the condition that the lower layer node of the target node meets the adjustment condition of the target computing engine, combining the lower layer nodes meeting the adjustment condition into a combined sub-node of the target node, and marking the combined sub-node as the computing node of the target computing engine.
In one implementation, the conversion unit is configured to perform:
and converting any node in the simplified rule tree into an execution node of a computing engine corresponding to the any node according to the node mark to obtain an execution plan tree.
As can be seen from the above, the technical solutions provided by the embodiments of the present disclosure, based on the characteristics of multiple computing engines, perform structural adjustments such as merging, splitting, and moving on nodes in an initial rule tree on the premise of not changing an operation result, select appropriate computing engines for different nodes, and execute the appropriate computing engines respectively to obtain a data processing result, thereby fully exerting the advantages of each computing engine, and improving the computing speed and the resource utilization rate.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
FIG. 6 is a block diagram illustrating an electronic device for data processing in accordance with an exemplary embodiment.
In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as a memory comprising instructions, executable by a processor of an electronic device to perform the above-described method is also provided. Alternatively, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, which, when run on a computer, causes the computer to implement the above-described method of data processing.
As can be seen from the above, the technical solutions provided by the embodiments of the present disclosure, based on the characteristics of multiple computing engines, perform structural adjustments such as merging, splitting, and moving on nodes in the initial rule tree on the premise of not changing the operation result, select appropriate computing engines for different nodes, and perform the operations respectively to obtain data processing results, thereby fully exerting the advantages of each computing engine, and improving the computing speed and resource utilization rate.
Fig. 7 is a block diagram illustrating an apparatus 800 for data processing in accordance with an example embodiment.
For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast electronic device, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 7, the apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power supply components 807 provide power to the various components of device 800. The power components 807 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed state of the device 800, the relative positioning of the components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or a component of the apparatus 800, the presence or absence of user contact with the apparatus 800, orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communication between the apparatus 800 and other devices in a wired or wireless manner. The apparatus 800 may access a wireless network based on a communication standard, such as WiFi, an operator network (such as 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the methods of the first and second aspects.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. Alternatively, for example, the storage medium may be a non-transitory computer-readable storage medium, such as a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the data processing method of any of the above embodiments.
As can be seen from the above, the technical solutions provided by the embodiments of the present disclosure, based on the characteristics of multiple computing engines, perform structural adjustments such as merging, splitting, and moving on nodes in an initial rule tree on the premise of not changing an operation result, select appropriate computing engines for different nodes, and execute the appropriate computing engines respectively to obtain a data processing result, thereby fully exerting the advantages of each computing engine, and improving the computing speed and the resource utilization rate.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements that have been described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (10)
1. A method of data processing, comprising:
generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logical relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logical relationship between connected child nodes;
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and combining the leaf nodes with the same father nodes according to the corresponding label rules to obtain a simplified rule tree;
sequentially determining target calculation engines according to a preset priority order, traversing nodes in the simplified rule tree according to the sequence from top to bottom, and marking any unmarked node as a calculation node of the target calculation engine under the condition that any unmarked node meets the applicable condition of the target calculation engine until the marking of the root node in the simplified rule tree is finished;
converting the simplified rule tree into an execution plan tree according to the node marks, wherein leaf nodes of the execution plan tree are used for representing a computing unit, and non-leaf nodes of the execution plan tree are used for representing the logical relation between the connected child nodes;
and processing the data to be processed based on the execution plan tree to obtain a processing result.
2. The data processing method of claim 1, wherein traversing the nodes in the initial rule tree in the order from bottom to top, and merging the leaf nodes having the same parent node according to the corresponding label rule to obtain a simplified rule tree, comprises:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed nodes as target nodes;
under the condition that the child nodes of the target node comprise a plurality of leaf nodes, obtaining label rules corresponding to the leaf nodes, wherein the label rules comprise label names, operation rules and label values;
and taking the nodes with the same winning signatures in the leaf nodes as candidate nodes, and combining the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet the combination conditions to obtain the simplified rule tree.
3. The data processing method of claim 1, wherein traversing the nodes in the simplified rule tree in a top-to-bottom order, and in a case that any unmarked node satisfies an applicable condition of the target computation engine, marking the unmarked node as a computation node of the target computation engine until a root node in the simplified rule tree is marked completely comprises:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the currently traversed nodes as target nodes;
under the condition that the target node is not marked, if the target node meets the applicable condition of the target computing engine, marking the target node as a computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child node exists, and marking the root node.
4. The data processing method of claim 3, wherein if the target node is not marked, the method further comprises:
and under the condition that the lower nodes of the target nodes meet the adjustment conditions of the target computing engine, merging the lower nodes meeting the adjustment conditions into merged sub-nodes of the target nodes, and marking the merged sub-nodes as computing nodes of the target computing engine.
5. The data processing method of claim 1, wherein said converting the reduced rule tree into an execution plan tree according to the node labels comprises:
and converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to the any node according to the node mark to obtain an execution plan tree.
6. A data processing apparatus, comprising:
the generating unit is configured to execute generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logical relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logical relationship between connected child nodes;
the merging unit is configured to traverse the nodes in the initial rule tree according to the sequence from bottom to top, and merge the leaf nodes with the same father nodes according to the corresponding label rules to obtain a simplified rule tree;
the marking unit is configured to execute the steps of sequentially determining target calculation engines according to a preset priority order, traversing nodes in the simplified rule tree according to the sequence from top to bottom, and marking any unmarked node as a calculation node of the target calculation engine under the condition that any unmarked node meets the applicable condition of the target calculation engine until the root node in the simplified rule tree is marked;
a conversion unit configured to perform conversion of the simplified rule tree into an execution plan tree according to the node markers, wherein leaf nodes of the execution plan tree are used for representing the calculation unit, and non-leaf nodes of the execution plan tree are used for representing the logical relationship between the connected child nodes;
and the processing unit is configured to execute processing on the data to be processed based on the execution plan tree to obtain a processing result.
7. The data processing apparatus according to claim 6, wherein the merging unit is configured to perform:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed nodes as target nodes;
under the condition that the child nodes of the target node comprise a plurality of leaf nodes, acquiring label rules corresponding to the leaf child nodes, wherein the label rules comprise label names, operation rules and label values;
and taking the nodes with the same winning signatures in the leaf nodes as candidate nodes, and combining the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet the combination conditions to obtain the simplified rule tree.
8. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data processing method of any one of claims 1 to 5.
9. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of a data processing electronic device, enable the data processing electronic device to perform the data processing method of any one of claims 1 to 5.
10. A computer program product comprising a computer program, characterized in that the computer program realizes the data processing method of any of claims 1-5 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210498840.XA CN114925092B (en) | 2022-05-09 | 2022-05-09 | Data processing method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210498840.XA CN114925092B (en) | 2022-05-09 | 2022-05-09 | Data processing method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114925092A true CN114925092A (en) | 2022-08-19 |
CN114925092B CN114925092B (en) | 2023-05-30 |
Family
ID=82808182
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210498840.XA Active CN114925092B (en) | 2022-05-09 | 2022-05-09 | Data processing method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114925092B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117038002A (en) * | 2023-10-08 | 2023-11-10 | 之江实验室 | Method and device for generating observation variable in drug evaluation research |
CN117675507A (en) * | 2023-11-13 | 2024-03-08 | 北京国电通网络技术有限公司 | Abnormal node terminal alarm method, electronic device and computer readable medium |
CN118394348A (en) * | 2024-06-28 | 2024-07-26 | 浪潮电子信息产业股份有限公司 | Operator scheduling method, device, equipment and computer readable storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050187907A1 (en) * | 2004-02-20 | 2005-08-25 | Microsoft Corporation | Systems and methods for updating a query engine opcode tree |
CN103678589A (en) * | 2013-12-12 | 2014-03-26 | 用友软件股份有限公司 | Database kernel query optimization method based on equivalence class |
CN107943929A (en) * | 2017-11-22 | 2018-04-20 | 福州大学 | The automatic generating method of wrapper being abstracted based on dom tree |
CN108038215A (en) * | 2017-12-22 | 2018-05-15 | 上海达梦数据库有限公司 | Data processing method and system |
CN109815389A (en) * | 2019-02-02 | 2019-05-28 | 北京三快在线科技有限公司 | Using the node matching method, apparatus and computer equipment of regulation engine |
US20190197141A1 (en) * | 2017-12-22 | 2019-06-27 | International Business Machines Corporation | Interactive adjustment of decision rules |
CN112464620A (en) * | 2020-09-23 | 2021-03-09 | 航天信息股份有限公司企业服务分公司 | Implementation method and implementation system of financial rule engine |
-
2022
- 2022-05-09 CN CN202210498840.XA patent/CN114925092B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050187907A1 (en) * | 2004-02-20 | 2005-08-25 | Microsoft Corporation | Systems and methods for updating a query engine opcode tree |
CN103678589A (en) * | 2013-12-12 | 2014-03-26 | 用友软件股份有限公司 | Database kernel query optimization method based on equivalence class |
CN107943929A (en) * | 2017-11-22 | 2018-04-20 | 福州大学 | The automatic generating method of wrapper being abstracted based on dom tree |
CN108038215A (en) * | 2017-12-22 | 2018-05-15 | 上海达梦数据库有限公司 | Data processing method and system |
US20190197141A1 (en) * | 2017-12-22 | 2019-06-27 | International Business Machines Corporation | Interactive adjustment of decision rules |
CN109815389A (en) * | 2019-02-02 | 2019-05-28 | 北京三快在线科技有限公司 | Using the node matching method, apparatus and computer equipment of regulation engine |
CN112464620A (en) * | 2020-09-23 | 2021-03-09 | 航天信息股份有限公司企业服务分公司 | Implementation method and implementation system of financial rule engine |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117038002A (en) * | 2023-10-08 | 2023-11-10 | 之江实验室 | Method and device for generating observation variable in drug evaluation research |
CN117038002B (en) * | 2023-10-08 | 2024-02-13 | 之江实验室 | Method and device for generating observation variable in drug evaluation research |
CN117675507A (en) * | 2023-11-13 | 2024-03-08 | 北京国电通网络技术有限公司 | Abnormal node terminal alarm method, electronic device and computer readable medium |
CN118394348A (en) * | 2024-06-28 | 2024-07-26 | 浪潮电子信息产业股份有限公司 | Operator scheduling method, device, equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114925092B (en) | 2023-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114925092B (en) | Data processing method and device, electronic equipment and storage medium | |
EP3767488A1 (en) | Method and device for processing untagged data, and storage medium | |
CN109961094B (en) | Sample acquisition method and device, electronic equipment and readable storage medium | |
CN110781323A (en) | Method and device for determining label of multimedia resource, electronic equipment and storage medium | |
CN106126592B (en) | Processing method and device for search data | |
CN112949983B (en) | Root cause determining method and root cause determining device | |
CN110930984A (en) | Voice processing method and device and electronic equipment | |
CN113849723A (en) | Search method and search device | |
CN113435205B (en) | Semantic analysis method and device | |
CN114547421A (en) | Search processing method and device, electronic equipment and storage medium | |
CN113609380A (en) | Label system updating method, searching method, device and electronic equipment | |
CN113779257A (en) | Method, device, equipment, medium and product for analyzing text classification model | |
CN112328809A (en) | Entity classification method, device and computer readable storage medium | |
CN111275089A (en) | Classification model training method and device and storage medium | |
CN116582417A (en) | Data processing method, device, computer equipment and storage medium | |
CN110147426B (en) | Method for determining classification label of query text and related device | |
CN110659726B (en) | Image processing method and device, electronic equipment and storage medium | |
CN113868247A (en) | Data processing method, data processing apparatus, electronic device, storage medium, and program product | |
CN110362686B (en) | Word stock generation method and device, terminal equipment and server | |
CN115376504A (en) | Voice interaction method and device for intelligent product and readable storage medium | |
CN114329003B (en) | Media resource data processing method and device, electronic equipment and storage medium | |
CN114338587B (en) | Multimedia data processing method and device, electronic equipment and storage medium | |
CN113377780B (en) | Database slicing method and device, electronic equipment and readable storage medium | |
CN112699259B (en) | Information display method and device, electronic equipment and computer readable storage medium | |
CN114092617A (en) | Sample data generation method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |