CN114925092B - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114925092B
CN114925092B CN202210498840.XA CN202210498840A CN114925092B CN 114925092 B CN114925092 B CN 114925092B CN 202210498840 A CN202210498840 A CN 202210498840A CN 114925092 B CN114925092 B CN 114925092B
Authority
CN
China
Prior art keywords
node
nodes
rule
tree
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210498840.XA
Other languages
Chinese (zh)
Other versions
CN114925092A (en
Inventor
李昂
吴兆跃
张型龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202210498840.XA priority Critical patent/CN114925092B/en
Publication of CN114925092A publication Critical patent/CN114925092A/en
Application granted granted Critical
Publication of CN114925092B publication Critical patent/CN114925092B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to a data processing method, a device, an electronic device and a storage medium, comprising: generating an initial rule tree according to a pre-acquired data processing rule; traversing the nodes in the initial rule tree, and merging the leaf nodes with the same parent nodes according to the corresponding label rules to obtain a simplified rule tree; sequentially determining target computing engines, traversing nodes in the simplified rule tree according to the sequence from top to bottom, and marking any unlabeled node as a computing node of the target computing engine under the condition that any unlabeled node meets the applicable condition of the target computing engine until the marking of a root node in the simplified rule tree is completed; converting the simplified rule tree into an execution plan tree; and processing the data to be processed based on the execution plan tree to obtain a processing result. Therefore, based on the characteristics of various calculation engines, proper calculation engines are selected for different nodes, so that the calculation speed and the resource utilization rate are improved.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The disclosure relates to the field of data analysis, and in particular relates to a data processing method, a data processing device, electronic equipment and a storage medium.
Background
In many business scenarios, business personnel need to select object groups according to object attribute circles of different objects by designating information such as signature, operator, label value and the like according to a custom rule, so that corresponding business activities can be carried out on each object group.
In the prior art, when object circling is performed, all circling conditions are spliced into one sql statement by using a hive sql (hive Structured Query Language, structured query language based on a hive storage engine) mode, and a final result is obtained after the sql statement is executed.
However, for complex circle selection rules, a long sql statement is spliced, a plurality of queries are nested in the sql statement, the hive engine is used for querying, the number is 1-2 hours, the number is more than 10 hours, the query speed is low, the resource occupation condition is serious, and other queries may be blocked.
Disclosure of Invention
The disclosure provides a data processing method, a data processing device, an electronic device and a storage medium, so as to at least solve the problems that in the related art, the query speed is low, the resource occupation condition is serious, and other queries may be blocked. The technical scheme of the present disclosure is as follows:
According to a first aspect of an embodiment of the present disclosure, there is provided a data processing method, including:
generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logic relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logic relationship between connected child nodes;
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and merging the leaf nodes with the same parent nodes according to the corresponding label rules to obtain a simplified rule tree;
sequentially determining target computing engines according to a preset priority order, traversing nodes in the simplified rule tree according to the order from top to bottom, and marking any unlabeled node as a computing node of the target computing engine until the root node in the simplified rule tree is marked up under the condition that any unlabeled node meets the applicable condition of the target computing engine;
converting the simplified rule tree into an execution plan tree according to the node mark, wherein leaf nodes of the execution plan tree are used for representing a computing unit, and non-leaf nodes of the execution plan tree are used for representing logic relations among connected child nodes;
And processing the data to be processed based on the execution plan tree to obtain a processing result.
Optionally, traversing the nodes in the initial rule tree according to the sequence from bottom to top, and merging the leaf nodes with the same parent node according to the corresponding label rule to obtain a simplified rule tree, which comprises:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed node as a target node;
under the condition that a plurality of leaf nodes are included in the child nodes of the target node, acquiring a label rule corresponding to the included leaf nodes, wherein the label rule comprises a label name, an operation rule and a label value;
and taking the nodes with the same label marks in the included leaf nodes as candidate nodes, and merging the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet merging conditions to obtain a simplified rule tree.
Optionally, traversing the nodes in the simplified rule tree in the order from top to bottom, and marking any unlabeled node as a computing node of the target computing engine until the root node in the simplified rule tree is marked completely under the condition that any unlabeled node meets the applicable condition of the target computing engine, including:
Traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the node traversed currently as a target node;
if the target node meets the applicable conditions of the target computing engine under the condition that the target node is not marked, marking the target node as the computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child nodes exist, and marking the root node.
Optionally, if the target node does not meet the applicable condition of the target computing engine, the method further includes:
and under the condition that the lower node of the target node meets the adjustment condition of the target computing engine, merging the lower node meeting the adjustment condition into a merging sub-node of the target node, and marking the merging sub-node as the computing node of the target computing engine.
Optionally, the converting the simplified rule tree into an execution plan tree according to the node mark includes:
and according to the node mark, converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to the any node to obtain an execution plan tree.
According to a second aspect of embodiments of the present disclosure, there is provided a data processing apparatus comprising:
the generation unit is configured to execute a data processing rule obtained in advance to generate an initial rule tree, wherein the data processing rule comprises at least one label rule and a logic relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logic relationship between the connected child nodes;
the merging unit is configured to traverse the nodes in the initial rule tree according to the sequence from bottom to top, and merge the leaf nodes with the same father nodes according to the corresponding label rules to obtain a simplified rule tree;
the marking unit is configured to execute the steps of sequentially determining target computing engines according to a preset priority order, traversing nodes in the simplified rule tree according to the order from top to bottom, and marking any unlabeled node as a computing node of the target computing engine until the marking of a root node in the simplified rule tree is completed under the condition that any unlabeled node meets the applicable condition of the target computing engine;
A conversion unit configured to perform conversion of the simplified rule tree into an execution plan tree according to the node marking, leaf nodes of the execution plan tree being used to characterize the calculation unit, non-leaf nodes of the execution plan tree being used to characterize logical relationships between connected child nodes;
and the processing unit is configured to execute the processing of the data to be processed based on the execution plan tree to obtain a processing result.
Optionally, the merging unit is configured to perform:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed node as a target node;
under the condition that a plurality of leaf nodes are included in the child nodes of the target node, acquiring a label rule corresponding to the included leaf nodes, wherein the label rule comprises a label name, an operation rule and a label value;
and taking the nodes with the same label marks in the included leaf nodes as candidate nodes, and merging the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet merging conditions to obtain a simplified rule tree.
Optionally, the marking unit is further configured to perform:
Traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the node traversed currently as a target node;
if the target node meets the applicable conditions of the target computing engine under the condition that the target node is not marked, marking the target node as the computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child nodes exist, and marking the root node.
Optionally, the marking unit is further configured to perform:
and under the condition that the target node is not marked, if the target node does not meet the application condition of the target computing engine, merging the lower-layer node meeting the adjustment condition into a merging sub-node of the target node under the condition that the lower-layer node of the target node meets the adjustment condition of the target computing engine, and marking the merging sub-node as the computing node of the target computing engine.
Optionally, the conversion unit is configured to perform:
and according to the node mark, converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to the any node to obtain an execution plan tree.
According to a third aspect of embodiments of the present disclosure, there is provided a data processing electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data processing method of any of the above.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium, which when executed by a processor of a data processing electronic device, causes the data processing electronic device to perform any one of the data processing methods described above.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program/instruction which, when executed by a processor, implements a data processing method according to any one of the preceding claims.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logic relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logic relationship between connected child nodes; traversing the nodes in the initial rule tree according to the sequence from bottom to top, and merging the leaf nodes with the same parent nodes according to the corresponding label rules to obtain a simplified rule tree; sequentially determining target computing engines according to a preset priority order, traversing nodes in the simplified rule tree according to the order from top to bottom, and marking any unlabeled node as a computing node of the target computing engine until the root node in the simplified rule tree is marked under the condition that any unlabeled node meets the applicable condition of the target computing engine; according to the node marks, converting the simplified rule tree into an execution plan tree, wherein leaf nodes of the execution plan tree are used for representing the calculation unit, and non-leaf nodes of the execution plan tree are used for representing the logic relationship between the connected child nodes; and processing the data to be processed based on the execution plan tree to obtain a processing result.
Based on the characteristics of various calculation engines, on the premise of not changing the operation result, the nodes in the initial rule tree are subjected to structural adjustment such as combination, splitting and movement, and appropriate calculation engines are selected for different nodes and respectively executed, so that the data processing result is obtained, the advantages of each calculation engine can be fully exerted, and the calculation speed and the resource utilization rate are improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.
FIG. 1 is a flow chart illustrating a method of data processing according to an exemplary embodiment.
FIG. 2 is a flow diagram of a simplified rule tree obtained by merging an initial rule tree according to an exemplary embodiment.
FIG. 3 is a flow diagram of marking a simplified rule tree in accordance with an exemplary embodiment.
FIG. 4 is a logic diagram that is shown in accordance with an exemplary embodiment.
Fig. 5 is a block diagram of a data processing apparatus according to an exemplary embodiment.
FIG. 6 is a block diagram of an electronic device for data processing, according to an example embodiment.
Fig. 7 is a block diagram illustrating an apparatus for data processing according to an example embodiment.
Detailed Description
In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
Fig. 1 is a flowchart illustrating a data processing method according to an exemplary embodiment, including the following steps, as shown in fig. 1.
In step S11, an initial rule tree is generated according to a pre-acquired data processing rule, where the data processing rule includes at least one label rule and a logical relationship between label rules, the initial rule tree includes leaf nodes and non-leaf nodes, the leaf nodes are used to characterize the label rules, and the non-leaf nodes are used to characterize the logical relationship between connected child nodes.
In the object group circling service, the tag may refer to an attribute of an object, such as gender, age, and interest, that is, a tag rule refers to a condition for performing data processing, and when a service person performs rule circling, a target signature, an operator, a tag value, such as an age range of the object, etc. A round of selection may include a plurality of labels, which may have a combination of intersecting, merging, differencing, and nesting relationships.
In the present disclosure, by parsing the data processing rules entered by the business personnel, the tag rules may be repackaged into leaf nodes, the inter-tag conditions may be repackaged into non-leaf nodes, and recursively combined into an initial rule tree. In the initial rule tree, the leaf nodes are used for representing label rules, the non-leaf nodes are used for representing logic relations among connected child nodes, for example, the circling rules input by service personnel can be mapped into a tree, the leaf nodes of the tree are label rules, the label rules comprise label marks, operators and label values, and the non-leaf nodes are logic relations and can be intersection, union, difference and the like.
In step S12, the nodes in the initial rule tree are traversed in the order from bottom to top, and the same leaf nodes of the parent nodes are combined according to the corresponding label rules, so as to obtain a simplified rule tree.
In one implementation, traversing nodes in an initial rule tree in a sequence from bottom to top, merging leaf nodes with identical parent nodes according to corresponding label rules to obtain a simplified rule tree, including:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the node traversed currently as a target node; under the condition that a plurality of leaf nodes are included in a child node of a target node, acquiring a label rule corresponding to the included leaf node, wherein the label rule comprises a label name, an operation rule and a label value; and merging the candidate nodes to obtain a simplified rule tree under the condition that the operation rules and the label values of the candidate nodes meet the merging conditions by taking the nodes with the same label marks in the included leaf nodes as the candidate nodes.
As shown in fig. 2, a flow diagram of the simplified rule tree is obtained by merging the initial rule tree. That is, if the child node of the node includes a plurality of leaf nodes, and there are leaf nodes with the same label name, it is determined whether the leaf nodes with the same label name satisfy the merging condition, and if so, the nodes satisfying the condition are merged into one node. For example, the condition [ gender=men's or gender=women ] may be combined to [ gender in (men, women) ].
It can be understood that, through traversing the initial rule tree, the tag rules represented by each leaf node are combed and combined, so that two judging processes are required to be performed for two different tag rules represented by two leaf nodes originally according to the initial rule tree, and in the simplified rule tree obtained in the step, only one leaf node is corresponding, that is, the same processing result can be obtained only by one processing, therefore, the efficiency of object ring selection can be further improved.
In step S13, the target computing engines are sequentially determined according to the preset priority order, and the nodes in the simplified rule tree are traversed according to the order from top to bottom, and under the condition that any unmarked node meets the applicable condition of the target computing engine, any unmarked node is marked as the computing node of the target computing engine until the marking of the root node in the simplified rule tree is completed.
The computing engine refers to a tool capable of performing data processing in a certain storage medium by using a specific method, and in different services, a self-defined computing engine can be generated in advance, so that the computing efficiency is further improved, that is, the applicable conditions of the computing engine can be self-defined and set, for example, for a clickhouse bitmap (bitmap) computing engine, the computing engine is applicable to enumeration rule conditions, for example, screening tags such as gender, region and the like of an object; for clickhousql calculation engines, it is applicable to continuous numerical class rule conditions, such as screening tags for payment amount, online time length, etc. of objects, etc.
The bitmap is a high-efficiency data structure, uses continuous binary bit (bit) storage, is used for inquiring and de-duplicating a large amount of integer data, has wide application in the aspects of indexing, data compression and the like, and is a storage engine, and the bitmap is characterized by high calculation speed and is not suitable for complex calculation.
In one implementation, this step may include:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the node traversed currently as a target node; under the condition that the target node is not marked, if the target node meets the applicable conditions of the target computing engine, marking the target node as the computing node of the target computing engine; and under the condition that the target node is marked, traversing the next node until no unmarked child nodes exist, and marking the root node.
That is, the simplified rule tree is traversed, and the applicable computing engine is selected for each node according to the specific condition of the node until each node in the simplified rule tree is marked, that is, each node in the simplified rule tree finds the corresponding computing engine, so that in the subsequent processing process, the computing engine applicable to each node can be called by the computing engine, and the computing effect is faster and better as the applicability is higher, thereby being beneficial to improving the efficiency of object circling.
In one implementation, if the target node does not meet the applicable condition of the target computing engine under the condition that the target node is not marked, the method further includes:
and under the condition that the lower node of the target node meets the adjustment condition of the target computing engine, merging the lower node meeting the adjustment condition into a merging sub-node of the target node, and marking the merging sub-node as the computing node of the target computing engine. The adjustment conditions are similar to the foregoing applicable conditions, and may be set by user-definition, which is not limited in this disclosure.
It can be understood that the simplified rule tree is a multi-layer tree structure, so that in the traversal process, under the condition that the target node does not meet the applicable conditions of the target computing engine, the child nodes of the target node can be further traversed, so that the nodes of each layer can determine the applicable computing engine, and the efficiency of object ring selection is further improved.
As shown in fig. 3, a schematic flow chart for marking a simplified rule tree is shown. That is, all custom compute engines are traversed from high to low by priority, and for each compute engine, the reduced rule tree is traversed top-down. For each node of the reduced rule tree, the following logical processing is performed: and judging whether the node is marked completely, if not, continuously judging whether the current computing engine is applicable, and if so, marking that the current node is marked completely and marking that the current computing engine is processed. If the current node is not applicable and the current node comprises child nodes, attempting to adjust the lower nodes of the current node, merging the lower nodes meeting the conditions into one child node of the current node, marking that the processing is finished and processing the current computing engine. And if all the child nodes of the current node are marked, marking the current node to be processed. And (5) performing marking adjustment circularly until the root node is marked, and exiting the circulation.
In step S14, the simplified rule tree is converted into an execution plan tree according to the node labels, the leaf nodes of the execution plan tree being used to characterize the computing units, the non-leaf nodes of the execution plan tree being used to characterize the logical relationships between the connected child nodes.
In one implementation, converting a reduced rule tree into an execution plan tree based on node labels, includes:
and according to the node marks, converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to any node to obtain an execution plan tree.
That is, the simplified rule tree is traversed, and the nodes marked with the calculation engines are converted into the execution plan nodes, wherein the execution plan nodes are calculation units which can be directly identified by the calculation engines, the leaf nodes of the execution plan tree are calculation units, and the non-leaf nodes are in a logic relationship, so that the subsequent processing of the execution plan tree is facilitated, and the object circling efficiency is further improved.
In step S15, the data to be processed is processed based on the execution plan tree, and a processing result is obtained.
In the step, when data processing is performed, a calculation unit is executed by a custom execution engine by traversing leaf nodes of an execution plan tree, so as to obtain a sub-result. And recursively collecting the sub-results into a final result according to the operation rule of the execution plan tree.
In this way, through structure adjustment of the initial rule tree, a simplified rule tree is obtained, the complexity of the simplified rule tree is lower than that of the initial rule tree, and further, according to different applicable calculation engines, nodes in the simplified rule tree are aggregated to obtain an execution plan tree, so that the object circling and selecting efficiency is further improved.
As shown in fig. 4, a logic diagram of the present solution is shown. Compared with the traditional hive sql mode, the method is applied to the object circling and selecting service, the average speed of object circling and selecting is reduced from 2 hours to 5 minutes, and the calculation speed and the resource utilization rate are greatly improved.
Specifically, first, an initial rule tree may be generated according to an object-looping rule, where a "relation" node is a non-leaf node used to characterize logical relationships between connected child nodes, and a "label" node is a leaf node used to characterize a label rule. Then, the nodes in the initial rule tree can be traversed according to the sequence from bottom to top, the same leaf nodes of the father node are combined according to the corresponding label rules to obtain a simplified rule tree, as shown in fig. 4, two 'label' nodes in the dotted line represent two leaf nodes which can be combined, and after the two leaf nodes are combined, the simplified rule tree is obtained. Furthermore, the target computing engine is sequentially determined according to the preset priority order, the nodes in the simplified rule tree are traversed according to the order from top to bottom, any unlabeled node is marked as the computing node of the target computing engine under the condition that any unlabeled node meets the applicable condition of the target computing engine until the marking of the root node in the simplified rule tree is completed, as shown in fig. 4, the "related" node and the corresponding "label" leaf node in the dotted line, and the "label" leaf node on the other side are all the nodes to which the computing engine "engine 1" is applicable, and based on the nodes, the three nodes and the nodes applicable to other computing engines can be marked separately. Then, according to the node labels, the simplified rule tree is converted into an execution plan tree, the leaf nodes of the execution plan tree are used for representing the computing units, the non-leaf nodes of the execution plan tree are used for representing the logic relations among the connected child nodes, as shown in fig. 4, the computing engine labels the same node in the simplified rule tree to correspond to one node in the execution plan tree, and each node is processed by different computing units, namely, a computing unit 1, a computing unit 2 and a computing unit 3. Finally, the data to be processed can be processed based on the execution plan tree to obtain a processing result, specifically, the computing result sub result corresponding to each computing unit is obtained respectively, and then the computing results are converged to obtain the processing result final result corresponding to the object circling rule.
Based on the characteristics of multiple computing engines, the technical scheme provided by the embodiment of the disclosure performs structural adjustment such as merging, splitting, moving and the like on the nodes in the initial rule tree on the premise of not changing the operation result, selects proper computing engines for different nodes, and respectively executes the computing engines to obtain a data processing result, so that the advantages of each computing engine can be fully exerted, and the computing speed and the resource utilization rate can be improved.
FIG. 5 is a block diagram of a data processing apparatus according to an exemplary embodiment, the apparatus comprising:
the generation unit is configured to execute a data processing rule obtained in advance to generate an initial rule tree, wherein the data processing rule comprises at least one label rule and a logic relationship between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, and the non-leaf nodes are used for representing the logic relationship between the connected child nodes;
the merging unit is configured to traverse the nodes in the initial rule tree according to the sequence from bottom to top, and merge the leaf nodes with the same father nodes according to the corresponding label rules to obtain a simplified rule tree;
The marking unit is configured to execute the steps of sequentially determining target computing engines according to a preset priority order, traversing nodes in the simplified rule tree according to the order from top to bottom, and marking any unlabeled node as a computing node of the target computing engine until the marking of a root node in the simplified rule tree is completed under the condition that any unlabeled node meets the applicable condition of the target computing engine;
a conversion unit configured to perform conversion of the simplified rule tree into an execution plan tree according to the node marking, leaf nodes of the execution plan tree being used to characterize the calculation unit, non-leaf nodes of the execution plan tree being used to characterize logical relationships between connected child nodes;
and the processing unit is configured to execute the processing of the data to be processed based on the execution plan tree to obtain a processing result.
In one implementation, the merging unit is configured to perform:
traversing the nodes in the initial rule tree according to the sequence from bottom to top, and taking the currently traversed node as a target node;
under the condition that a plurality of leaf nodes are included in the child nodes of the target node, acquiring a label rule corresponding to the included leaf nodes, wherein the label rule comprises a label name, an operation rule and a label value;
And taking the nodes with the same label marks in the included leaf nodes as candidate nodes, and merging the candidate nodes under the condition that the operation rules and the label values of the candidate nodes meet merging conditions to obtain a simplified rule tree.
In one implementation, the marking unit is further configured to perform:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the node traversed currently as a target node;
if the target node meets the applicable conditions of the target computing engine under the condition that the target node is not marked, marking the target node as the computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child nodes exist, and marking the root node.
In one implementation, the marking unit is further configured to perform:
and under the condition that the target node is not marked, if the target node does not meet the application condition of the target computing engine, merging the lower-layer node meeting the adjustment condition into a merging sub-node of the target node under the condition that the lower-layer node of the target node meets the adjustment condition of the target computing engine, and marking the merging sub-node as the computing node of the target computing engine.
In one implementation, the conversion unit is configured to perform:
and according to the node mark, converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to the any node to obtain an execution plan tree.
From the above, it can be seen that, according to the technical solution provided by the embodiments of the present disclosure, based on the characteristics of multiple computing engines, on the premise of not changing the operation result, the nodes in the initial rule tree are subjected to structural adjustment such as merging, splitting, moving, and the like, and appropriate computing engines are selected for different nodes and respectively executed, so as to obtain a data processing result, thereby fully playing the advantages of each computing engine, and improving the computing speed and the resource utilization rate.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
FIG. 6 is a block diagram of an electronic device for data processing, according to an example embodiment.
In an exemplary embodiment, a computer-readable storage medium is also provided, such as a memory, comprising instructions executable by a processor of an electronic device to perform the above-described method. Alternatively, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
In an exemplary embodiment, a computer program product is also provided which, when run on a computer, causes the computer to carry out the method of data processing described above.
From the above, it can be seen that, according to the technical solution provided by the embodiments of the present disclosure, based on the characteristics of multiple computing engines, on the premise of not changing the operation result, the nodes in the initial rule tree are subjected to structural adjustment such as merging, splitting, moving, and the like, and appropriate computing engines are selected for different nodes and respectively executed, so as to obtain a data processing result, thereby fully playing the advantages of each computing engine, and improving the computing speed and the resource utilization rate.
Fig. 7 is a block diagram illustrating an apparatus 800 for data processing according to an example embodiment.
For example, apparatus 800 may be a mobile phone, computer, digital broadcast electronic device, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to fig. 7, apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the apparatus 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the device 800. Examples of such data include instructions for any application or method operating on the device 800, contact data, phonebook data, messages, pictures, videos, and the like. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
Power supply component 807 provides power to the various components of device 800. Power supply component 807 can include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for device 800.
The multimedia component 808 includes a screen between the device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the apparatus 800. For example, the sensor assembly 814 may detect an on/off state of the device 800, a relative positioning of the components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or one component of the apparatus 800, the presence or absence of user contact with the apparatus 800, an orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communication between the apparatus 800 and other devices, either in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic elements for executing the methods described in the first and second aspects.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including instructions executable by processor 820 of apparatus 800 to perform the above-described method. Alternatively, for example, the storage medium may be a non-transitory computer-readable storage medium, which may be, for example, ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
In an exemplary embodiment, a computer program product containing instructions is also provided which, when run on a computer, cause the computer to perform the data processing method of any of the above embodiments.
From the above, it can be seen that, according to the technical solution provided by the embodiments of the present disclosure, based on the characteristics of multiple computing engines, on the premise of not changing the operation result, the nodes in the initial rule tree are subjected to structural adjustment such as merging, splitting, moving, and the like, and appropriate computing engines are selected for different nodes and respectively executed, so as to obtain a data processing result, thereby fully playing the advantages of each computing engine, and improving the computing speed and the resource utilization rate.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (9)

1. A method of data processing, comprising:
generating an initial rule tree according to a pre-acquired data processing rule, wherein the data processing rule comprises at least one label rule and a logic relation between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rules, the non-leaf nodes are used for representing the logic relation between connected child nodes, the labels refer to data items used for data processing, the label rules refer to conditions used for data processing, the initial rule tree is formed by analyzing the data processing rule input by a service staff, packaging the label rules into leaf nodes, packaging the logic relation between the label rules into non-leaf nodes and recursively combining the label rules;
traversing the nodes in the initial rule tree according to the sequence from bottom to top, merging the leaf nodes with the same parent nodes according to the corresponding label rules to obtain a simplified rule tree, wherein the method comprises the following steps: traversing the nodes in the initial rule tree according to the sequence from bottom to top, taking the node traversed currently as a target node, acquiring a label rule corresponding to the included leaf node when a plurality of leaf nodes are included in a child node of the target node, wherein the label rule comprises a label name, an operation rule and a label value, taking the node with the same label name in the included leaf node as a candidate node, and merging the candidate nodes when the operation rule and the label value of the candidate node meet a merging condition to obtain a simplified rule tree;
Sequentially determining target computing engines according to a preset priority order, traversing nodes in the simplified rule tree according to the order from top to bottom, marking any unlabeled node as a computing node of the target computing engine under the condition that any unlabeled node meets the applicable condition of the target computing engine until the marking of a root node in the simplified rule tree is completed, wherein the computing node of the target computing engine is obtained by marking merging sub-nodes, and the merging sub-nodes are obtained by merging the lower-layer node meeting the adjustable condition with the unlabeled node under the condition that any unlabeled node does not meet the applicable condition of the target computing engine and the lower-layer node of the unlabeled node meets the adjustable condition of the target computing engine;
converting the simplified rule tree into an execution plan tree according to the node mark, wherein leaf nodes of the execution plan tree are used for representing a computing unit, and non-leaf nodes of the execution plan tree are used for representing logic relations among connected child nodes;
and processing the data to be processed based on the execution plan tree to obtain a processing result.
2. The method according to claim 1, wherein traversing the nodes in the reduced rule tree in the order from top to bottom, marking any unmarked node as a computation node of the target computation engine if the unmarked node satisfies the applicable condition of the target computation engine, until the marking of the root node in the reduced rule tree is completed, comprises:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the node traversed currently as a target node;
if the target node meets the applicable conditions of the target computing engine under the condition that the target node is not marked, marking the target node as the computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child nodes exist, and marking the root node.
3. The data processing method according to claim 1, wherein said converting the simplified rule tree into an execution plan tree according to the node flag includes:
and according to the node mark, converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to the any node to obtain an execution plan tree.
4. A data processing apparatus, comprising:
the generation unit is configured to execute a data processing rule obtained in advance to generate an initial rule tree, wherein the data processing rule comprises at least one label rule and a logic relation between the label rules, the initial rule tree comprises leaf nodes and non-leaf nodes, the leaf nodes are used for representing the label rule, the non-leaf nodes are used for representing the logic relation between connected child nodes, the labels refer to data items used for data processing, the label rules refer to conditions used for data processing, the initial rule tree is formed by analyzing the data processing rule input by a service personnel, encapsulating the label rule into the leaf nodes, encapsulating the logic relation between the label rules into the non-leaf nodes and recursively combining the label rules;
a merging unit configured to perform traversing of nodes in the initial rule tree in a sequence from bottom to top, and merging processing of leaf nodes with identical parent nodes according to corresponding label rules to obtain a simplified rule tree, where the merging unit is configured to perform: traversing the nodes in the initial rule tree according to the sequence from bottom to top, taking the node traversed currently as a target node, acquiring a label rule corresponding to the included leaf node when a plurality of leaf nodes are included in a child node of the target node, wherein the label rule comprises a label name, an operation rule and a label value, taking the node with the same label name in the included leaf node as a candidate node, and merging the candidate nodes when the operation rule and the label value of the candidate node meet a merging condition to obtain a simplified rule tree;
The marking unit is configured to execute the steps of sequentially determining target computing engines according to a preset priority order, traversing nodes in the simplified rule tree according to the order from top to bottom, marking any unlabeled node as a computing node of the target computing engine under the condition that any unlabeled node meets the applicable condition of the target computing engine until the marking of a root node in the simplified rule tree is completed, wherein the computing node of the target computing engine is obtained by marking merging sub-nodes, and the merging sub-nodes are obtained by merging the lower node meeting the adjustable condition with the unlabeled node under the condition that any unlabeled node does not meet the applicable condition of the target computing engine and the lower node of the unlabeled node meets the adjustable condition of the target computing engine;
a conversion unit configured to perform conversion of the simplified rule tree into an execution plan tree according to the node marking, leaf nodes of the execution plan tree being used to characterize the calculation unit, non-leaf nodes of the execution plan tree being used to characterize logical relationships between connected child nodes;
And the processing unit is configured to execute the processing of the data to be processed based on the execution plan tree to obtain a processing result.
5. The data processing apparatus of claim 4, wherein the tagging unit is further configured to perform:
traversing the nodes in the simplified rule tree according to the sequence from top to bottom, and taking the node traversed currently as a target node;
if the target node meets the applicable conditions of the target computing engine under the condition that the target node is not marked, marking the target node as the computing node of the target computing engine;
and under the condition that the target node is marked, traversing the next node until no unmarked child nodes exist, and marking the root node.
6. The data processing apparatus according to claim 4, wherein the conversion unit is configured to perform:
and according to the node mark, converting any node in the simplified rule tree into an execution node of a calculation engine corresponding to the any node to obtain an execution plan tree.
7. An electronic device, comprising:
a processor;
A memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data processing method of any one of claims 1 to 3.
8. A computer readable storage medium, characterized in that instructions in the computer readable storage medium, when executed by a processor of a data processing electronic device, enable the data processing electronic device to perform the data processing method of any one of claims 1 to 3.
9. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the data processing method of any of claims 1-3.
CN202210498840.XA 2022-05-09 2022-05-09 Data processing method and device, electronic equipment and storage medium Active CN114925092B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210498840.XA CN114925092B (en) 2022-05-09 2022-05-09 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210498840.XA CN114925092B (en) 2022-05-09 2022-05-09 Data processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114925092A CN114925092A (en) 2022-08-19
CN114925092B true CN114925092B (en) 2023-05-30

Family

ID=82808182

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210498840.XA Active CN114925092B (en) 2022-05-09 2022-05-09 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114925092B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117038002B (en) * 2023-10-08 2024-02-13 之江实验室 Method and device for generating observation variable in drug evaluation research
CN117675507A (en) * 2023-11-13 2024-03-08 北京国电通网络技术有限公司 Abnormal node terminal alarm method, electronic device and computer readable medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7539691B2 (en) * 2004-02-20 2009-05-26 Microsoft Corporation Systems and methods for updating a query engine opcode tree
CN103678589B (en) * 2013-12-12 2017-02-01 用友网络科技股份有限公司 Database kernel query optimization method based on equivalence class
CN107943929B (en) * 2017-11-22 2021-09-28 福州大学 Wrapper automatic generation method based on DOM tree abstraction
US10831733B2 (en) * 2017-12-22 2020-11-10 International Business Machines Corporation Interactive adjustment of decision rules
CN108038215A (en) * 2017-12-22 2018-05-15 上海达梦数据库有限公司 Data processing method and system
CN109815389A (en) * 2019-02-02 2019-05-28 北京三快在线科技有限公司 Using the node matching method, apparatus and computer equipment of regulation engine
CN112464620A (en) * 2020-09-23 2021-03-09 航天信息股份有限公司企业服务分公司 Implementation method and implementation system of financial rule engine

Also Published As

Publication number Publication date
CN114925092A (en) 2022-08-19

Similar Documents

Publication Publication Date Title
US20210117726A1 (en) Method for training image classifying model, server and storage medium
US20210232847A1 (en) Method and apparatus for recognizing text sequence, and storage medium
US11120078B2 (en) Method and device for video processing, electronic device, and storage medium
CN114925092B (en) Data processing method and device, electronic equipment and storage medium
EP2457183B1 (en) System and method for tagging multiple digital images
US20220272161A1 (en) Method and device for displaying group
CN109961094B (en) Sample acquisition method and device, electronic equipment and readable storage medium
CN109255128B (en) Multi-level label generation method, device and storage medium
EP3767488A1 (en) Method and device for processing untagged data, and storage medium
CN110781323A (en) Method and device for determining label of multimedia resource, electronic equipment and storage medium
CN110930984A (en) Voice processing method and device and electronic equipment
CN111046927B (en) Method and device for processing annotation data, electronic equipment and storage medium
CN113609380B (en) Label system updating method, searching device and electronic equipment
CN116822924A (en) Workflow configuration method, device, equipment and storage medium
CN113849723A (en) Search method and search device
CN113779257A (en) Method, device, equipment, medium and product for analyzing text classification model
CN112783779A (en) Test case generation method and device, electronic equipment and storage medium
CN112328809A (en) Entity classification method, device and computer readable storage medium
CN111079421B (en) Text information word segmentation processing method, device, terminal and storage medium
CN111275089A (en) Classification model training method and device and storage medium
CN114219443A (en) Document data processing method, device and equipment
CN115376504A (en) Voice interaction method and device for intelligent product and readable storage medium
CN114338587B (en) Multimedia data processing method and device, electronic equipment and storage medium
CN115303218B (en) Voice instruction processing method, device and storage medium
CN114154465B (en) Structure reconstruction method and device of structure diagram, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant