CN113297335A - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113297335A
CN113297335A CN202110571687.4A CN202110571687A CN113297335A CN 113297335 A CN113297335 A CN 113297335A CN 202110571687 A CN202110571687 A CN 202110571687A CN 113297335 A CN113297335 A CN 113297335A
Authority
CN
China
Prior art keywords
entity
data
task
entities
task data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110571687.4A
Other languages
Chinese (zh)
Inventor
吴辰侣
刘明鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weikun Shanghai Technology Service Co Ltd
Original Assignee
Weikun Shanghai Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weikun Shanghai Technology Service Co Ltd filed Critical Weikun Shanghai Technology Service Co Ltd
Priority to CN202110571687.4A priority Critical patent/CN113297335A/en
Publication of CN113297335A publication Critical patent/CN113297335A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to research and development management and discloses a data processing method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring first task data from a first message queue, wherein the first task data is predefined task data; acquiring second task data from the second message queue, wherein the second task data is data when the first task data is executed; generating a graph model according to the first task data and the second task data, wherein the graph model comprises at least two entities, incidence relations among different entities in the at least two entities and access information of each entity in the at least two entities; and determining a first entity and a second entity which have faults in at least two entities according to the graph model, wherein the first entity is determined according to the access information of the first entity, and the second entity is determined according to the incidence relation between the first entity and the second entity. By implementing the embodiment of the application, a plurality of entities with faults can be determined.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.
Background
With the prevalence of cloud computing, big data and artificial intelligence, enterprises accumulate massive data. In order to better manage the mass data, a metadata management system is often designed to manage the mass data. Generally, when a metadata management system is used to manage mass data, only one entity with a fault can be determined, and the application scenario is single.
Disclosure of Invention
The embodiment of the application provides a data processing method and device, electronic equipment and a storage medium, which can determine a plurality of entities with faults.
A first aspect of the present application provides a data processing method, including:
acquiring first task data from a first message queue, wherein the first task data is predefined task data;
acquiring second task data from a second message queue, wherein the second task data is data when the first task data is executed;
generating a graph model according to the first task data and the second task data, wherein the graph model comprises at least two entities, incidence relations among different entities in the at least two entities and access information of each entity in the at least two entities;
and determining a first entity and a second entity which have faults in the at least two entities according to the graph model, wherein the first entity is determined according to the access information of the first entity, and the second entity is determined according to the incidence relation between the first entity and the second entity.
A second aspect of the present application provides a data processing apparatus comprising a first obtaining module, a second obtaining module, a generating module, and a determining module, wherein,
the first acquiring module is used for acquiring first task data from a first message queue, wherein the first task data is predefined task data;
the second obtaining module is configured to obtain second task data from a second message queue, where the second task data is data when the first task data is executed;
the generating module is configured to generate a graph model according to the first task data and the second task data, where the graph model includes at least two entities, an association relationship between different entities in the at least two entities, and access information of each entity in the at least two entities;
the determining module is configured to determine, according to the graph model, a first entity and a second entity that are faulty in the at least two entities, where the first entity is determined according to the access information of the first entity, and the second entity is determined according to an association relationship between the first entity and the second entity.
A third aspect of the application provides a data processing electronic device comprising a processor, a memory, a communication interface and one or more programs, wherein the one or more programs are stored in the memory and are generated as instructions which are executed by the processor to perform steps in any of the methods of a data processing method.
A fourth aspect of the present application provides a computer readable storage medium for storing a computer program for execution by the processor to perform the method of any one of the data processing methods.
According to the technical scheme, the data of the same task data during definition and execution are obtained from different message queues, and the graph model is generated according to the obtained data, so that multiple entities with faults can be determined according to the graph model, and application scenes are enriched. Meanwhile, the data for generating the graph model comprises data of the same task data during definition and execution, so that multiple failed entities can be comprehensively determined based on a more complete graph model under the condition of collecting enough data, and preparation is made for subsequent developers to update the entities. In addition, the data of the same task data during definition and execution are acquired from different message queues, so that the complex process of interaction with the data of different data sources is avoided, and the data integration efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
FIG. 1 is a schematic diagram of a data processing system provided by an embodiment of the present application;
fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a graphical model;
FIG. 4 is a schematic flow chart diagram illustrating a further data processing method provided in an embodiment of the present application;
fig. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device in a hardware operating environment according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The following are detailed below.
The terms "first" and "second" in the description and claims of the present application and the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Referring first to fig. 1, fig. 1 is a schematic diagram of a data processing system provided in an embodiment of the present application, where the data processing system 100 includes a data processing apparatus 110. The data processing device 110 is used to acquire, process, etc. task data. The data processing system 100 may include an integrated single device or multiple devices, and for convenience of description, the data processing apparatus 110 is referred to as an electronic device or a server, and is not limited herein.
The electronic device may include various handheld devices, vehicle-mounted devices, wearable devices, computing devices or other processing devices connected to a wireless modem with wireless communication functions, and various forms of User Equipment (UE), Mobile Stations (MS), terminal devices (terminal device), and so on.
Referring to fig. 2, fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present application. The data processing method can be applied to an electronic device or a server, and as shown in fig. 2, the method includes:
201. and acquiring first task data from the first message queue, wherein the first task data is predefined task data.
The first task data may include a task name, a task number, task data associated with the first task data, task content, a task type, a predefined time for scheduling the first task data, and the like. It is to be understood that the first task data may further include a data type of a task name, a data type of a task number, a data type of task data associated with the first task data, a data type of a task content, a data type of a task type, a data type of a predefined time for scheduling the first task data, and the like.
It should be noted that the data type may include, for example, an integer type, a floating point type, a character type, a boolean type, a character string type, an array type, and the like. Integer types may include, for example: byte, short, int, long, etc. The floating point number types may include, for example: float, double, etc. The character types may include, for example: char, varchar, etc. Boolean types may include, for example: tune, false, etc. The string type may include string, for example. The array type may be array, for example.
For example, the first task data may include task data that Hive data flows back to the relational database. Specifically, see table 1, as shown in table 1, it can be seen that the task name is Hive data reflow to the relational database, and the data type of the task name is string; the task number is 2, and the data type of the task number is int; the task data associated with the first task data is 1, and the data type of the task data associated with the first task data is array; the task content is as follows:
Figure BDA0003082809680000041
Figure BDA0003082809680000042
Figure BDA0003082809680000043
the data type of the task content is string; the task type is hive2Rds, and the data type of the task type is string; the predefined time to schedule the first task data is 0509? And the data type of the predefined time for scheduling the first task data is string.
Table 1: task data of Hive data reflowing to relational database
Figure BDA0003082809680000044
Figure BDA0003082809680000051
The first task data is predefined task data, and can be understood as: the first task data is data obtained by embedding a task content in the predefined task data.
The first message queue may further include other predefined task data besides the first task data, which is not limited herein. It is understood that the other predefined task data is data obtained by embedding task contents in the other predefined task data.
Optionally, step 201 includes: acquiring a preset task type; and acquiring first task data from the first message queue according to the preset task type.
The preset task type may be, for example, a data cleaning task, a data reflow task, an interface call task, and the like, which is not limited herein.
The obtaining of the preset task type may include: and when the input operation on the display interface is detected, acquiring a preset task type.
The display interface may include an input box, and when an input operation on the display interface is detected, the input operation may be understood as: and when the input operation on the display interface is detected, acquiring a preset task type from the input frame.
The obtaining of the first task data from the first message queue according to the preset task type may include: and first task data with the task type being a preset task type is obtained from the first message queue.
202. And acquiring second task data from the second message queue, wherein the second task data is data when the first task data is executed.
The second message queue may further include data for executing other predefined task data, which is not limited herein.
Optionally, step 202 may include: and acquiring second task data from the second message queue according to the preset task type.
203. And generating a graph model according to the first task data and the second task data, wherein the graph model comprises at least two entities, incidence relations among different entities in the at least two entities and access information of each entity in the at least two entities.
Each of the at least two entities may include a task type, a name of a data table, a name of an interface, a name of a task, a name of a report, and the like in the first task data and the second task data, which is not limited herein.
The interface may be, for example, an Application Programming Interface (API) or a Dynamic Link Library (DLL), which is not limited herein.
Each entity of the at least two entities may be a node in the graph model, and an association relationship between different entities of the at least two entities may be at least one edge in the graph model, where the at least one edge is a directed edge.
The access information of each of the at least two entities may include the number of times of access of each of the at least two entities, the access time of each of the at least two entities, and the like, and the access time of each of the at least two entities may include, for example, the earliest access time of each of the at least two entities, the latest access time of each of the at least two entities, and the like. It is to be understood that the access information of each of the at least two entities may be understood as an edge attribute of each of the at least one edge. That is, the edge attribute of each of the at least one edge may include the number of times of access of each of the at least two entities, the access time of each of the at least two entities, and the like.
Wherein the graph model may further comprise attributes of each of the at least two entities. The attributes of each of the at least two entities may include, for example: the task number, the task type, the data type of the task number, the data type of the task type, and the like, which are not limited herein.
Illustratively, referring to fig. 3, fig. 3 is a schematic diagram of a graphical model. As shown in fig. 3, the graph model includes entities 1 to 4, where entity 1 is respectively associated with entity 2, entity 3, and entity 4, entity 2 is respectively associated with entity 1 and entity 4, entity 3 is respectively associated with entity 1 and entity 4, and entity 4 is respectively associated with entity 1, entity 2, and entity 3. Namely, the graph model further comprises an association relationship between the entity 1 and the entity 2, an association relationship between the entity 1 and the entity 3, an association relationship between the entity 1 and the entity 4, an association relationship between the entity 2 and the entity 4, an association relationship between the entity 3 and the entity 1, and an association relationship between the entity 3 and the entity 4. Further, the graph model further comprises access time of the entity 1, access time of the entity 2, access time of the entity 3, access time of the entity 4, access times of the entity 1, access times of the entity 2, access times of the entity 3 and access times of the entity 4.
Optionally, step 203 may include: analyzing task contents in the first task data to obtain the name of the first data table, the name of the second data table and an incidence relation between the first data table and the second data table; acquiring task data associated with the first task data; acquiring access information of the first data table and access information of the second data from the second task data; and generating a graph model according to the name of the first data table, the name of the second data table, the incidence relation between the first data table and the second data table, other task data related to the first task data, the access information of the first data table and the access information of the second data.
It can be seen that, in the above technical solution, by acquiring data of the same task data during definition and execution, and generating a graph model according to the acquired data, a more complete graph model can be generated under the condition that enough data is collected. In addition, the access information of different entities in the scheduling process can be acquired by acquiring the access information of the first data table and the access information of the second data table from the second task data, so that when a fault entity is determined according to the graph model, a plurality of entities with faults can be determined more accurately, and application scenes are enriched.
Optionally, analyzing the task content in the first task data to obtain the name of the first data table, the name of the second data table, and the association relationship between the first data table and the second data table, including: acquiring a first field in the first task data, wherein the first field in the first task data indicates task content in the first task data; and analyzing the task content in the first task data to obtain the name of the first data table, the name of the second data table and the incidence relation between the first data table and the second data table.
204. And determining a first entity and a second entity which have faults in at least two entities according to the graph model, wherein the first entity is determined according to the access information of the first entity, and the second entity is determined according to the incidence relation between the first entity and the second entity.
The first entity may be one or more entities, which are not limited herein. It will be appreciated that the second entity may be an entity with which each of the one or more entities is associated.
Optionally, the access information of each of the at least two entities includes the number of accesses of each entity, and step 204 includes: determining a first entity of which the access times are greater than or equal to a first threshold corresponding to each entity in at least two entities according to the access times of each entity; determining a second entity associated with the first entity according to the association relationship between different entities in at least two entities; the first entity and the second entity are determined to be faulty entities.
It can be seen that, in the above technical solution, the entity with the access frequency greater than or equal to the corresponding first threshold and the entities associated therewith are used as multiple entities with faults, so that developers can update the entities, such as updating data tables, interfaces, and the like, conveniently, and application scenarios are enriched.
Optionally, before determining, according to the number of accesses of each entity, a first entity of which the number of accesses is greater than or equal to a first threshold corresponding to each entity in the at least two entities, the method further includes: determining the number of entities associated with each entity according to the association relationship between different entities in at least two entities; determining the weight corresponding to each entity according to the number of entities associated with each entity; and determining a first threshold corresponding to each entity according to the weight corresponding to each entity, the access frequency of each entity and the average value of the access frequency of the entity associated with each entity.
Wherein the greater the number of entities associated with each entity, the greater the weight of each entity.
Illustratively, entity 1 is associated with entity 2, entity 1 is associated with entity 3, entity 2 is associated with entity 4, and entity 2 is associated with entity 5. It can be seen that the number of entities associated with entity 1 is 2, the number of entities associated with entity 2 is 3, the number of entities associated with entity 3 is 1, the number of entities associated with entity 4 is 1, and the number of entities associated with entity 5 is 1. The weight of entity 2 is the largest, the weight of entity 3, the weight of entity 4 and the weight of entity 5 are the same, the weight of entity 1 is less than the weight of entity 2 and the weight of entity 1 is greater than the weight of entity 3.
It can be seen that, in the above technical solution, by determining the number of entities associated with each entity, the first threshold corresponding to each entity can be determined according to the number of entities associated with each entity, so that when a faulty entity is determined, more accurate determination of multiple faulty entities can be achieved by combining the association relationship between the entities, and application scenarios are enriched.
Optionally, step 204 includes: the access information of each of the at least two entities may include an access time of each of the at least two entities, and determining a first entity and a second entity having a failure among the at least two entities according to the graph model includes: sequencing the access time of each entity in at least two entities according to the sequence of the access time from large to small to obtain the sequenced access time of each entity; determining the difference value of the access time of each sequenced entity and the time of the same position in the access time of the entity associated with each sequenced entity according to the access time of each sequenced entity and the association relationship between different entities in at least two entities; and determining the entities with the difference value larger than the second threshold value as the first entity and the second entity with faults.
The second threshold may be set by an administrator or configured in a configuration file, which is not limited herein.
Illustratively, entity 1 has 3 times after the ranking, and entity 2 associated with entity 1 has 2 times after the ranking. The difference between the time of the same position in the access time of the sorted entity 1 and the access time of the sorted entity 2 may be, for example, the difference between the earliest access time of the entity 1 and the earliest access time of the entity 2, or the difference between the latest access time of the entity 1 and the latest access time of the entity 2.
It can be seen that, in the above technical solution, by determining the access time of each entity, when determining a faulty entity, more accurate determination of multiple faulty entities can be achieved according to the difference between the access time of each sequenced entity and the time of the same position in the access time of the entity associated with each sequenced entity, thereby enriching application scenarios.
Referring to fig. 4, fig. 4 is a schematic flowchart of another data processing method provided in the embodiment of the present application. The data processing method can be applied to an electronic device or a server, and as shown in fig. 4, the method includes:
401. and acquiring a preset task type.
The preset task type may refer to the related description in step 201 in fig. 2, which is not described herein again.
Step 401 may include: and when the input operation on the display interface is detected, acquiring a preset task type.
The display interface may include an input box, and when an input operation on the display interface is detected, the input operation may be understood as: and when the input operation on the display interface is detected, acquiring a preset task type from the input frame.
402. And acquiring first task data from the first message queue according to the preset task type.
Step 402 may include, among others: and first task data with the task type being a preset task type is obtained from the first message queue.
403. And acquiring second task data from the second message queue, wherein the second task data is data when the first task data is executed.
Step 403 may refer to step 202 in fig. 2, which is not described herein again.
404. And generating a graph model according to the first task data and the second task data, wherein the graph model comprises at least two entities, incidence relations among different entities in the at least two entities and access information of each entity in the at least two entities.
Step 404 may refer to step 203 in fig. 2, which is not described herein again.
405. And determining a first entity and a second entity which have faults in at least two entities according to the graph model, wherein the first entity is determined according to the access information of the first entity, and the second entity is determined according to the incidence relation between the first entity and the second entity.
Step 405 may refer to step 204 in fig. 2, which is not described herein again.
According to the technical scheme, the data of the same task data during definition and execution are obtained from different message queues, and the graph model is generated according to the obtained data, so that multiple entities with faults can be determined according to the graph model, and application scenes are enriched. Meanwhile, the data for generating the graph model comprises data of the same task data during definition and execution, so that multiple failed entities can be comprehensively determined based on a more complete graph model under the condition of collecting enough data, and preparation is made for subsequent developers to update the entities. In addition, the data of the same task data during definition and execution are acquired from different message queues, so that the complex process of interaction with the data of different data sources is avoided, and the data integration efficiency is improved.
Referring to fig. 5, fig. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present application. As shown in fig. 5, a data processing apparatus 500 provided in an embodiment of the present application includes a first obtaining module 501, a second obtaining module 502, a generating module 503, and a determining module 504.
Optionally, the first obtaining module 501 is configured to obtain first task data from the first message queue, where the first task data is predefined task data;
a second obtaining module 502, configured to obtain second task data from the second message queue, where the second task data is data when the first task data is executed;
a generating module 503, configured to generate a graph model according to the first task data and the second task data, where the graph model includes at least two entities, an association relationship between different entities in the at least two entities, and access information of each entity in the at least two entities;
a determining module 504, configured to determine, according to the graph model, a first entity and a second entity that have a failure in at least two entities, where the first entity is determined according to the access information of the first entity, and the second entity is determined according to an association relationship between the first entity and the second entity.
According to the technical scheme, the data of the same task data during definition and execution are obtained from different message queues, and the graph model is generated according to the obtained data, so that multiple entities with faults can be determined according to the graph model, and application scenes are enriched. Meanwhile, the data for generating the graph model comprises data of the same task data during definition and execution, so that multiple failed entities can be comprehensively determined based on a more complete graph model under the condition of collecting enough data, and preparation is made for subsequent developers to update the entities. In addition, the data of the same task data during definition and execution are acquired from different message queues, so that the complex process of interaction with the data of different data sources is avoided, and the data integration efficiency is improved.
In a possible implementation manner, when the first task data is acquired from the first message queue, the first acquiring module is configured to acquire a preset task type; and acquiring first task data from the first message queue according to the preset task type.
In a possible implementation manner, when the graph model is generated according to the first task data and the second task data, the generating module 503 is configured to analyze task content in the first task data to obtain a name of the first data table, a name of the second data table, and an association relationship between the first data table and the second data table; acquiring task data associated with the first task data; acquiring access information of the first data table and access information of the second data table from the second task data; and generating a graph model according to the name of the first data table, the name of the second data table, the association relation between the first data table and the second data table, other task data associated with the first task data, the access information of the first data table and the access information of the second data table.
It can be seen that, in the above technical solution, by acquiring data of the same task data during definition and execution, and generating a graph model according to the acquired data, a more complete graph model can be generated under the condition that enough data is collected. In addition, the access information of different entities in the scheduling process can be acquired by acquiring the access information of the first data table and the access information of the second data table from the second task data, so that when a fault entity is determined according to the graph model, a plurality of entities with faults can be determined more accurately, and application scenes are enriched.
In a possible implementation manner, when analyzing task content in the first task data to obtain a name of the first data table, a name of the second data table, and an association relationship between the first data table and the second data table, the generating module 503 is configured to obtain a first field in the first task data, where the first field in the first task data indicates the task content in the first task data; and analyzing the task content in the first task data to obtain the name of the first data table, the name of the second data table and the incidence relation between the first data table and the second data table.
In a possible implementation manner, the access information of each of the at least two entities includes the number of accesses of each entity, and when determining, according to the graph model, that there is a failed first entity and a second entity in the at least two entities, the determining module 504 is configured to determine, according to the number of accesses of each entity, a first entity of the at least two entities, whose number of accesses is greater than or equal to a first threshold corresponding to each entity; determining a second entity associated with the first entity according to the association relationship between different entities in at least two entities; the first entity and the second entity are determined to be faulty entities.
It can be seen that, in the above technical solution, the entity with the access frequency greater than or equal to the corresponding first threshold and the entity associated therewith are taken as the entity with the fault, so that developers can update the entity, such as updating a data table, an interface, and the like, conveniently, and application scenarios are enriched.
In a possible implementation manner, before determining, according to the number of accesses of each entity, a first entity of the at least two entities whose number of accesses is greater than or equal to the first threshold corresponding to each entity, the determining module 504 is further configured to determine, according to an association relationship between different entities of the at least two entities, the number of entities associated with each entity; determining the weight corresponding to each entity according to the number of entities associated with each entity; and determining a first threshold corresponding to each entity according to the weight corresponding to each entity, the access frequency of each entity and the average value of the access frequency of the entity associated with each entity.
It can be seen that, in the above technical solution, by determining the number of entities associated with each entity, the first threshold corresponding to each entity can be determined according to the number of entities associated with each entity, so that when a faulty entity is determined, more accurate determination of multiple faulty entities can be achieved by combining the association relationship between the entities, and application scenarios are enriched.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device in a hardware operating environment according to an embodiment of the present application.
Embodiments of the present application provide an electronic device for data processing, comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor to perform instructions comprising steps in any of the data processing methods. As shown in fig. 6, an electronic device of a hardware operating environment according to an embodiment of the present application may include:
a processor 601, such as a CPU.
The memory 602 may alternatively be a high speed RAM memory or a stable memory such as a disk memory.
A communication interface 603 for implementing connection communication between the processor 601 and the memory 602.
Those skilled in the art will appreciate that the configuration of the electronic device shown in fig. 6 is not intended to be limiting and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
As shown in fig. 6, the memory 602 may include an operating system, a network communication module, and one or more programs. An operating system is a program that manages and controls the server hardware and software resources, supporting the execution of one or more programs. The network communication module is used for communication among the components in the memory 602 and with other hardware and software in the electronic device.
In the electronic device shown in fig. 6, the processor 601 is configured to execute one or more programs in the memory 602, and implement the following steps:
acquiring first task data from a first message queue, wherein the first task data is predefined task data;
acquiring second task data from the second message queue, wherein the second task data is data when the first task data is executed;
generating a graph model according to the first task data and the second task data, wherein the graph model comprises at least two entities, incidence relations among different entities in the at least two entities and access information of each entity in the at least two entities;
and determining a first entity and a second entity which have faults in at least two entities according to the graph model, wherein the first entity is determined according to the access information of the first entity, and the second entity is determined according to the incidence relation between the first entity and the second entity.
For specific implementation of the electronic device related to the present application, reference may be made to various embodiments of the data processing method, which are not described herein again.
The present application also provides a computer readable storage medium for storing a computer program, the stored computer program being executable by a processor to perform the steps of:
acquiring first task data from a first message queue, wherein the first task data is predefined task data;
acquiring second task data from the second message queue, wherein the second task data is data when the first task data is executed;
generating a graph model according to the first task data and the second task data, wherein the graph model comprises at least two entities, incidence relations among different entities in the at least two entities and access information of each entity in the at least two entities;
and determining a first entity and a second entity which have faults in at least two entities according to the graph model, wherein the first entity is determined according to the access information of the first entity, and the second entity is determined according to the incidence relation between the first entity and the second entity.
For specific implementation of the computer-readable storage medium related to the present application, reference may be made to the embodiments of the data processing method, which are not described herein again.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required in the present application.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A data processing method, comprising:
acquiring first task data from a first message queue, wherein the first task data is predefined task data;
acquiring second task data from a second message queue, wherein the second task data is data when the first task data is executed;
generating a graph model according to the first task data and the second task data, wherein the graph model comprises at least two entities, incidence relations among different entities in the at least two entities and access information of each entity in the at least two entities;
and determining a first entity and a second entity which have faults in the at least two entities according to the graph model, wherein the first entity is determined according to the access information of the first entity, and the second entity is determined according to the incidence relation between the first entity and the second entity.
2. The method of claim 1, wherein the retrieving the first task data from the first message queue comprises:
acquiring a preset task type;
and acquiring the first task data from the first message queue according to the preset task type.
3. The method of claim 1 or 2, wherein generating a graph model from the first task data and the second task data comprises:
analyzing the task content in the first task data to obtain the name of a first data table, the name of a second data table and the incidence relation between the first data table and the second data table;
acquiring task data related to the first task data;
acquiring access information of the first data table and access information of the second data table from the second task data;
and generating the graph model according to the name of the first data table, the name of the second data table, the association relation between the first data table and the second data table, other task data associated with the first task data, the access information of the first data table and the access information of the second data table.
4. The method according to claim 2, wherein the parsing the task content in the first task data to obtain a name of a first data table, a name of a second data table, and an association relationship between the first data table and the second data table comprises:
acquiring a first field in the first task data, wherein the first field in the first task data indicates task content in the first task data;
analyzing the task content in the first task data to obtain the name of the first data table, the name of the second data table and the association relation between the first data table and the second data table.
5. The method of claim 1, wherein the access information of each of the at least two entities comprises a number of accesses of the each entity, and wherein the determining, according to the graph model, a first entity and a second entity that have a failure in the at least two entities comprises:
determining a first entity of the at least two entities, wherein the access times of the first entity are greater than or equal to a first threshold corresponding to each entity according to the access times of each entity;
determining the second entity associated with the first entity according to the association relationship between different entities in the at least two entities;
determining the first entity and the second entity as faulty entities.
6. The method of claim 5, wherein before determining, according to the number of accesses of each entity, a first entity of the at least two entities whose number of accesses is greater than or equal to a first threshold corresponding to each entity, the method further comprises:
determining the number of entities associated with each entity according to the association relationship between different entities in the at least two entities;
determining the weight corresponding to each entity according to the entity number associated with each entity;
and determining a first threshold corresponding to each entity according to the weight corresponding to each entity, the access times of each entity and the average value of the access times of the entity associated with each entity.
7. A data processing apparatus, characterized in that the apparatus comprises a first obtaining module, a second obtaining module, a generating module and a determining module, wherein,
the first acquiring module is used for acquiring first task data from a first message queue, wherein the first task data is predefined task data;
the second obtaining module is configured to obtain second task data from a second message queue, where the second task data is data when the first task data is executed;
the generating module is configured to generate a graph model according to the first task data and the second task data, where the graph model includes at least two entities, an association relationship between different entities in the at least two entities, and access information of each entity in the at least two entities;
the determining module is configured to determine, according to the graph model, a first entity and a second entity that are faulty in the at least two entities, where the first entity is determined according to the access information of the first entity, and the second entity is determined according to an association relationship between the first entity and the second entity.
8. The apparatus of claim 7, wherein the first obtaining module is configured to obtain the first task data from the first message queue
Acquiring a preset task type;
and acquiring the first task data from the first message queue according to the preset task type.
9. An electronic device for data processing, comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and are generated as instructions to be executed by the processor to perform the steps of the method of any of claims 1-6.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store a computer program, which is executed by the processor, to implement the method of any of claims 1-6.
CN202110571687.4A 2021-05-25 2021-05-25 Data processing method and device, electronic equipment and storage medium Pending CN113297335A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110571687.4A CN113297335A (en) 2021-05-25 2021-05-25 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110571687.4A CN113297335A (en) 2021-05-25 2021-05-25 Data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113297335A true CN113297335A (en) 2021-08-24

Family

ID=77324749

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110571687.4A Pending CN113297335A (en) 2021-05-25 2021-05-25 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113297335A (en)

Similar Documents

Publication Publication Date Title
CN109448100B (en) Three-dimensional model format conversion method, system, computer device and storage medium
CN111124906A (en) Tracking method, compiling method and device based on dynamic embedded points and electronic equipment
CN107103064B (en) Data statistical method and device
CN112311617A (en) Configured data monitoring and alarming method and system
CN112800095A (en) Data processing method, device, equipment and storage medium
CN110716848A (en) Data collection method and device, electronic equipment and storage medium
CN111352800A (en) Big data cluster monitoring method and related equipment
CN109062807B (en) Method and device for testing application program, storage medium and electronic device
CN111400170A (en) Data permission testing method and device
CN112559538A (en) Incidence relation generation method and device, computer equipment and storage medium
US11822961B2 (en) Method and apparatus for data processing, server and storage medium
CN113472555A (en) Fault detection method, system, device, server and storage medium
CN113297335A (en) Data processing method and device, electronic equipment and storage medium
WO2019153546A1 (en) Ten-thousand-level dimension data generation method, apparatus and device, and storage medium
CN114281549A (en) Data processing method and device
CN115220131A (en) Meteorological data quality inspection method and system
CN114756301A (en) Log processing method, device and system
CN114064712A (en) Data access method and device, electronic equipment and computer readable storage medium
CN112527486A (en) Scheduling optimization method and device
CN112994976A (en) Gateway testing method and device, electronic equipment and storage medium
CN112363774A (en) Storm real-time task configuration method and device
CN106528577B (en) Method and device for setting file to be cleaned
CN109901990B (en) Method, device and equipment for testing service system
CN109857632B (en) Test method, test device, terminal equipment and readable storage medium
CN113553320B (en) Data quality monitoring method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination