CN114661830A - Data processing method, device, terminal and storage medium - Google Patents

Data processing method, device, terminal and storage medium Download PDF

Info

Publication number
CN114661830A
CN114661830A CN202210225367.8A CN202210225367A CN114661830A CN 114661830 A CN114661830 A CN 114661830A CN 202210225367 A CN202210225367 A CN 202210225367A CN 114661830 A CN114661830 A CN 114661830A
Authority
CN
China
Prior art keywords
accessed
item
attribute
item object
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210225367.8A
Other languages
Chinese (zh)
Other versions
CN114661830B (en
Inventor
张硕
孟越
徐地
田春华
袁文飞
胡坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Industrial Big Data Innovation Center Co ltd
Original Assignee
Suzhou Industrial Big Data Innovation Center Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Industrial Big Data Innovation Center Co ltd filed Critical Suzhou Industrial Big Data Innovation Center Co ltd
Priority to CN202210225367.8A priority Critical patent/CN114661830B/en
Publication of CN114661830A publication Critical patent/CN114661830A/en
Application granted granted Critical
Publication of CN114661830B publication Critical patent/CN114661830B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/289Object oriented databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data processing method, a data processing device, a terminal and a storage medium; the method and the device can acquire the target current request field in the current access request; searching a project object corresponding to the entity name of the project object to be accessed from a preset object model, and marking as a target project object; the project object is connected with at least one database; and searching attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected with the target item object. According to the embodiment of the application, the project object corresponding to the name of the project object to be accessed can be searched in a preset object model according to the entity name and the attribute name in the current request field of the target, so that the target project object and the database connected with the target project object are determined. Therefore, data are searched from the database correspondingly connected with the target project object, and the data searching speed is improved.

Description

Data processing method, device, terminal and storage medium
Technical Field
The present application relates to the technical field of database optimization, and in particular, to a data processing method, apparatus, terminal, and storage medium.
Background
A database is a "warehouse that organizes, stores, and manages data according to a data structure. Is an organized, sharable, uniformly managed collection of large amounts of data that is stored long term within a computer. When extracting the required data from the database, the required data information is obtained by extracting the required data in a query language. However, because the existing database has a large data storage amount, when searching for data, the searching speed is slow, and cannot meet the requirements of users, so a data processing method is needed to improve the speed of searching for data.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, a terminal and a storage medium, which can improve the data searching speed.
An embodiment of the present application provides a data processing method, including:
acquiring a target current request field in a current access request, wherein the target current request field comprises an entity name of a project object to be accessed and an attribute name of the project object to be accessed;
searching a project object corresponding to the entity name of the project object to be accessed from a preset object model, and marking as a target project object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database;
and searching attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected with the target item object.
In some embodiments, the method comprises:
performing field analysis on the entity name and the attribute name of the item object to be accessed to obtain a main object of the item object to be accessed, wherein the main object comprises entity characteristics for representing the entity in the item object to be accessed and attribute characteristics for representing the attribute in the item object to be accessed;
and correspondingly storing the attribute information on the position of the entity characteristic and the position of the attribute characteristic in the main body object to obtain first statistical data information.
In some embodiments, when the target current request field includes a first to-be-accessed item object and a second to-be-accessed item object;
the method further comprises the following steps:
determining an item object incidence relation recorded in the object model;
determining the incidence relation between a first item object to be accessed and a second item object to be accessed in the target current request field according to the incidence relation of the item objects recorded in the object model;
judging whether the incidence relation between the first item object to be accessed and the second item object to be accessed meets a preset incidence condition or not;
if so, determining the incidence relation between a first subject object of the first item object to be accessed and a second subject object of the second item object to be accessed;
and storing the attribute information according to the incidence relation between a first main body object of the first item object to be accessed and a second main body object of the second item object to be accessed to obtain second statistical data information.
In some embodiments, the preset association condition is that the number of hops δ between the first item object to be accessed and the second item object to be accessed is not greater than N, wherein N is not less than 1, and N is a positive integer.
In some embodiments, the association of the first subject object of the first item object to be accessed with the second subject object of the second item object to be accessed comprises:
the incidence relation between the first entity characteristic in the first subject object and the second entity characteristic in the second subject object;
the incidence relation between the first attribute feature in the first subject object and the second entity feature in the second subject object;
the incidence relation between the first entity characteristic in the first subject object and the second attribute characteristic in the second subject object;
and the incidence relation between the first attribute feature in the first subject object and the second attribute feature in the second subject object.
In some embodiments, the method further comprises:
acquiring a history request field in a history access request, wherein the history request field comprises an entity name of an accessed item object and an attribute name of the accessed item object;
performing field analysis on the entity name and the attribute name of the accessed item object to obtain a history main object of the accessed item object, wherein the history main object comprises history entity characteristics and history attribute characteristics;
performing statistical processing on the historical access request to obtain the usage amount of the historical access request, wherein the usage amount is used for representing the number of times that the historical access request is accessed in a preset historical time period;
determining the usage amount of the history subject object according to the usage amount of the history access request;
determining a history subject object which is the same as the subject object of the item object to be accessed in the history subject objects of all the accessed item objects, and taking the usage amount of the determined same history subject object as the usage amount of the subject object of the item object to be accessed;
and sequencing the entity characteristics and the attribute characteristics of the main object according to the usage amount of the main object in the item object to be accessed to obtain sequenced first statistical data information.
In some embodiments, the method further comprises:
according to the usage amount of a main object in the item object to be accessed, calculating the sampling degree and accuracy of the entity features and the attribute features in the main object to obtain sampling degree data and accuracy data of the entity features and sampling degree data and accuracy data of the attribute features;
and storing the sampling degree data and the accuracy data of the entity characteristics and the sampling degree data and the accuracy data of the attribute characteristics in the first statistical data information to obtain first target statistical data information.
In some embodiments, the method further comprises:
determining the usage amount of the association relation between the first item object to be accessed and the second item object to be accessed, which meets a preset association condition, according to the usage amount of the main object in the item objects to be accessed;
wherein the usage amount of the association relationship between the first item object to be accessed and the second item object to be accessed includes:
the usage amount of the incidence relation between the first entity name in the first to-be-accessed item object and the second entity name in the second to-be-accessed item object;
the usage amount of the incidence relation between the first entity name in the first item object to be accessed and the second attribute name in the second item object to be accessed;
the usage amount of the incidence relation between the first attribute name in the first to-be-accessed item object and the second entity name in the second to-be-accessed item object;
usage amount of an association relation between a first attribute name in the first item object to be accessed and a second attribute name in the second item object to be accessed;
and sorting the second statistical data information according to the usage amount of the incidence relation between the first to-be-accessed item object and the second to-be-accessed item object to obtain sorted second target statistical data information.
In some embodiments, the method further comprises:
according to the usage amount of a main object in the project objects to be accessed, calculating the sampling degree and accuracy of the incidence relation between the first project object to be accessed and the second project object to be accessed to obtain sampling degree data and accuracy data of the incidence relation;
and storing the sampling degree data and the accuracy data of the incidence relation in the second statistical data information to obtain second target statistical data information.
In some embodiments, the method for searching for the item object corresponding to the entity name of the item object to be accessed from the preset object model includes:
determining a project object corresponding to the entity name of the project object to be accessed in the object model, and recording the project object as an initial target project object;
and judging whether the attribute name of the initial target item object corresponds to the attribute name of the item object to be accessed, if so, determining that the initial target item object is the target item object.
An embodiment of the present application further provides a data processing apparatus, including:
the system comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring a target current request field in a current access request, and the target current request field comprises the name of an item object to be accessed and the attribute name of the item object to be accessed;
the first searching unit is used for searching an item object corresponding to the name of the item object to be accessed from a preset object model and marking as a target item object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database;
and the second searching unit is used for searching the attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected with the target item object.
A terminal comprising a processor and a memory, the memory storing a plurality of instructions; the processor loads instructions from the memory to execute the steps of any one of the data processing methods provided by the embodiments of the present application.
A computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of any of the data processing methods provided by the embodiments of the present application.
A computer program product comprising a computer program or instructions which, when executed by a processor, performs the steps of any of the data processing methods provided by the embodiments of the present application.
The method and the device for accessing the item object can acquire a target current request field in a current access request, wherein the target current request field comprises the name of the item object to be accessed and the attribute name of the item object to be accessed; searching a project object corresponding to the name of the project object to be accessed from a preset object model, and marking as a target project object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database; and searching attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected with the target item object.
In the application, the data processing apparatus may search, according to the name of the item object to be accessed and the attribute name of the item object to be accessed in the target current request field, an item object corresponding to the name of the item object to be accessed in a preset object model, thereby determining a target item object and a database connected to the target item object. Therefore, data are searched from the database correspondingly connected with the target project object, and the data searching speed is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1a is a schematic view of a scenario of a data processing method provided in an embodiment of the present application;
FIG. 1b is a schematic flow chart of a data processing method provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart diagram illustrating an embodiment of a data processing method according to an embodiment of the present disclosure;
FIG. 3 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a data processing method, a data processing device, a terminal and a storage medium.
The data processing apparatus may be specifically integrated in an electronic device, and the electronic device may be a terminal, a server, or other devices. The terminal can be a mobile phone, a tablet Computer, an intelligent bluetooth device, a notebook Computer, or a Personal Computer (PC), and the like; the server may be a single server or a server cluster composed of a plurality of servers.
In some embodiments, the data processing apparatus may also be integrated in a plurality of electronic devices, for example, the data processing apparatus may be integrated in a plurality of servers, and the data processing method of the present application is implemented by the plurality of servers.
In some embodiments, the server may also be implemented in the form of a terminal.
For example, referring to fig. 1a, the electronic device may be a server, in which a data processing apparatus is integrated, the server in this embodiment of the present application is configured to obtain a target current request field in a current access request, where the target current request field includes a name of an item object to be accessed and an attribute name of the item object to be accessed; searching a project object corresponding to the name of the project object to be accessed from a preset object model, and marking as a target project object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database; and searching attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected with the target item object.
The following are detailed below. The numbers in the following examples are not intended to limit the order of preference of the examples.
In this embodiment, a data processing method is provided, and as shown in fig. 1b, a specific flow of the data processing method may be as follows:
110. and acquiring a target current request field in the current access request, wherein the target current request field comprises the name of the item object to be accessed and the attribute name of the item object to be accessed.
The access request is a request signal sent to the server when a user needs to browse or search resources such as data information, webpage information, video information and the like, and the server responds to the access request so as to provide the resources such as the data information, the webpage information, the video information and the like for the user. For example, in some embodiments of the present application, the access request issued by the user may be data information used to find the item object.
The current access request refers to a request signal sent when a user needs to search the data information of the item object currently.
In some embodiments, the current access request may be a data query language, which is a language used to retrieve a desired data set from a database, a data file.
For example, in some embodiments of the present application, the data Query Language may be a GraphQL data Query Language, or may be a Structured Query Language (Structured Query Language).
The target current request field is a field in the current access request, and is used for representing an item object which needs to be queried by a user.
The to-be-accessed item object may be an entity device, a process segment, a device category, or a product, and the name of the to-be-accessed item object may be the name of the entity device, the name of the process segment, the name of the device category, or the name of the product. The attribute name of the to-be-accessed item object corresponds to the to-be-accessed item object and is used for representing the attribute name of specific content in the to-be-accessed item object, and in some embodiments, the attribute name of the to-be-accessed item object may be an attribute name of a use mode and an attribute name of use time corresponding to the entity device, or an attribute name of each process step in the process segment, an attribute name of each device in the device category, or an attribute name of each component in the product.
Obtaining the target current request field in the current access request may refer to obtaining the target current request field by a user by inputting a data query language and extracting a field in the data query language.
120. Searching a project object corresponding to the name of the project object to be accessed from a preset object model, and marking as a target project object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database.
The object model may be a domain model, which is a visual representation of concept classes within a domain or objects in the real world. Also known as conceptual models and domain object models. It focuses on analyzing the problem domain itself, exploring important business domain concepts, and establishing relationships between the business domain concepts. In some embodiments, the domain model includes a plurality of project objects, each project object is connected to at least one database, and the database connected to the project object may store data information of the project object.
A database refers to a "warehouse that organizes, stores, and manages data according to a data structure. Is an organized, sharable, uniformly managed collection of large amounts of data that is stored long term within a computer.
Searching the project object corresponding to the name of the project object to be accessed from the preset object model means that the name and the attribute name of the project object in the object model are compared with the name and the attribute name of the project object to be accessed, the project object with the same name and the same attribute name is determined, and the determined project object is marked as a target project object.
The method for searching the project object corresponding to the name of the project object to be accessed from the preset object model comprises the following steps:
121. and determining the project object corresponding to the entity name of the project object to be accessed in the object model, and recording the project object as an initial target project object.
The initial target item object refers to an item object having the same name as the entity name of the item object to be accessed. During searching, the name of each item object in the object model and the entity name of the item object to be accessed can be subjected to field matching, and the item objects with the same entity names are determined.
122. And judging whether the attribute name of the initial target item object corresponds to the attribute name of the item object to be accessed, if so, determining that the initial target item object is the target item object.
Judging whether the attribute name of the initial target item object corresponds to the attribute name of the item object to be accessed means that each attribute name in the initial target item object is obtained, the attribute name of the item object to be accessed is matched with each attribute name in the initial target item object and fields are carried out, and whether the attribute name of the item object to be accessed corresponds to each attribute name in the initial target item object is determined. The condition that the attribute name of the item object to be accessed corresponds to each attribute name in the initial target item object comprises the following steps:
the attribute name of the item object to be accessed is correspondingly the same as the attribute name in the initial target item object;
the set of attribute names of the item object to be accessed is contained in the set of attribute names in the initial target item object.
130. And searching attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected with the target item object.
Searching in a database refers to extracting required data from a database according to the query requirement. The attribute information may refer to data information corresponding to an attribute name of the item object to be accessed. By determining the database connected with the target project object, the attribute information corresponding to the attribute name related in the project object to be accessed can be quickly found, wherein the attribute information can also be used for representing the information of the entity name.
In this embodiment of the present application, after the attribute information is obtained, the data processing method further includes:
140. and performing field analysis on the entity name and the attribute name of the item object to be accessed to obtain a main object of the item object to be accessed, wherein the main object comprises entity characteristics for representing the entity in the item object to be accessed and attribute characteristics for representing the attribute in the item object to be accessed.
The field analysis refers to identifying and extracting the entity name and the attribute name of the item object to be accessed, and eliminating the filter item in the target current request field, so as to determine the main object in the target current request field.
In embodiments of the present application, the subject object includes entity features and attribute features, where, for example, in some embodiments, the entity features may include "entity in the item object to be accessed", and the attribute features may include "as a return attribute", "as a NULL (IS NULL) filter attribute or a non-NULL (IS NOT NULL) filter attribute", "as a value filter attribute NOT being a NULL (IS NULL) or a non-NULL (IS NOT NULL), as a return attribute", and "as a set of attributes of the grouping clauses". One subject object may include one entity feature and one attribute feature, or may include one entity feature and a plurality of attribute features.
150. And correspondingly storing the attribute information on the position of the entity characteristic and the position of the attribute characteristic in the main body object to obtain first statistical data information.
The attribute information is correspondingly stored in the position of the entity feature and the position of the attribute feature in the main body object, namely, the content in the attribute information is correspondingly stored in the position of the entity feature and the position of the attribute feature according to the position of the entity feature and the position of the attribute feature in the main body object.
For example, after the entity characteristics and the attribute characteristics are determined, the entity characteristics and the attribute characteristics may be stored in a manner of creating a new folder, and then the attribute information is stored in the folder corresponding to the entity characteristics and the folder corresponding to the attribute characteristics, respectively, so that a user can conveniently and quickly refer to the attribute information.
For example, in some embodiments, when the current access request is in the GraphQL data query language, the format of the current access request may be: process _ segment
display _ name: a material injection machine,
equipment_id:“e_filler”}
wherein, the subject feature may be "process _ segment"; the attribute feature may be "display _ name" or "equipment _ id"; and the attribute information is usage data of "injector" and usage data of "e _ filer".
Then, a main body profile folder named "< process _ segment >" is created, and the usage data of attribute information "feeder" and the usage data of "e _ filer" are stored in the main body profile folder.
Establishing an attribute feature folder with the name of < process _ segment, display _ name > ", and storing the use data of which the attribute information is 'annotating machine' in the attribute feature folder.
An attribute feature folder named "< process _ segment, request _ id >" is created, and the use data of attribute information "e _ file" is stored in the attribute feature folder. In this embodiment of the application, when the target current request field includes the first to-be-accessed item object and the second to-be-accessed item object, the data processing method further includes: and determining the item object incidence relation recorded in the object model.
The item object association relationship refers to a connection relationship between item objects in the object model, for example, the item object a is connected with the item object B, the item object a has an association relationship with the item object B, the item object a is connected with the item object C through the item object B, the item object a has an association relationship with the item object C, the item object a is not connected with the item object D or the number of the item objects connected between the item object a and the item object D exceeds a preset number, and the item object a does not have an association relationship with the item object D.
And determining the incidence relation between the first item object to be accessed and the second item object to be accessed in the target current request field according to the incidence relation of the item objects recorded in the object model.
The first to-be-accessed item object and the second to-be-accessed item object refer to any two to-be-accessed item objects in the plurality of to-be-accessed item objects involved in the current access request.
The method for determining the association relationship between the first item object to be accessed and the second item object to be accessed comprises the steps of determining a first target item object corresponding to the first item object to be accessed in an object model, determining a second target item object corresponding to the second item object to be accessed in the object model, and determining the association relationship between the first item object to be accessed and the second item object to be accessed in a target current request field according to the association relationship between the first target item object and the second target item object.
And judging whether the association relation between the first item object to be accessed and the second item object to be accessed meets a preset association condition or not.
The preset association condition is that the hop count δ between the first item object to be accessed and the second item object to be accessed is not more than N, wherein the hop count refers to the number of the associated item objects to be accessed, and the hop count δ can be set as required. For example, when the hop count δ is 1, there are two associated to-be-accessed item objects, the first to-be-accessed item object and the second to-be-accessed item object are directly connected, when the hop count δ is 2, there are three associated to-be-accessed item objects, one to-be-accessed item object is connected between the first to-be-accessed item object and the second to-be-accessed item object, when the hop count δ is 5, there are six associated to-be-accessed item objects, and four to-be-accessed item objects are connected between the first to-be-accessed item object and the second to-be-accessed item object. In some embodiments, when the hop count δ is 1, if the first to-be-accessed item object and the second to-be-accessed item object are directly connected in the object model, the preset association condition is satisfied.
And if so, determining the incidence relation between a first main body object of the first item object to be accessed and a second main body object of the second item object to be accessed.
The first main body object can perform field analysis on the entity name and the attribute name of the first item object to be accessed, and the obtained main body object of the first item object to be accessed is the first main body object; the second subject object may perform field analysis on the entity name and the attribute name of the second to-be-accessed item object, and the obtained subject object of the second to-be-accessed item object is the second subject object.
The association relationship between the first subject object and the second subject object includes:
the incidence relation between the first entity characteristic in the first subject object and the second entity characteristic in the second subject object;
the incidence relation between the first attribute feature in the first subject object and the second entity feature in the second subject object;
the incidence relation between the first entity characteristic in the first subject object and the second attribute characteristic in the second subject object;
and the incidence relation between the first attribute feature in the first subject object and the second attribute feature in the second subject object. The number of the incidence relation between the first entity name and the second entity name, the incidence relation between the first entity name and the second attribute name, the incidence relation between the first attribute name and the second entity name, and the incidence relation between the first attribute name and the second attribute name may be one or more.
And storing the attribute information according to the incidence relation between a first main body object of the first item object to be accessed and a second main body object of the second item object to be accessed to obtain second statistical data information.
The second statistical data information refers to data information in which the attribute information is stored according to an association relationship between a first subject object of the first item object to be accessed and a second subject object of the second item object to be accessed.
Wherein storing the attribute information according to the association relationship between the first subject object of the first to-be-accessed item object and the second subject object of the second to-be-accessed item object may include:
according to the incidence relation between the first entity feature in the first main body object and the second entity feature in the second main body object, storing the attribute information of the first entity feature in the first main body object at the position corresponding to the first entity feature, and storing the attribute information of the second entity feature in the second main body object at the position corresponding to the second entity feature to obtain second statistical data information used for representing the incidence relation between the first entity feature in the first main body object and the second entity feature in the second main body object.
And storing the attribute information of the second entity feature in the second main body object at the position corresponding to the second entity feature to obtain second statistical data information for representing the association relationship between the first attribute feature in the first main body object and the second entity feature in the second main body object.
And storing the attribute information of the first entity feature in the first main body object at a position corresponding to the first entity feature and storing the attribute information corresponding to the second attribute feature in the second main body object at a position corresponding to the second entity feature to obtain second statistical data information for representing the association relationship between the first entity feature in the first main body object and the second attribute feature in the second main body object.
And storing the attribute information corresponding to the first attribute feature in the first main body object at a position corresponding to the first entity feature, and storing the attribute information corresponding to the second attribute feature in the second main body object at a position corresponding to the second entity feature to obtain second statistical data information for representing the association relationship between the first attribute feature in the first main body object and the second attribute feature in the second main body object.
In the embodiment of the application, in order to further display the first statistical data information and the second statistical data information, the times of accessing the entity characteristics and the attribute characteristics in the historical access request in the historical time period are calculated, the usage amount of each data information in the first statistical data information and the second statistical data information is determined according to the accessed times, and the first statistical data information and the second statistical data information are sorted.
The method for determining the usage amount of the entity characteristics and the usage amount of the attribute characteristics comprises the following steps:
and acquiring a history request field in the history access request, wherein the history request field comprises the entity name of the accessed item object and the attribute name of the accessed item object.
The historical access request refers to a request signal sent by a user before the data information of the item object needs to be searched currently or a request signal sent within a time period.
In some embodiments, the historical access request may be a data query language, which is a language used to retrieve a desired data set from a database, data file.
For example, in some embodiments of the present application, the data Query Language may be a GraphQL data Query Language, or may be a Structured Query Language (Structured Query Language).
The history request field is a field in the history access request, and is used for representing the item object queried by the user. The accessed item object refers to an item object that has been accessed before the current access request, wherein the item object that has been accessed may be the same as the item object of the current access request or different from the item object of the current access request.
Obtaining the history request field in the history access request may refer to that the user obtains the entity name and the attribute name of the accessed item object in the history request field by collecting the input data query language and extracting the field in the input data query language.
The entity name and the attribute name of the accessed item object refer to the entity name and the attribute name of the accessed item object appearing in the history request field.
And carrying out field analysis on the entity name and the attribute name of the accessed item object to obtain a history main object of the accessed item object, wherein the history main object comprises history entity characteristics and history attribute characteristics.
The field analysis is to identify and extract the entity name and the attribute name of the accessed item object, and exclude the filter item in the history request field, thereby determining the history subject object of the history request field.
In embodiments of the present application, the history subject object includes history entity characteristics and history attribute characteristics, wherein, for example, in some embodiments, the history entity characteristics may include "entity in accessed item object", and the attribute characteristics may include "as a return attribute", "as a NULL value (IS NULL) filter attribute or a non-NULL value (IS NOT NULL) filter attribute", "as a value filter attribute NOT being a NULL value (IS NULL) or a non-NULL value (IS NOT NULL), as a return attribute", "as a set of attributes of grouping clauses".
And carrying out statistical processing on the historical access request to obtain the usage amount of the historical access request, wherein the usage amount is used for representing the number of times that the historical access request is accessed in a preset historical time period.
The statistical processing means to count the number of times that the historical access request is accessed within a preset historical time period, wherein the historical time period may be set manually. For example, the number of times a historical access request was accessed within the past week is obtained.
And determining the usage amount of the history subject object according to the usage amount of the history access request.
Determining the usage amount of the history subject object means counting the history subject objects appearing in each history access request, obtaining the number of times that the history subject object is accessed in a history time period, and obtaining the number of times that the history entity characteristic and the history attribute characteristic are accessed.
And determining a history subject object which is the same as the subject object of the item object to be accessed in the history subject objects of all the accessed item objects, and taking the use amount of the determined history subject object as the use amount of the subject object of the item object to be accessed.
The same history subject object refers to a history subject object with the same entity characteristics as the history entity characteristics and a history subject object with the same attribute characteristics as the history attribute characteristics. And determining the use amount of the entity characteristics and the use amount of the attribute characteristics in the item object to be accessed according to the times of accessing the historical entity characteristics and the historical attribute characteristics.
And sequencing the entity characteristics and the attribute characteristics of the main object according to the usage amount of the main object in the item object to be accessed to obtain sequenced first statistical data information.
The step of sequencing the main body objects in the to-be-accessed item objects refers to sequencing the entity features and the attribute features in the main body objects according to the usage amount of the main body objects in the to-be-accessed item objects, so that sequenced first statistical data information is obtained. The obtained sorted first statistical data information may be presented in a form of a list.
The method for sorting the first statistical data information according to the usage amount of the entity feature and the usage amount of the attribute feature further includes:
and according to the usage amount of a main object in the item object to be accessed, calculating the sampling degree and accuracy of the entity characteristics and the attribute characteristics in the main object to obtain the sampling degree data and accuracy data of the entity characteristics and the sampling degree data and accuracy data of the attribute characteristics.
The sampling degree of the entity features is used for representing the occurrence frequency of the entity features, and the higher the sampling degree is, the larger the occurrence frequency is, and the larger the data statistical accuracy of the entity features is. The sampling degree of the attribute features is used for representing the occurrence frequency of the attribute features, and the higher the sampling degree is, the greater the occurrence frequency is, and the greater the accuracy of the attribute features is.
And storing the sampling degree data and the accuracy data of the entity characteristics and the sampling degree data and the accuracy data of the attribute characteristics in the first statistical data information to obtain first target statistical data information.
Storing the degree-of-sampling data and the accuracy data in the first statistical data information refers to storing the entity feature degree-of-sampling data and the accuracy data at a location of the entity feature in the first statistical data information and storing the degree-of-sampling data and the accuracy data of the attribute feature at a location of the attribute feature in the first statistical data information.
In an embodiment of the present application, a method for ranking second statistical data information includes:
and determining the usage amount of the association relation between the first item object to be accessed and the second item object to be accessed, which meets preset association conditions, according to the usage amount of the main body object in the item objects to be accessed.
Determining the usage amount of the association relationship between the first item object to be accessed and the second item object to be accessed refers to determining the usage amount of the association relationship between the first entity name and the first attribute name in the first item object to be accessed and the second entity name and the second attribute name in the second item object to be accessed.
Wherein the usage amount of the association relationship between the first item object to be accessed and the second item object to be accessed includes:
the usage amount of the incidence relation between the first entity name in the first to-be-accessed item object and the second entity name in the second to-be-accessed item object;
the usage amount of the incidence relation between the first entity name in the first to-be-accessed item object and the second attribute name in the second to-be-accessed item object;
the usage amount of the incidence relation between the first attribute name in the first to-be-accessed item object and the second entity name in the second to-be-accessed item object;
and the usage amount of the association relationship between the first attribute name in the first item object to be accessed and the second attribute name in the second item object to be accessed.
And sorting the second statistical data information according to the usage amount of the incidence relation between the first to-be-accessed item object and the second to-be-accessed item object to obtain sorted second target statistical data information.
The sorting processing means that the usage amount of the association relationship between the first entity name and the second entity name, the usage amount of the association relationship between the first entity name and the second attribute name, the usage amount of the association relationship between the first attribute name and the second entity name, and the usage amount of the association relationship between the first attribute name and the second attribute name are sorted according to the usage amount, so that second target statistical data information for sorting the association relationship is obtained. The method for ordering the second statistical data information further comprises:
and according to the usage amount of the main body object in the item object to be accessed, calculating the sampling degree and accuracy of the association relationship between the first item object to be accessed and the second item object to be accessed to obtain sampling degree data and accuracy data of the association relationship.
The sampling degree of the incidence relation is used for representing the occurrence frequency of the incidence relation, and the higher the sampling degree is, the larger the occurrence frequency is, and the greater the data statistical accuracy of the incidence relation is.
And storing the sampling degree data and the accuracy data of the incidence relation in the second statistical data information to obtain second target statistical data information.
The following describes a data processing method in an embodiment of the present invention with reference to a specific application scenario.
Referring to fig. 2, a schematic flow chart of an embodiment of applying the data processing method in an experimental scenario according to an embodiment of the present invention is shown, where the data processing method is applied to a server, and the data processing method includes:
201. and determining the information of the preset object model.
The object model may be a domain model.
The information of the object model includes, but is not limited to, entities, attributes, associations, ID attributes, access attributes of the domain model.
Determining the information of the preset object model refers to performing a reading operation on the domain model.
202. And acquiring a target current request field in the current access request.
The current access request is a GraphQL query language, wherein the GraphQL query language can be obtained by obtaining a request from a service log of the GraphQL engine, or obtaining the request from an aggregated log service center of the GraphQL engine or directly forwarding the request from the client (for example, when the request is applied to a GraphQL query server).
Obtaining the target current request field in the current access request refers to obtaining log content in the GraphQL query language, where the log content includes request initiation time, request body content (e.g., return entity and hierarchy, return attribute set, filter condition related attributes and conditions, grouping attributes, etc.).
203. It is determined whether the current access request matches the object model.
Analyzing the target current request field in the previous access request means analyzing the syntactic structure of the GraphQL request, and then checking whether the GraphQL request conforms to the field model structure on the basis of the field model.
And when the entity name in the GraphQL request and the attribute name corresponding to the entity name correspond to the item object in the domain model, the GraphQL request conforms to the domain model structure.
204. And analyzing the target current request field in the current access request to obtain an analysis result.
Analyzing the GraphQL request to obtain the request initiation time of the GraphQL request, the filter item (including entity and attribute, and comparison expression in filter condition) in the request, the return entity and hierarchy among entities in the request, the return Distinct clause in the request, and the packet aggregation information (including packet clause and aggregation function) in the request, and analyzing the entity set, association set, multi-hop association NOT greater than a specified hop threshold (the hop threshold delta IS a system parameter), entity and attribute as return attribute, entity and attribute as IS NULL or NOT NULL filter attribute, and the like of the GraphQL request according to the request initiation time, the filter item (including entity and attribute and comparison expression in filter condition) in the request, the return entity and hierarchy among entities in the request, the return Distinct clause in the request, and packet aggregation information (including packet clause and aggregation function) in the request, Entities and attributes that are value filtering attributes that are NOT IS NULL or IS NOT NULL filtering, a set of attributes that are Distingt clauses (as one case of return attributes), and a set of attributes that are grouping clauses.
205. The usage of the target current request field in the current access request is calculated.
The usage of the target current request field includes:
the usage ue (e1) of the entity (e.g., e 1);
an associated (e.g., r1) usage ur (r 1);
the usage u (< r1, r2 …, rd >) of the multi-hop association { < r1, r2 …, rd > | d ≦ δ }, wherein the number δ is a system parameter;
entity and attribute (e.g., e1.a1) as the usage u of the return attributea―ret(e1.a1);
Use of entities and attributes (e.g., e1.a1) as IS NULL or IS NOT NULL filtering attributes ua―null(e1.a1);
Entity and attribute (e.g., e1.a1) usage u as a value filtering attribute that IS NOT IS NULL or IS NOT NULL filtereda―pred(e1.a1)
Usage u as a set of attributes of the Distingt clause (i.e., return attributes, e1.a1, e1.a2, …, e1.a { n } })as―dist(e1.a1,…,e1.an);
Usage u as a set of attributes of a grouping clause (e.g., { e1.a1, e1.a2, …, e1.a { n } })as―grp(e1.a1,…,e1.an)。
The method for calculating the usage amount may include: the usage over a preset historical period of time is calculated using a sliding window algorithm.
For example, the sliding window size is denoted as W and the step size is denoted as S. In practice, for example, W, S may be selected to be 1 month, 1 day (or 1 week, 1 hour). For each time period (e.g., 1 hour) of the sliding window step, the usage of the target current request field is calculated and stored. Element usage is reserved for all step periods (e.g., 1 hour) within a window size period (e.g., 1 week) in the system, and when the window is slid one step forward (in the future direction), a set of target current request fields for one step period (1 hour) is newly generated and reserved, and the oldest set of target current request fields for one step period (1 hour) and above is planed.
206. And extracting attribute information from a database of the object model according to the relation between the current field of the object and the object model, and obtaining first object statistical data information and second object statistical data information according to the usage amount of the current field of the object.
The first target statistical data information and the second target statistical data information each include a subject object of statistical information to be generated, a form and content of the statistical information on the subject object to be generated, and a timeliness requirement of the statistical information to be generated, which may be, for example, a decimal that takes a value of [0, 1 ].
Wherein the first target statistical data information includes:
statistical information of entity characteristics:
subject object: an entity (e 1);
1) the content is as follows: the number of instances in an entity in the underlying data;
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + gamma. u)e(e1) ); wherein epsilon is a system parameter, epsilon is more than or equal to 0 and less than 1, and epsilon is used for controlling the dependence of each metric value in the statistical information generation plan on the historical use condition (the smaller the dependence is, the larger the dependence is); gamma is a system parameter, 0 < gamma.ltoreq.1, and is used to control the influence of the degree of usage on the degree of sampling in the statistical information generation plan (the greater the dependency).
2) The content is as follows: a variance in the number of instances in an entity in the underlying data;
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + gamma. u)e(e1));
And timeliness: 1- (1-epsilon)/(1 + alpha. u)e(e1) ). Where α is a system parameter, 0 < α ≦ 1, α is used to control the impact of usage on timeliness in the statistics generation plan (the greater the dependency).
Statistical information of the attributes: 1. an attribute as a return value;
subject object: the attributes (e1.a1) listed in the returned list in the access request, where e1 characterizes the entity and a1 characterizes the attributes.
The content is as follows: the average byte size of the attribute value in the underlying data;
parameters are as follows: sampling degree: (1- (1-. epsilon.)/(1 +. gamma. u)a―ret(e1.a1));
And timeliness: (1- (1-. rho.)/((1-. alpha. u.))))/((1-. epsilon.))/((1. alpha. u.))))))) (1- (. beta.). Variannum (e 1)))))) xa―ret(e1.a 1)))). Where β is a system parameter, 0 ≦ β ≦ 1, and β is used to control the amount of variation in the number of instances of the entity versus statisticsThe effect of timeliness in the information generation plan (the greater the dependency). ρ is a system parameter, ρ is more than or equal to 0 and less than or equal to 1, and ρ is used for controlling the dependency of timeliness in the statistical information generation plan on the historical data situation (the larger the dependency is), and the larger the dependency is. Variantnum is used to characterize the amount of variation in the number of instances in an entity in the underlying data.
2. An attribute associated with a value cardinality;
a subject object: an attribute (e1.a2) listed in the single attribute Distingt clause or the single attribute grouping clause in the access request;
the content is as follows: the number of different values (distint value) of the property value in the underlying data;
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + gamma 1. u)as―dist(e1.a2)+γ2·uas―grp(e1.a2));
And timeliness: (1- (1-. rho.)/(1. beta. variantnum (e1))) × (1- (1-. epsilon.)/(1. alpha.1. u.)/(1.))as―dist(e1.a2)+α2·uas―grp(e1.a2))))。
3. The single attribute constitutes this attribute in one grouping clause;
subject object: the attribute listed in the single attribute group clause in the access request (let it be e1.a 3);
the content is as follows: count-min sketch count of the attribute value in the bottom layer data;
parameters are as follows: the space size is as follows: 1- (1-epsilon)/(1 + gamma. u)as―grp(e1.a3));
And timeliness: (1- (1-. rho.)/(1. beta. variantnum (e1))) × (1- (1-. epsilon.)/(1. alpha. u.)/(1. alpha. alpha.))as―grp(e1.a3)))。
4. Attributes in the NULL filter condition;
subject object: an attribute listed in the access request in IS NULL or IS NOT NULL filter conditions (let it be e1.a 4);
the content is as follows: null ratio of the attribute value in the underlying data;
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + gamma. u)a―null(e1.a4));
And timeliness: (1- (1-. rho.)/(1-. beta. variannum (e1))) × (1- (1-. epsilon.)/(1-. alpha. u.)/(1-. alpha.))a―null(e1.a4)))。
5. Attributes in non-NULL filter conditions;
subject object: an attribute listed in the access request in a value filter condition that IS NOT IS NULL or IS NOT NULL filtered (let it be e1.a5);
1) the content is as follows: the number of different values (discontinuity values) of the attribute value in the underlying data is marked as dist _ val _ num;
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + gamma. u)a―pred(e1.a5));
2) The content is as follows: an equal-depth statistical histogram of the attribute values in the underlying data;
parameters are as follows: the number of barrels is as follows: max (1, [ log ]μdist_val_num])×(1-(1-ε)/(1+γ·ua―pred(e1.a5)));
3) The content is as follows: count-min sketch count of the attribute value in the bottom layer data;
parameters are as follows: the space size is as follows: 1- (1-epsilon)/(1 + gamma. u)a―pred(e1.a5));
And timeliness: (1- (1-. rho.)/(1. beta. variantnum (e1))) × (1- (1-. epsilon.)/(1. alpha. u.)/(1. alpha. alpha.))a―pred(e1.a5)))。
Statistical information of multiple attributes:
6. a combination of attributes associated with a value cardinality;
subject object: the combination of attributes listed in the multi-attribute Distingt clause or multi-attribute grouping clause in the access request (let it be { e1.a61, …, e1.a6n });
the content is as follows: the number of different values (at least one-dimensional values being different) of the multidimensional values of the combination of attributes in the underlying data;
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + gamma 1. u)as―dist(e1.a61,…,e1.a6n)+γ2·uas―grp(e1.a61,…,e1.a6n));
And timeliness: (1- (1-. rho.)/(1. beta. variantnum (e1))) × (1- (1-. epsilon.)/(1. alpha.1. u.)/(1.))as―dist(e1.a61,…,e1.a6n)+α2·uas―grp(e1.a61,…,e1.a6n)))。
7. Multiple attributes constitute all of these attributes of a group clause;
a subject object: the combination of attributes listed in the multi-attribute grouping clause in the access request (let it be { e1.a71, …, e1.a7n });
the content is as follows: counting-min sketch counts of the multidimensional values of the attribute combination in the bottom layer data;
parameters are as follows: the space size is as follows: 1- (1-epsilon)/(1 + gamma. u)as―grp(e1.a71,…,e1.a7n));
And timeliness: (1- (1-. rho.)/(1. beta. variantnum (e1))) × (1- (1-. epsilon.)/(1. alpha. u.)/(1. alpha. alpha.))as―grp(e1.a71,…,e1.a7n)))。
The second target statistical data information includes:
single-associated statistical information:
subject object: one hop association in an access request (e.g., r1 ═ < e11, e12 >);
1) the content is as follows: the number of associated instances of the one-hop association r1 in the underlying data (intra-association type);
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + gamma. u)r(r1));
2) The content is as follows: the amount of change in the number of associated instances of the one-hop association r1 in the underlying data is denoted as variantnum';
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + gamma. u)r(r1));
And timeliness: 1- (1-epsilon)/(1 + alpha. u)r(r1))。
Joint statistics of attribute and ticket association:
1. attributes in NULL filter conditions;
subject object: attributes listed in IS NULL or IS NOT NULL filter conditions in the access request, along with a one-hop association (e.g., r2 ═ e21, e22> and e21.a1) that starts with the entity in which the attribute IS located;
the content is as follows: the number of associated instances (intra-association type) of the one-hop association r2 that satisfy the condition that the entity attribute value (e21.a1) is null in the underlying data;
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + min { gamma 1. u)a―null(e21.a1),γ2·ur(r2)});
And timeliness: (1- (1-. rho)/(1 + max { β 1. v)ariantnum(e21)+β2·variantnum′(r2)}))×(1-(1-ε)/(1+min{α1·ua―null(e21.a1),α2·ur(r2)}))。
2. Attributes in non-NULL filter conditions;
subject object: attributes listed in the access request in non-NULL related filter terms, along with a one-hop association from the entity in which the attribute resides (e.g., r3 ═ e31, e32> and e31.a 2);
1) the content is as follows: a statistical histogram in the underlying data with the entity attribute value (e31.a2) as an argument and the number of associated instances (intra-association type) of the one-hop association r3 satisfying the attribute value filter condition as a dependent variable;
parameters are as follows: the number of barrels is as follows: max (1, [ log ]μdist_val_num])×(1-(1-ε)/(1+γ·ua―pred(e31.a2)))。
2) The content is as follows: count-min sketch count in the underlying data with the entity attribute value (e31.a2) as an argument and the number of associated instances (intra-association type) of the one-hop association r3 that satisfy the attribute value filter condition as a dependent variable;
parameters are as follows: the space size is as follows: 1- (1-epsilon)/(1 + min { gamma 1. u)a―pred(e31.a2),γ2·ur(r3)});
And timeliness: (1- (1-. rho.)/(1 + max { β 1. variantnum (e31) + β 2. variantnum' (r 3)) × (1- (1-. epsilon.)/(1 + min { α 1. u) } is used as a carrier for the drug delivery systema―pred(e31.a2),α2·ur(r3)}))。
Generating a plan by using the multi-associated statistical information:
subject object: multiple hop association (e.g. with a hop count no greater than δ in the access request<r1,r2,…,rd>Where d < δ (δ is a system parameter), and ri < ei,1,ei,2>) So that the timeliness of the time is more than or equal to tau (wherein tau is a system parameter);
the content is as follows: the number of instances of the multi-hop association < r1, r2, …, rd > (intra-association type) in the underlying data;
parameters are as follows: sampling degree: 1- (1-epsilon)/(1 + γ · u (< r1, r2, …, rd >);
and timeliness: 1- (1-epsilon)/(1 + alpha. u (< r1, r2, …, rd >)).
In order to better implement the method, embodiments of the present application further provide a data processing apparatus, where the data processing apparatus may be specifically integrated in an electronic device, and the electronic device may be a terminal, a server, or the like. The terminal can be a mobile phone, a tablet computer, an intelligent Bluetooth device, a notebook computer, a personal computer and other devices; the server may be a single server or a server cluster composed of a plurality of servers.
For example, in the present embodiment, the method of the present embodiment will be described in detail by taking an example in which a data processing device is specifically integrated in a server.
For example, as shown in fig. 3, the data processing apparatus may include:
an obtaining unit 301, configured to obtain a target current request field in a current access request, where the target current request field includes an entity name of an item object to be accessed and an attribute name of the item object to be accessed;
a first searching unit 302, configured to search, from a preset object model, an item object corresponding to an entity name of the item object to be accessed, and mark the item object as a target item object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database;
a second searching unit 303, configured to search, from at least one database connected to the target item object, attribute information corresponding to an attribute name of the item object to be accessed.
In some embodiments of the present application, the data processing apparatus may further include:
a first field analysis unit, configured to perform field analysis on an entity name and an attribute name of the to-be-accessed item object, and obtain a main object of the to-be-accessed item object, where the main object includes an entity feature used to characterize an entity in the to-be-accessed item object and an attribute feature used to characterize an attribute in the to-be-accessed item object;
and the first storage unit is used for correspondingly storing the attribute information on the position of the entity characteristic and the position of the attribute characteristic in the main body object to obtain first statistical data information.
In some embodiments of the present application, the data processing apparatus may further include:
a first determination unit configured to determine an item-object association relationship described in the object model;
a second determining unit, configured to determine, according to an item object association relationship recorded in the object model, an association relationship between a first item object to be accessed and a second item object to be accessed in the target current request field;
the judging unit is used for judging whether the incidence relation between the first item object to be accessed and the second item object to be accessed meets a preset incidence condition or not; if so, determining the incidence relation between a first subject object of the first item object to be accessed and a second subject object of the second item object to be accessed;
and the second storage unit is used for storing the attribute information according to the incidence relation between the first main body object of the first item object to be accessed and the second main body object of the second item object to be accessed to obtain second statistical data information.
In some embodiments of the present application, the data processing apparatus may further include:
the history acquisition unit is used for acquiring a history request field in the history access request, wherein the history request field comprises an entity name of an accessed item object and an attribute name of the accessed item object;
a second field analysis unit, configured to perform field analysis on the entity name and the attribute name of the accessed item object, and obtain a history main object of the accessed item object, where the history main object includes a history entity feature and a history attribute feature;
the first statistical unit is used for performing statistical processing on the historical access request to obtain the usage amount of the historical access request, wherein the usage amount is used for representing the number of times that the historical access request is accessed in a preset historical time period;
a third determination unit configured to determine a usage amount of the history subject object according to the usage amount of the history access request;
a fourth determination unit, configured to determine, among history subject objects of all accessed item objects, a history subject object that is the same as a subject object of the item object to be accessed, and use the amount of the history subject object that is determined to be the same as the amount of use of the subject object of the item object to be accessed;
and the sequencing processing unit is used for sequencing the entity characteristics and the attribute characteristics of the main body object according to the usage amount of the main body object in the item object to be accessed to obtain the sequenced first statistical data information.
In some embodiments of the present application, the data processing apparatus may further include:
the calculation processing unit is used for calculating the sampling degree and the accuracy of the entity characteristics and the attribute characteristics in the main object according to the usage amount of the main object in the item object to be accessed to obtain the sampling degree data and the accuracy data of the entity characteristics and the sampling degree data and the accuracy data of the attribute characteristics;
and the obtaining unit is used for storing the sampling degree data and the accuracy data of the entity characteristics and the sampling degree data and the accuracy data of the attribute characteristics in the first statistical data information to obtain first target statistical data information.
In some embodiments of the present application, the data processing apparatus may further include:
the calculation processing unit is used for calculating and processing the sampling degree and the accuracy of the association relationship between the first item object to be accessed and the second item object to be accessed according to the usage amount of the main body object in the item objects to be accessed to obtain sampling degree data and accuracy data of the association relationship;
and the obtaining unit is used for storing the sampling degree data and the accuracy data of the association relationship in the second statistical data information to obtain second target statistical data information.
In some embodiments of the present application, the method for searching an item object corresponding to an entity name of the item object to be accessed from a preset object model by the first searching unit 302 further includes:
determining a project object corresponding to the entity name of the project object to be accessed in the object model, and recording the project object as an initial target project object;
and judging whether the attribute name of the initial target item object corresponds to the attribute name of the item object to be accessed, if so, determining that the initial target item object is the target item object.
In a specific implementation, the above units may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and the specific implementation of the above units may refer to the foregoing method embodiments, which are not described herein again.
As can be seen from the above, in the data processing apparatus of this embodiment, the obtaining unit 301 obtains a target current request field in the current access request, where the target current request field includes an entity name of an item object to be accessed and an attribute name of the item object to be accessed; the first searching unit 302 searches an item object corresponding to the entity name of the item object to be accessed from a preset object model, and records the item object as a target item object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database; the second lookup unit 303 looks up attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected to the target item object.
Therefore, according to the name of the item object to be accessed and the attribute name of the item object to be accessed in the target current request field, the embodiment of the application can search the item object corresponding to the name of the item object to be accessed in the preset object model, so as to determine the target item object and the database connected with the target item object. Therefore, data are searched from the database correspondingly connected with the target project object, and the data searching speed is improved.
The embodiment of the application also provides the electronic equipment which can be equipment such as a terminal and a server. The terminal can be a mobile phone, a tablet computer, an intelligent Bluetooth device, a notebook computer, a personal computer and the like; the server may be a single server, a server cluster composed of a plurality of servers, or the like.
In some embodiments, the data processing apparatus may also be integrated in a plurality of electronic devices, for example, the xx apparatus may be integrated in a plurality of servers, and the data processing method of the present application is implemented by the plurality of servers.
In this embodiment, the electronic device of this embodiment is described in detail by taking the data processing device as an example, for example, as shown in fig. 4, which shows a schematic structural diagram of the number of data processing devices according to the embodiment of the present application, specifically:
the data processing apparatus may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, an input module 404, and a communication module 405. Those skilled in the art will appreciate that the data processing arrangement depicted in FIG. 4 does not constitute a limitation of the data processing arrangement and may include more or fewer components than those shown, or some of the components may be combined, or a different arrangement of components. Wherein:
the processor 401 is a control center of the data processing apparatus, connects various parts of the entire data processing apparatus with various interfaces and lines, performs various functions of the data processing apparatus and processes data by running or executing software programs and/or modules stored in the memory 402 and calling data stored in the memory 402, thereby performing overall monitoring of the data processing apparatus. In some embodiments, processor 401 may include one or more processing cores; in some embodiments, processor 401 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.
The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the data processing apparatus, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.
The data processing apparatus further comprises a power supply 403 for supplying power to the various components, and in some embodiments, the power supply 403 may be logically connected to the processor 401 via a power management system, so as to implement functions of managing charging, discharging, and power consumption via the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The data processing apparatus may also include an input module 404, the input module 404 being operable to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
The data processing device may also include a communication module 405, and in some embodiments the communication module 405 may include a wireless module, through which the data processing device may wirelessly transmit over short distances, thereby providing wireless broadband internet access to the user. For example, the communication module 405 may be used to assist a user in sending and receiving e-mails, browsing web pages, accessing streaming media, and the like.
Although not shown, the data processing apparatus may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 401 in the data processing apparatus loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application programs stored in the memory 402, thereby implementing various functions.
In some embodiments, a computer program product is also proposed, which comprises a computer program or instructions that, when executed by a processor, implement the steps of any of the data processing methods described above.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present application provide a computer-readable storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in any data processing method provided by the embodiments of the present application.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium.
Since the instructions stored in the storage medium may execute the steps in any data processing method provided in the embodiments of the present application, beneficial effects that can be achieved by any data processing method provided in the embodiments of the present application may be achieved, for details, see the foregoing embodiments, and are not described herein again.
The foregoing detailed description is directed to a data processing method, an apparatus, a terminal and a storage medium provided in the embodiments of the present application, and specific examples are applied in the present application to explain the principles and implementations of the present application, and the descriptions of the foregoing embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (14)

1.A method of data processing, the method comprising:
acquiring a target current request field in a current access request, wherein the target current request field comprises an entity name of a project object to be accessed and an attribute name of the project object to be accessed;
searching a project object corresponding to the entity name of the project object to be accessed from a preset object model, and marking as a target project object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database;
and searching attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected with the target item object.
2. The data processing method according to claim 1, characterized in that the method comprises:
performing field analysis on the entity name and the attribute name of the item object to be accessed to obtain a main object of the item object to be accessed, wherein the main object comprises entity characteristics for representing the entity in the item object to be accessed and attribute characteristics for representing the attribute in the item object to be accessed;
and correspondingly storing the attribute information on the position of the entity characteristic and the position of the attribute characteristic in the main body object to obtain first statistical data information.
3. The data processing method of claim 2, wherein when the target current request field includes a first item object to be accessed and a second item object to be accessed;
the method further comprises the following steps:
determining an item object incidence relation recorded in the object model;
determining the incidence relation between a first item object to be accessed and a second item object to be accessed in the target current request field according to the incidence relation of the item objects recorded in the object model;
judging whether the incidence relation between the first item object to be accessed and the second item object to be accessed meets a preset incidence condition or not;
if so, determining the incidence relation between a first subject object of the first item object to be accessed and a second subject object of the second item object to be accessed;
and storing the attribute information according to the incidence relation between a first main body object of the first item object to be accessed and a second main body object of the second item object to be accessed to obtain second statistical data information.
4. The data processing method according to claim 3, wherein the preset association condition is that the number of hops δ between the first item object to be accessed and the second item object to be accessed is not greater than N, wherein N is not less than 1, and N is a positive integer.
5. The data processing method according to claim 3, wherein the association relationship between the first subject object of the first item object to be accessed and the second subject object of the second item object to be accessed comprises:
the incidence relation between the first entity characteristic in the first subject object and the second entity characteristic in the second subject object;
the incidence relation between the first attribute feature in the first subject object and the second entity feature in the second subject object;
the incidence relation between the first entity characteristic in the first subject object and the second attribute characteristic in the second subject object;
and the incidence relation between the first attribute feature in the first subject object and the second attribute feature in the second subject object.
6. The data processing method of claim 3, wherein the method further comprises:
acquiring a history request field in a history access request, wherein the history request field comprises an entity name of an accessed item object and an attribute name of the accessed item object;
performing field analysis on the entity name and the attribute name of the accessed item object to obtain a history main object of the accessed item object, wherein the history main object comprises history entity characteristics and history attribute characteristics;
performing statistical processing on the historical access request to obtain the usage amount of the historical access request, wherein the usage amount is used for representing the number of times that the historical access request is accessed in a preset historical time period;
determining the usage amount of the history subject object according to the usage amount of the history access request;
determining a history subject object which is the same as the subject object of the item object to be accessed in the history subject objects of all the accessed item objects, and taking the usage amount of the determined same history subject object as the usage amount of the subject object of the item object to be accessed;
and sequencing the entity characteristics and the attribute characteristics of the main object according to the usage amount of the main object in the item object to be accessed to obtain sequenced first statistical data information.
7. The data processing method of claim 6, wherein the method further comprises:
according to the usage amount of a main object in the item object to be accessed, calculating the sampling degree and accuracy of the entity features and the attribute features in the main object to obtain sampling degree data and accuracy data of the entity features and sampling degree data and accuracy data of the attribute features;
and storing the sampling degree data and the accuracy data of the entity characteristics and the sampling degree data and the accuracy data of the attribute characteristics in the first statistical data information to obtain first target statistical data information.
8. The data processing method of claim 6, wherein the method further comprises:
determining the usage amount of the association relation between the first item object to be accessed and the second item object to be accessed, which meets a preset association condition, according to the usage amount of the main object in the item objects to be accessed;
wherein the usage amount of the association relationship between the first item object to be accessed and the second item object to be accessed includes:
the usage amount of the incidence relation between the first entity name in the first to-be-accessed item object and the second entity name in the second to-be-accessed item object;
the usage amount of the incidence relation between the first entity name in the first item object to be accessed and the second attribute name in the second item object to be accessed;
the usage amount of the incidence relation between the first attribute name in the first to-be-accessed item object and the second entity name in the second to-be-accessed item object;
usage amount of an association relation between a first attribute name in the first item object to be accessed and a second attribute name in the second item object to be accessed;
and sorting the second statistical data information according to the usage amount of the incidence relation between the first to-be-accessed item object and the second to-be-accessed item object to obtain sorted second target statistical data information.
9. The data processing method of claim 8, wherein the method further comprises:
according to the usage amount of a main object in the project objects to be accessed, calculating the sampling degree and accuracy of the incidence relation between the first project object to be accessed and the second project object to be accessed to obtain sampling degree data and accuracy data of the incidence relation;
and storing the sampling degree data and the accuracy data of the incidence relation in the second statistical data information to obtain second target statistical data information.
10. The data processing method according to claim 1, wherein the method of searching for the item object corresponding to the entity name of the item object to be accessed from the preset object model comprises:
determining a project object corresponding to the entity name of the project object to be accessed in the object model, and recording the project object as an initial target project object;
and judging whether the attribute name of the initial target item object corresponds to the attribute name of the item object to be accessed, if so, determining that the initial target item object is the target item object.
11. A data processing apparatus, comprising:
an obtaining unit, configured to obtain a target current request field in a current access request, where the target current request field includes an entity name of an item object to be accessed and an attribute name of the item object to be accessed;
the first searching unit is used for searching an item object corresponding to the entity name of the item object to be accessed from a preset object model and marking as a target item object; the preset object model comprises a plurality of project objects, and each project object is connected with at least one database;
and the second searching unit is used for searching the attribute information corresponding to the attribute name of the item object to be accessed from at least one database connected with the target item object.
12. A terminal comprising a processor and a memory, said memory storing a plurality of instructions; the processor loads instructions from the memory to perform the steps of the data processing method of any of claims 1 to 10.
13. A computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of the data processing method according to any one of claims 1 to 10.
14. A computer program product comprising a computer program or instructions, characterized in that the computer program or instructions, when executed by a processor, implement the steps in the data processing method of any of claims 1 to 10.
CN202210225367.8A 2022-03-09 2022-03-09 Data processing method, device, terminal and storage medium Active CN114661830B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210225367.8A CN114661830B (en) 2022-03-09 2022-03-09 Data processing method, device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210225367.8A CN114661830B (en) 2022-03-09 2022-03-09 Data processing method, device, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN114661830A true CN114661830A (en) 2022-06-24
CN114661830B CN114661830B (en) 2023-03-24

Family

ID=82028780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210225367.8A Active CN114661830B (en) 2022-03-09 2022-03-09 Data processing method, device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN114661830B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115374109A (en) * 2022-07-29 2022-11-22 华为技术有限公司 Data access method, device, computing equipment and system
CN116384939A (en) * 2023-04-13 2023-07-04 华腾建信科技有限公司 Engineering project safety management method, device, equipment and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101485178A (en) * 2006-07-14 2009-07-15 诺基亚公司 Method for obtaining information objects in a communication system
CN101777047A (en) * 2009-01-08 2010-07-14 国际商业机器公司 System, equipment and method for accessing database under multiple-tenant environment
CN103853803A (en) * 2013-06-26 2014-06-11 携程计算机技术(上海)有限公司 Database configuration file encapsulation method and operation method as well as operation device thereof
WO2015181814A2 (en) * 2014-05-29 2015-12-03 Cidabra Technologies Ltd System, method and computer program product for assisted information collection
CN106033466A (en) * 2015-03-20 2016-10-19 华为技术有限公司 Database query method and device
CN106202438A (en) * 2016-07-13 2016-12-07 乐视控股(北京)有限公司 The method and system of storage associated data
CN109241165A (en) * 2018-08-30 2019-01-18 联动优势科技有限公司 A kind of the determination method, apparatus and equipment of database synchronization delay
CN109284326A (en) * 2018-11-26 2019-01-29 北京中创碳投科技有限公司 A kind of data bank access method and device
CN109656980A (en) * 2018-12-27 2019-04-19 Oppo(重庆)智能科技有限公司 Data processing method, electronic equipment, device and readable storage medium storing program for executing
CN110457382A (en) * 2019-08-12 2019-11-15 中国联合网络通信集团有限公司 Method for processing business and equipment
CN111400507A (en) * 2020-06-05 2020-07-10 浙江口碑网络技术有限公司 Entity matching method and device
CN113127102A (en) * 2021-05-18 2021-07-16 中国农业银行股份有限公司 Method, device, equipment, storage medium and program for processing service data
CN113157965A (en) * 2021-05-07 2021-07-23 杭州网易云音乐科技有限公司 Audio visual model training and audio visual method, device and equipment
US20210390103A1 (en) * 2020-06-15 2021-12-16 Blue Light LLC Federated search of heterogeneous data sources

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101485178A (en) * 2006-07-14 2009-07-15 诺基亚公司 Method for obtaining information objects in a communication system
CN101777047A (en) * 2009-01-08 2010-07-14 国际商业机器公司 System, equipment and method for accessing database under multiple-tenant environment
CN103853803A (en) * 2013-06-26 2014-06-11 携程计算机技术(上海)有限公司 Database configuration file encapsulation method and operation method as well as operation device thereof
WO2015181814A2 (en) * 2014-05-29 2015-12-03 Cidabra Technologies Ltd System, method and computer program product for assisted information collection
CN106033466A (en) * 2015-03-20 2016-10-19 华为技术有限公司 Database query method and device
CN106202438A (en) * 2016-07-13 2016-12-07 乐视控股(北京)有限公司 The method and system of storage associated data
CN109241165A (en) * 2018-08-30 2019-01-18 联动优势科技有限公司 A kind of the determination method, apparatus and equipment of database synchronization delay
CN109284326A (en) * 2018-11-26 2019-01-29 北京中创碳投科技有限公司 A kind of data bank access method and device
CN109656980A (en) * 2018-12-27 2019-04-19 Oppo(重庆)智能科技有限公司 Data processing method, electronic equipment, device and readable storage medium storing program for executing
CN110457382A (en) * 2019-08-12 2019-11-15 中国联合网络通信集团有限公司 Method for processing business and equipment
CN111400507A (en) * 2020-06-05 2020-07-10 浙江口碑网络技术有限公司 Entity matching method and device
US20210390103A1 (en) * 2020-06-15 2021-12-16 Blue Light LLC Federated search of heterogeneous data sources
CN113157965A (en) * 2021-05-07 2021-07-23 杭州网易云音乐科技有限公司 Audio visual model training and audio visual method, device and equipment
CN113127102A (en) * 2021-05-18 2021-07-16 中国农业银行股份有限公司 Method, device, equipment, storage medium and program for processing service data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KRISZTIAN BALOG等: "EntiTables: Smart Assistance for Entity-Focused Tables", 《PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL》 *
刘强: "基于云计算的BIM数据集成与管理技术研究", 《中国博士学位论文全文数据库 工程科技II辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115374109A (en) * 2022-07-29 2022-11-22 华为技术有限公司 Data access method, device, computing equipment and system
CN115374109B (en) * 2022-07-29 2023-09-01 华为技术有限公司 Data access method, device, computing equipment and system
CN116384939A (en) * 2023-04-13 2023-07-04 华腾建信科技有限公司 Engineering project safety management method, device, equipment and storage medium
CN116384939B (en) * 2023-04-13 2023-12-01 华腾建信科技有限公司 Engineering project safety management method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN114661830B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
CN114661830B (en) Data processing method, device, terminal and storage medium
US10042911B2 (en) Discovery of related entities in a master data management system
US6470333B1 (en) Knowledge extraction system and method
US20200042626A1 (en) Identifying similar field sets using related source types
US20020143797A1 (en) File classification management system and method used in operating systems
CN106682097A (en) Method and device for processing log data
US20040267693A1 (en) Method and system for evaluating the suitability of metadata
US20090171938A1 (en) Context-based document search
US20060230012A1 (en) System and method for dynamically tracking user interests based on personal information
US20180218285A1 (en) Search input recommendations
WO2014028300A1 (en) Managing cross-correlated data
CN107341033A (en) A kind of data statistical approach, device, electronic equipment and storage medium
CN106682096A (en) Method and device for log data management
US20130080466A1 (en) Query servicing with access path security in a relational database management system
US7861154B2 (en) Integration of annotations to dynamic data sets
US20160041975A1 (en) Document tagging and retrieval using per-subject dictionaries including subject-determining-power scores for entries
US20230315727A1 (en) Cost-based query optimization for untyped fields in database systems
CN115187331A (en) Product recommendation method, device, equipment and storage medium based on multi-modal data
CN111414410A (en) Data processing method, device, equipment and storage medium
US11663109B1 (en) Automated seasonal frequency identification
CN116628228B (en) RPA flow recommendation method and computer readable storage medium
US20060149731A1 (en) System and method for deriving affinity relationships between objects
WO2023093783A1 (en) Distributed recommendation method for mass digital information
CN111159435B (en) Multimedia resource processing method, system, terminal and computer readable storage medium
CN116089417A (en) Information acquisition method, information acquisition device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant