CN114610769A - Data analysis method, device, equipment and storage medium - Google Patents

Data analysis method, device, equipment and storage medium Download PDF

Info

Publication number
CN114610769A
CN114610769A CN202210288360.0A CN202210288360A CN114610769A CN 114610769 A CN114610769 A CN 114610769A CN 202210288360 A CN202210288360 A CN 202210288360A CN 114610769 A CN114610769 A CN 114610769A
Authority
CN
China
Prior art keywords
data
service
bean object
analysis
annotation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210288360.0A
Other languages
Chinese (zh)
Inventor
赵勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kangjian Information Technology Shenzhen Co Ltd
Original Assignee
Kangjian Information Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangjian Information Technology Shenzhen Co Ltd filed Critical Kangjian Information Technology Shenzhen Co Ltd
Priority to CN202210288360.0A priority Critical patent/CN114610769A/en
Publication of CN114610769A publication Critical patent/CN114610769A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4488Object-oriented
    • G06F9/449Object-oriented method invocation or resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • General Business, Economics & Management (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of big data, and discloses a data analysis method, a data analysis device, data analysis equipment and a storage medium. The method comprises the following steps: analyzing annotation attributes of each field in the blank Bean object through a preset function of the Bean object; extracting corresponding data content from the acquired business data according to the annotation attribute, and filling the data content into a blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of the internet hospital; analyzing the service flow data through timing tasks and natural language processing to obtain key numerical values; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, and generating a monitoring report based on the analysis and monitoring results. The invention improves the efficiency of business volume statistics by applying the uniform supervision background system, analyzes each dimension of supervision data, and solves the technical problem that the business growth rate cannot be observed through an analysis result.

Description

Data analysis method, device, equipment and storage medium
Technical Field
The present invention relates to the field of big data technologies, and in particular, to a data analysis method, apparatus, device, and storage medium.
Background
After a plurality of internet hospitals are successfully docked with the supervision platform, with daily accumulated business data uploading, the SAAS platform integrating the plurality of internet hospitals counts and analyzes the uploaded data, and the generation value of a large amount of uploaded data is a point which is not explored by the internet hospitals. Example (c): it is known that a certain internet hospital successfully docks a certain province supervision platform, but cannot determine the total uploading amount, the total success amount, the total failure amount, the failure reason, the success ratio, the ring ratio success rate and the like.
The multi-latitude analysis supervision data is helpful for controlling the compliance state of the internet hospital, analyzing the growth rate of the internet hospital and timely self-checking the ecology of the operated internet hospital, and meanwhile, the supervision platform pays attention to the operation state of a certain internet hospital according to the data uploaded by the internet hospital. If the regulation and the correction are not in time, the regulation and the correction requirements of the supervision platform can be received, if the regulation and the correction are not in time, inestimable economic loss can be caused, or punishment measures such as license plate recovery of an internet hospital can be caused, and if the regulation and the correction are simple, for example, a patient on the Guangxi supervision platform needs to be connected with a population main index interface when building a card, and the patient needs real-name authentication when building the card and the like. The internet hospitals of many medical institutions in the industry are only connected with the supervision platform, but the data generated in the uploading process are not paid enough attention, and how to review themselves from the perspective of supervision is also a problem that the internet hospitals are easy to miss.
Disclosure of Invention
The invention mainly aims to improve the efficiency of traffic statistics by applying a uniform supervision background system, perform multi-dimensional analysis on supervision data and solve the technical problem that the service growth rate cannot be observed through an analysis result.
A first aspect of the present invention provides a data analysis method, comprising: determining a preset function of the Bean object, and analyzing annotation attributes of fields in a pre-created blank Bean object by using the preset function; acquiring service data uploaded by each internet hospital; extracting corresponding data content from each business data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, and generating a monitoring report based on the analysis and monitoring results.
Optionally, in a first implementation manner of the first aspect of the present invention, before determining a preset function of a Bean object and parsing annotation attributes of fields in a pre-created blank Bean object by using the preset function, the method further includes: acquiring service summary data of a service table and field data of each field in the service table from a preset database; and analyzing the service summary data, converting the analyzed service summary data into a Bean object in a preset format, and creating a blank Bean object.
Optionally, in a second implementation manner of the first aspect of the present invention, the parsing, by using the preset function, the annotation attribute of each field in the pre-created blank Bean object includes: acquiring all attribute names of a blank bean object, wherein the attribute names comprise a service source, a supervision platform code, a service code, retry times and an error classification label; searching all annotation contents in all the attribute names; selecting target annotation content with specific identification from all the annotation contents; and acquiring the attribute name corresponding to the target annotation content and the target annotation content, and determining the annotation attribute of the blank Bean object.
Optionally, in a third implementation manner of the first aspect of the present invention, the filling the data content into the blank Bean object to obtain the target Bean object includes: acquiring an attribute name in the annotation attribute, and extracting corresponding data content from each service data according to the attribute name, wherein the data content comprises a data configuration type and a data storage position; acquiring an attribute name and a corresponding value in data content according to the data configuration type, and determining a data dimension corresponding to the data content according to the data of the attribute name and the value; and filling the data content into the blank Bean object according to the data dimension to obtain a target Bean object.
Optionally, in a fourth implementation manner of the first aspect of the present invention, the converting, according to a preset self-attention mechanism, each target Bean object into a business data matrix to obtain business flow data of each internet hospital includes: converting each target Bean object into a service data matrix according to a preset self-attention mechanism; based on the data dimension, uploading the data content to the business data matrix to obtain a first business data matrix and a second business data matrix with business data; and decoding the first service matrix and the second service matrix according to a decoder to obtain the service flow data of the Internet hospital.
Optionally, in a fifth implementation manner of the first aspect of the present invention, after the converting, according to a preset self-attention mechanism, each target Bean object into a business data matrix to obtain business flow data of each internet hospital, the method further includes: generating MQ messages based on preset timing task data and the service flow data, and storing the MQ messages in an MQ message queue of Redis; monitoring the MQ message queue to acquire MQ messages in the MQ message queue; establishing a timing task according to the MQ message, and storing the timing task into a timing task table; and creating a thread pool, and regularly running the timed tasks in the timed task table based on the thread pool.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the performing, according to the key value, exception analysis and monitoring on the corresponding service data, and generating a monitoring report based on a result of the analysis and monitoring includes: inquiring in a preset database according to the key value to obtain service data corresponding to the service flow data; carrying out average path analysis on the service data based on an isolated forest algorithm and the attribute name in the target Bean object to obtain the average path length of the original service data; analyzing according to the average path length and the expectation of the path length of each data in the original service data, and determining a data abnormal point; and calling an association rule analysis model to analyze and monitor the data abnormal points, and generating a monitoring report based on the analysis and monitoring results.
A second aspect of the present invention provides a data analysis apparatus comprising: the device comprises a first creating module, a second creating module and a third creating module, wherein the first creating module is used for determining a preset function of a Bean object and analyzing annotation attributes of fields in a blank Bean object created in advance by using the preset function; the first acquisition module is used for acquiring service data uploaded by each Internet hospital; the filling module is used for extracting corresponding data content from each business data according to the annotation attribute and filling the data content into the blank Bean object to obtain a target Bean object; the conversion module is used for converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; the analysis module is used for analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and the monitoring module is used for carrying out abnormity analysis and monitoring on the corresponding service data according to the key value and generating a monitoring report based on the analysis and monitoring result.
Optionally, in a first implementation manner of the second aspect of the present invention, the data analysis apparatus further includes: the second acquisition module is used for acquiring the service summary data of the service table and the field data of each field in the service table from a preset database; and the second creating module is used for analyzing the service summary data, converting the analyzed service summary data into a Bean object in a preset format and creating a blank Bean object.
Optionally, in a second implementation manner of the second aspect of the present invention, the first creating module is specifically configured to: acquiring all attribute names of a blank bean object, wherein the attribute names comprise a service source, a supervision platform code, a service code, retry times and an error classification label; searching all annotation contents in all the attribute names; selecting target annotation content with specific identification from all the annotation contents; and acquiring the attribute name corresponding to the target annotation content and the target annotation content, and determining the annotation attribute of the blank Bean object.
Optionally, in a third implementation manner of the second aspect of the present invention, the filling module is specifically configured to: acquiring an attribute name in the annotation attribute, and extracting corresponding data content from each service data according to the attribute name, wherein the data content comprises a data configuration type and a data storage position; acquiring an attribute name and a corresponding value in data content according to the data configuration type, and determining a data dimension corresponding to the data content according to the data of the attribute name and the value; and filling the data content into the blank Bean object according to the data dimension to obtain a target Bean object.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the conversion module includes: the conversion unit is used for converting each target Bean object into a service data matrix according to a preset self-attention mechanism; the uploading unit is used for uploading the data content to the business data matrix based on the data dimension to obtain a first business data matrix and a second business data matrix with business data; and the decoding unit is used for decoding the first service matrix and the second service matrix according to a decoder to obtain the service flow data of the Internet hospital.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the data analysis apparatus further includes: the generating module is used for generating MQ messages based on preset timing task data and the service flow data and storing the MQ messages in an MQ message queue of Redis; the monitoring module is used for monitoring the MQ message queue and acquiring the MQ messages in the MQ message queue; the storage module is used for creating a timing task according to the MQ message and storing the timing task into a timing task table; and the running module is used for creating a thread pool and running the timing tasks in the timing task table at regular time based on the thread pool.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the monitoring module is specifically configured to: inquiring in a preset database according to the key value to obtain service data corresponding to the service flow data; carrying out average path analysis on the service data based on an isolated forest algorithm and the attribute name in the target Bean object to obtain the average path length of the original service data; analyzing according to the average path length and the expectation of the path length of each data in the original service data, and determining a data abnormal point; and calling an association rule analysis model to analyze and monitor the data abnormal points, and generating a monitoring report based on the analysis and monitoring results.
A third aspect of the present invention provides a data analysis apparatus comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the data analysis device to perform the steps of the data analysis method described above.
A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the data analysis method described above.
In the technical scheme provided by the invention, through determining the preset function of the Bean object, and analyzing the annotation attribute of each field in the pre-created blank Bean object by using the preset function; acquiring service data uploaded by each internet hospital; extracting corresponding data contents from each service data according to the annotation attribute, and filling the data contents into the blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, generating a monitoring report query based on the analysis and monitoring results to obtain original service data corresponding to the service flow data, and analyzing the original service data according to the attribute name in the target Bean object to obtain a data analysis result. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
Drawings
FIG. 1 is a schematic diagram of a first embodiment of a data analysis method provided by the present invention;
FIG. 2 is a schematic diagram of a second embodiment of a data analysis method provided by the present invention;
FIG. 3 is a schematic diagram of a third embodiment of the data analysis method provided by the present invention;
FIG. 4 is a schematic diagram of a fourth embodiment of the data analysis method provided by the present invention;
FIG. 5 is a schematic diagram of a fifth embodiment of a data analysis method provided by the present invention;
FIG. 6 is a schematic diagram of a first embodiment of a data analysis apparatus provided in the present invention;
FIG. 7 is a schematic view of a second embodiment of the data analysis device provided in the present invention;
fig. 8 is a schematic diagram of an embodiment of a data analysis device provided in the present invention.
Detailed Description
According to the data analysis method, the data analysis device, the data analysis equipment and the data analysis storage medium, the preset function of the Bean object is determined, and the preset function is used for analyzing the annotation attribute of each field in the pre-created blank Bean object; acquiring service data uploaded by each internet hospital; extracting corresponding data contents from each service data according to the annotation attribute, and filling the data contents into the blank Bean object to obtain a target Bean object; converting each target Bean object into a service data matrix according to a preset self-attention mechanism to obtain service flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, generating a monitoring report query based on the analysis and monitoring results to obtain original service data corresponding to the service flow data, and analyzing the original service data according to the attribute name in the target Bean object to obtain a data analysis result. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For understanding, the following describes a specific flow of an embodiment of the present invention, and referring to fig. 1, a first embodiment of a data analysis method according to an embodiment of the present invention includes:
101. determining a preset function of the Bean object, and analyzing annotation attributes of fields in a pre-created blank Bean object by using the preset function;
in this embodiment, a preset function of the Bean object is determined, and the preset function is used to analyze annotation attributes of fields in the pre-created blank Bean object. Wherein, the preset function refers to a default constructor. The default constructor (default constructor) is the constructor that is called when no initialization formula is explicitly provided. It is defined by a constructor with no parameters, or a constructor that provides default arguments for all parameters. A default constructor is used if initialization is not provided when defining a variable of a certain class.
Specifically, a preset interface is registered in the spring, the spring has a plurality of bean objects, and when a bean object is requested from the spring by a getDomain (bean name) method, that is, before the bean object is instantiated, a predefined function in the preset interface needs to be called to operate on the bean object, so that the bean object is instantiated.
Since the preset interface is only registered in the spring, it is only an identification of the categorical summary for the spring. In order to call the preset interface in the spring, the preset interface needs to be specifically realized. The specific implementation mode of the preset interface is as follows: calling the user-defined extension class to realize the business process of the preset interface, and calling the user-defined extension class to realize the business process of the preset interface, so that the spring can operate the bean object through a predefined function in the preset interface.
102. Acquiring service data uploaded by each internet hospital;
in the embodiment, a data acquisition template is formulated at the central end according to the data acquisition requirement, and a case of service data is automatically generated according to the template;
secondly, in the acquisition end hospital system, simulating user access operation to carry out data entry, wherein the data entry content is the generated service data; after the simulation input is completed, the background database of the acquisition end hospital completes the data alignment of the service data in the background database according to the database structure so as to establish a data acquisition channel between the central end hospital and the acquisition end hospital and realize the automatic acquisition of the data;
thirdly, after the data alignment is finished, conventional data acquisition and entry can be carried out in an acquisition end hospital; and after the conventional input is finished, carrying out repeated identification on each data acquisition item acquired by the acquisition end hospital, deleting the repeated data acquisition items when the data acquisition items are repeated, or storing the acquired data acquisition items into a background database.
In the embodiment, by formulating a template for data acquisition, generating service data according to the template, filling or inputting the service data into an electronic medical record system of an opposite-end hospital, and aligning the data according to a database structure of a background database of an acquisition-end hospital, automatic alignment of database fields of electronic medical record systems of different hospitals is realized; on the basis of the alignment of the database fields, a worker can realize data acquisition in a way of customizing a database view.
103. Extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object;
in this embodiment, first, a method for obtaining content information of field information is obtained based on a name of the field information, that is, the name is converted into a corresponding method name according to a java bean specification, where the java bean is a reusable component written in java language. For the conversion of the method for obtaining the content information of the field information, a specific rule is that if the name of the field information is field, the method for obtaining the content information of the field information is getField or setField, that is, a get or set prefix is added in front of the name of the field information, and the name of the method for obtaining the content information of the field information can be obtained through the rule.
For reading the content information of the field information, the specific implementation manner may be: getdeclear Method ("getField"); in the method, the specified method is acquired through the class object cls, the getField represents the acquired code, and the acquired method is marked as getFieldMethod.
And when the obtained method for acquiring the content information of the field information is a private method, setting the private method to be in an accessible state, and calling the private method to acquire the content information of the field information. The setting of the private method to the accessible state is specifically implemented by the function setAccessible, that is, the private method can be set to the accessible state.
104. Converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital;
in this embodiment, a self attention mechanism (self attention) is one of the attention mechanisms and is also an important component in the transform. Specifically, the purpose of the attention mechanism is to pay attention to partial details according to our target, rather than performing analysis on a global basis, so the core is how to determine the part we pay attention to based on the target and perform further analysis after finding the partial details. For example, two sentences: the 'i am a students' and 'you and me are students' can be analyzed by combining with a BilSTM and an attention mechanism, word embedding is firstly carried out, then the BilSTM is used for processing and analyzing context to obtain a new word vector, then the new word vector enters an attention mechanism part, the similarity between words and words between two sentences is mainly calculated, the word vector is used as a weight after normalization, and the word vector of the word represented by another sentence is obtained by combining the weight and each word vector of the other sentence. For example, we analyze I now, that is, our target is I, how we determine the part to be focused according to this target is to calculate the similarity, multiply the word vector of I with the word vectors such as you, and, me, etc., to obtain the similarity (this is the feature of the word vectors, generally speaking, the closer the meanings of the two word vectors are, the smaller the distance and the included angle are, and the larger the product is), normalize the similarity to obtain the weight, multiply the weight with the word vectors such as you, and, me, etc., and finally add up to obtain the word vector constructed according to I by using the second sentence. And finally, analyzing whether the two sentences are repeated or not by comparing the difference degree of the obtained new sentence with the original sentence, wherein the application of the attention mechanism in text repetition degree detection is realized.
105. Analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data;
in the present embodiment, Natural Language Processing (NLP) is various theories and methods for realizing efficient communication between a person and a computer in natural language. Semantic parsing is the understanding of semantic content represented by a piece of text using various methods of natural language processing. The word segmentation structure is a structure for splitting the service flow data into a main body, time, a limiting word and a target.
Taking the example that the business pipeline data is 'how active the WeChat is in approximately three months', based on the natural language processing, through the processes of Named entity recognition (NER, Named-entity recognition, which refers to the recognition of entities with specific meanings in texts), part-of-speech tagging (which is the marking of part-of-speech of a word according to the meaning and context content), word stem processing (which is the similar processing of removing a plurality of nouns and removing different tenses of verbs), the construction of sentence syntax trees (the graphical representation of the constructed sentence structures), and reference relations (which determines the meaning represented by each word or symbol in the business flow data), splitting the 'activity in about three months of WeChat' to obtain a word segmentation structure of WeChat + in about three months + activity + in about, wherein the main body is 'WeChat', the time is 'in about three months', the limitation is 'activity', and the purpose is 'how'.
106. And performing anomaly analysis and monitoring on the corresponding service data according to the key values, and generating a monitoring report based on the analysis and monitoring results.
In this embodiment, the anomaly analysis is to perform anomaly data mining based on data in the original service data, obtain whether the data is abnormal based on the anomaly data mining, further determine an anomaly point, perform correlation analysis according to data corresponding to the anomaly point to obtain a reason for the occurrence of the anomaly point, and obtain a data analysis result according to the anomaly point and the reason for the occurrence of the anomaly.
In the embodiment of the invention, a preset function of a Bean object is determined, and the preset function is utilized to analyze the annotation attribute of each field in a pre-created blank Bean object; acquiring service data uploaded by each internet hospital; extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, generating a monitoring report query based on the analysis and monitoring results to obtain original service data corresponding to the service flow data, and analyzing the original service data according to the attribute name in the target Bean object to obtain a data analysis result. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
Referring to fig. 2, a second embodiment of the data analysis method according to the embodiment of the present invention includes:
201. acquiring service summary data of a service table and field data of each field in the service table from a preset database;
in this embodiment, the summary information of the table in the data and the field information of each field in the table are obtained through JDBC.
Specifically, the JDBC (Java DataBase Connectivity) is a Java API for executing SQL statements, and a unified access interface can be provided for multiple types of relational databases through the JDBC, and the summary information in the table in the data and the field information of each field in the table are obtained through the interface.
202. Analyzing the service summary data, converting the analyzed service summary data into a Bean object in a preset format, and creating a blank Bean object;
in this embodiment, the summary information of the table includes a name of the table and remark information of the table. And segmenting the summary information of the table, respectively converting the content of each segment, and combining the conversion results of the content of all the segments into a Java-specified Bean object so as to provide a uniform and standard Bean object for the subsequent database information identification operation.
Specifically, the field information includes a field name, a field type, field remark information, a field primary key relationship, and the like. Segmenting the field names, respectively converting the content of each segment, and combining the conversion results of the content of all the segments into variable names of the Bean objects of the Java specification; converting the field type into a type of Java specification; and generating a method name of the Bean object of the Java specification according to the variable name of the Bean object and the name of the preset method.
203. Determining a preset function of the Bean object, and analyzing annotation attributes of each field in a pre-created blank Bean object by using the preset function;
204. acquiring service data uploaded by each internet hospital;
205. extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object;
206. converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital;
207. analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data;
208. and performing anomaly analysis and monitoring on the corresponding service data according to the key values, and generating a monitoring report based on the analysis and monitoring results.
The steps 203-208 in this embodiment are similar to the steps 101-106 in the first embodiment, and are not described herein again.
In the embodiment of the invention, a preset function of a Bean object is determined, and the preset function is utilized to analyze the annotation attribute of each field in a pre-created blank Bean object; acquiring service data uploaded by each internet hospital; extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, generating a monitoring report query based on the analysis and monitoring results to obtain original service data corresponding to the service flow data, and analyzing the original service data according to the attribute name in the target Bean object to obtain a data analysis result. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
Referring to fig. 3, a third embodiment of the data analysis method according to the embodiment of the present invention includes:
301. acquiring all attribute names of the blank bean object, wherein the attribute names comprise a service source, a supervision platform code, a service code, retry times and an error classification label;
in this embodiment, the annotation attribute includes an attribute name and annotation content, and the annotation content includes a type of resource configuration and a location of the resource configuration.
After the spring calls the predefined function, the resource configuration content is read in the predefined function, and in order to obtain the resource configuration content, the annotation attribute of the bean object needs to be obtained first, wherein the annotation attribute comprises an attribute name and annotation content.
302. Searching all annotation contents in all attribute names;
in this embodiment, the bean object has a plurality of attribute names of different types according to different implemented functions, for example, if the bean object implements one game function, the attribute names that the bean object may include are skills, levels, and the like. If a resource needs to be loaded on a certain attribute name of the bean object, the attribute name needs to be annotated, and annotation content corresponding to the attribute name is obtained after annotation, wherein the annotation content comprises the type of resource configuration and the position of the resource configuration.
303. Selecting target annotation content with specific identification from all annotation contents;
in this embodiment, each attribute name in the set propertyList is traversed, and the annotation content of each attribute name is obtained and stored in the annotationList. If the annotation content in the annotationList has the specific identifier of resource field, the annotation content is selected, and the annotation content comprises the type of resource configuration and the position of the resource configuration. It should be noted that, in the embodiment of the present invention, the attribute name of the resource to be loaded is marked by the resource field specific identifier, and may also be marked by other specific identifiers.
304. Acquiring an attribute name and target annotation content corresponding to the target annotation content, and determining annotation attributes of the blank Bean object;
in this embodiment, that is, the annotation attribute includes an attribute name and annotation content corresponding to the attribute name, and the annotation content corresponding to the attribute name and the attribute name is stored as a temporary variable.
305. Acquiring service data uploaded by each internet hospital;
306. acquiring an attribute name in the annotation attribute, and extracting corresponding data content from each service data according to the attribute name, wherein the data content comprises a data configuration type and a data storage position;
in this embodiment, the annotation attribute of the bean object is obtained according to the predefined function, where the annotation attribute includes an attribute name and annotation content, and the annotation content includes a type of resource configuration and a location of the resource configuration.
After the spring calls the predefined function, the resource configuration content is read in the predefined function, and in order to obtain the resource configuration content, the annotation attribute of the bean object needs to be obtained first, wherein the annotation attribute comprises an attribute name and annotation content. The bean object has a plurality of attribute names of different types according to different implemented functions, for example, if the bean object implements a game function, the attribute names that the bean object may include are skills, levels, and the like. If a resource needs to be loaded on a certain attribute name of the bean object, the attribute name needs to be annotated, and annotation content corresponding to the attribute name is obtained after annotation, wherein the annotation content comprises the type of resource configuration and the position of the resource configuration.
307. Acquiring an attribute name and a corresponding value in the data content according to the data configuration type, and determining a data dimension corresponding to the data content according to the data of the attribute name and the value;
in this embodiment, since the resource configuration file has a plurality of configuration items, each configuration item has a unique identifier, and each configuration item is an attribute name: the form of the value "should be noted that the configuration item can be understood as a data form, such as" age: 23 ", the resource configuration file may be understood as a collection of configuration items, and may be a file such as a table. In addition, the resource configuration file can load, update and read data, in the prior art, the configuration file cannot be shared in different systems due to the fact that the format of the resource configuration file is not uniform, so that a predefined dimension is defined for the resource configuration file, a classification summarization is performed on resource objects (used for operating the resource configuration file) through the predefined dimension, and the purpose of changing the resource configuration file is achieved by loading resource configuration content in the predefined dimension. The predefined dimension is defined as: class ResourceConf { CONF _ TYPE; list < Conf > conss; load (String confLocation, String confType); update (); readOne (); readAll () }.
In this embodiment, the annotation attribute of the bean object is obtained according to the predefined function, where the annotation attribute includes an attribute name and annotation content, and the annotation content includes a type of resource configuration and a location of the resource configuration.
After the spring calls the predefined function, the resource configuration content is read in the predefined function, and in order to obtain the resource configuration content, the annotation attribute of the bean object needs to be obtained first, wherein the annotation attribute comprises an attribute name and annotation content. The bean object has a plurality of attribute names of different types according to different implemented functions, for example, if the bean object implements a game function, the attribute names that the bean object may include are skills, levels, and the like. If a resource needs to be loaded on a certain attribute name of the bean object, the attribute name needs to be annotated, and annotation content corresponding to the attribute name is obtained after annotation, wherein the annotation content comprises the type of resource configuration and the position of the resource configuration.
308. Converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital;
309. analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data;
310. and performing anomaly analysis and monitoring on the corresponding service data according to the key values, and generating a monitoring report based on the analysis and monitoring results.
Steps 305 and 308-310 in this embodiment are similar to steps 102 and 104-106 in the first embodiment, and are not described herein again.
In the embodiment of the invention, a preset function of a Bean object is determined, and the preset function is utilized to analyze the annotation attribute of each field in a pre-created blank Bean object; acquiring service data uploaded by each internet hospital; extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object; converting each target Bean object into a service data matrix according to a preset self-attention mechanism to obtain service flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, generating a monitoring report query based on the analysis and monitoring results to obtain original service data corresponding to the service flow data, and analyzing the original service data according to the attribute name in the target Bean object to obtain a data analysis result. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
Referring to fig. 4, a fourth embodiment of the data analysis method according to the embodiment of the present invention includes:
401. determining a preset function of the Bean object, and analyzing annotation attributes of fields in a pre-created blank Bean object by using the preset function;
402. acquiring service data uploaded by each internet hospital;
403. extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object;
404. converting each target Bean object into a service data matrix according to a preset self-attention mechanism;
in this embodiment, the internet hospital data filling system obtains the merchant transaction sheets within the preset time length and the transaction variables corresponding to each transaction, the merchant transaction sheets within the preset time length of the internet hospital are sorted according to the time sequence, and the transaction variables in each transaction sheet are sorted according to the preset variable arrangement sequence.
In the business data matrix of the internet hospital, the first row elements are all transaction variables of a first transaction list in a preset time length, and the second row elements are all transaction variables of a second transaction list in the preset time length. The first column element is a first type of transaction variable of all transaction orders in a preset time length, and the second column element is a second type of transaction variable of all transaction orders in the preset time length. If T transaction orders exist in the preset time length and V transaction variables exist in each transaction, the business data matrix of the Internet hospital is a T multiplied by V matrix.
Specifically, by respectively defining the row vector and the column vector of a business data matrix of the internet hospital, when the transaction variable elements need to be deleted or added, the transaction variable element types can be more conveniently expanded directly through the row-column transformation of the matrix.
Wherein, the business variable comprises: at least one of a service source, a supervision platform code, a hospital code, a service primary key, an uploading state, a supervision receipt result, an error classification label and a retry number. And elements in the variables may be increased or decreased.
405. Based on the data dimension, uploading the data content to a service data matrix to obtain a first service data matrix and a second service data matrix with service data;
in this embodiment, the internet hospital data filling system performs data filling on the service data matrix of the hospital according to the self-attention mechanism module, and obtains a first service data matrix and a second service data matrix after the data filling respectively. The first traffic data matrix is the result after being filled by the variable attention mechanism and the second traffic data matrix is the result after being filled by the time attention mechanism.
406. Decoding the first service matrix and the second service matrix according to a decoder to obtain service flow data of the internet hospital;
in this embodiment, the data filling system of the internet hospital decodes the filled first service matrix and second service matrix according to a pre-configured decoder, and service flow data that can be directly analyzed and visually checked can be obtained after decoding.
In another embodiment, the decoding may be performed on the first service matrix and the second merchant transaction matrix according to a decoder to obtain the service flow data to be analyzed, which specifically includes: inputting the first service matrix and the second service matrix which are respectively filled in the variable dimension and the time dimension into a decoder; and carrying out weighted summation decoding on the filled first service matrix and the second service matrix to obtain service flow data.
The first business matrix and the second business matrix which are filled in different dimensions of time and variables are input into a decoder together, and the two business transaction data matrices are subjected to weighted summation operation, so that a completely filled business flow data which can be visually analyzed and used can be obtained. Wherein the decoder is a multi-layer perceptron MLP.
407. Generating MQ messages based on preset timing task data and service flow data, and storing the MQ messages in an MQ message queue of Redis;
in this embodiment, the MQ message queue of Redis is used to decouple the grouping process and the data storage process, so that the grouping process and the data storage process can be parallel, the length of the total grouping process is shortened, and the time for responding to a single case grouping request is further reduced.
In Redis, the MQ message queue mechanism adopts a character string linked list which is ordered according to an insertion sequence, and the character string linked list is the same as a common linked list in a data structure; storing target disease diagnosis packet data at a head of the MQ message queue. And when data is extracted, obtaining the data from the tail part of the MQ message queue.
When the MQ message queue is not present, the Redis creates the MQ message queue from the target disease diagnosis packet data and then stores the target disease diagnosis packet data in the MQ message queue. When all elements in the MQ message queue are extracted, the MQ message queue is deleted from the upper database. One of the target disease diagnosis grouping data is stored in each element. It will be appreciated that the operations of inserting elements and extracting elements may be performed simultaneously on the MQ message queue.
408. Monitoring the MQ message queue to obtain the MQ messages in the MQ message queue;
in this embodiment, the timed task system monitors the mq message queue, obtains the mq message according to the message queue name, then deserializes to obtain corresponding service data and obtain the current machine name, creates a timed task, and stores the timed task in the timed task table. The process tasks can be configured according to the specific loan service system service process, for example, the individual timing tasks are formed according to the loan process, credit granting, loan payment, repayment, background management system and the like.
The timing task system can create a thread pool during initialization and run timing tasks at regular time, wherein the tasks include regular acquisition and updating of current machine names, data required to be verified in each loan process and the like.
409. Establishing a timing task according to the MQ message, and storing the timing task into a timing task table;
in the embodiment, the backlog situation in the MQ message queue and the database can be relieved by monitoring the MQ message queue, acquiring the information and forwarding the information to the buffer queue, then forwarding the information to the actual consumption queue after the buffer duration of the buffer queue is expired, then taking out the information in the actual consumption queue, and performing matching judgment on the information and the peripheral information table. The embodiment of the invention does not depend on the operation of inserting the database, removes the operation of repeatedly updating by polling for many times, reduces the whole load of the database and can improve the speed of data receiving and processing.
410. Creating a thread pool, and regularly running the timed tasks in the timed task table based on the thread pool;
in this embodiment, the timing task may be any task submitted by the task system. The terminal can receive the timing task submitted by the task system and can detect the task type identification carried in the timing task. The terminal can obtain a target thread pool corresponding to the task type identifier of the timing task. One or more threads may be included in the target city pool. The corresponding relation between the task type and the thread pool can be stored in the terminal in advance, and the threads in the same thread pool can be used for processing the tasks of the same task type. For example, the task type identifier "type 1" indicates that the first task type, and the thread pool corresponding to "type 1" is thread pool a, which indicates that the threads in thread pool a are used for processing the tasks of the first task type; the task type identifier "type 2" indicates a second task type, and the thread pool corresponding to "type 2" is a thread pool B, which indicates that threads in the thread pool B are used for executing tasks of the second task type; the task type identifier "type 3" indicates that the third task type, the thread pool corresponding to "type 3" is thread pool C, which indicates that the threads in thread pool C are used to execute the tasks of the third task type, and so on.
411. Analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data;
412. and performing anomaly analysis and monitoring on the corresponding service data according to the key values, and generating a monitoring report based on the analysis and monitoring results.
The steps 401-.
In the embodiment of the invention, a preset function of a Bean object is determined, and the preset function is utilized to analyze the annotation attribute of each field in a pre-created blank Bean object; acquiring service data uploaded by each internet hospital; extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, generating a monitoring report query based on the analysis and monitoring results to obtain original service data corresponding to the service flow data, and analyzing the original service data according to the attribute name in the target Bean object to obtain a data analysis result. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
Referring to fig. 5, a fifth embodiment of the data analysis method according to the embodiment of the present invention includes:
501. determining a preset function of the Bean object, and analyzing annotation attributes of fields in a pre-created blank Bean object by using the preset function;
502. acquiring service data uploaded by each internet hospital;
503. extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object;
504. converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital;
505. analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data;
506. inquiring in a preset database through the key numerical value to obtain original service data corresponding to the service flow data;
in this embodiment, in the process of acquiring historical service data, it is possible that the data acquisition mode is incorrect, which causes bad data to exist in the data (for example, the data is not filled), missing values of the bad data in the form usually exist in a null value form, and if the bad data are directly ignored, an abnormal clustering process of the data will occur, so in practical application, if the acquired historical service data have missing values, the data needs to be preprocessed before being clustered, and the preprocessed normal data are clustered, which can improve the clustering efficiency.
507. Carrying out average path analysis on the original service data based on an isolated forest algorithm and the attribute name in the target Bean object to obtain the average path length of the original service data;
in this embodiment, performing anomaly analysis on original service data to obtain a data analysis result includes: analyzing original service data based on an isolated forest algorithm to obtain data anomaly points; and calling an association rule analysis model to perform correlation analysis on the data abnormal points to obtain a data analysis result.
The isolated Forest algorithm (Isolation Forest) is an unsupervised anomaly detection method suitable for continuous data. The data outlier is a point where original service data is recursively and randomly divided in an isolated forest until all points corresponding to data in the original service data are isolated, and under the strategy of random division, the outlier generally has a short path isolated. The association rule analysis model is a correlation analysis model which is trained by a large amount of sample data, the association rule analysis model can be trained based on an Apriori algorithm, the Apriori algorithm is an association rule mining algorithm, the relation of an item set in a database is found out by utilizing a layer-by-layer search iteration method to form a rule, the process of the association rule analysis model is composed of connection (class matrix operation) and pruning (removing unnecessary intermediate results), the concept of the item set in the algorithm is a set of items, the set containing K items is a K item set, the frequency of the item set is the number of transactions containing the item set, and the frequency is called the frequency of the item set, and if a certain item set meets the minimum support degree, the item set is called a frequent item set.
In another embodiment, an association rule analysis model trained based on Apriori algorithm scans a data set (detail data containing active K (active K is an index) and dimension items) corresponding to a data exception point { K |, a1, a2, B1, B2, B3 … … N1, N2}, screens out a frequent item set L containing K from the data set, and for all non-empty subsets S of L, if P (M ≧ N ≧ T/K) ≧ min _ conf (confidence threshold, which is customizable), the frequent item set S (K, M, N, T) is an active relevant set. (where M is a1, N is B3, and T is N2), dimension items of a1, B3, and N2 are obtained according to the influence degree, and are sorted, and a sequence obtained after sorting is a data analysis result.
508. Analyzing according to the average path length and the expectation of the path length of each data in the original service data, and determining data abnormal points;
in the embodiment, the average path analysis is carried out on the original service data based on the isolated forest algorithm to obtain the average path length of the original service data; and analyzing according to the average path length and the expectation of the path length of each data in the original service data to determine the data anomaly point.
Firstly, an isolated forest algorithm is selected for abnormal data mining, and the scene needs to further analyze whether data in near three months are abnormal or not is taken as an example: the original service data is a data set of n samples in about March, and the average path length is calculated according to an isolated forest algorithm.
509. And calling an association rule analysis model to analyze and monitor the data abnormal points, and generating a monitoring report based on the analysis and monitoring results.
In this embodiment, in the data analysis method based on natural language processing, a user inputs business pipeline data to be analyzed in natural language, and can initiate a data analysis instruction, perform semantic parsing on the business pipeline data in the data analysis instruction based on natural language processing to obtain a word segmentation structure, and invoke a search engine to search corresponding data according to the word segmentation structure to obtain original business data; the method comprises the steps of conducting anomaly analysis on original business data to obtain a data analysis result, refining the data analysis result into a natural language based on a natural language generation technology, and generating a corresponding analysis report of business flow data, so that a user can input the business flow data to be analyzed by using the natural language and initiate a data analysis instruction to obtain the corresponding analysis report, and the technical threshold of data analysis is reduced, so that the data is efficiently utilized, and the data value is fully exerted.
In one embodiment, performing anomaly analysis on original service data to obtain a data analysis result includes: analyzing original service data based on an isolated forest algorithm to obtain data anomaly points; and calling an association rule analysis model to perform correlation analysis on the data abnormal points to obtain a data analysis result.
Step 501-506 in the present embodiment is similar to step 101-106 in the first embodiment, and will not be described herein again.
In the embodiment of the invention, a preset function of a Bean object is determined, and the preset function is utilized to analyze the annotation attribute of each field in a pre-created blank Bean object; acquiring service data uploaded by each internet hospital; extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, generating a monitoring report query based on the analysis and monitoring results to obtain original service data corresponding to the service flow data, and analyzing the original service data according to the attribute name in the target Bean object to obtain a data analysis result. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
With reference to fig. 6, the data analysis method in the embodiment of the present invention is described above, and a data analysis apparatus in the embodiment of the present invention is described below, where a first embodiment of the data analysis apparatus in the embodiment of the present invention includes:
the first creating module 601 is configured to determine a preset function of a Bean object, and analyze annotation attributes of fields in a pre-created blank Bean object by using the preset function;
a first obtaining module 602, configured to obtain service data uploaded by each internet hospital;
a filling module 603, configured to extract corresponding data content from each service data according to the annotation attribute, and fill the data content into the blank Bean object to obtain a target Bean object;
a conversion module 604, configured to convert each target Bean object into a service data matrix according to a preset self-attention mechanism, so as to obtain service flow data of each internet hospital;
the analysis module 605 is configured to analyze the service flow data through a preset timing task and natural language processing to obtain a key value of the service flow data;
and the monitoring module 606 is configured to perform anomaly analysis and monitoring on the corresponding service data according to the key value, and generate a monitoring report based on the analysis and monitoring result.
In the embodiment of the invention, a preset function of a Bean object is determined, and the preset function is utilized to analyze the annotation attribute of each field in a pre-created blank Bean object; acquiring service data uploaded by each Internet hospital; extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, generating a monitoring report query based on the analysis and monitoring results to obtain original service data corresponding to the service flow data, and analyzing the original service data according to the attribute name in the target Bean object to obtain a data analysis result. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
Referring to fig. 7, a second embodiment of the data analysis device according to the embodiment of the present invention specifically includes:
the first creating module 601 is configured to determine a preset function of a Bean object, and analyze annotation attributes of fields in a pre-created blank Bean object by using the preset function;
a first obtaining module 602, configured to obtain service data uploaded by each internet hospital;
the filling module 603 is configured to extract corresponding data content from each service data according to the annotation attribute, and fill the data content into the blank Bean object to obtain a target Bean object;
a conversion module 604, configured to convert each target Bean object into a service data matrix according to a preset self-attention mechanism, so as to obtain service flow data of each internet hospital;
the analysis module 605 is configured to analyze the service flow data through a preset timing task and natural language processing to obtain a key value of the service flow data;
and the monitoring module 606 is configured to perform anomaly analysis and monitoring on the corresponding service data according to the key value, and generate a monitoring report based on the analysis and monitoring result.
In this embodiment, the data analysis apparatus further includes:
a second obtaining module 607, configured to obtain service summary data of a service table and field data of each field in the service table from a preset database;
the second creating module 608 is configured to parse the service summary data, convert the parsed service summary data into a Bean object in a preset format, and create a blank Bean object.
In this embodiment, the first creating module 601 is specifically configured to:
acquiring all attribute names of a blank bean object, wherein the attribute names comprise a service source, a supervision platform code, a service code, retry times and an error classification label;
searching all annotation contents in all the attribute names;
selecting target annotation content with specific identification from all the annotation contents;
and acquiring the attribute name corresponding to the target annotation content and the target annotation content, and determining the annotation attribute of the blank Bean object.
In this embodiment, the filling module 603 is specifically configured to:
acquiring an attribute name in the annotation attribute, and extracting corresponding data content from each service data according to the attribute name, wherein the data content comprises a data configuration type and a data storage position;
acquiring an attribute name and a corresponding value in data content according to the data configuration type, and determining a data dimension corresponding to the data content according to the data of the attribute name and the value;
and filling the data content into the blank Bean object according to the data dimension to obtain a target Bean object.
In this embodiment, the converting module 604 includes:
a conversion unit 6041, configured to convert each target Bean object into a service data matrix according to a preset self-attention mechanism;
an uploading unit 6042, configured to upload the data content to the service data matrix based on the data dimension, so as to obtain a first service data matrix and a second service data matrix with service data;
a decoding unit 6043, configured to decode the first service matrix and the second service matrix according to a decoder, so as to obtain service flow data of the internet hospital.
In this embodiment, the data analysis apparatus further includes:
a generating module 609, configured to generate an MQ message based on preset timing task data and the service flow data, and store the MQ message in an MQ message queue of Redis;
a monitoring module 610, configured to monitor the MQ message queue, and obtain MQ messages in the MQ message queue;
a storage module 611, configured to create a timing task according to the MQ message, and store the timing task in a timing task table;
and an operation module 612, configured to create a thread pool, and periodically operate the timed task in the timed task table based on the thread pool.
In this embodiment, the monitoring module 606 is specifically configured to:
inquiring in a preset database according to the key value to obtain service data corresponding to the service flow data;
carrying out average path analysis on the service data based on an isolated forest algorithm and the attribute name in the target Bean object to obtain the average path length of the original service data;
analyzing according to the average path length and the expectation of the path length of each data in the original service data, and determining a data abnormal point;
and calling an association rule analysis model to analyze and monitor the data abnormal points, and generating a monitoring report based on the analysis and monitoring results.
In the embodiment of the invention, a preset function of a Bean object is determined, and the preset function is utilized to analyze the annotation attribute of each field in a pre-created blank Bean object; acquiring service data uploaded by each internet hospital; extracting corresponding data content from each service data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object; converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital; analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data; and performing anomaly analysis and monitoring on the corresponding service data according to the key values, and generating a monitoring report based on the analysis and monitoring results. The invention improves the efficiency of log query and traffic statistics by using the characteristics of decoupling, universality, rapidness and simplicity of a unified supervision background system, analyzes supervision data in each dimension, is decoupled from items, can be repeatedly used after jar packet injection, and solves the technical problems of increasing data supervision channels, analyzing each dimension of the supervision data and observing the service growth rate according to the data analysis result.
Fig. 6 and fig. 7 describe the data analysis apparatus in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the data analysis apparatus in the embodiment of the present invention is described in detail from the perspective of hardware processing.
Fig. 8 is a schematic structural diagram of a data analysis apparatus 800 according to an embodiment of the present invention, where the data analysis apparatus 800 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 810 (e.g., one or more processors) and a memory 820, and one or more storage media 830 (e.g., one or more mass storage devices) storing an application 833 or data 832. Memory 820 and storage medium 830 may be, among other things, transient or persistent storage. The program stored in the storage medium 830 may include one or more modules (not shown), each of which may include a series of instructions operating on the data analysis apparatus 800. Further, the processor 810 may be configured to communicate with the storage medium 830, and execute a series of instruction operations in the storage medium 830 on the data analysis device 800 to implement the steps of the data analysis method provided by the above-described method embodiments.
The data analysis apparatus 800 may also include one or more power supplies 840, one or more wired or wireless network interfaces 850, one or more input-output interfaces 860, and/or one or more operating systems 831, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and the like. Those skilled in the art will appreciate that the data analysis device configuration shown in FIG. 8 does not constitute a limitation of the data analysis devices provided herein, and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the above-mentioned data analysis method.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A data analysis method, characterized in that the data analysis method comprises:
determining a preset function of the Bean object, and analyzing annotation attributes of fields in a pre-created blank Bean object by using the preset function;
acquiring service data uploaded by each Internet hospital;
extracting corresponding data content from each business data according to the annotation attribute, and filling the data content into the blank Bean object to obtain a target Bean object;
converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital;
analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data;
and performing anomaly analysis and monitoring on the corresponding service data according to the key values, and generating a monitoring report based on the analysis and monitoring results.
2. The data analysis method according to claim 1, before the determining a preset function of the Bean object and using the preset function to parse annotation attributes of fields in a pre-created blank Bean object, further comprising:
acquiring service summary data of a service table and field data of each field in the service table from a preset database;
and analyzing the service summary data, converting the analyzed service summary data into a Bean object in a preset format, and creating a blank Bean object.
3. The data analysis method according to claim 1, wherein the parsing annotation attributes of each field in the pre-created blank Bean object using the preset function comprises:
acquiring all attribute names of a blank bean object, wherein the attribute names comprise a service source, a supervision platform code, a service code, retry times and an error classification label;
searching all annotation contents in all the attribute names;
selecting target annotation content with specific identification from all the annotation contents;
and acquiring the attribute name corresponding to the target annotation content and the target annotation content, and determining the annotation attribute of the blank Bean object.
4. The data analysis method of claim 1, wherein the populating the data content into the white Bean object to obtain a target Bean object comprises:
acquiring an attribute name in the annotation attribute, and extracting corresponding data content from each service data according to the attribute name, wherein the data content comprises a data configuration type and a data storage position;
acquiring an attribute name and a corresponding value in data content according to the data configuration type, and determining a data dimension corresponding to the data content according to the data of the attribute name and the value;
and filling the data content into the blank Bean object according to the data dimension to obtain a target Bean object.
5. The data analysis method of claim 1, wherein the converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital comprises:
converting each target Bean object into a service data matrix according to a preset self-attention mechanism;
based on the data dimension, uploading the data content to the business data matrix to obtain a first business data matrix and a second business data matrix with business data;
and decoding the first service matrix and the second service matrix according to a decoder to obtain the service flow data of the Internet hospital.
6. The data analysis method of claim 1, wherein after the converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business pipeline data of each internet hospital, further comprising:
generating MQ messages based on preset timing task data and the service flow data, and storing the MQ messages in an MQ message queue of Redis;
monitoring the MQ message queue to acquire MQ messages in the MQ message queue;
establishing a timing task according to the MQ message, and storing the timing task into a timing task table;
and creating a thread pool, and regularly running the timed tasks in the timed task table based on the thread pool.
7. The data analysis method of claim 1, wherein the performing anomaly analysis and monitoring on the corresponding service data according to the key value, and generating a monitoring report based on the analysis and monitoring result comprises:
inquiring in a preset database according to the key value to obtain service data corresponding to the service flow data;
carrying out average path analysis on the service data based on an isolated forest algorithm and the attribute name in the target Bean object to obtain the average path length of the original service data;
analyzing according to the average path length and the expectation of the path length of each data in the original service data, and determining a data abnormal point;
and calling an association rule analysis model to analyze and monitor the data abnormal points, and generating a monitoring report based on the analysis and monitoring results.
8. A data analysis apparatus, characterized in that the data analysis apparatus comprises:
the system comprises a first creating module, a second creating module and a third creating module, wherein the first creating module is used for determining a preset function of a Bean object and analyzing annotation attributes of fields in a pre-created blank Bean object by using the preset function;
the first acquisition module is used for acquiring service data uploaded by each Internet hospital;
the filling module is used for extracting corresponding data content from each business data according to the annotation attribute and filling the data content into the blank Bean object to obtain a target Bean object;
the conversion module is used for converting each target Bean object into a business data matrix according to a preset self-attention mechanism to obtain business flow data of each internet hospital;
the analysis module is used for analyzing the service flow data through a preset timing task and natural language processing to obtain a key numerical value of the service flow data;
and the monitoring module is used for carrying out abnormity analysis and monitoring on the corresponding service data according to the key value and generating a monitoring report based on the analysis and monitoring result.
9. A data analysis apparatus, characterized in that the data analysis apparatus comprises: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the data analysis device to perform the steps of the data analysis method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the data analysis method according to any one of claims 1 to 7.
CN202210288360.0A 2022-03-23 2022-03-23 Data analysis method, device, equipment and storage medium Pending CN114610769A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210288360.0A CN114610769A (en) 2022-03-23 2022-03-23 Data analysis method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210288360.0A CN114610769A (en) 2022-03-23 2022-03-23 Data analysis method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114610769A true CN114610769A (en) 2022-06-10

Family

ID=81865754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210288360.0A Pending CN114610769A (en) 2022-03-23 2022-03-23 Data analysis method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114610769A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117930028A (en) * 2024-03-21 2024-04-26 成都赛力斯科技有限公司 Method, system, equipment and medium for predicting thermal failure of new energy vehicle battery

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117930028A (en) * 2024-03-21 2024-04-26 成都赛力斯科技有限公司 Method, system, equipment and medium for predicting thermal failure of new energy vehicle battery
CN117930028B (en) * 2024-03-21 2024-05-17 成都赛力斯科技有限公司 Method, system, equipment and medium for predicting thermal failure of new energy vehicle battery

Similar Documents

Publication Publication Date Title
US20050246353A1 (en) Automated transformation of unstructured data
CN112487140A (en) Question-answer dialogue evaluating method, device, equipment and storage medium
WO2008121862A1 (en) Data merging in distributed computing
CN106341257B (en) Device for self-defining log analysis rule and automatically analyzing log
US7765219B2 (en) Sort digits as number collation in server
CN115576984A (en) Method for generating SQL (structured query language) statement and cross-database query by Chinese natural language
US9754083B2 (en) Automatic creation of clinical study reports
CN111859969B (en) Data analysis method and device, electronic equipment and storage medium
JP2004362223A (en) Information mining system
CN115470338B (en) Multi-scenario intelligent question answering method and system based on multi-path recall
CN113051362A (en) Data query method and device and server
CN110909126A (en) Information query method and device
CN117743371A (en) SQL sentence generation method, device, equipment and medium based on large language model
CN113221570A (en) Processing method, device, equipment and storage medium based on-line inquiry information
CN113657088A (en) Interface document analysis method and device, electronic equipment and storage medium
CN111831624A (en) Data table creating method and device, computer equipment and storage medium
CN114610769A (en) Data analysis method, device, equipment and storage medium
CN113434631B (en) Emotion analysis method and device based on event, computer equipment and storage medium
Jain et al. Database-agnostic workload management
CN115495587A (en) Alarm analysis method and device based on knowledge graph
CN113095073B (en) Corpus tag generation method and device, computer equipment and storage medium
CN115221323A (en) Cold start processing method, device, equipment and medium based on intention recognition model
CN115269862A (en) Electric power question-answering and visualization system based on knowledge graph
CN112527880B (en) Method, device, equipment and medium for collecting metadata information of big data cluster
CN115203057B (en) Low code test automation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination