CN116244410B

CN116244410B - Index data analysis method and system based on knowledge graph and natural language

Info

Publication number: CN116244410B
Application number: CN202310126462.7A
Authority: CN
Inventors: 金震; 张京日; 穆宇浩
Original assignee: Beijing SunwayWorld Science and Technology Co Ltd
Current assignee: Beijing SunwayWorld Science and Technology Co Ltd
Priority date: 2023-02-16
Filing date: 2023-02-16
Publication date: 2023-10-20
Anticipated expiration: 2043-02-16
Also published as: CN116244410A

Abstract

The invention provides an index data analysis method and system based on a knowledge graph and natural language, wherein the method comprises the following steps: acquiring an index data set corresponding to the service to be analyzed, extracting data characteristics of each index data in the index data set, and constructing an index knowledge graph corresponding to the index data set based on the data characteristics; carrying out natural language processing on a target request input by a user, and carrying out key analysis requirement identification on the natural language obtained by processing to obtain a target requirement; matching the target requirements with the index knowledge graph to obtain key index data, generating a target image corresponding to the key index data, and displaying the key index data and the target image. The query or analysis efficiency of index data according to the user requirements is improved, and meanwhile, the accuracy of the query or analysis of the index data is guaranteed.

Description

Index data analysis method and system based on knowledge graph and natural language

Technical Field

The invention relates to the technical field of data processing, in particular to an index data analysis method and system based on a knowledge graph and natural language.

Background

At present, along with the development of the technology level, various industries can generate a large amount of useful index data in the operation process, and the query or analysis of the index data is beneficial to accurately and reliably grasping the operation condition of the business;

however, due to huge amount of index data, fuzzy matching can only be performed by index names when index data is queried or analyzed, so that the difference between the finally queried index data and the user needs is larger, the accuracy of query or analysis cannot be ensured, and meanwhile, the experience of the user is greatly reduced, and meanwhile, due to huge amount of index data, the efficiency of query or analysis of the index data is low, so that how to directly analyze the index data in natural language, particularly Chinese language, with business intuition becomes a problem to be solved;

therefore, the invention provides an index data analysis method and system based on a knowledge graph and natural language.

Disclosure of Invention

The invention provides an index data analysis method and system based on a knowledge graph and natural language, which are used for generating a response index knowledge graph by using index data, so that the required key index data can be conveniently and rapidly positioned from the index knowledge graph according to the query or analysis requirement of a user, the query or analysis efficiency of the index data according to the user requirement is improved, and meanwhile, the accuracy of the query or analysis of the index data is ensured.

The invention provides an index data analysis method based on a knowledge graph and natural language, which comprises the following steps:

step 1: acquiring an index data set corresponding to the service to be analyzed, extracting data characteristics of each index data in the index data set, and constructing an index knowledge graph corresponding to the index data set based on the data characteristics;

step 2: carrying out natural language processing on a target request input by a user, and carrying out key analysis requirement identification on the natural language obtained by processing to obtain a target requirement;

step 3: matching the target requirements with the index knowledge graph to obtain key index data, generating a target image corresponding to the key index data, and displaying the key index data and the target image.

Preferably, in step 1, an index data set corresponding to a service to be analyzed is obtained, which includes:

acquiring service attributes of a service to be analyzed, and generating an index data acquisition request based on the service attributes and identity information of a data acquisition terminal;

constructing a communication link between the data acquisition terminal and a preset server, transmitting an index data acquisition request to the preset server based on the communication link, and splitting the index data acquisition request into a first sub-response request and a second sub-response request based on the preset server;

Performing first analysis on the first sub-response request based on a preset server, matching an analysis result with a preset registration identity information base, performing second analysis on the second sub-response request when matched preset registration identity information exists, and determining an index data identifier to be acquired;

searching a preset index database based on the index data identifier to be acquired to obtain an index data set, packaging the index data set, and feeding back the index data set to the data acquisition terminal based on the communication link.

Preferably, in step 1, the method for analyzing the index data based on the knowledge graph and the natural language extracts the data characteristics of each index data in the index data set, including:

acquiring an index data set, dividing each index data in the index data set into N data segments with equal length, determining the protocol type of each index data in the index data set, and sequentially inputting the N data segments corresponding to each index data into a corresponding feature recognition model based on the protocol type;

analyzing the input data segments based on the feature recognition model to obtain data type values and data target values corresponding to the data segments, and obtaining data features of the index data based on the data type values and the data target values corresponding to the data segments;

And clustering each index data in the index data set based on the data characteristics, and obtaining a classification result corresponding to each index data based on the clustering result.

Preferably, in step 1, an index knowledge graph corresponding to an index dataset is constructed based on data features, which includes:

acquiring the obtained index data and corresponding data characteristics, determining repeated index data in the index data based on the data characteristics, and performing de-duplication on the repeated index data to obtain standard index data;

determining task fields corresponding to standard index data based on data features of the index data, determining first logic relations between the task fields corresponding to the standard index data based on service processing logic of the service to be processed, and constructing an infrastructure of an index knowledge graph between the task fields based on the first logic relations;

determining sub-index data corresponding to each task field based on the infrastructure construction result, associating the sub-index data with the corresponding task field, determining keywords corresponding to each sub-index number based on the data characteristics, and determining a second logic relationship between the sub-index data based on the keywords;

And converting the sub-index data into structured data based on a second logic relationship, and fusing the converted structured data with the constructed basic framework to obtain an index knowledge graph corresponding to the index data set.

Preferably, an index data analysis method based on a knowledge graph and natural language obtains an index knowledge graph corresponding to an index data set, including:

acquiring an obtained index knowledge graph, randomly extracting sub-index data contained in the task field to be checked in the index knowledge graph based on sampling detection, and determining theoretical sub-index data corresponding to the task field to be checked based on a service range of the task field to be checked and the service range;

matching the sub-index data with theoretical sub-index data, judging that the constructed index knowledge graph has storage defects when the sub-index data and the theoretical sub-index data are different, and sequentially checking each task field in the index knowledge graph based on a judging result to obtain difference index data of the sub-index data in each task field and the corresponding theoretical sub-index data;

extracting data head information and data tail information of the difference index data, determining the relative position relation between the difference index data and the sub-index data in the corresponding task field based on the data head information and the data tail information, and complementing the difference index data in the index knowledge graph based on the relative position relation to obtain a final index knowledge graph.

Preferably, in step 2, a method for analyzing index data based on a knowledge graph and natural language, performs natural language processing on a target request input by a user, including:

acquiring a history input text, splitting the history input text to obtain N sentence segments, and respectively analyzing the parts of speech of the vocabularies contained in the N sentence segments to obtain part of speech composition without the sentence segments;

respectively carrying out multi-mode semantic learning on N sentence segments based on part-of-speech composition to obtain semantic features in different modes, and training the obtained semantic features and an analysis flow of the semantic features as training samples to obtain a natural language processing model;

acquiring a target request input by a user, converting the target request into a target text in a target request mode based on a target request mode carried in the target request, and outputting the target text to a natural language processing model for analysis to obtain target semantic features corresponding to the target request;

matching the target natural language corpus from a preset corpus in a corresponding mode based on the target semantic features, determining target language sequences among all target natural language corpuses in the target natural language corpus based on a preset grammar rule in a target requirement mode, and sequencing the target natural language corpuses based on the target language sequences to obtain a final natural language.

Preferably, in step 2, a method for analyzing index data based on a knowledge graph and natural language performs key analysis requirement identification on the natural language obtained by processing to obtain a target requirement, including:

acquiring a natural language obtained after natural language processing is performed on a target request input by a user, and extracting target semantics corresponding to the natural language;

determining keywords in the natural language based on the target semantics, obtaining key analysis requirements of the user based on the keywords, judging whether the target intention of the user is missing based on the key analysis requirements, and determining missing language components in the natural language when the target intention of the user is missing;

determining a natural corpus label to be supplemented based on the missing language components, searching historical corpus information of a user based on the natural corpus label to be supplemented to obtain target supplementary natural corpus, and supplementing key analysis requirements after natural language processing is performed on the target supplementary natural corpus;

and obtaining the final target requirement of the user based on the supplementing result.

Preferably, in step 3, matching the target requirement with the index knowledge graph to obtain key index data, including:

Acquiring the obtained target demand, inputting the target demand into a preset index query engine for conversion to obtain a query element corresponding to the target demand;

generating a query sentence based on the query element, analyzing the index knowledge graph based on the query sentence, and determining the attribution rate of each index data in the index knowledge graph relative to the query sentence;

and judging the index data with the attribution rate being greater than or equal to a preset threshold value as key index data, and sequencing the obtained index data based on the descending order of the attribution rate to obtain final key index data, wherein the number of the key index data is at least one.

Preferably, in step 3, a target image corresponding to key index data is generated, and the key index data and the target image are displayed, which includes:

acquiring the obtained key index data, and determining index analysis requirements corresponding to the key index data;

determining a target chart type for displaying key index data based on index analysis requirements, and matching a target chart template from a preset chart template library based on the target chart type;

extracting configuration parameters of a target chart template, determining format requirements of the target chart on data to be displayed based on the configuration parameters, and performing format conversion on key index data based on the format requirements;

And fusing the converted key index data with a target chart template to obtain a target image corresponding to the key index data, wherein the target image is a histogram, a pie chart or a line chart.

Preferably, an index data analysis system based on a knowledge graph and natural language comprises:

the map construction module is used for acquiring an index data set corresponding to the service to be analyzed, extracting the data characteristics of each index data in the index data set, and constructing an index knowledge map corresponding to the index data set based on the data characteristics;

the natural language processing module is used for carrying out natural language processing on a target request input by a user, and carrying out key analysis requirement identification on the natural language obtained by processing to obtain a target requirement;

the analysis module is used for matching the target requirements with the index knowledge graph to obtain key index data, generating a target image corresponding to the key index data, and displaying the key index data and the target image.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a flow chart of a method for analyzing index data based on knowledge graph and natural language in an embodiment of the invention;

FIG. 2 is a flowchart of step 1 in an index data analysis method based on knowledge graph and natural language according to an embodiment of the present invention;

fig. 3 is a block diagram of an index data analysis system based on a knowledge graph and natural language in an embodiment of the invention.

Detailed Description

The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.

Example 1:

the embodiment provides an index data analysis method based on a knowledge graph and natural language, as shown in fig. 1, including:

In this embodiment, the service to be analyzed refers to a service that needs to be analyzed through index data, and specifically may be a product sales amount, an operating profit, and the like.

In this embodiment, the index data set refers to all index data corresponding to the index knowledge graph to be constructed, specifically, all index data related to the service may be obtained from the server according to the service requirement, and the index data contained in the index data set is not unique.

In this embodiment, the index data is data included in the index data set, for example, if the sales of the product in the month need to be determined, the index data includes sales of the product, types of the product, unit price of the product, and the like

In this embodiment, the data features refer to the type of the index data, the specific value range of the index data, and the like.

In this embodiment, the index knowledge graph refers to determining a logic relationship, a calling relationship and an interaction relationship between index data according to data features of the index data, so as to display the relationship between the index data in a graph form.

In this embodiment, the target request refers to a request such as analysis or query that the user needs to perform on the index data.

In this embodiment, the natural language processing refers to converting data input by a user in the system into a natural language, where the natural language is a language that naturally evolves with culture, and chinese and english are examples of the natural language, and may be, for example, converting an input target request into "sales of products in the current month".

In this embodiment, performing the key analysis requirement recognition refers to analyzing the obtained natural language, and extracting keywords in the natural language, so as to facilitate determining a final purpose of user query or analysis.

In this embodiment, the target requirement refers to the type of the index data that the user needs to obtain from the index knowledge graph, the corresponding value, and the like, that is, the analysis purpose of the user.

In this embodiment, the key index data refers to at least one index data which is consistent with the target request input by the user after analyzing the index knowledge graph according to the target requirement.

In this embodiment, the generation of the target image corresponding to the key index data is determined according to the index service display requirement corresponding to the key index data, and specifically may be the generation of a corresponding histogram, pie chart or graph according to the value of the key index data, the proportion of different data types in the key index data, and the like.

In this embodiment, when matching the target demand with the index knowledge graph, if no index data is matched with the target demand in the knowledge graph, a new index is automatically generated according to parameters such as an atomic index, a service dimension, a service definition, time and the like of the service to be analyzed, and the constructed knowledge graph is perfected through the new index.

The beneficial effects of the technical scheme are as follows: by generating the index knowledge graph of response to the index data, the required key index data can be conveniently and rapidly positioned from the index knowledge graph according to the query or analysis requirements of the user, the query or analysis efficiency of the index data according to the user requirements is improved, and meanwhile, the accuracy of the query or analysis of the index data is ensured.

Example 2:

on the basis of embodiment 1, the present embodiment provides an index data analysis method based on a knowledge graph and natural language, in step 1, an index data set corresponding to a service to be analyzed is obtained, including:

In this embodiment, the service attribute is a parameter for characterizing a service type of the service to be analyzed, and the like.

In this embodiment, the identity information is used to characterize a device type, a communication address of the device, an access right of the device to a preset server, and the like, which correspond to the data acquisition terminal.

In this embodiment, the index data acquisition request is transmitted to the preset service to the data acquisition terminal, so that the corresponding index data can be acquired from the preset server through the data acquisition terminal.

In this embodiment, the preset server is set in advance, and is used for storing index data corresponding to different services to be analyzed.

In this embodiment, the first sub-response request is an identity information verification request corresponding to identity information corresponding to the data acquisition terminal included in the index data acquisition request.

In this embodiment, the second sub-response request is a data acquisition request for characterizing, among the index data, a data corresponding to the index data that needs to be acquired from the preset server.

In this embodiment, the first parsing refers to verifying the identity information of the data acquisition terminal by the preset server.

In this embodiment, the preset registration identity information base is set in advance, and identity information recorded by different terminals during registration is stored in the preset registration identity information base.

In this embodiment, the preset registered identity information is one of preset registered identity information databases, that is, terminal identity information in the preset registered identity information database, which is matched with the current data acquisition terminal identity.

In this embodiment, the second parsing refers to parsing the second sub-response request, that is, parsing the data acquisition request of the data acquisition terminal.

In this embodiment, the index data identifier is a type of tag for marking the data type corresponding to the different index data.

The beneficial effects of the technical scheme are as follows: corresponding index data are acquired from a preset server according to the service attribute of the service to be analyzed, so that a corresponding index knowledge graph is conveniently constructed according to the acquired index data, and the efficiency and the accuracy of analyzing and inquiring the index data according to the user requirements are ensured.

Example 3:

on the basis of embodiment 1, the present embodiment provides an index data analysis method based on a knowledge graph and natural language, as shown in fig. 2, in step 1, extracting data features of each index data in an index data set includes:

step 101: acquiring an index data set, dividing each index data in the index data set into N data segments with equal length, determining the protocol type of each index data in the index data set, and sequentially inputting the N data segments corresponding to each index data into a corresponding feature recognition model based on the protocol type;

Step 102: analyzing the input data segments based on the feature recognition model to obtain data type values and data target values corresponding to the data segments, and obtaining data features of the index data based on the data type values and the data target values corresponding to the data segments;

step 103: and clustering each index data in the index data set based on the data characteristics, and obtaining a classification result corresponding to each index data based on the clustering result.

In this embodiment, the data segment refers to different data segments obtained by splitting the index data, and is a part of the original index data.

In this embodiment, the protocol type refers to a requirement or a condition or the like that different index data needs to satisfy when executing the corresponding function.

In this embodiment, the feature recognition model is trained in advance, and is used to recognize data features corresponding to different data.

In this embodiment, the data type value is a data type corresponding to the index data using numerical characterization.

In this embodiment, the target value refers to a specific value size case or the like corresponding to different index data.

The beneficial effects of the technical scheme are as follows: the obtained index data is split into different data segments, the obtained data segments are input into the characteristic recognition model for analysis processing, so that the data characteristics of the different index data are accurately and reliably confirmed, and meanwhile, the obtained index data are classified according to the obtained data characteristics, so that index knowledge maps corresponding to the different index data are conveniently constructed, and convenience and guarantee are provided for rapidly and accurately analyzing the index data according to the input requirements of users.

Example 4:

on the basis of embodiment 1, the present embodiment provides an index data analysis method based on a knowledge graph and natural language, in step 1, an index knowledge graph corresponding to an index data set is constructed based on data features, including:

In this embodiment, repeating the index data means that at least two identical index data exist in the acquired index data.

In this embodiment, the standard index data refers to index data without repetition obtained by performing deduplication on repeated index data in the obtained index data.

In this embodiment, the task domain is a function type corresponding to the different index data, for example, the "customer purchasing power", "commodity attribute", "holiday effect", etc. that affect the "sales volume", where the "sales volume" is the task domain of "customer purchasing power", "commodity attribute", "holiday effect", etc.

In this embodiment, the service processing logic is used to characterize the sequence of the service to be processed in executing each department or each link in the running process, and the influence relationship between each department or each link.

In this embodiment, the first logical relationship is an interaction relationship for characterizing task domains, i.e. a dependency relationship between task domains where interactions exist.

In this embodiment, the infrastructure refers to a rough framework for constructing an index knowledge graph according to task fields corresponding to standard index data, so that specific index data corresponding to respective task fields can be filled into corresponding task fields in sequence, and the purpose is to improve efficiency and accuracy of index knowledge graph construction.

In this embodiment, the sub-index data refers to index data corresponding to each task area, and is part of standard index data.

In this embodiment, associating the sub-index data with the corresponding task domain refers to building an association relationship between the sub-index data and the corresponding task domain, so as to facilitate building an index knowledge graph corresponding to the index data according to the task domain.

In this embodiment, the keywords refer to data segments that can characterize the data cores of different sub-index data.

In this embodiment, the second logical relationship is an interaction relationship or an inter-calling relationship or the like for characterizing sub-index data included in different task fields.

In this embodiment, the structured data refers to performing data format conversion on the second logic relationship between the sub-index data and the specific content corresponding to the sub-index data, so as to ensure that the data content and the association relationship of the sub-index data can be displayed in the form of a knowledge graph.

In this embodiment, fusing the converted structured data with the constructed basic frame refers to filling sub-index data corresponding to each task field and a second logic relationship corresponding to the sub-index data in the constructed basic frame, so as to construct a final required index knowledge graph.

The beneficial effects of the technical scheme are as follows: the method has the advantages that the method is convenient to ensure the accuracy and conciseness of the constructed index knowledge graph by performing the de-duplication operation on the index data according to the data characteristics of the index data, the task fields related to the index data are determined according to the data characteristics, the basic framework of the index knowledge graph is constructed according to the task fields according to the service processing logic of the service to be processed, and finally, the sub-index data are filled in the constructed basic framework by determining the sub-index data and the logic relations among the sub-index data contained in different task fields, so that the final index knowledge graph is obtained, the reliability of the constructed index knowledge graph is ensured, and convenience and guarantee are provided for rapid and accurate analysis or query of the index data according to user input.

Example 5:

on the basis of embodiment 4, the present embodiment provides an index data analysis method based on a knowledge graph and natural language, to obtain an index knowledge graph corresponding to an index data set, including:

In this embodiment, the task field to be verified refers to a region, which is obtained by sampling and detecting the index knowledge graph and needs to be verified in the index knowledge graph, so as to verify whether the constructed index knowledge graph is perfect.

In this embodiment, the service scope is used to characterize the type of index data that should be included in the task field to be checked, and is index data that should be included in the index knowledge graph that is theoretically constructed.

In this embodiment, the theoretical sub-index data refers to the type of sub-index data that should be included in the task field to be verified and the data content of the specific sub-index data, which are determined according to the service range.

In this embodiment, the step of determining that the constructed index knowledge graph has a storage defect refers to determining that the constructed index knowledge graph is imperfect, that is, lacks sub-index data.

In this embodiment, the difference index data refers to sub-index data that is not involved in the constructed index knowledge graph, and is at least one.

In this embodiment, the data header information refers to the data content corresponding to the start position of the difference index data.

In this embodiment, the data trailer information refers to the data content corresponding to the end position of the difference index data.

In this embodiment, the relative positional relationship is a data position for characterizing that the difference index data should be in the corresponding task domain, in order to supplement the difference index data only from the constructed index knowledge graph.

The beneficial effects of the technical scheme are as follows: the method comprises the steps of sampling and detecting a constructed index knowledge graph, determining a service range of a task field to be checked, accurately and effectively confirming theoretical sub-index data of the task field to be checked, providing a reference basis for checking the task field to be checked, matching the sub-index data of the task field to be checked with corresponding theoretical sub-index data, determining difference index data in the task field to be checked, and accurately and effectively confirming the relative positions of the difference indexes in the corresponding task field according to data head information and data tail information of the difference index data after the difference index data is determined, so that the constructed index knowledge graph is perfected, and the accuracy of the constructed index knowledge graph is guaranteed.

Example 6:

on the basis of embodiment 1, the present embodiment provides a method for analyzing index data based on a knowledge graph and a natural language, in step 2, performing natural language processing on a target request input by a user, including:

In this embodiment, the history of input text is pre-set to provide data support for building a natural language model of distress.

In this embodiment, the sentence segment refers to a plurality of sentences obtained after splitting the history input text, so as to improve the processing efficiency and the processing accuracy of the history input text, thereby facilitating to ensure the accuracy of the natural language processing model obtained by training.

In this embodiment, the parts of speech is a type used to characterize different words, and may specifically be a connective or the like.

In this embodiment, the part-of-speech composition is a word for characterizing the vocabulary type contained in each sentence segment, and may specifically be a connective word, a person-named word, and a word for characterizing a result or purpose.

In this embodiment, the multiple modes refer to that the historical data text is analyzed by using natural languages of different modes, for example, different sentence segments may be analyzed by using chinese, english, and the like.

In this embodiment, semantic learning refers to determining the subject content of different sentence segments, so as to confirm the processing flow of natural language by processing the history input text.

In this embodiment, the semantic features are sentence content features for characterizing different sentence segments, i.e. the idea of the gist that each sentence segment is intended to convey.

In this embodiment, the analysis flow refers to splitting of the history text, analysis of parts of speech in sentences, semantic learning conditions under different modes, and the like.

In this embodiment, the natural language processing model is constructed according to semantic features and an analysis flow of the semantic features, so as to convert a target request input by a user into a corresponding natural language, thereby implementing analysis on a constructed index knowledge graph.

In this embodiment, the target requirement mode refers to a mode that the user needs to adopt, and may be any one of chinese or english, for example.

In this embodiment, the target text refers to converting the target request into a corresponding script file, in order to convert the target request input by the user into a corresponding natural language.

In this embodiment, the target semantic feature refers to specific data content corresponding to the target text obtained by performing natural language processing on the target text input by the user through the natural language processing model, that is, a final function which the target text wants to realize.

In this embodiment, the preset corpus is set in advance, and different modes correspond to different corpora, and different natural language vocabularies are stored in the corpus.

In this embodiment, the target natural language corpus refers to a set of natural language vocabularies which are matched from a preset corpus according to target semantic features and have the same content as the target semantic features and different expression forms.

In this embodiment, the preset grammar rules are known in advance, for example, the structure of the main predicate in english mode is to be followed, etc.

In this embodiment, the target natural language corpus is a natural language corpus included in the target natural language corpus set.

In this embodiment, the target word order is a logic order that is used to characterize the target natural language corpus to satisfy when constructing the logic sentence.

The beneficial effects of the technical scheme are as follows: by processing the historical input text, the semantic features in different modes and the analysis flow of the semantic features are accurately and effectively confirmed according to the processing results of the historical input text, so that the accuracy and the reliability of a constructed natural language processing model are guaranteed, then, a target request input by a user is converted into a target text, the obtained target text is analyzed through the constructed natural language processing model, the accuracy and the reliability of the target semantic features of the target input request of the user are realized, finally, the target natural language corpus is matched from a preset corpus corresponding to the mode according to the obtained target semantic features, the effective acquisition of the natural language corresponding to the target request is realized, the reliability of confirming the target requirement of the user is guaranteed, and the accuracy and the efficiency of analyzing the target data through the target knowledge graph are also improved.

Example 7:

on the basis of embodiment 1, the present embodiment provides an index data analysis method based on a knowledge graph and a natural language, in step 2, a key analysis requirement identification is performed on the natural language obtained by processing, so as to obtain a target requirement, including:

In this embodiment, the target semantics refer to the subject content corresponding to the natural language.

In this embodiment, keywords refer to data pieces that have a large influence on the gist of natural language and are capable of representing natural language content.

In this embodiment, the key analysis requirement is used for characterizing the purpose that the user needs to query or analyze the index data finally, and may specifically be to query a certain type of index data, etc.

In this embodiment, the target intention refers to a final processing result of the index data that the user needs to implement through the index knowledge graph and the natural language.

In this embodiment, determining whether the target intention of the user is missing based on the key analysis requirement is used to verify whether the key analysis requirement determined according to the natural language satisfies the query logic for the index data, so as to facilitate verification of whether the identification of the natural language is accurate.

In this embodiment, the missing language component is a sentence component for representing the missing in the obtained natural language, and may specifically be any component in the main predicate and the like.

In this embodiment, the natural corpus label to be supplemented is a kind of markup symbol for characterizing the type of natural language that needs to be supplemented.

In this embodiment, the historical corpus information is obtained in advance and is a record of all analysis requests of the user over a period of time.

In this embodiment, the target supplementary natural language refers to a natural language that can supplement the obtained natural language, that is, a natural language that matches a natural language label to be supplemented in the history corpus information.

The beneficial effects of the technical scheme are as follows: the key analysis requirements of the user are determined according to the target semantics of the natural language, the obtained key analysis requirements are checked, and when the target intention is missing, the key analysis requirements of the user are supplemented in time, so that the accuracy and the reliability of the finally obtained target requirements are ensured, the accuracy of index data query or analysis is ensured, and the query or analysis efficiency of the index data according to the user requirements is improved.

Example 8:

on the basis of embodiment 1, the present embodiment provides an index data analysis method based on a knowledge graph and a natural language, in step 3, matching a target requirement with an index knowledge graph to obtain key index data, including:

In this embodiment, the preset index query engine is set in advance, and is configured to generate a corresponding index data query statement according to the target requirement.

In this embodiment, the query element is a query gist for characterizing the index data corresponding to the target requirement, for example, query may be performed on the index data related to finance.

In this embodiment, the query statement is generated by a preset index query engine, so as to control the system to quickly locate the index data.

In this embodiment, the attribution rate is used to characterize the degree to which each index data in the index knowledge graph satisfies the requirement of the query statement, and a larger value indicates that the query requirement of the query statement is satisfied.

In this embodiment, the preset threshold is set in advance, and the minimum criterion for representing that the query requirement is satisfied is adjustable according to the actual situation.

In this embodiment, analyzing the index knowledge graph based on the query statement includes:

The method comprises the specific steps of obtaining the total number of the queried key index data, and calculating the accuracy of the queried key index data based on the total number, wherein the specific steps comprise:

calculating the accuracy of the queried key index data according to the following formula:

wherein, eta represents the accuracy of the queried key index data, and the value range is (0, 1); alpha represents an error factor, and the value range is 0.01,0.03; m represents the total number of the queried key index data; m represents the number of key index data which does not meet the query requirement in the queried key index data, and the value is smaller than M; s represents the number of key index data which are misjudged to be not in accordance with the query requirement in the key index data, and the value is smaller than m;

comparing the calculated accuracy with a preset accuracy threshold;

if the calculated accuracy is greater than or equal to a preset accuracy threshold, judging that the analysis effect of the index data based on the index knowledge graph and the natural language is qualified;

otherwise, judging that the analysis effect of the index data based on the index knowledge graph and the natural language is unqualified, and optimizing the construction flow of the index knowledge graph and the processing flow of the natural language until the accuracy rate obtained by calculation is greater than or equal to a preset accuracy rate threshold value.

The preset accuracy threshold is set in advance, is used for representing the minimum requirement for analyzing the index data, and can be adjusted.

The beneficial effects of the technical scheme are as follows: the method comprises the steps of generating corresponding query sentences according to target requirements through a preset index query engine, accurately and reliably searching index data contained in an index knowledge graph, determining the attribution rate of the searched index data relative to the query sentences, finally accurately and reliably determining the finally needed key index data according to the attribution rate, simultaneously, calculating the accuracy rate of inquiring the key index data, and optimizing the construction flow of the index knowledge graph and the processing flow of natural language in time when the accuracy rate is smaller than a preset accuracy rate threshold value, so that the analysis effect of the index data is conveniently ensured, and the accuracy rate of inquiring or analyzing the index data is ensured.

Example 9:

on the basis of embodiment 1, the present embodiment provides an index data analysis method based on a knowledge graph and a natural language, in step 3, a target image corresponding to key index data is generated, and the key index data and the target image are displayed, including:

and fusing the converted key index data with a target chart template to obtain a target image corresponding to the key index data, and feeding the key index data and the corresponding target image back to the query terminal for display, wherein the target image is a histogram, a pie chart or a line chart.

In this embodiment, the index analysis requirement refers to an analysis purpose or an analysis requirement corresponding to the key index data, and specifically may be determining a trend of the data or a proportion of different data by the key index data.

In this embodiment, the target chart type refers to a chart type that needs to display key index data, and may specifically be a histogram, a pie chart, a line chart, or other images.

In this embodiment, the preset chart template library is set in advance, and is used for storing different chart templates.

In this embodiment, the target chart template refers to a chart suitable for presenting the current key data.

In this embodiment, the configuration parameters refer to requirements of a target chart template on a value range of data to be displayed, a display format of the data, and the like.

In this embodiment, the target image refers to a final image obtained by fusing key index data with a corresponding target chart template.

The beneficial effects of the technical scheme are as follows: by determining the index analysis requirement on the key index data, the corresponding target image is generated from the index data according to the index analysis requirement, and the target image and the key index data are displayed, so that the reliability of index data analysis is improved, and the running condition of the current service can be known in time according to the index analysis result.

Example 10:

the embodiment provides an index data analysis system based on a knowledge graph and natural language, as shown in fig. 3, including:

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. The index data analysis method based on the knowledge graph and the natural language is characterized by comprising the following steps:

step 3: matching the target requirements with the index knowledge graph to obtain key index data, generating a target image corresponding to the key index data, and displaying the key index data and the target image;

in step 1, obtaining an index data set corresponding to a service to be analyzed includes:

Searching a preset index database based on an index data identifier to be acquired to obtain an index data set, packaging the index data set, and feeding back the index data set to a data acquisition terminal based on a communication link;

in step 1, extracting data features of each index data in the index data set includes:

clustering each index data in the index data set based on the data characteristics, and obtaining a classification result corresponding to each index data based on the clustering result;

in step 2, performing natural language processing on a target request input by a user, including:

2. The method for analyzing index data based on knowledge graph and natural language according to claim 1, wherein in step 1, an index knowledge graph corresponding to an index data set is constructed based on data characteristics, comprising:

3. The method for analyzing index data based on knowledge graph and natural language according to claim 2, wherein obtaining the index knowledge graph corresponding to the index data set comprises:

4. The method for analyzing index data based on knowledge graph and natural language according to claim 1, wherein in step 2, the method for identifying the key analysis requirement of the natural language obtained by processing to obtain the target requirement comprises the following steps:

5. The method for analyzing index data based on knowledge graph and natural language according to claim 1, wherein in step 3, matching the target requirement with the index knowledge graph to obtain key index data comprises:

6. The method for analyzing index data based on knowledge graph and natural language according to claim 1, wherein in step 3, a target image corresponding to key index data is generated, and the key index data and the target image are displayed, comprising:

7. An index data analysis system based on a knowledge graph and natural language, comprising:

the analysis module is used for matching the target requirements with the index knowledge graph to obtain key index data, generating a target image corresponding to the key index data, and displaying the key index data and the target image;

wherein, the atlas construction module includes:

wherein, natural language processing module includes: