CN114637831A - Data query method based on semantic analysis and related equipment thereof - Google Patents

Data query method based on semantic analysis and related equipment thereof Download PDF

Info

Publication number
CN114637831A
CN114637831A CN202210253921.3A CN202210253921A CN114637831A CN 114637831 A CN114637831 A CN 114637831A CN 202210253921 A CN202210253921 A CN 202210253921A CN 114637831 A CN114637831 A CN 114637831A
Authority
CN
China
Prior art keywords
data
query
information
target information
intention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210253921.3A
Other languages
Chinese (zh)
Inventor
纪桂锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202210253921.3A priority Critical patent/CN114637831A/en
Publication of CN114637831A publication Critical patent/CN114637831A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24522Translation of natural language queries to structured queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/685Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using automatically derived transcript of audio data, e.g. lyrics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application belongs to the technical field of artificial intelligence, is applied to the field of intelligent communities, and relates to a data query method based on semantic analysis and related equipment thereof, wherein the data query method comprises the steps of receiving voice query data and converting the voice query data into text data; inputting the text data into a pre-trained intention recognition model to obtain intention information, and performing information extraction operation on the text data based on the pre-trained word vector conversion model to obtain target information; and generating a query statement according to the intention information and the target information, and operating the query statement in a database to obtain result data. The result data may be stored in a block chain, among other things. The method and the device improve data query efficiency and accuracy.

Description

Data query method based on semantic analysis and related equipment thereof
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a data query method based on semantic analysis and related equipment thereof.
Background
In recent years, a data analysis BI tool starts to appear a voice interaction (VUI) mode to replace the traditional GUI interaction, and conventionally, a mouse is controlled to click various options and fields, so that only one sentence needs to be spoken, for example: the sales volume of east region in 2021 s is what, the data result can be obtained immediately, so that the operation time and the analysis threshold are greatly reduced, and the user experience is improved. However, the key word recognition methods are mainly used, such as: "what the sales in the east area of hua in 2021 is", wherein the keywords are "2021", "huadong area" and "sales", and then go to the database to query the data.
At present, more common but more complex questions can not be identified only by a keyword identification method, for example, "how much money was earned in Nanjing in the last month of department A", and the keyword of the sentence has no "sales", so that the intention of the user can not be correctly identified. The exhaustive approach of building keywords requires a significant human input and is very inefficient.
Disclosure of Invention
The embodiment of the application aims to provide a data query method based on semantic analysis and related equipment thereof, so that the data query efficiency and accuracy are improved.
In order to solve the above technical problem, an embodiment of the present application provides a data query method based on semantic analysis, which adopts the following technical solutions:
a data query method based on semantic analysis comprises the following steps:
receiving voice inquiry data, and converting the voice inquiry data into text data;
inputting the text data into a pre-trained intention recognition model to obtain intention information, and performing information extraction operation on the text data based on the pre-trained word vector conversion model to obtain target information;
and generating a query statement according to the intention information and the target information, and operating the query statement in a database to obtain result data.
Further, the step of performing information extraction operation on the text data to obtain target information includes:
acquiring preset category labels, wherein each category label is associated with corresponding preset basic data;
carrying out named entity recognition operation on the text data to obtain a plurality of entities;
respectively inputting the entity and the basic data into a word vector conversion model to obtain an entity vector and a basic vector;
calculating a semantic distance between the entity vector and the base vector;
and judging whether the semantic distance is greater than a semantic threshold value, if so, taking the corresponding entity as target information of a category label associated with the basic data.
Further, the step of performing named entity recognition operation on the text data to obtain a plurality of entities includes:
and inputting the text data into a pre-trained conditional random field model, and carrying out entity labeling operation on the text data by the conditional random field model to obtain the plurality of entities.
Further, the step of generating a query statement according to the intention information and the target information includes:
and filling the intention information and the target information into a preset SQL template to obtain the query statement.
Further, the step of filling the intention information and the target information into a preset SQL template to obtain the query statement includes:
determining a vacancy in the SQL template associated with the category label, filling the target information into the corresponding vacancy according to the category label, filling the intention information into the corresponding vacancy in the SQL template, and generating the query statement.
Further, the step of filling the intention information and the target information into a preset SQL template to obtain the query statement includes:
filling the intention information and the target information into the SQL template to obtain an initial statement;
determining whether a vacancy exists in the initial sentence;
if not, taking the initial statement as the query statement;
if so, determining a category label associated with the vacancy, acquiring a default value corresponding to the category label, filling the default value into the vacancy, and generating the query statement.
Further, after the step of running the query statement in the database and obtaining result data, the method further includes:
filling the result data into a corresponding preset style sheet to obtain display data;
and displaying the display data on a front-end page.
In order to solve the above technical problem, an embodiment of the present application further provides a data query device based on semantic analysis, which adopts the following technical solutions:
a data query device based on semantic analysis, comprising:
the receiving module is used for receiving voice inquiry data and converting the voice inquiry data into text data;
the extraction module is used for inputting the text data into a pre-trained intention recognition model to obtain intention information, and performing information extraction operation on the text data based on the pre-trained word vector conversion model to obtain target information;
and the generating module is used for generating a query statement according to the intention information and the target information, operating the query statement in a database and obtaining result data.
In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:
a computer device comprising a memory and a processor, the memory having stored therein computer-readable instructions, the processor implementing the steps of the semantic analysis-based data query method described above when executing the computer-readable instructions.
In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:
a computer readable storage medium having computer readable instructions stored thereon, which when executed by a processor, implement the steps of the semantic analysis based data query method described above.
Compared with the prior art, the embodiment of the application mainly has the following beneficial effects:
according to the method and the device, the voice inquiry data of the user are converted into the text data, the intention information of the user is recognized from the text data according to the intention recognition model, and meanwhile, information extraction operation is carried out on the text data, so that the target information is obtained. The query sentences are generated according to the intention information and the target information, so that the information can be more accurately identified, more types of voice query sentences can be covered, the analysis time of voice query data of a user is shortened, and the data query efficiency and accuracy are improved.
Drawings
In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a semantic analysis based data query method according to the present application;
FIG. 3 is a schematic block diagram of an embodiment of a semantic analysis based data query device according to the present application;
FIG. 4 is a schematic block diagram of one embodiment of a computer device according to the present application.
Reference numerals: 200. a computer device; 201. a memory; 202. a processor; 203. a network interface; 300. a data query device based on semantic analysis; 301. a receiving module; 302. an extraction module; 303. and generating a module.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer iii, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that the data query method based on semantic analysis provided in the embodiments of the present application is generally executed by a server/terminal device, and accordingly, the data query apparatus based on semantic analysis is generally disposed in the server/terminal device.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow diagram of one embodiment of a semantic analysis based data query method according to the present application is shown. The data query method based on semantic analysis comprises the following steps:
s1: receiving voice query data, and converting the voice query data into text data.
In this embodiment, the user speaks the voice query data through the microphone, the microphone transmits the voice query data to the server, and the ASR technology in the server converts the voice query data into text data for subsequent semantic recognition. ASR technology (Automatic Speech Recognition technology) is a technology for converting human Speech into text. The aim of Automatic Speech Recognition (ASR) technology is to make computer "listen and write" continuous Speech spoken by different people, namely "Speech dictation machine", which is a technology for realizing conversion from "voice" to "character".
In this embodiment, the electronic device (for example, the server/terminal device shown in fig. 1) on which the semantic analysis based data query method operates may receive the voice query data through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future.
S2: inputting the text data into a pre-trained intention recognition model to obtain intention information, and performing information extraction operation on the text data based on the pre-trained word vector conversion model to obtain target information.
In this embodiment, semantic recognition is performed on the text data by an intention recognition model, and a corresponding semantic tag is determined as output intention information. Wherein, the intention identification model is an RNN (Recurrent Neural Network) model based on deep learning. RNN (Current Neural network) is a type of Neural network used to process sequence data. The RNN includes an input layer, a hidden layer, and an output layer. The information extraction operation includes: information in the text data, for example, information such as date, department, location, etc., is extracted as target information.
Specifically, the step of performing information extraction operation on the text data to obtain the target information includes:
acquiring preset category labels, wherein each category label is associated with corresponding preset basic data;
carrying out named entity recognition operation on the text data to obtain a plurality of entities;
respectively inputting the entity and the basic data into a word vector conversion model to obtain an entity vector and a basic vector;
calculating a semantic distance between the entity vector and the base vector;
and judging whether the semantic distance is greater than a semantic threshold value, if so, taking the corresponding entity as target information of a category label associated with the basic data.
In this embodiment, if the semantic distance is less than or equal to the semantic threshold, it is determined that there is no corresponding target information in the corresponding category label. Named Entity Recognition (NER) is performed on words, and category labels are preset in the application, such as: and (4) department. Underlying data is added under the category label, for example: the finance department. And obtaining a word vector through a word vector conversion model, wherein the word vector conversion model can adopt a BERT model. And calculating a semantic distance between the word vector and the basic vector, and when the semantic distance is greater than a semantic threshold, regarding the word as a homonymy, and further using the word as target information of a corresponding category label, for example, using a legal department or a market department as target information of a department. Wherein the semantic distance is a cosine similarity between the entity vector and the basis vector.
The step of obtaining a plurality of entities by performing named entity recognition operation on the text data comprises:
and inputting the text data into a pre-trained conditional random field model, and carrying out entity labeling operation on the text data by the conditional random field model to obtain the plurality of entities.
In this embodiment, text data is input into a CRF (Conditional Random Fields) model, and the Conditional Random field model performs a word segmentation operation on the text data to obtain a plurality of words, and performs an entity tagging operation on the words, such as tagging the words as names of people, places, organizations, or the like; and finally, determining whether the marked words are entities such as a name entity, a place name entity, an organization entity, date and time and the like. And determining words belonging to the entity by the conditional random field according to the labeling result.
S3: and generating a query statement according to the intention information and the target information, and operating the query statement in a database to obtain result data.
In this embodiment, a query statement is generated based on the intention information and the target information, the query statement is executed, a data query is performed from the database, and the result data is output. The method and the device can identify information more accurately, cover more types of voice inquiry data, and reduce analysis time of the voice inquiry data of the user.
Specifically, the step of generating a query statement according to the intention information and the target information includes:
and filling the intention information and the target information into a preset SQL template to obtain the query statement.
In this embodiment, the intention information and the target information are filled in by calling a preset SQL template, so as to obtain a query Statement (SQL).
The step of filling the intention information and the target information into a preset SQL template to obtain the query statement comprises the following steps:
determining a vacancy in the SQL template associated with the category label, filling the target information into the corresponding vacancy according to the category label, filling the intention information into the corresponding vacancy in the SQL template, and generating the query statement.
In this embodiment, the target information corresponding to different category labels is extracted from the text data in the above steps, and in the generation process of the query statement in this step, the category labels have an association relationship with the gaps in the SQL template, so that the target information is filled in the corresponding gaps in the SQL model according to the category labels, and the intention information is also filled in the corresponding gaps, thereby generating the query statement. And the vacancy corresponding to the intention information is a vacancy which is marked as the intention in the SQL template in advance, and the corresponding vacancy is identified according to the mark and filled.
In addition, the step of filling the intention information and the target information into a preset SQL template to obtain the query statement includes:
filling the intention information and the target information into the SQL template to obtain an initial statement;
determining whether a vacancy exists in the initial sentence;
if not, taking the initial statement as the query statement;
if so, determining a category label associated with the vacancy, acquiring a default value corresponding to the category label, filling the default value into the vacancy, and generating the query statement.
In this embodiment, if only department and location information can be extracted, but date information cannot be extracted, there is a case where some item of information is missing in the uploaded initial sentence, at this time, a default value is used, for example, if there is no information of a department, it is considered that all departments are included, and names of all departments are filled in the missing place as the default value. The application can be applied to all cases by one sql template.
In some optional implementations of this embodiment, in step S1: after receiving the voice inquiry data and converting the voice inquiry data into text data, step S2: before inputting the text data into a pre-trained intention recognition model and obtaining intention information, the electronic device may further perform the following steps:
receiving an initial intent recognition model and training data, wherein the training data comprises a query statement and a corresponding semantic tag;
and training the initial intention recognition model through the training data until the initial intention recognition model converges to obtain the pre-trained intention recognition model.
In the embodiment, the intention recognition model is trained by defining semantic tags and labeling a proper amount of query sentences, so as to recognize what the user wants to look up. For example, "how much money was earned in Nanjing by department A in the last month," in this context, the intent recognition model would consider the user to look up [ sales ]. Specifically, the labeled query sentence is used as a training data set, and the initial intention recognition model performs data cleaning, data conversion, NLP word segmentation, vectorization and other processing on data in the training data set. The model effect is optimized through the following modes: and adjusting model parameters, and training for multiple times according to different service scenes and data until the optimal parameters are obtained. And more training data are added, so that the model learns more characteristics, and the recognition effect is better.
In some alternative implementations, at step S3: after the query statement is run in the database and the result data is obtained, the electronic device may perform the following steps:
filling the result data into a corresponding preset style sheet to obtain display data;
and displaying the display data on a front-end page.
In this embodiment, the server fills the result data into the corresponding preset style sheet to obtain the display data, and displays the display data on the front-end page. Or the server transmits the result data to a front-end page; and the front-end page receives the result data, fills the result data into a corresponding preset style sheet (namely a preset chart style), obtains display data and displays the display data. The display data are shown in table 1 for example:
TABLE 1
Date of day Department of department City Sales amount
202109 A Nanjing 23000
According to the method and the device, the voice inquiry data of the user are converted into the text data, the intention information of the user is recognized from the text data according to the intention recognition model, and meanwhile, information extraction operation is carried out on the text data, so that the target information is obtained. The query sentences are generated according to the intention information and the target information, so that the information can be more accurately identified, more types of voice query sentences can be covered, the analysis time of voice query data of a user is shortened, and the data query efficiency and accuracy are improved.
It is emphasized that, in order to further ensure the privacy and security of the result data, the result data may also be stored in a node of a block chain.
The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The method and the device can be applied to the field of smart communities, and therefore the construction of smart cities is promoted.
Those skilled in the art will appreciate that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to computer readable instructions, which can be stored in a computer readable storage medium, and when executed, the computer readable instructions can include the processes of the embodiments of the methods described above. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a data query apparatus based on semantic analysis, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be applied to various electronic devices.
As shown in fig. 3, the data query device 300 based on semantic analysis according to the embodiment includes: a receiving module 301, an extracting module 302 and a generating module 303. Wherein: the receiving module 301 is configured to receive voice query data, and convert the voice query data into text data; the extraction module 302 is configured to input the text data into a pre-trained intention recognition model to obtain intention information, and perform information extraction operation on the text data based on the pre-trained word vector conversion model to obtain target information; the generating module 303 is configured to generate a query statement according to the intention information and the target information, and run the query statement in a database to obtain result data.
In this embodiment, the voice query data of the user is converted into the text data, and then the intention information of the user is recognized from the text data according to the intention recognition model, and meanwhile, the information extraction operation is performed on the text data to obtain the target information. The query sentences are generated according to the intention information and the target information, so that the information can be more accurately identified, more types of voice query sentences can be covered, the analysis time of voice query data of the user is reduced, and the data query efficiency and accuracy are improved.
The extraction module 302 comprises an acquisition submodule, an identification submodule, an input submodule, a calculation submodule and a judgment submodule, wherein the acquisition submodule is used for acquiring preset category labels, and each category label is respectively associated with corresponding preset basic data; the recognition submodule is used for carrying out named entity recognition operation on the text data to obtain a plurality of entities; the input submodule is used for respectively inputting the entity and the basic data into a word vector conversion model to obtain an entity vector and a basic vector; the calculation submodule is used for calculating the semantic distance between the entity vector and the base vector; the judgment submodule is used for judging whether the semantic distance is larger than a semantic threshold value, and if so, the corresponding entity is used as the target information of the category label associated with the basic data.
In some optional implementations of this embodiment, the identifying submodule is further configured to: and inputting the text data into a pre-trained conditional random field model, and carrying out entity labeling operation on the text data by the conditional random field model to obtain the plurality of entities.
In some optional implementations of this embodiment, the generating module 303 is further configured to: and filling the intention information and the target information into a preset SQL template to obtain the query statement.
In some optional implementations of this embodiment, the generating module 303 is further configured to: determining a vacancy in the SQL template associated with the category label, filling the target information into the corresponding vacancy according to the category label, filling the intention information into the corresponding vacancy in the SQL template, and generating the query statement.
The generating module 303 includes a filling sub-module, a determining sub-module, a first generating sub-module, and a second generating sub-module, where the filling sub-module is configured to fill the intention information and the target information into the SQL template to obtain an initial statement; the determining submodule is used for determining whether the initial statement has a vacancy or not; the first generation submodule is used for taking the initial statement as the query statement when no vacancy exists; the second generation submodule is used for determining a category label associated with the vacancy when the vacancy exists, acquiring a default value corresponding to the category label, filling the default value into the vacancy, and generating the query statement.
In some optional implementations of this embodiment, the apparatus 300 further includes: the system comprises a first training module and a second training module, wherein the first training module is used for receiving an initial intention recognition model and training data, and the training data comprises a query statement and a corresponding semantic label; the second training module is used for training the initial intention recognition model through the training data until the initial intention recognition model converges to obtain the pre-trained intention recognition model.
In some optional implementations of this embodiment, the apparatus 300 further includes: the display data generating module is used for filling the result data into a corresponding preset style sheet to obtain display data; the display module is used for displaying the display data on a front-end page.
According to the method and the device, the voice inquiry data of the user are converted into the text data, the intention information of the user is recognized from the text data according to the intention recognition model, and meanwhile, information extraction operation is carried out on the text data, so that the target information is obtained. The query sentences are generated according to the intention information and the target information, so that the information can be more accurately identified, more types of voice query sentences can be covered, the analysis time of voice query data of the user is reduced, and the data query efficiency and accuracy are improved.
In order to solve the technical problem, the embodiment of the application further provides computer equipment. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.
The computer device 200 includes a memory 201, a processor 202, and a network interface 203 communicatively connected to each other via a system bus. It is noted that only computer device 200 having components 201-203 is shown, but it is understood that not all of the illustrated components are required and that more or fewer components can alternatively be implemented. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
The memory 201 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 201 may be an internal storage unit of the computer device 200, such as a hard disk or a memory of the computer device 200. In other embodiments, the memory 201 may also be an external storage device of the computer device 200, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the computer device 200. Of course, the memory 201 may also include both internal and external storage devices of the computer device 200. In this embodiment, the memory 201 is generally used for storing an operating system installed in the computer device 200 and various types of application software, such as computer readable instructions of a data query method based on semantic analysis. Further, the memory 201 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 202 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 202 is generally operative to control overall operation of the computer device 200. In this embodiment, the processor 202 is configured to execute computer readable instructions stored in the memory 201 or process data, for example, execute computer readable instructions of the semantic analysis-based data query method.
The network interface 203 may comprise a wireless network interface or a wired network interface, and the network interface 203 is generally used for establishing communication connection between the computer device 200 and other electronic devices.
In the embodiment, the query sentence is generated according to the intention information and the target information, so that the information can be more accurately identified, more types of voice query sentences can be covered, the analysis time of voice query data of a user is reduced, and the data query efficiency and accuracy are improved.
The present application further provides another embodiment, which is to provide a computer-readable storage medium storing computer-readable instructions executable by at least one processor to cause the at least one processor to perform the steps of the semantic analysis based data query method as described above.
In the embodiment, the query sentence is generated according to the intention information and the target information, so that the information can be more accurately identified, more types of voice query sentences can be covered, the analysis time of voice query data of a user is reduced, and the data query efficiency and accuracy are improved.
Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that modifications can be made to the embodiments described in the foregoing detailed description, or equivalents can be substituted for some of the features described therein. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims (10)

1. A data query method based on semantic analysis is characterized by comprising the following steps:
receiving voice inquiry data, and converting the voice inquiry data into text data;
inputting the text data into a pre-trained intention recognition model to obtain intention information, and performing information extraction operation on the text data based on the pre-trained word vector conversion model to obtain target information;
and generating a query statement according to the intention information and the target information, and operating the query statement in a database to obtain result data.
2. The data query method based on semantic analysis according to claim 1, wherein the step of performing information extraction operation on the text data to obtain target information comprises:
acquiring preset category labels, wherein each category label is associated with corresponding preset basic data;
carrying out named entity recognition operation on the text data to obtain a plurality of entities;
respectively inputting the entity and the basic data into a word vector conversion model to obtain an entity vector and a basic vector;
calculating a semantic distance between the entity vector and the base vector;
and judging whether the semantic distance is greater than a semantic threshold value, if so, taking the corresponding entity as target information of a category label associated with the basic data.
3. The method for querying data based on semantic analysis according to claim 2, wherein the step of performing named entity recognition operation on the text data to obtain a plurality of entities comprises:
and inputting the text data into a pre-trained conditional random field model, and carrying out entity labeling operation on the text data by the conditional random field model to obtain the plurality of entities.
4. The semantic analysis-based data query method according to claim 1, wherein the step of generating a query statement from the intention information and the target information comprises:
and filling the intention information and the target information into a preset SQL template to obtain the query statement.
5. The data query method based on semantic analysis according to claim 4, wherein the step of filling the intention information and the target information into a preset SQL template to obtain the query statement comprises:
determining a vacancy in the SQL template associated with the category label, filling the target information into the corresponding vacancy according to the category label, filling the intention information into the corresponding vacancy in the SQL template, and generating the query statement.
6. The data query method based on semantic analysis according to claim 4, wherein the step of filling the intention information and the target information into a preset SQL template to obtain the query statement comprises:
filling the intention information and the target information into the SQL template to obtain an initial statement;
determining whether a vacancy exists in the initial sentence;
if not, taking the initial statement as the query statement;
if yes, determining a category label associated with the vacancy, acquiring a default value corresponding to the category label, filling the default value into the vacancy, and generating the query statement.
7. The semantic analysis-based data query method according to claim 1, further comprising, after the step of executing the query statement in the database to obtain result data:
filling the result data into a corresponding preset style sheet to obtain display data;
and displaying the display data on a front-end page.
8. A data query device based on semantic analysis, comprising:
the receiving module is used for receiving voice inquiry data and converting the voice inquiry data into text data;
the extraction module is used for inputting the text data into a pre-trained intention recognition model to obtain intention information, and performing information extraction operation on the text data based on the pre-trained word vector conversion model to obtain target information;
and the generating module is used for generating a query statement according to the intention information and the target information, operating the query statement in a database and obtaining result data.
9. A computer device comprising a memory having computer readable instructions stored therein and a processor that when executed performs the steps of the semantic analysis based data query method according to any one of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon computer readable instructions, which when executed by a processor, implement the steps of the semantic analysis based data query method according to any one of claims 1 to 7.
CN202210253921.3A 2022-03-15 2022-03-15 Data query method based on semantic analysis and related equipment thereof Pending CN114637831A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210253921.3A CN114637831A (en) 2022-03-15 2022-03-15 Data query method based on semantic analysis and related equipment thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210253921.3A CN114637831A (en) 2022-03-15 2022-03-15 Data query method based on semantic analysis and related equipment thereof

Publications (1)

Publication Number Publication Date
CN114637831A true CN114637831A (en) 2022-06-17

Family

ID=81948390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210253921.3A Pending CN114637831A (en) 2022-03-15 2022-03-15 Data query method based on semantic analysis and related equipment thereof

Country Status (1)

Country Link
CN (1) CN114637831A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117591547A (en) * 2024-01-18 2024-02-23 中昊芯英(杭州)科技有限公司 Database query method and device, terminal equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117591547A (en) * 2024-01-18 2024-02-23 中昊芯英(杭州)科技有限公司 Database query method and device, terminal equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112685565B (en) Text classification method based on multi-mode information fusion and related equipment thereof
CN112328761B (en) Method and device for setting intention label, computer equipment and storage medium
CN111783471B (en) Semantic recognition method, device, equipment and storage medium for natural language
CN112632278A (en) Labeling method, device, equipment and storage medium based on multi-label classification
CN112395390B (en) Training corpus generation method of intention recognition model and related equipment thereof
CN113627797B (en) Method, device, computer equipment and storage medium for generating staff member portrait
CN114357117A (en) Transaction information query method and device, computer equipment and storage medium
CN112836521A (en) Question-answer matching method and device, computer equipment and storage medium
CN115438149A (en) End-to-end model training method and device, computer equipment and storage medium
CN112446209A (en) Method, equipment and device for setting intention label and storage medium
CN116796730A (en) Text error correction method, device, equipment and storage medium based on artificial intelligence
CN117312535B (en) Method, device, equipment and medium for processing problem data based on artificial intelligence
CN112434746B (en) Pre-labeling method based on hierarchical migration learning and related equipment thereof
CN112199954B (en) Disease entity matching method and device based on voice semantics and computer equipment
CN117275466A (en) Business intention recognition method, device, equipment and storage medium thereof
CN116563034A (en) Purchase prediction method, device, equipment and storage medium based on artificial intelligence
CN116166858A (en) Information recommendation method, device, equipment and storage medium based on artificial intelligence
CN115730603A (en) Information extraction method, device, equipment and storage medium based on artificial intelligence
CN113609833B (en) Dynamic file generation method and device, computer equipment and storage medium
CN115238077A (en) Text analysis method, device and equipment based on artificial intelligence and storage medium
CN115062136A (en) Event disambiguation method based on graph neural network and related equipment thereof
CN113590840A (en) Knowledge base sharing method and device, computer equipment and storage medium
CN114637831A (en) Data query method based on semantic analysis and related equipment thereof
CN114238574B (en) Intention recognition method based on artificial intelligence and related equipment thereof
CN112949317B (en) Text semantic recognition method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination