CN110737687A

CN110737687A - Data query method, device, equipment and storage medium

Info

Publication number: CN110737687A
Application number: CN201910846566.9A
Authority: CN
Inventors: 魏佳
Original assignee: Ping An Puhui Enterprise Management Co Ltd
Current assignee: Ping An Puhui Enterprise Management Co Ltd
Priority date: 2019-09-06
Filing date: 2019-09-06
Publication date: 2020-01-31

Abstract

The invention relates to the technical field of data query, and discloses data query methods, devices, equipment and storage media.

Description

Data query method, device, equipment and storage medium

Technical Field

The invention relates to the technical field of data query, in particular to data query method, device, equipment and storage medium.

Background

The background Text is code editors, and is also advanced Text editors of HyperText markup language (HTML) and prose, which is favored by large users because it can implement queries for specified data in the entire project.

However, in practical application, when the Sublime Text is used for searching for specified data, although some data specified by a user can be searched, the searched result not only contains the data specified by the user, but also searches other data containing the data.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide data query methods, devices, equipment and storage media, and aims to solve the technical problem that data meeting the user requirements cannot be rapidly and accurately queried according to the actual requirements of users in the prior art.

To achieve the above object, the present invention provides an data query method, including the steps of:

receiving a data query request triggered by a user, and acquiring a text to be queried and a problem to be queried corresponding to the data query request;

determining a data structure of the data which needs to be queried by the user according to the problem to be queried, wherein the data structure is used for limiting the position of the data which needs to be queried by the user in an initial regular expression;

searching an initial regular expression from a pre-constructed regular expression management library according to the data structure;

generating a target regular expression according to the problem to be queried and the initial regular expression;

and searching data conforming to the data structure from the text to be queried by adopting the target regular expression.

Preferably, the step of determining a data structure of the data that the user needs to query according to the question to be queried includes:

extracting at least keywords from the question to be queried based on a keyword extraction technology;

performing semantic analysis on each keyword to obtain semantic information of each keyword;

and determining a data structure of the data which needs to be inquired by the user by combining the semantic information of each keyword.

Preferably, before the step of extracting at least keywords from the question to be queried based on the keyword extraction technology, the method further includes:

determining the format of the question to be queried;

when the question to be inquired is in a voice format, converting the question to be inquired in the voice format into the question to be inquired in a text format based on a voice recognition technology;

when the question to be inquired is in a picture format, converting the question to be inquired in the picture format into the question to be inquired in a text format based on an image recognition character technology;

wherein, the step of extracting at least keywords from the question to be queried based on the keyword extraction technology comprises the following steps:

at least keywords are extracted from the questions to be queried in text format based on keyword extraction techniques.

Preferably, before the step of converting the question to be queried in the picture format into the question to be queried in the text format based on an image recognition character technology when the question to be queried is in the picture format, the method further includes:

converting the to-be-inquired question of the picture format from a red, green and blue (RGB) color space to a luminance chromaticity coordinate LUV color space;

segmenting the question to be inquired of the picture format in the LUV color space;

converting the question to be inquired in the picture format into a gray-scale image on the basis of picture segmentation;

when the question to be inquired is in a picture format, the step of converting the question to be inquired in the picture format into the question to be inquired in a text format based on an image recognition character technology comprises the following steps:

traversing characters in the gray-scale image, and extracting the contour features of the current characters based on an image recognition character technology;

carrying out template rough classification and template fine matching on the contour features of the current character and templates in a pre-constructed feature template library to determine a computer character corresponding to the current character, wherein the contour features of the character are recorded in the templates, and a corresponding relation exists between the contour features of the character and the computer character;

and arranging the computer characters in sequence to obtain the problem to be inquired in a text format.

Preferably, before the step of searching for the initial regular expression from the pre-constructed regular expression management library according to the data structure, the method further includes:

acquiring data searching records recorded by each big data platform;

extracting information recorded in the data search record, and taking the extracted information as an th input parameter;

acquiring a preset data structure set, and taking a standard data structure recorded in the data structure set as a second input parameter;

inputting the th input parameter and the second input parameter into a pre-constructed target data structure analysis model respectively, so that the target data structure analysis model analyzes the th input parameter by taking the second parameter as an analysis standard, and determines a target data structure corresponding to the th input parameter;

and constructing grammar based on the regular expression, and generating an initial regular expression corresponding to the target data structure according to the target data structure and a pre-stored regular expression character table.

Preferably, before the step of inputting the th input parameter and the second input parameter into the pre-constructed target data structure analysis model respectively, the method further comprises:

receiving a data acquisition instruction, and extracting a network address of sample data to be acquired from the data acquisition instruction;

configuring a web crawler according to the network address, and acquiring the sample data from a webpage corresponding to the network address by using the web crawler;

carrying out data cleaning on the sample data to obtain target sample data;

dividing the target sample data by adopting a retention method to obtain training data and test data, wherein the training data and the test data are mutually exclusive;

building a training model by adopting a convolutional neural network algorithm;

marking the training data, inputting the marked training data serving as input parameters into the training model for processing to obtain a training result;

judging whether the training result is matched with a marking result corresponding to the marked training data;

if so, determining the training model outputting the training result as an initial data structure analysis model; if not, continuing to train the training model by using the marked training data until the output training result is matched with the marking result;

marking the test data, inputting the marked test data serving as input parameters into the initial data structure analysis model for processing to obtain a verification result;

and judging whether the verification result is matched with a marking result corresponding to the marked test data, and if so, determining the data structure analysis model as the target data structure analysis model.

Preferably, after the step of searching for data conforming to the data structure from the text to be queried by using the target regular expression, the method further includes:

collecting the biological characteristic information of the user;

generating a query record according to the biological characteristic information and the initial regular expression, and storing the query record;

and when the preset conditions are met, analyzing each query record by adopting the target data structure analysis model, and optimizing the initial regular expression stored in the regular expression management library according to the analysis result corresponding to each query record and the regular expression character table.

In addition, in order to achieve the above object, the present invention further provides kinds of data query apparatuses, including:

the system comprises an acquisition module, a query module and a query module, wherein the acquisition module is used for receiving a data query request triggered by a user and acquiring a text to be queried and a problem to be queried corresponding to the data query request;

the determining module is used for determining a data structure of the data which needs to be inquired by the user according to the problem to be inquired;

finding module, which is used to find the initial regular expression from the pre-constructed regular expression management base according to the data structure;

the generating module is used for generating a target regular expression according to the problem to be inquired and the initial regular expression;

and the second searching module is used for searching the data which accords with the data structure from the text to be queried by adopting the target regular expression.

In addition, to achieve the above object, the present invention further provides data query devices, where the device includes a memory, a processor, and a data query program stored in the memory and executable on the processor, and the data query program is configured to implement the steps of the data query method as described above.

Furthermore, to achieve the above object, the present invention further proposes computer readable storage media, wherein the computer readable storage media stores thereon a data query program, and the data query program, when executed by a processor, implements the steps of the data query method as described above.

According to the data query scheme provided by the invention, in the process of data query, an operator does not need to have professional knowledge of any regular expression and Sublime software, and only needs to provide the problem to be queried according to daily habits, so that the terminal equipment can quickly determine the target regular expression to be used, thereby greatly simplifying user operation and improving user experience.

In addition, in the query process, the target regular expression is specially generated for aiming at the data structure of the problem to be queried provided by the user, so that the scheme can quickly and accurately provide effective query results according to the requirements of the user, and the efficiency of data query is greatly improved.

Drawings

FIG. 1 is a schematic structural diagram of a data query device of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating an embodiment of a data query method according to the present invention;

FIG. 3 is a flowchart illustrating a data query method according to a second embodiment of the present invention;

fig. 4 is a block diagram illustrating an embodiment of a data query device according to the present invention.

The objects, features, and advantages of the present invention are further described in with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a data query device according to an embodiment of the present invention.

As shown in fig. 1, the data query apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the configuration shown in FIG. 1 does not constitute a limitation of the data query apparatus, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.

As shown in fig. 1, the memory 1005, which is storage media, may include therein an operating system, a network communication module, a user interface module, and a data query program.

In the data query apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the data query device of the present invention may be disposed in the data query device, and the data query device calls the data query program stored in the memory 1005 through the processor 1001 and executes the data query method provided by the embodiment of the present invention.

An embodiment of the present invention provides data query methods, and referring to fig. 2, fig. 2 is a schematic flowchart of a data query method according to an embodiment of the present invention.

In this embodiment, the data query method includes the following steps:

step S10, receiving a data query request triggered by a user, and acquiring a text to be queried and a question to be queried corresponding to the data query request.

Specifically, the execution subject in this embodiment is a terminal device installed with the Sublime software, such as a personal computer, a tablet computer, a smart phone, and the like of a user, which is not listed here in , and is not limited in any way.

In addition, the problem to be queried acquired in this embodiment is not a query sentence that can be identified by the database, but a content provided by the user using natural language according to personal habits, so that the data query method provided by this embodiment can greatly facilitate the user unfamiliar with the database and Sublime software operation knowledge to query the data that the user wants to query from the text to be queried.

, in order to facilitate the user operation and simplify the query difficulty as much as possible, in practical applications, the questions to be queried provided by the user can be in various formats, such as text format, voice format, picture format, etc., which are not listed here in , and do not limit this.

Step S20, determining a data structure of the data that the user needs to query according to the question to be queried.

Specifically, in this embodiment, the data structure is mainly used to define the position of the data that the user needs to query in the initial regular expression.

To facilitate understanding of the operation of determining the data structure of the data that the user needs to query according to the question to be queried in step S20, specific implementations are given below, which may be roughly as follows:

firstly, extracting at least keywords from the question to be queried based on a keyword extraction technology;

then, carrying out semantic analysis on each keyword to obtain semantic information of each keyword;

and then, determining a data structure of the data which needs to be inquired by the user by combining the semantic information of each keyword.

For ease of understanding, the following description is made in conjunction with the examples:

for example, when the to-be-queried question input by the user is "query only word success", based on the keyword extraction technology, the extracted keywords may be: through semantic analysis of the 4 keywords, and finally combining semantic information of the 4 keywords, the determined structure of data which needs to be queried by the user is as follows: only the entered word itself is queried.

For example, when the question to be queried input by the user is "query all data beginning with the word success", the extracted keywords may be, based on the keyword extraction technology: "query", "in", "word", "success", and "beginning", the semantic analysis of the above 5 keywords, and finally the semantic information of the 5 keywords are combined, and the structure of the data that the user needs to query is determined as follows: the query takes the entered word as the beginning of the data.

For another example, when the to-be-queried question input by the user is "query all data ending with the word success", based on the keyword extraction technology, the extracted keywords may be: through semantic analysis of the above 5 keywords and finally combining semantic information of the 5 keywords, the structure of the data that the user needs to query is determined as follows: the query takes the entered word as the data for the end.

For another example, when the question to be queried input by the user is "query all data with the word success as the intermediate content", based on the keyword extraction technology, the extracted keywords may be: through semantic analysis of the above 5 keywords and finally combining semantic information of the 5 keywords, the structure of the data that the user needs to query is determined as follows: the query takes the entered word as data of the intermediate content.

It should be understood that the above is only an example, and the technical solution of the present invention is not limited in any way, and in practical applications, those skilled in the art can make settings according to needs, and the present invention is not limited herein.

In addition, in practical application, the format of the question to be queried provided by the user is not limited, so that the format of the question to be queried may be a text format, a voice format, and a picture format. Therefore, before the problem to be queried provided by the user is acquired and the keyword is extracted from the problem to be queried based on the keyword extraction technology, the format of the problem to be queried needs to be determined, and then corresponding operation is performed according to the determined format until the problem to be queried in the text format is obtained, and the operation is not executed.

For example, when the question to be queried is determined to be in the voice format, the question to be queried in the voice format needs to be converted into the question to be queried in the text format based on the voice recognition technology.

For example, when the question to be queried is determined to be in the picture format, the question to be queried in the picture format needs to be converted into the question to be queried in the text format based on the image recognition character technology.

Correspondingly, the operation of extracting at least keywords from the question to be queried based on the keyword extraction technology, specifically extracting at least keywords from the question to be queried in a text format based on the keyword extraction technology.

Regarding the specific implementation flows of the keyword extraction technology, the voice recognition technology and the image recognition text technology, a person skilled in the art can check the existing documents by himself, and details are not repeated here.

In addition, it is worth mentioning that in order to ensure that the extracted keywords have a high reference value, in practical applications, before the operation of extracting the keywords from the question to be queried is performed, a text preprocessing operation may be performed on the question to be queried in a text format.

For example, the stop word is removed, that is, the feedback information is removed, such as: wool, Mo, o, etc. have no actual meaning.

For example, invalid special characters, such as emoticons, various punctuation marks, and the like, are removed.

For example, after the above processing operations are performed, the text content is combined into lines.

Accordingly, before the question to be queried in the voice format is converted into the question to be queried in the text format, series of preprocessing operations, such as filtering, removing interference sound and the like, may also be performed on the question to be queried in the voice format to ensure that the converted text information is more accurate.

Similarly, before the problem to be queried in the picture format is converted into the problem to be queried in the text format, series of preprocessing operations, such as gray processing, denoising and the like, can be performed on the problem to be queried in the picture format, so as to ensure that the converted text information is more accurate.

For the convenience of understanding, the present embodiment provides two ways of performing preprocessing operation on the question to be queried in the picture format, which are roughly as follows:

mode 1:

(1) and converting the to-be-inquired question of the picture format from a red, green and blue (RGB) color space into a luminance chromaticity coordinate LUV color space.

Specifically, because the RGB color space represents three quantities of hue, brightness and saturation at , it is difficult to separate them, so that many details are difficult to digitally adjust, and the LUV color space is color spaces with the vision system , i.e., a unitized encoding of visually perceptible color differences, so that digital adjustment of all features of a picture can be achieved.

(2) And segmenting the question to be queried in the picture format in the LUV color space.

Specifically, after the color space conversion, the segmentation operation performed on the to-be-queried question of the picture format in the LUV color space may be performed by using a pyramid clustering segmentation algorithm. Compared with other segmentation algorithms, such as mean shift algorithm, watershed algorithm and the like, the segmentation speed is higher, the total number of the segmented regions is moderate, and the effect is better. However, in a specific implementation, a person skilled in the art may select other segmentation algorithms according to actual needs, and the method is not limited herein.

In addition, regarding the operation of segmentation by using the pyramid clustering segmentation algorithm, a person skilled in the art can search the existing literature by himself to implement the segmentation, and details are not repeated here.

(3) And converting the question to be inquired in the picture format into a gray-scale image on the basis of picture segmentation.

Correspondingly, when the question to be queried is in a picture format, the step of converting the question to be queried in the picture format into the question to be queried in a text format based on an image recognition character technology comprises the following steps:

firstly, traversing characters in the gray-scale image, and extracting the contour features of the current characters based on an image recognition character technology; secondly, carrying out template rough classification and template fine matching on the contour features of the current character and templates in a pre-constructed feature template library to determine a computer character corresponding to the current character, wherein the contour features of the character are recorded in the templates, and a corresponding relation exists between the contour features of the character and the computer character; and finally, arranging the computer characters in sequence to obtain the problem to be inquired in a text format.

Mode 2:

and performing color conversion processing on the problem to be inquired in the picture format according to the preset fuzzy degree.

Specifically, the higher the preset fuzzy degree is, the sharper and clearer the color is when the color conversion processing is performed on the to-be-queried problem in the picture format, and the more accurate the contour of the finally extracted character is, that is, the more accurate the to-be-queried problem in the text format obtained by conversion is.

In general, in practical applications, colors are divided into 24 colors, so that if the preset degree of blurring is 12, that is, according to the existing standard 24 colors, two adjacent colors are blurred, when the two adjacent colors are (R, G, B), both the two colors can be represented by (R, G, B) or both the two adjacent colors can be represented by (R, G, B), and finally, if the preset degree of blurring is 6, that is, according to the existing standard 24 colors, four adjacent colors are blurred, when the two adjacent colors are (R, G, B), and finally, the colors can be divided into 6 colors, that is, when the four adjacent colors are (R, G, B), all the four colors can be represented by (R, G, B) or (R, G, B), or (R, G, B).

It should be understood that the above only provides two specific implementation manners for performing the preprocessing operation on the to-be-queried problem in the picture format, the technical solution of the present invention is not limited at all, and in practical applications, those skilled in the art may perform reasonable setting according to needs, and the present invention is not limited herein.

And step S30, searching an initial regular expression from a pre-constructed regular expression management library according to the data structure.

Specifically, in this embodiment, there are 4 initial regular expressions stored in the regular expression management library, and the specific format is approximately as follows:

initial regular expression 1: v ^ data to be queried $/g;

initial regular expression 2: v ^ data to be queried \ w +/g;

initial regular expression 3: v \ w + data to be queried $/g;

initial regular expression 4: v/W + data to be queried \ w +/g.

Wherein, the initial regular expression 1 only queries the data content of the data to be queried; the initial regular expression 2 represents that the queried data content is the data content with the data to be queried as the beginning; the initial regular expression 3 represents that the queried data content is the data content taking the data to be queried as the end; the initial regular expression 4 represents that the data content of the query is the data content including the data to be queried in the middle.

In addition, regarding the "data to be queried" appearing in the 4 initial regular expressions, in practical application, the data needs to be replaced by the object to be queried, which is extracted from the obtained problem to be queried and needs to be queried actually.

The meanings of the other meta-characters appearing in the 4 initial regular expressions above are detailed in table 1.

TABLE 1 regular expression character table

Furthermore, it is worth to mention that, in order to ensure that the operation of step 3 can be executed smoothly, in practical application, it is necessary to construct an initial regular expression stored in the regular expression management library.

Regarding the construction of the initial regular expression, the following may be approximated:

firstly, data searching records recorded by each big data platform are obtained.

Specifically, in practical application, the manner of obtaining the recorded data search records from each big data platform may be to configure the web crawler software with the network addresses of the big data platforms storing the data search records, and then crawl the data from the big data platforms as needed by the web crawler software.

In addition, in practical applications, the data search record may also be stored in a local server.

Accordingly, the data search record may be obtained from the server according to a preset database query statement.

The specific implementation manner can be set by a person skilled in the art according to needs, and is not limited herein.

Then, the information described in the data lookup record is extracted, and the extracted information is used as an th input parameter.

Specifically, since the data search record stored in the server is usually in a text format, the manner of extracting the recorded information from the data search record may be implemented based on a keyword extraction technology, or may be implemented based on other problem extraction manners, which is not limited herein.

In addition, it should be understood that, in practical applications, if the recorded data search record is in other formats, such as a voice format, a picture format, and the like, the manner of extracting the information needs to be adjusted according to the specific format of the recorded data search record, and details are not described here.

And then, acquiring a preset data structure set, and taking a standard data structure recorded in the data structure set as a second input parameter.

Specifically, the standard data structure recorded in the preset data structure set is substantially predetermined by the technician according to the business needs, and is mainly used as the analysis standard for the subsequent analysis process of the input parameters.

Then, the th input parameter and the second input parameter are respectively input into a target data structure analysis model which is constructed in advance, so that the target data structure analysis model analyzes the th input parameter by taking the second parameter as an analysis standard, and a target data structure corresponding to the th input parameter is determined.

And finally, constructing grammar based on the regular expression, and generating an initial regular expression corresponding to the target data structure according to the target data structure and a pre-stored regular expression character table.

It should be understood that the above specific implementation manner of building the initial regular expression is only , and the technical solution of the present invention is not limited in any way, and in practical applications, those skilled in the art may set the implementation manner as needed, and the implementation manner is not limited herein.

In addition, since a target data structure analysis model is needed when the initial regular expression is constructed, in order to ensure that the operation of constructing the initial regular expression can be smoothly performed, the target data structure analysis model needs to be constructed first.

Regarding the construction of the target data structure analysis model, the following can be roughly:

(1) receiving a data acquisition instruction, and extracting a network address of sample data to be acquired from the data acquisition instruction.

(2) And configuring the web crawler according to the network address, and acquiring the sample data from a webpage corresponding to the network address by using the web crawler.

(3) And carrying out data cleaning on the sample data to obtain target sample data.

Specifically, the data cleansing operation performed on the sample data can be roughly classified into the following operations:

(3-1) removing the incomplete data, namely removing the data with missing information;

(3-2) removing error data, namely removing data which does not meet the requirements on format or is not required by the training in quantity and type;

(3-3) removing repeated data, namely removing the repeated data, namely only data are reserved, and reducing the data volume;

and (3-4) format conversion, namely, carrying out format conversion on the sample data, and converting the sample data into a standard format which can be recognized in the subsequent training and testing process, such as a binary format.

It should be understood that the above are only given by way of specific data cleansing, and the technical solution of the present invention is not limited in any way, and in practical applications, those skilled in the art can select the data cleansing according to the needs, and the present invention is not limited herein.

(4) And dividing the target sample data by adopting a retention method to obtain training data and test data, wherein the training data and the test data are mutually exclusive.

Specifically, the leave-out method is only specific data division modes, and the principle is to divide a data set into two mutually exclusive sub-data sets, so that training data for training and test data for testing are mutually exclusive, i.e. different, and the analysis accuracy of the model can be better verified.

(5) And (5) constructing a training model by adopting a convolutional neural network algorithm.

Specifically, the construction of the training model by using the convolutional neural network algorithm means that data training frames with the structures of the input layer, the convolutional layer, the pooling layer and the output layer are constructed based on the convolutional neural network algorithm.

Of course, in practical applications, a computer in the art may select other machine learning algorithms according to needs, and is not limited herein.

(6) And marking the training data, inputting the marked training data serving as input parameters into the training model for processing to obtain a training result.

(7) And judging whether the training result is matched with a marking result corresponding to the marked training data.

Correspondingly, if the data structure is matched with the initial data structure, determining the training model outputting the training result as an initial data structure analysis model; and if not, continuing to train the training model by using the marked training data until the output training result is matched with the marking result.

(8) And marking the test data, inputting the marked test data serving as input parameters into the initial data structure analysis model for processing, and obtaining a verification result.

(9) And judging whether the verification result is matched with a marking result corresponding to the marked test data, and if so, determining the data structure analysis model as the target data structure analysis model.

It should be understood that, the above specific implementation manner is only for constructing the target data structure analysis model, and the technical solution of the present invention is not limited in any way, and in practical applications, those skilled in the art may set the implementation manner as needed, and the implementation manner is not limited herein.

And step S40, generating a target regular expression according to the problem to be queried and the initial regular expression.

In order to facilitate understanding of the operation in step S40, the following description is made with reference to an example.

For example, when the user only needs to query the word "json", the initial regular expression found from the pre-constructed regular expression management library is the initial regular expression 1.

At this time, the "data to be queried" appearing in the initial regular expression 1 needs to be replaced by "json".

Accordingly, the generated target regular expression is: /[ lambda ] json $/g.

And step S50, searching data conforming to the data structure from the text to be queried by adopting the target regular expression.

Still taking the data to be queried as json and the target expression of "/^ json $/g" as an example, if the content in the text to be queried is "myjsonstring", the generated target regular expression is used, and the queried data which accords with the structure is json only.

Through the above description, it is not difficult to find that, in the data query process, an operator does not need to have professional knowledge of any regular expression Sublime software, and only needs to provide a problem to be queried according to daily habits, and the terminal device can quickly determine a target regular expression to be used, so that the user operation is greatly simplified, and the user experience is improved.

In addition, in the query process, the generated target regular expression is specially for the data structure of the problem to be queried provided by the user, and can quickly and accurately provide effective query results according to the requirements of the user, thereby greatly improving the efficiency of data query.

Referring to fig. 3, fig. 3 is a schematic flow chart of data query methods according to a second embodiment of the invention.

Based on the above embodiment, the data query method of this embodiment further includes, after the step S50:

and step S60, collecting the biological characteristic information of the user.

Specifically, the biometric information in this embodiment may be any biometric information capable of identifying the identity of the user, such as kinds or a combination of several kinds of fingerprint feature information, voiceprint feature information, face feature information, and iris feature information.

Accordingly, the manner in which the biometric information is collected may also vary from biometric information to biometric information.

For example, when the acquired biometric information is fingerprint feature information, the biometric information may be acquired by using a fingerprint identification chip of the terminal device.

For example, when the acquired biometric information is face feature information or iris feature information, the biometric information may be extracted from an image or a video including a face of the user captured by a camera of the terminal device.

For example, when the collected biometric information is voiceprint feature information, the biometric information may be extracted from the voice information collected by the automatic voice collecting unit of the terminal device.

And step S70, generating a query record according to the biological characteristic information and the initial regular expression, and storing the query record.

Specifically, the query record is generated according to the collected biometric information and the pre-constructed initial regular expression, which is substantially to establish mapping relationships between the collected biometric information and the initial regular expression selected by the user for the current query data, and then store the mapping relationships in the specified location.

And step S80, when the preset conditions are met, analyzing each query record by adopting the target data structure analysis model, and optimizing the initial regular expression stored in the regular expression management library according to the analysis result corresponding to each query record and the regular expression character table.

Specifically, in the optimization of the initial regular expression in this embodiment, the initial regular expression stored in the regular expression management library is substantially managed differently from person to person, that is, for different users, the corresponding initial regular expression is only the commonly used initial regular expression.

Through the above description, it is easy to find that the data query method provided in this embodiment optimizes and manages the initial regular expression by combining the biometric information, so that the constructed initial regular expression can be different from person to person, and the user experience can be better improved while the queried data is the data actually required by the user.

Furthermore, an embodiment of the present invention further provides computer-readable storage media, where the computer-readable storage media store thereon a data query program, and the data query program, when executed by a processor, implements the steps of the data query method as described above.

Referring to fig. 4, fig. 4 is a block diagram illustrating an embodiment of the data query apparatus according to the present invention.

As shown in fig. 4, the data querying apparatus provided in the embodiment of the present invention includes an obtaining module 4001, a determining module 4002, an -th searching module 4003, a generating module 4004, and a second searching module 4005.

The query device comprises an acquisition module 4001, a determination module 4002, a search module 4003, a generation module 4004 and a second search module 4005, wherein the acquisition module 4001 is used for receiving a data query request triggered by a user and acquiring a text to be queried and a problem to be queried corresponding to the data query request, the determination module 4002 is used for determining a data structure of data to be queried by the user according to the problem to be queried, the search module 4003 is used for searching an initial regular expression from a pre-constructed regular expression management library according to the data structure, the generation module 4004 is used for generating a target regular expression according to the problem to be queried and the initial regular expression, and the second search module 4005 is used for searching data conforming to the data structure from the text to be queried by adopting the target regular expression.

Regarding the data structure mentioned above, in the present embodiment, the data structure is mainly used for defining the position of the data that the user needs to query in the initial regular expression.

In addition, regarding the operation of determining the data structure of the data that needs to be queried by the user according to the problem to be queried, in practical application, a person skilled in the art can perform reasonable setting according to business needs.

For easy understanding, the present embodiment provides specific implementation manners for determining the data structure of the data that needs to be queried by the user according to the question to be queried, which are roughly as follows:

and finally, determining a data structure of the data which needs to be inquired by the user by combining the semantic information of each keyword.

In addition, it is worth mentioning that the format of the question to be queried provided by the user may be a text format, a voice format, or a picture format, since in practical applications, there is no limitation on the format of the question to be queried.

Therefore, before an operation of extracting at least keywords from the questions to be queried based on a keyword extraction technology, the format of the questions to be queried needs to be determined, and then corresponding operations are performed according to the determined format until the questions to be queried in a text format are obtained, so that the operations are not performed.

, in order to ensure the extracted keywords have high reference value, the question to be queried in text format may be pre-processed before extracting the keywords from the question to be queried.

mode 1:

Mode 2:

In addition, it is worth to ensure that the lookup module 4003 can smoothly find the initial regular expression matching the determined data structure, and the initial regular expression needs to be constructed in advance.

acquiring data searching records recorded by each big data platform;

carrying out data cleaning on the sample data to obtain target sample data;

building a training model by adopting a convolutional neural network algorithm;

It should be understood that, the specific implementation manners given above for only types of building the initial regular expression and building the target data structure analysis model do not limit the technical solution of the present invention, and in a specific application, a person skilled in the art may set the implementation manners as needed, and the present invention is not limited to this.

Through the above description, it is not difficult to find that, in the data query process, the data query device provided in this embodiment, an operator does not need to have professional knowledge of any regular expression Sublime software, and only needs to provide a problem to be queried according to daily habits, and the terminal device can quickly determine a target regular expression to be used, so that the user operation is greatly simplified, and the user experience is improved.

It should be noted that the above-described work flows are only exemplary, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of them to achieve the purpose of the solution of the embodiment according to actual needs, and the present invention is not limited herein.

In addition, the technical details that are not described in detail in this embodiment may refer to the data query method provided in any embodiment of the present invention, and are not described herein again.

A second embodiment of the data search device of the present invention is proposed based on the th embodiment of the data search device described above.

In this embodiment, the data query apparatus further includes a biometric information collection module, a query record generation module, and an initial regular expression optimization module.

The biological characteristic information acquisition module is used for acquiring the biological characteristic information of the user.

And the query record generation module is used for generating a query record according to the biological characteristic information and the initial regular expression and storing the query record.

And the initial regular expression optimizing module is used for analyzing each query record by adopting the target data structure analysis model when the preset conditions are met, and optimizing the initial regular expression stored in the regular expression management library according to the analysis result corresponding to each query record and the regular expression character table.

Through the above description, it is easy to find that the data query device provided in this embodiment optimizes and manages the initial regular expression by combining the biometric information, so that the constructed initial regular expression can be different from person to person, and the user experience can be better improved while the queried data is the data actually required by the user.

Furthermore, it should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises the series of elements does not include only those elements, but may include other elements not expressly listed or inherent to such process, method, article, or system.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Based on the above understanding, the technical solution of the present invention or the parts contributing to the prior art can be essentially embodied in the form of software products stored in storage media (e.g. Read Only Memory (ROM)/RAM, magnetic disk, optical disk) and including instructions for causing terminal devices (e.g. mobile phone, computer, server, or network device) to execute the methods described in the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1, A method for data query, comprising the steps of:

2. The method of claim 1, wherein the step of determining the data structure of the data that the user needs to query according to the question to be queried comprises:

3. The method of claim 2, wherein before the step of extracting at least keywords from the question to be queried based on the keyword extraction technique, the method further comprises:

determining the format of the question to be queried;

and when the question to be inquired is in a picture format, converting the question to be inquired in the picture format into the question to be inquired in a text format based on an image recognition character technology.

4. The method as claimed in claim 3, wherein before the step of converting the question to be queried in the picture format into the question to be queried in the text format based on the image recognition text technology when the question to be queried is in the picture format, the method further comprises:

5. The method of any , wherein prior to the step of looking up an initial regular expression from a pre-built regular expression management library according to the data structure, the method further comprises:

acquiring data searching records recorded by each big data platform;

6. The method of claim 5, wherein prior to the step of inputting said th input parameter and said second input parameter, respectively, into a pre-constructed target data structure analysis model, said method further comprises:

carrying out data cleaning on the sample data to obtain target sample data;

building a training model by adopting a convolutional neural network algorithm;

7. The method of claim 5, wherein after the step of using the target regular expression to find data conforming to the data structure from the text to be queried, the method further comprises:

collecting the biological characteristic information of the user;

The data inquiry device of kinds, characterized in that, said device includes:

A data query device of , wherein the device comprises a memory, a processor and a data query program stored on the memory and operable on the processor, the data query program being configured to implement the steps of the data query method of any of claims 1 to 7.

10, computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a data query program, which when executed by a processor implements the steps of the data query method as claimed in any of claims 1 to 7 and .