CN111309709B - Database building and searching method and device - Google Patents

Database building and searching method and device Download PDF

Info

Publication number
CN111309709B
CN111309709B CN202010104739.2A CN202010104739A CN111309709B CN 111309709 B CN111309709 B CN 111309709B CN 202010104739 A CN202010104739 A CN 202010104739A CN 111309709 B CN111309709 B CN 111309709B
Authority
CN
China
Prior art keywords
data
information
database
storage
semantic tag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010104739.2A
Other languages
Chinese (zh)
Other versions
CN111309709A (en
Inventor
陆阳
李建岐
黄毕尧
高鸿坚
褚广斌
刘存林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
State Grid Corp of China SGCC
State Grid Chongqing Electric Power Co Ltd
Global Energy Interconnection Research Institute
Original Assignee
Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
State Grid Corp of China SGCC
State Grid Chongqing Electric Power Co Ltd
Global Energy Interconnection Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd, State Grid Corp of China SGCC, State Grid Chongqing Electric Power Co Ltd, Global Energy Interconnection Research Institute filed Critical Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
Priority to CN202010104739.2A priority Critical patent/CN111309709B/en
Publication of CN111309709A publication Critical patent/CN111309709A/en
Application granted granted Critical
Publication of CN111309709B publication Critical patent/CN111309709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a database establishing and retrieving method and device, wherein the database establishing method comprises the following steps: establishing a data characteristic table, wherein the data characteristic table comprises a first data information table, the first data information table is used for storing a plurality of different types of data information, and the data information comprises a data storage path; acquiring a storage path of data in an original data file; obtaining the storage position of the data in the data characteristic table according to the storage path of the data; inserting the data into the data feature table according to the storage position; and constructing a data characteristic database according to the data characteristic table after the data is inserted. By implementing the invention, the data file can be directly obtained aiming at a certain specific information to be detected or a specific condition, and the high efficiency and the flexibility of the data file retrieval are improved.

Description

Database building and searching method and device
Technical Field
The invention relates to the technical field of data processing, in particular to a database building and retrieving method and device.
Background
With the wide application of intelligent electric meters in the whole country, the requirements of the electricity consumption information acquisition system on the data transmission rate and the reliability are continuously improved. Currently, power line carrier and micro-power wireless are the main modes of local communication of the electricity consumption information acquisition system. Taking power line carrier communication as an example, due to the diversity of a power grid structure, a geographical environment where the power line carrier communication is located and the like, the characteristics of the power line channel have the phenomena of noise, impedance, attenuation and the like of the power line channel, and the efficiency and the reliability of user information transmitted by the power consumption information acquisition system are affected. In order to determine the reliability of the electricity consumption transmitted by the electricity consumption information acquisition system, the characteristics of a local communication channel of the electricity consumption information acquisition system need to be analyzed, and the analysis of the characteristics of the communication channel includes factors including geographic environment, weather conditions, power grid structure and the like, so that a large amount of data needs to be stored for later recall in the analysis of the characteristics of the channel.
The existing channel data calling mode is to store channel measurement scenes, channel measurement types and channel measurement data in a file in advance, repeatedly search file names and file paths under a file directory to obtain data files of different channel measurement scenes, channel measurement types and channel measurement data, and cannot directly obtain the data files aiming at certain specific requirement conditions, so that the searching efficiency of the data files is low.
Disclosure of Invention
Therefore, the technical problem to be solved by the invention is to overcome the defect of low searching efficiency of the data file caused by the existing data storage mode, thereby providing a database establishing and searching method and device.
According to a first aspect, an embodiment of the present invention provides a database building method, including: establishing a data characteristic table, wherein the data characteristic table comprises a first data information table, the first data information table is used for storing a plurality of different types of data information, and the data information comprises a data storage path; acquiring a storage path of data in an original data file; obtaining the storage position of the data in the data characteristic table according to the storage path of the data; inserting the data into the data feature table according to the storage position; and constructing a data characteristic database according to the data characteristic table after the data is inserted.
With reference to the first aspect, in a first implementation manner of the first aspect, the data feature table further includes a second data information table, where the second data information table is different from the type of the data information stored in the first data information table, and the second data information table and the first data information table contain the same data file id value, and the data file id value is used to correlate the data information in different data tables.
With reference to the first aspect, in a second implementation manner of the first aspect, the obtaining, according to the storage path of the data, a storage location of the data in the data feature table includes: dividing and analyzing the storage path of the data to obtain a semantic tag; searching a semantic tag table according to the semantic tag, wherein the semantic tag table stores the corresponding relation between the semantic tag and a storage position in a data feature table; and determining the storage position of the data in the data characteristic table.
With reference to the first aspect, in a third implementation manner of the first aspect, the obtaining, according to the storage path of the data, a storage location of the data in the data feature table includes: dividing and analyzing the storage path of the data to obtain a semantic tag; and classifying and matching the semantic tags based on natural language processing, and determining the position of the data in the data characteristic table.
With reference to the second implementation manner of the first aspect, in a fourth implementation manner of the first aspect, the searching a semantic tag table according to the semantic tag includes: and when the semantic tag is not found in the semantic tag table, adding the corresponding relation between the semantic tag and the storage position of the semantic tag in the data feature table to the semantic tag table.
With reference to the first implementation manner of the first aspect, in a fifth implementation manner of the first aspect, the data feature table includes a channel data feature table.
With reference to the fifth implementation manner of the first aspect, in a sixth implementation manner of the first aspect, the second data information table includes: a channel type information table including any one or more of a device type, a measurement content type, and measurement time/frequency domain information used in channel data measurement; the channel scene information table comprises any one or more of measurement time, measurement environment, measurement place, measurement climate, altitude, humidity and temperature information.
According to a second aspect, an embodiment of the present invention provides a database retrieval method, including: acquiring information to be retrieved; in the data feature database obtained by the database establishing method in the first aspect or any implementation manner of the first aspect, retrieving corresponding one or more storage paths according to the information to be retrieved; and obtaining a retrieval result according to the storage path.
According to a third aspect, an embodiment of the present invention provides a database creation apparatus, including: the system comprises a data feature table establishing module, a data feature table generating module and a data processing module, wherein the data feature table establishing module is used for establishing a data feature table, the data feature table comprises a first data information table, the first data information table is used for storing a plurality of different types of data information, and the data information comprises a data storage path; the storage path acquisition module is used for acquiring the storage path of the data in the original data file; the storage position acquisition module is used for acquiring the storage position of the data in the data characteristic table according to the storage path of the data; the data insertion module is used for inserting the data into the data characteristic table according to the storage position; and the database construction module is used for constructing a data characteristic database according to the data characteristic table after the data is inserted.
According to a fourth aspect, an embodiment of the present invention provides a database retrieval apparatus, including: the information to be searched acquisition module is used for acquiring information to be searched; a storage path retrieval module, configured to retrieve, in a data feature database obtained by the database creation method according to the first aspect or any implementation manner of the first aspect, one or more corresponding storage paths according to the information to be retrieved; and the retrieval result acquisition module is used for acquiring a retrieval result according to the storage path.
According to a fifth aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the steps of the database creation method according to the first aspect or any implementation manner of the first aspect, or the database retrieval method according to the second aspect.
According to a sixth aspect, an embodiment of the present invention provides a storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the database creation method according to the first aspect or any implementation manner of the first aspect, or the database retrieval method according to the second aspect.
The technical scheme of the invention has the following advantages:
1. according to the database establishing method/device provided by the embodiment, the data storage paths in the original data file are divided according to different data types and names by constructing the data feature table, and the data storage paths are stored in the data feature table to establish the database. The data storage paths are stored in the database according to different data types, the stored data are efficiently classified, and when the data are searched in the database in the embodiment, the keywords representing the data types and names stored in the data feature table are searched, so that all the data files meeting the specific conditions can be searched, and efficient and flexible searching is realized.
2. According to the database establishing method/device, the data characteristic table comprises the first data information table and the second data information table, the data files with the same data storage path are associated through the same data file id, the multi-category data are stored in different data information tables respectively, data expansion and data retrieval are facilitated, and when a new data characteristic table is established, the consistency of the data file ids is kept, and the expandability of stored data is further improved.
3. According to the database establishing method/device, the semantic tags are obtained through segmentation and analysis of the storage paths, and are searched in the pre-stored semantic tag table, so that the specific positions of data in the data feature table are determined, and the accuracy of determining the data storage positions and the efficiency of determining the data storage positions are improved.
4. When the storage position of the data is determined through the semantic tag table, and the semantic tag is not queried in the semantic tag table, the storage position of the data is determined in other modes, and the obtained corresponding relationship between the semantic tag and the storage position of the semantic tag in the data feature table is added into the semantic tag table, so that the semantic tag content in the semantic tag table is perfected, and the effectiveness of the storage position of the data is determined by adopting the semantic tag table in the follow-up.
5. According to the database searching method/device, the database in the embodiment is searched through the acquired information to be searched, so that the data file can be directly obtained aiming at a certain specific information to be detected or a specific condition, and the efficiency and the flexibility of data file searching are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart showing a specific example of a database creation method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a specific example of a database retrieval method in an embodiment of the present invention;
FIG. 3 is a schematic block diagram of a specific example of a database creation apparatus in an embodiment of the present invention;
FIG. 4 is a schematic block diagram of a specific example of a database retrieval apparatus in an embodiment of the present invention;
Fig. 5 is a schematic block diagram of a specific example of an electronic device in an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; the two components can be directly connected or indirectly connected through an intermediate medium, or can be communicated inside the two components, or can be connected wirelessly or in a wired way. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
The embodiment provides a database building method, as shown in fig. 1, including the following steps:
s110: and establishing a data characteristic table, wherein the data characteristic table comprises a first data information table, the first data information table is used for storing a plurality of different types of data information, and the data information comprises a data storage path.
By way of example, in this embodiment, taking a database for the power line channel characteristic data as an example, the different kinds of data information may be information data such as a channel measurement city, a channel measurement location, a channel measurement phase line type, a channel measurement time, a channel measurement scene, a channel measurement temperature, humidity, a channel measurement device type, a channel measurement bandwidth type, a channel measurement content type, a channel measurement domain type, and the like. The measuring equipment comprises an FSH4 handheld spectrometer and a PicoScope high-speed data acquisition card in the embodiment; the measurement frequency band type comprises a broadband power line carrier and a narrowband power line carrier; the measurement content types include noise, attenuation, and impedance; the channel measurement domain types include time domain and frequency domain.
The storage path representation of the data corresponds to the storage path which contains the data stored in the corresponding folder for different types of data in the established data characteristic table, when the data is stored in a two-dimensional plane diagram mode, the two-dimensional plane consists of a horizontal axis X and a vertical axis Y, and the storage path of the data can comprise the folder path to which the data belongs and the names and units of the X axis and the Y axis of the data in the data file. The specific data characteristic table format may be as shown in table 1. The kind of the plurality of data information stored in the first data information table is not limited in this embodiment, and may be determined by those skilled in the art as required.
TABLE 1
Figure BDA0002388163280000081
S120: and acquiring a storage path of the data in the original data file.
In this embodiment, the original data file is used as a data file for measuring channel characteristics of a certain cell, and the storage structure of the data file is stored in a multi-level file directory form, wherein a first-level directory stores measurement year/month information and measurement city information, a second-level directory stores measurement content information, a third-level directory stores measurement frequency band information, a fourth-level directory stores measurement specific time information, and a fifth-level directory stores data files corresponding to different file names according to different measurement information.
The method for obtaining the storage path of the data in the original data file may be to obtain the storage path information of all the data files under the root directory of the original data file through the os module of the Python language. The storage path of each data file is a character string, and all the data file storage paths form a character string list, and the character string list is stored in the data file path list variable in the algorithm program for subsequent use. The method for acquiring the storage path of the data in the original data file is not limited in this embodiment, and can be determined by a person skilled in the art according to needs.
S130: and obtaining the storage position of the data in the data characteristic table according to the storage path of the data.
For example, according to the storage path of the data, the manner of obtaining the storage position of the data in the data feature table may be that the storage path of the data is first divided, the semantic label is manually defined on the divided storage path, for example, the divided storage path includes "handle", then the "handle" is manually defined as "handle", the storage position of the data in the data feature table is manually determined, and the storage position includes the column name of the data in the data feature table. In particular, the manner of obtaining the storage location of the data in the data feature table according to the storage path of the data is not limited in this embodiment, and can be determined by those skilled in the art according to the need.
S140: the data is inserted into the data feature table according to the storage location.
Illustratively, according to the storage location, the data may be inserted into the data feature table by defining a dictionary in which column names of the columns in the data feature table are stored, and traversing the dictionary to insert the data into the data feature table.
S150: and constructing a data characteristic database according to the data characteristic table after the data is inserted.
Illustratively, the background application and the foreground application of the database retrieval and classification query platform are written according to the data feature table after the data is inserted. In the embodiment, a background program is built by adopting a flash frame of Python, and a front-end program is built by using a web front-end technology. The background program is built through a flash framework design, and is used for a data query interface and a data file acquisition interface of a front-end request. The data query interface provides a query entry, and the data file acquisition interface provides an acquisition function of the data file and interfaces with the original data file. The front-end program is realized by adopting a static webpage file and a JavaScript program.
The interface corresponding request function is written based on an http protocol, the function can be a data query request based on javaScript or a data query request based on other languages, wherein the data query request based on javaScript is used for accessing and managing database use in a browser, and the data query request based on the other languages can be displayed to a user for viewing through a software interface by combining UI programming. And writing a background server based on a flash micro-service framework, preprocessing the data query request by the program through processing the data query request from the front end, converting the request into a database query language, and calling a database driver interface to obtain database return data to be returned to the client for use. The program can read the csv file in the file library according to the data resource requested by the user, and return the read csv file to the client after preprocessing. An interface function based on javaScript is compiled, a request is sent to a background server through a data query request, and the request data is displayed in a browser webpage for a user to browse and view. The construction manner of the data feature library in this embodiment is not limited, and can be determined by those skilled in the art according to needs.
According to the database establishing method provided by the embodiment, the data storage paths in the original data file are divided according to different data types and names by constructing the data feature table, and the data storage paths are stored in the data feature table to establish the database. The data storage paths are stored in the database according to different data types, the stored data are classified efficiently, and when the data are searched in the database in the embodiment, all the data files meeting specific conditions can be searched by searching the keywords representing the data types and names stored in the data feature table, so that flexible and efficient searching is realized.
As an optional implementation manner of this embodiment, the data feature table further includes a second data information table, where the second data information table is different from the first data information table in a category of data information stored in the first data information table, and the second data information table and the first data information table include the same data file id value, and the data file id value is used to associate data information in different data tables.
Illustratively, the second data information table may be one or more tables formed by dividing the data information in the first data information table according to different attributes or characteristics, for example, dividing the information of the channel measurement city, the location of the channel measurement, the phase line type of the channel measurement, the time of the channel measurement, the channel measurement scene, the temperature and humidity of the channel measurement in the step S110 into the channel scene information; dividing the channel measurement equipment type, the channel measurement bandwidth type, the channel measurement content type and the channel measurement time domain or frequency domain into channel type information; the respective data information classified as channel scene information may be composed into a channel scene information table and the respective data information channel type information table classified as channel type information, and the channel scene information table and the channel type information table may be collectively composed into a second data information table.
When the second data information table includes the channel scene information table and the channel type information table, the content stored in the first data information table is a storage path of data, and when the data is stored in a two-dimensional plane, the two-dimensional plane is composed of a horizontal axis X and a vertical axis Y, the storage path of the data includes names of X and Y axes and units of X and Y axes.
In order to associate one or more second data information tables with the first data information table, the same data file id value is established for data information in the same data file. For details of the column names and data types of the first and second data information tables, see tables 2, 3, and 4, where table 2 represents the column name and data type of the first data information table, table 3 represents the column name and data type of the channel type information table in the second data information table, and table 4 represents the column name and data type of the channel scene information table in the second data information table.
TABLE 2
Figure BDA0002388163280000111
Figure BDA0002388163280000121
TABLE 3 Table 3
Column numbering Column name Data type Interpretation of the drawings
1 id INT Data file id value
2 device SET Channel measurement device type
3 band SET Channel measurement bandwidth type
4 content SET Channel measurement content type
5 domain SET Channel measurement domain type
TABLE 4 Table 4
Column numbering Column name Data type Interpretation of the drawings
1 id INT Data file id value
2 city VARchar Channel measurement city
3 position VARchar Channel measurement site
4 powerline SET Phase line type of channel measurement
5 time TIMESTAMP Channel measurement time
6 sence SET Channel measurement scenario
7 temperature FLOAT Channel measurement temperature
8 humidity FLOAT Channel measurement humidity
It should be noted that, when the data feature table includes a first data information table and a second data information table, and the second data information table includes a channel scene information table and a channel type information table, the first data information table is inserted first and then the channel type data table and the channel scene data table are inserted when the data feature is inserted subsequently.
When data is inserted into the first data information table, before insertion, the X-axis name, the Y-axis name and the unit information of the data in the data file are acquired by reading the content information of the data file. At this time, the data storage path, the data X-axis name, the data X-axis unit, the data Y-axis name, and the data Y-axis unit in the data table 2 are already determined, only the id of the data file is unknown, and the id column of the data file is automatically generated by the database through id self-addition determination, so as to ensure the uniqueness of the id data. After insertion is completed, the data file id value generated by the database is obtained through sql query for insertion of other tables.
When data is inserted into the channel type information table, before insertion, checking the integrity of column names in the dictionary in the step S140, when the column names in the channel type information table are absent in the dictionary, adding the absent column names into the dictionary, inquiring whether default values exist in the column, designating the values as default values if the default values exist, and prompting a user to input the values of the column if the default values do not exist. And after the column information is acquired, inserting the data file id value and the data corresponding to the first data table into the channel type information table.
When data is inserted into the channel scene information table, before insertion, checking the integrity of the channel scene column in the dictionary in the step S140, if the column in the channel scene information table is absent in the dictionary, adding the absent column name into the dictionary, inquiring whether a default value exists in the column, if the default value exists, designating the value as the default value, and if the default value does not exist, prompting the user to input the value of the column. And after the column information is acquired, inserting the data file id value and the data corresponding to the first data table into the channel type information table.
According to the database establishing method provided by the embodiment, the data characteristic table comprises the first data information table and the second data information table, the data files with the same data storage path are associated through the same data file id, the multi-category data are stored in different data information tables respectively, data expansion and data retrieval are facilitated, and when a new data characteristic table is established, the consistency of the data file ids is kept, so that the expandability of the stored data is further improved.
As an optional implementation manner of this embodiment, step S130 includes:
first, a storage path of data is divided and analyzed to obtain a semantic tag.
Illustratively, each item in the character string list in the above step S120 is divided into storage paths by path delimiters '_', underline delimiters '_', blank delimiters, bracket symbols, and the like. Specifically, a Python character string processing function split is used, a character string list is returned after the function is executed, the character string list is used as a semantic label, and the semantic label can be stored in a semantic label list variable in an algorithm program for subsequent use. In this embodiment, the storage path splitting and parsing manner of the data and the manner of obtaining the semantic tag are not limited, and can be determined by those skilled in the art according to needs.
And secondly, searching a semantic tag table according to the semantic tag, wherein the semantic tag table stores the corresponding relation between the semantic tag and the storage position in the data feature table.
Illustratively, the semantic tag table is a pre-stored list representing correspondence between semantic tags and storage locations in the data feature table. The semantic tag table may include a semantic tag value, a data information table name, a data information table column name, and a data information value. The data information value represents a specific value that is filled into the data information table, and the value may be an alphanumeric, a numeric, or the like. In this embodiment, the column names and data types of the semantic tag table are shown in table 5.
TABLE 5
Figure BDA0002388163280000141
Figure BDA0002388163280000151
With reference to table 5, this embodiment gives a practical example for describing table 5, and when the second data information table includes a channel type information table and a channel scene information table, the practical application of the semantic tag table is shown in table 6.
TABLE 6
label table_name column_name column_value
Handan Channel scene information table city Handan
Yiwu (Yiwu) Channel scene information table city Yiwu (Yiwu)
Attenuation of Channel type information table content Attenuation of
Based on the semantic tags, the manner of looking up the semantic tag table may be that when the resulting semantic tag is "handle", the "label" column in table 6 above queries "handle".
Then, the storage location of the data in the data characteristics table is determined.
Illustratively, still exemplified by the practical example in table 6 above, when a handle is queried, the information corresponding to "handle" is viewed, and as can be seen in table 6, "handle" should be inserted into the channel measurement city column in the channel scene information table, and the data information value to be stored in this column is "" Handa ".
According to the database establishing method provided by the embodiment, the semantic tags are obtained by dividing and analyzing the storage paths, and are searched in the pre-stored semantic tag table, so that the specific positions of the data in the data characteristic table are determined, and the accuracy of determining the data storage positions and the speed of determining the data storage positions are improved.
As an optional implementation manner in this embodiment, step S130 includes:
first, a storage path of data is divided and analyzed to obtain a semantic tag. The specific manner is to be seen in the corresponding parts in the above embodiments, and the detailed description is omitted here.
And secondly, classifying and matching semantic tags based on natural language processing, and determining the position of the data in a data characteristic table.
Illustratively, the semantic tags are subjected to a classification algorithm and a string matching algorithm based on the classification and matching concrete representation of natural language processing. The character string matching value adopts a regular expression matching mode to judge whether the character string meets a certain mode, for example, when a date tag '2012-09-05' meets a date regular expression "[ 0-9] {2,4} - [0-9] {1,2} $" and the position of data in a data feature table is determined to be a measurement time information column in a channel scene table, and the data information value is stored in a time stamp mode. For another example, when the city label 'Handan' appears, the algorithm will determine whether the label belongs to a city database, the city database is a character string dictionary in the algorithm program, the dictionary stores keys for the Chinese names and pinyin names of all cities in China, the dictionary values are the Chinese names of the cities, when the label is matched with the keys of the city database, the position of the data in the data feature table is determined to be a measured city column in the scene table of the channel database, and the data information value is the value of the matched Key in the dictionary, namely the Chinese name of the city.
According to the database establishing method provided by the embodiment, the semantic tags are obtained through segmentation and analysis of the storage paths, and the specific positions of the data in the data feature table are determined through classification and matching based on natural language processing, so that the accuracy of determining the data storage positions and the speed of determining the data storage positions are improved.
As an optional implementation manner of this embodiment, searching the semantic tag table according to the semantic tag includes: and when the semantic tag is not found in the semantic tag table, adding the corresponding relation between the semantic tag and the storage position of the semantic tag in the data feature table to the semantic tag table.
Illustratively, when the semantic label is "handle", but "handle" is not queried in the semantic label table, the position in the data feature table corresponding to "handle" is a channel measurement city column in the channel scene information table, and the data information value is large, which is obtained by the above-mentioned manual input or based on natural language processing. The information of { "Handan" channel scene information table, channel measurement city column, handan } obtained above is added to the semantic tag table.
According to the database establishing method provided by the embodiment, when the storage position of the data is determined through the semantic tag table, the storage position of the data is determined in other modes when the semantic tag is not queried in the semantic tag table, and the obtained corresponding relation between the semantic tag and the storage position of the semantic tag in the data feature tag is added into the semantic tag table, so that the semantic tag content in the semantic tag table is perfected, and the effectiveness of the storage position of the data is determined by adopting the semantic tag table in the follow-up process.
The present embodiment provides a database retrieval method, as shown in fig. 2, including:
s210, obtaining information to be retrieved.
The information to be retrieved is input by a user, and a data query interface for front-end request exists in the database, and the information to be retrieved can be obtained from the data query interface.
S220, in the data characteristic database obtained by the database establishing method in the embodiment, corresponding one or more storage paths are obtained according to the information to be searched.
The method for retrieving the corresponding one or more storage paths according to the information to be retrieved may be, for example, that after the information to be retrieved is obtained from a data query interface, the information to be retrieved is parsed, what channel type and city the information to be retrieved belongs to is obtained, an sql query statement is generated according to the information to be retrieved, a data list meeting the condition is queried from an sql database through the sql query statement, the query list is packaged through a back-end program, and then the front-end is responded to parse, so that the one or more storage paths meeting the condition are obtained.
The specific implementation process is as follows: the data query interface receives information to be retrieved, the data file acquisition interface analyzes the data file id value corresponding to the request information by acquiring the request information received by the data query interface, and the background system extracts a query request from the first data information table according to the data file id value to acquire one or more storage paths of the data file.
S230, obtaining a search result according to the storage path.
Illustratively, a read request is initiated to a data file according to one or more storage paths, the data file is read, file offset information of the data stored in the file is obtained, and the data storage information is fed back to a front-end interface to obtain a search result.
According to the database retrieval method, the database in the embodiment is retrieved through the acquired information to be retrieved, so that the data file is directly obtained aiming at certain specific information to be detected or specific conditions, and the high efficiency and the flexibility of the data file retrieval are improved.
The present embodiment provides a database creation device, as shown in fig. 3, including:
a data feature table establishing module 310, configured to establish a data feature table, where the data feature table includes a first data information table, and the first data information table is used to store a plurality of different kinds of data information, and the data information includes a data storage path; the specific implementation manner is described in the related description of the step S110 in the method of the foregoing embodiment, which is not repeated herein.
A storage path obtaining module 320, configured to obtain a storage path of data in the original data file; the specific implementation manner is described in the related description of the step S120 in the method of the foregoing embodiment, which is not repeated herein.
A storage location obtaining module 330, configured to obtain a storage location of the data in the data feature table according to the storage path of the data; the specific implementation manner is described in the related description of the step S130 in the method of the foregoing embodiment, which is not repeated herein.
A data inserting module 340, configured to insert data into the data feature table according to the storage location; the specific implementation manner is described in the related description of the step S140 in the method of the foregoing embodiment, which is not repeated herein.
The database construction module 350 is configured to construct a data feature database according to the data feature table after the data is inserted. The specific implementation manner is described in the related description of the step S150 in the method of the foregoing embodiment, which is not repeated herein.
As an optional implementation manner of this embodiment, the data feature table creating module 310 further includes a second data information table creating module, configured to create a second data information table, where the second data information table is different from the first data information table in a type of data information stored in the first data information table, and the second data information table and the first data information table include the same data file id value, where the data file id value is used to correlate data information in different data tables. The specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
As an alternative implementation of this embodiment, the storage location obtaining module 330 includes:
the first semantic tag acquisition module is used for dividing and analyzing a storage path of the data to obtain semantic tags; the specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
The first semantic tag searching module is used for searching a semantic tag table according to the semantic tag, and the semantic tag table stores the corresponding relation between the semantic tag and the storage position in the data characteristic table. The specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
And the first storage position determining module is used for determining the storage position of the data in the data characteristic table. The specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
As an alternative implementation of this embodiment, the storage location obtaining module 330 includes:
the second semantic tag acquisition module is used for dividing and analyzing the storage path of the data to obtain semantic tags; the specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
And the second storage position determining module is used for classifying and matching the semantic tags based on natural language processing and determining the position of the data in the data characteristic table. The specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
As an optional implementation manner of this embodiment, the first semantic tag searching module includes:
the semantic tag adding module is used for adding the corresponding relation between the semantic tag and the storage position of the semantic tag in the data feature table to the semantic tag table when the semantic tag is not found in the semantic tag table. The specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
As an optional implementation manner in this embodiment, the data feature table in the data feature table establishing module 310 includes a channel data feature table. The specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
As an optional implementation manner of this embodiment, the second data information table in the second data information table creating module includes:
a channel type information table including any one or more of a device type, a measurement content type, and measurement time/frequency domain information used in channel data measurement;
the channel scene information table comprises any one or more of measurement time, measurement environment, measurement place, measurement climate, altitude, humidity and temperature information. The specific implementation manner is described in the relevant description of the corresponding parts in the method of the foregoing embodiment, which is not repeated herein.
The present embodiment provides a database retrieval apparatus, as shown in fig. 4, including:
the information to be retrieved obtaining module 410 is configured to obtain information to be retrieved; the specific implementation manner is described in the related description of the step S210 in the method of the foregoing embodiment, which is not repeated herein.
The storage path retrieving module 420 is configured to retrieve one or more corresponding storage paths according to information to be retrieved from the data feature database obtained by the database establishing method in the above embodiment; the specific implementation manner is described in the related description of the step S220 in the method of the foregoing embodiment, which is not repeated herein.
The retrieval result obtaining module 430 is configured to obtain a retrieval result according to the storage path. The specific implementation manner is described in the related description of the step S230 in the method of the foregoing embodiment, which is not repeated herein.
Embodiments of the present application also provide an electronic device, as shown in fig. 5, a processor 510 and a memory 520, where the processor 510 and the memory 520 may be connected by a bus or other means.
The processor 510 may be a central processing unit (Central Processing Unit, CPU). Processor 510 may also be a chip such as other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or a combination thereof.
The memory 520 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the database creation or retrieval method in the embodiments of the present invention. The processor executes various functional applications of the processor and data processing by running non-transitory software programs, instructions, and modules stored in memory.
Memory 520 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created by the processor, etc. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 520 may optionally include memory located remotely from the processor, such remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more modules are stored in the memory 520 and when executed by the processor 510 perform a database creation or retrieval method as in the embodiments shown in fig. 1 or fig. 2.
The details of the above electronic device may be understood correspondingly with reference to the corresponding related descriptions and effects in the embodiments shown in fig. 1 or fig. 2, which are not repeated here.
The present embodiment also provides a computer storage medium storing computer executable instructions that can perform the database creation or retrieval method of any of the above method embodiments. Wherein the storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the invention.

Claims (11)

1. A database creation method, comprising:
establishing a data characteristic table, wherein the data characteristic table comprises a first data information table, the first data information table is used for storing a plurality of different types of data information, and the data information comprises a data storage path;
acquiring a storage path of data in an original data file;
dividing and analyzing the storage path of the data to obtain a semantic tag;
searching a semantic tag table according to the semantic tag, wherein the semantic tag table stores the corresponding relation between the semantic tag and a storage position in a data feature table;
determining a storage position of the data in the data characteristic table;
inserting the data into the data feature table according to the storage position;
and constructing a data characteristic database according to the data characteristic table after the data is inserted.
2. The method of claim 1, wherein the data characterization table further comprises a second data information table, the second data information table being of a different type than the data information stored by the first data information table, the second data information table and the first data information table containing the same data file id value used to correlate data information in different data tables.
3. The method according to claim 1, wherein the obtaining, according to the storage path of the data, the storage location of the data in the data feature table includes:
dividing and analyzing the storage path of the data to obtain a semantic tag;
and classifying and matching the semantic tags based on natural language processing, and determining the position of the data in the data characteristic table.
4. The method of claim 1, wherein said looking up a semantic tag table based on said semantic tags comprises:
and when the semantic tag is not found in the semantic tag table, adding the corresponding relation between the semantic tag and the storage position of the semantic tag in the data feature table to the semantic tag table.
5. The method of claim 2, wherein the data characteristics table comprises a channel data characteristics table.
6. The method of claim 5, wherein the second data information table comprises:
a channel type information table including any one or more of a device type, a measurement content type, and measurement time/frequency domain information used in channel data measurement;
The channel scene information table comprises any one or more of measurement time, measurement environment, measurement place, measurement climate, altitude, humidity and temperature information.
7. A database retrieval method, comprising:
acquiring information to be retrieved;
in the data feature database obtained by the database creation method of any one of claims 1 to 6, retrieving a corresponding one or more storage paths according to the information to be retrieved;
and obtaining a retrieval result according to the storage path.
8. A database creation apparatus, comprising:
the system comprises a data feature table establishing module, a data feature table generating module and a data processing module, wherein the data feature table establishing module is used for establishing a data feature table, the data feature table comprises a first data information table, the first data information table is used for storing a plurality of different types of data information, and the data information comprises a data storage path;
the storage path acquisition module is used for acquiring the storage path of the data in the original data file;
the storage position acquisition module is used for dividing and analyzing the storage path of the data to obtain a semantic tag;
searching a semantic tag table according to the semantic tag, wherein the semantic tag table stores the corresponding relation between the semantic tag and a storage position in a data feature table;
Determining a storage position of the data in the data characteristic table;
the data insertion module is used for inserting the data into the data characteristic table according to the storage position;
and the database construction module is used for constructing a data characteristic database according to the data characteristic table after the data is inserted.
9. A database retrieval apparatus, comprising:
the information to be searched acquisition module is used for acquiring information to be searched;
a storage path retrieval module, configured to retrieve, in a data feature database obtained by the database creation method of any one of claims 1 to 6, one or more storage paths according to the information to be retrieved;
and the retrieval result acquisition module is used for acquiring a retrieval result according to the storage path.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the database creation method of any of claims 1-6 or the database retrieval method of claim 7 when the program is executed by the processor.
11. A storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the database creation method of any of claims 1-6 or the database retrieval method of claim 7.
CN202010104739.2A 2020-02-20 2020-02-20 Database building and searching method and device Active CN111309709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010104739.2A CN111309709B (en) 2020-02-20 2020-02-20 Database building and searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010104739.2A CN111309709B (en) 2020-02-20 2020-02-20 Database building and searching method and device

Publications (2)

Publication Number Publication Date
CN111309709A CN111309709A (en) 2020-06-19
CN111309709B true CN111309709B (en) 2023-05-23

Family

ID=71160006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010104739.2A Active CN111309709B (en) 2020-02-20 2020-02-20 Database building and searching method and device

Country Status (1)

Country Link
CN (1) CN111309709B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407563A (en) * 2021-06-04 2021-09-17 莱斯发展有限公司 Query form generation method, device and equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8595237B1 (en) * 2010-02-17 2013-11-26 Netapp, Inc. Method and system for managing metadata in a storage environment
CN108595523A (en) * 2018-03-27 2018-09-28 广州供电局有限公司 device data retrieval model construction method, device and computer equipment
CN109947789A (en) * 2019-01-28 2019-06-28 平安科技(深圳)有限公司 A kind of method, apparatus, computer equipment and the storage medium of the data processing of multiple database
WO2019223601A1 (en) * 2018-05-23 2019-11-28 杭州海康威视数字技术股份有限公司 Database system, and establishment method and apparatus therefor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8595237B1 (en) * 2010-02-17 2013-11-26 Netapp, Inc. Method and system for managing metadata in a storage environment
CN108595523A (en) * 2018-03-27 2018-09-28 广州供电局有限公司 device data retrieval model construction method, device and computer equipment
WO2019223601A1 (en) * 2018-05-23 2019-11-28 杭州海康威视数字技术股份有限公司 Database system, and establishment method and apparatus therefor
CN109947789A (en) * 2019-01-28 2019-06-28 平安科技(深圳)有限公司 A kind of method, apparatus, computer equipment and the storage medium of the data processing of multiple database

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨莹 ; .基于Oracle数据库大数据的检索优化分析与设计.数码世界.2017,(04),全文. *
肖玉泽 ; 张利军 ; 潘巍 ; 张小芳 ; 李战怀 ; .HDFS下海量小文件高效存储与索引方法.小型微型计算机系统.2015,(10),全文. *

Also Published As

Publication number Publication date
CN111309709A (en) 2020-06-19

Similar Documents

Publication Publication Date Title
CN107844565B (en) Commodity searching method and device
CN110909170B (en) Interest point knowledge graph construction method and device, electronic equipment and storage medium
CN107885873B (en) Method and apparatus for outputting information
CN107798001B (en) Webpage processing method, device and equipment
CN108572990A (en) Information-pushing method and device
KR20080111822A (en) Search support information system providing guide information and ranking information linked with user's search
CN112667720A (en) Conversion method, device, equipment and storage medium of interface data model
CN103095823A (en) Object description method and object information interaction system in Internet of Things system
CN102890702A (en) Internet forum-oriented opinion leader mining method
CN111726336B (en) Method and system for extracting identification information of networked intelligent equipment
CN105550221B (en) Information search method and device
CN106021583A (en) Statistical method and system for page flow data
CN103838862A (en) Video searching method, device and terminal
KR20010106666A (en) Method and System for extracting and storing data from HTML type web pages and Storing media extracted the data
US20090189901A1 (en) Coordinate system identification
CN111309709B (en) Database building and searching method and device
CN101894109A (en) Database building method and device
CN114238475A (en) Data processing method of data warehouse
CN113505143B (en) Sentence type conversion method and device, storage medium and electronic device
CN108399224A (en) A kind of method of the push of shopping at network information
CN111460012A (en) Spark-based meteorological historical station leather-following data visualization method and system
CN104281693A (en) Semantic search method and semantic search system
CN115270777A (en) Contract document information extraction method, device and system
KR101734533B1 (en) Method for providing news of multi-nations
CN111782958A (en) Recommendation word determining method and device, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant