CN114818709A - Method and device for acquiring field names of data table, storage medium and electronic device - Google Patents

Method and device for acquiring field names of data table, storage medium and electronic device Download PDF

Info

Publication number
CN114818709A
CN114818709A CN202210434715.2A CN202210434715A CN114818709A CN 114818709 A CN114818709 A CN 114818709A CN 202210434715 A CN202210434715 A CN 202210434715A CN 114818709 A CN114818709 A CN 114818709A
Authority
CN
China
Prior art keywords
field name
annotation
chinese
initial
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210434715.2A
Other languages
Chinese (zh)
Inventor
李俊颉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Haier Technology Co Ltd
Haier Smart Home Co Ltd
Original Assignee
Qingdao Haier Technology Co Ltd
Haier Smart Home Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Haier Technology Co Ltd, Haier Smart Home Co Ltd filed Critical Qingdao Haier Technology Co Ltd
Priority to CN202210434715.2A priority Critical patent/CN114818709A/en
Publication of CN114818709A publication Critical patent/CN114818709A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a method and a device for acquiring field names of a data table, a storage medium and an electronic device, and relates to the technical field of smart families, wherein the method for acquiring the field names of the data table comprises the following steps: the method comprises the steps of obtaining an initial field name carried in a table building statement and obtaining a Chinese annotation corresponding to the initial field name, wherein the table building statement is used for building a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name; acquiring a reference field name matched with the Chinese annotation from the annotation vocabulary and the field name with the corresponding relation; the technical scheme is adopted, and the problems that in the related technology, the consistency of the field names used for constructing the data table is low and the like are solved.

Description

Method and device for acquiring field names of data table, storage medium and electronic device
Technical Field
The application relates to the technical field of smart homes, in particular to a method and a device for acquiring field names of a data table, a storage medium and an electronic device.
Background
In daily database development, a database management tool is usually used for constructing a data table, but at present, naming specifications of database fields have no unified standard all the time, and different developers have different naming habits, so that for the same data field, named field names may also be different due to style differences of companies, businesses and developers, for example, order numbers may have different field names such as order _ sn, order _ number and order _ no, the same data field has different corresponding field names, and in the subsequent data table use process, great inconvenience may be brought to the developers, testers and operation and maintenance personnel on work.
Aiming at the problems of low consistency of field names for constructing a data table and the like in the related art, an effective solution is not provided yet.
Disclosure of Invention
The embodiment of the application provides a method and a device for acquiring field names of a data table, a storage medium and an electronic device, and aims to at least solve the problems that the field names used for constructing the data table are low in consistency and the like in the related technology.
According to an embodiment of the present application, a method for obtaining a field name of a data table is provided, including:
the method comprises the steps of obtaining an initial field name carried in a table building statement and obtaining a Chinese annotation corresponding to the initial field name, wherein the table building statement is used for creating a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name;
acquiring a reference field name matched with the Chinese annotation from the annotation vocabulary and the field name with the corresponding relation;
and acquiring a target field name from the initial field name and the reference field name, wherein the target field name is used for constructing the data table.
Optionally, the obtaining a reference field name matching the chinese annotation from the annotation vocabulary and the field name having a corresponding relationship includes:
acquiring a comment vocabulary set corresponding to the Chinese comment, wherein the comment vocabulary set comprises a target comment vocabulary which allows the Chinese meaning to be expressed;
and acquiring a field name corresponding to each target annotation word from the annotation words and the field names with corresponding relations, and acquiring the reference field name.
Optionally, the obtaining a field name corresponding to each target annotation word from the annotation words and the field names having the corresponding relationship to obtain the reference field name includes:
acquiring the current utilization rate of the field name corresponding to each target annotation word;
and determining the field name corresponding to the target annotation word with the current utilization rate larger than a first threshold value as the reference field name.
Optionally, the obtaining of the annotation vocabulary set corresponding to the chinese annotation includes one of:
obtaining a comment vocabulary with semantic similarity higher than first similarity with the Chinese comment from comment vocabularies recorded in a word stock to obtain a comment vocabulary set, wherein the word stock is used for recording the comment vocabularies and field names with corresponding relations;
acquiring a target vocabulary category with semantic similarity higher than a second similarity with the Chinese annotation from a plurality of vocabulary categories recorded in a word stock, wherein the word stock is used for recording annotation vocabularies and field names with corresponding relations, and the vocabulary categories are obtained by classifying the annotation vocabularies recorded in the word stock according to semantics; and obtaining the annotation vocabularies included in the target vocabulary category to obtain the annotation vocabulary set.
Optionally, the obtaining a target field name from the initial field name and the reference field name includes one of:
determining the initial field name as the target field name in a case where the initial field name is included in the reference field name; determining a field name with the highest current utilization rate in the reference field names as the target field name under the condition that the reference field names do not comprise the initial field names;
displaying the initial field name and the reference field name as candidate field names on a display interface; and determining the candidate field name of which the selection operation is performed in the candidate field names as the target field name.
Optionally, after obtaining the target field name from the initial field name and the reference field name, the method further includes:
constructing the Chinese annotation and the initial field name having a correspondence relationship in a case where the initial field name is determined to be the target field name and the annotation vocabulary and the field name having a correspondence relationship do not include the initial field name;
storing the Chinese annotation and the initial field name having a correspondence relationship in an annotation vocabulary and a field name having a correspondence relationship.
Optionally, after storing the chinese annotation with a correspondence relationship and the initial field name in the annotation vocabulary with a correspondence relationship and the field name, the method further includes:
under the condition that the storage time of the initial field name reaches the target time, detecting the target utilization rate of the initial field name;
deleting the Chinese annotation and the initial field name having a correspondence from the annotation vocabulary and the field name having a correspondence in a case where the target usage is less than or equal to a second threshold.
According to another embodiment of the present application, there is provided an apparatus for obtaining a field name of a data table, including:
the system comprises a first obtaining module, a second obtaining module and a third obtaining module, wherein the first obtaining module is used for obtaining an initial field name carried in a table building statement and obtaining a Chinese annotation corresponding to the initial field name, the table building statement is used for creating a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name;
the second acquisition module is used for acquiring a reference field name matched with the Chinese annotation from the annotation vocabulary and the field name with the corresponding relation;
a third obtaining module, configured to obtain a target field name from the initial field name and the reference field name, where the target field name is used to construct the data table.
According to another aspect of the embodiments of the present application, there is also provided a computer-readable storage medium, in which a computer program is stored, where the computer program is configured to execute the above-mentioned method for acquiring a name of a field of a data table when the computer program runs.
According to another aspect of the embodiments of the present application, there is also provided an electronic apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the method for obtaining a name of a data table field by using the computer program.
In the embodiment of the application, an initial field name carried in a table building statement is obtained, and a Chinese annotation corresponding to the initial field name is obtained, wherein the table building statement is used for creating a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name; acquiring a reference field name matched with the Chinese annotation from the annotation vocabulary and the field name with the corresponding relation; the method comprises the steps of obtaining a target field name from an initial field name and a reference field name, wherein the target field name is used for building a data table, namely obtaining the initial field name from a table building sentence firstly, obtaining a Chinese annotation corresponding to the initial field name, and because although different initial field names may exist in the same data field, Chinese meanings corresponding to the data field are kept consistent, the corresponding reference field name can be determined based on the Chinese annotation bearing the Chinese meanings, the reference field name corresponding to the Chinese annotation can be matched from annotation vocabularies and field names with corresponding relations, the naming consistency of the field names is further enhanced, finally the target field name can be determined from the initial field name and the reference field name, and the data table can be built subsequently by using the target field name. By adopting the technical scheme, the problems that the consistency of the field names used for constructing the data table is low and the like in the related technology are solved, and the technical effect of improving the consistency of the field names used for constructing the data table is realized.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a diagram illustrating a hardware environment of a method for obtaining names of fields in a data table according to an embodiment of the present application;
FIG. 2 is a flowchart of a method for obtaining field names of a data table according to an embodiment of the present application;
FIG. 3 is an interface diagram of a manner of obtaining Chinese annotations according to an embodiment of the present application;
FIG. 4 is a diagram illustrating a correspondence between a comment vocabulary and a field name according to an embodiment of the present application;
FIG. 5 is a diagram illustrating retrieval of an annotated vocabulary according to an embodiment of the present application;
FIG. 6 is a diagram illustrating retrieval of an annotated vocabulary according to an embodiment of the present application;
FIG. 7 is a first diagram illustrating a manner of generating a target field name according to an embodiment of the present application;
FIG. 8 is a second diagram illustrating a generation manner of a target field name according to an embodiment of the present application;
FIG. 9 is a flowchart of a method for obtaining a name of a field of a data table according to an embodiment of the present application;
fig. 10 is a block diagram of an apparatus for acquiring a field name of a data table according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of the embodiments of the present application, a method for obtaining a field name of a data table is provided. The method for acquiring the field name of the data table is widely applied to full-House intelligent digital control application scenes such as intelligent homes (Smart Home), intelligent homes, intelligent Home equipment ecology, intelligent House (Intelligent House) ecology and the like. Alternatively, in this embodiment, the above-mentioned method for acquiring a field name of a data table may be applied to a hardware environment formed by the terminal device 102 and the server 104 as shown in fig. 1. As shown in fig. 1, the server 104 is connected to the terminal device 102 through a network, and may be configured to provide a service (e.g., an application service) for the terminal or a client installed on the terminal, set a database on the server or independent of the server, and provide a data storage service for the server 104, and configure a cloud computing and/or edge computing service on the server or independent of the server, and provide a data operation service for the server 104.
The network may include, but is not limited to, at least one of: wired networks, wireless networks. The wired network may include, but is not limited to, at least one of: wide area networks, metropolitan area networks, local area networks, which may include, but are not limited to, at least one of the following: WIFI (Wireless Fidelity), bluetooth. Terminal equipment 102 can be but not limited to be PC, the cell-phone, the panel computer, intelligent air conditioner, intelligent cigarette machine, intelligent refrigerator, intelligent oven, intelligent kitchen range, intelligent washing machine, intelligent water heater, intelligent washing equipment, intelligent dish washer, intelligent projection equipment, intelligent TV, intelligent clothes hanger, intelligent (window) curtain, intelligence audio-visual, smart jack, intelligent stereo set, intelligent audio amplifier, intelligent new trend equipment, intelligent kitchen guarding equipment, intelligent bathroom equipment, intelligence robot of sweeping the floor, intelligence robot of wiping the window, intelligence robot of mopping the ground, intelligent air purification equipment, intelligent steam ager, intelligent microwave oven, intelligent kitchen is precious, intelligent clarifier, intelligent water dispenser, intelligent lock etc..
In this embodiment, a method for acquiring a field name of a data table is provided, and is applied to the device terminal, and fig. 2 is a flowchart of a method for acquiring a field name of a data table according to an embodiment of the present application, where the flowchart includes the following steps:
step S202, an initial field name carried in a table building statement is obtained, and a Chinese annotation corresponding to the initial field name is obtained, wherein the table building statement is used for creating a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name;
step S204, acquiring a reference field name matched with the Chinese annotation from the annotation vocabulary and the field name with the corresponding relation;
step S206, obtaining a target field name from the initial field name and the reference field name, where the target field name is used to construct the data table.
Through the steps, the initial field name is obtained from the table building statement, and the Chinese annotation corresponding to the initial field name is obtained, although different initial field names may exist in the same data field, the Chinese meanings corresponding to the data field are kept consistent, so that the corresponding reference field name can be determined based on the Chinese annotation bearing the Chinese meaning, the reference field name corresponding to the Chinese annotation can be matched from the annotation vocabulary and the field name with the corresponding relation, the naming consistency of the field name is further enhanced, the target field name can be determined from the initial field name and the reference field name, and the data table can be built subsequently by using the target field name. By adopting the technical scheme, the problems that the consistency of the field names used for constructing the data table is low and the like in the related technology are solved, and the technical effect of improving the consistency of the field names used for constructing the data table is realized.
In the technical solution provided in step S202, the table building statement may be, but is not limited to, a statement used for creating a data table in a database, where the table building statement may be used to indicate column information included in the data table, and may include a data type, a maximum character length, and a field name of a specified column, for example: the statement "order _ number varchar (255)" may indicate that the field name of the corresponding column is "order _ number", the data type is "varchar", and the maximum length is 255 characters. In addition to the above field names "order _ number", there may be other naming modes due to inconsistent naming styles of developers, such as: order _ sn, order _ no, etc.
Alternatively, in this embodiment, although the field names may cause naming differences due to different developers, companies and departments, the corresponding chinese annotations are kept consistent, for example, the above-mentioned "order _ sn, order _ number and order _ no" belong to 3 different field names, but the corresponding chinese annotations may be "order numbers" because they correspond to the same chinese meaning.
Optionally, in this embodiment, the obtaining manner of the chinese annotation may include one of the following:
in the first mode, under the condition that the table building statement carries the Chinese annotation, the Chinese annotation can be directly obtained.
In a second way, an input interface may pop up to allow the user to provide a corresponding chinese comment when the initial field name is detected, for example, as shown in fig. 3, which is an interface diagram of an obtaining manner of a chinese comment according to an embodiment of the present application, an input interface may pop up to allow the user to input a corresponding "chinese comment" when the initial field name (Address) is detected, as shown in fig. 3.
And thirdly, automatically identifying the Chinese meaning of the initial field name and generating the corresponding Chinese annotation, for example, identifying that the Chinese meaning of the order _ number is order and number, and then generating the corresponding Chinese annotation which can be order number.
In the technical solution provided in step S204 above, the annotation vocabulary and the field names with corresponding relationships may be stored in a word stock by, but not limited to, key value pairs, tables, etc., and by way of example in tabular form, fig. 4 is a schematic diagram of correspondence between the annotation vocabulary and the field names according to the embodiment of the present application, as shown in fig. 4, the annotation vocabulary and the field names with corresponding relationships are recorded in the "annotation vocabulary & field name table" stored in the word stock, such as: the comment vocabulary (order _ number) corresponds to the field name (order number); the comment vocabulary (Address) corresponds to a field name (Address).
Optionally, in this embodiment, the reference field name that one chinese annotation can match may include, but is not limited to, a plurality of, for example: the reference field names matched by the Chinese notes (order number) may include, but are not limited to: "order _ sn", "order _ number", and "order _ no", etc., that is, "order _ sn", "order _ number", and "order _ no" may all be used as the reference field name corresponding to the "order number".
In an exemplary embodiment, the reference field name matching the chinese annotation may be obtained from the annotated vocabulary and field names having a correspondence, but is not limited to, by: acquiring a comment vocabulary set corresponding to the Chinese comment, wherein the comment vocabulary set comprises a target comment vocabulary which allows the Chinese meaning to be expressed; and acquiring a field name corresponding to each target annotation word from the annotation words and the field names with corresponding relations, and acquiring the reference field name.
Optionally, in this embodiment, the set of annotation words may include, but is not limited to, a plurality of target annotation words, each of which may express the chinese meaning, such as: the chinese meaning for the chinese annotation (order number) may be "order number," and the set of annotated words for the chinese meaning (order number) may include, but is not limited to: order number, goods number, and the like.
Optionally, in this embodiment, taking the above "order number, goods number, and goods number" as an example of a comment vocabulary set, obtaining a field name corresponding to each target comment vocabulary from the comment vocabulary and the field names having a corresponding relationship may include, but is not limited to, the following processes: the target annotation vocabulary may include, but is not limited to: "order number", "goods number", the corresponding reference field names may include, but are not limited to: "order _ number", "order _ id", "index _ number", and the like.
Optionally, in this embodiment, the annotated vocabulary set corresponding to the chinese annotation may be obtained by, but not limited to, performing word segmentation processing on the chinese annotation, and after the syntax check is passed, by transmitting the field name and the chinese annotation of the field to an elastic search, performing word segmentation on the chinese annotation one by one in sequence, and matching existing similar fields in a lexicon, for example: the Chinese notation is 'order number' and is participled to obtain 'order' and 'number', wherein similar fields corresponding to 'order' can comprise 'order', 'purchase', 'index'; similar fields corresponding to "number" may include "no", "number", "id"; the free combination may be made to result in "order _ number", "order _ id", "index _ number", and the like as reference field names.
In an exemplary embodiment, the reference field name may be obtained by, but not limited to, obtaining a field name corresponding to each of the target annotation vocabularies from the annotation vocabularies and the field names having the corresponding relationship by: acquiring the current utilization rate of the field name corresponding to each target annotation word; and determining the field name corresponding to the target annotation word with the current utilization rate larger than a first threshold value as the reference field name.
Optionally, in this embodiment, the current usage rate may be, but is not limited to, used to indicate the usage frequency of the corresponding field name, such as: "order _ number: 50% "," order _ id: 30% "," index _ number: 2% "may mean that 50% of users in the history use" order _ number "as the field name corresponding to the target annotation vocabulary," order _ id "and" index _ number "account for 30% and 2%, respectively, that is," order _ id "is a choice of most users.
Optionally, in this embodiment, the first threshold may be set according to, but not limited to, actual requirements, and is intended to recommend names of mainstream in the history records to the user as a reference, for example, a field name with a current usage rate greater than 20% may be determined as a reference field name, where the first threshold is set as the "order _ number: 50% "," order _ id: 30% "," index _ number: 2% "for example, the reference field names may be" order _ number "and" order _ id ".
Alternatively, in this embodiment, in addition to the above manner of determining the reference field name by the first threshold, the field name 15-bit before the current usage rate in the history record may be used as the reference field name, and of course, the setting of the ranking number may be set according to specific requirements, and the main point is to recommend the name of the main stream to the user as a reference.
In an exemplary embodiment, the set of annotation words corresponding to the chinese annotation may be obtained, but is not limited to, by: obtaining a comment vocabulary with semantic similarity higher than first similarity with the Chinese comment from comment vocabularies recorded in a word stock to obtain a comment vocabulary set, wherein the word stock is used for recording the comment vocabularies and field names with corresponding relations; acquiring a target vocabulary category with semantic similarity higher than a second similarity with the Chinese annotation from a plurality of vocabulary categories recorded in a word stock, wherein the word stock is used for recording annotation vocabularies and field names with corresponding relations, and the vocabulary categories are obtained by classifying the annotation vocabularies recorded in the word stock according to semantics; and obtaining the annotation vocabulary included in the target vocabulary category to obtain the annotation vocabulary set.
Optionally, in this embodiment, fig. 5 is a schematic diagram of obtaining a annotation vocabulary according to an embodiment of the present application, as shown in fig. 5, obtaining an annotation vocabulary with a semantic similarity higher than the first similarity with the chinese annotation from annotation vocabularies recorded in a thesaurus may include, but is not limited to, matching the annotation vocabulary recorded in the thesaurus with semantics of the chinese annotation one by one, obtaining an annotation vocabulary with a similarity higher than the first similarity, and obtaining an annotation vocabulary set, for example: the chinese annotation (order number) may be matched with the annotated words (order number, order code, address, … …, city) recorded in the thesaurus one by one, and the annotated words with a similarity higher than the first similarity (80%) are obtained, resulting in an annotated word set (order number, order code).
Optionally, in this embodiment, fig. 6 is a schematic diagram of obtaining an annotation vocabulary according to an embodiment of the present application, as shown in fig. 6, the annotation vocabulary recorded in the vocabulary library is divided into N vocabulary categories according to semantics, and each vocabulary category includes a plurality of annotation vocabularies allowing to express the same semantics, such as: the address, the place and the place allow the same semantic meaning (address) to be expressed, the Chinese annotation is compared with the respective semantic meaning of the N vocabulary categories, the target vocabulary category (order number and order code) with the semantic similarity higher than the second similarity (70%) is obtained, and an annotation vocabulary set (order number and order code) is obtained.
In the technical solution provided in step S206, the obtaining of the target field name from the initial field name and the reference field name may be, but is not limited to, providing the reference field name to the user as a naming reference, such as: the initial field name is "index _ number", the corresponding reference field names are ("order _ number", "order _ id"), and the target field name may refer to the above reference field names.
In an exemplary embodiment, obtaining the target field name from the initial field name and the reference field name may be, but is not limited to, by one of: determining the initial field name as the target field name in a case where the initial field name is included in the reference field name; and determining the field name with the highest current utilization rate in the reference field names as the target field name under the condition that the reference field names do not comprise the initial field names.
Displaying the initial field name and the reference field name as candidate field names on a display interface; and determining the candidate field name of which the selection operation is performed in the candidate field names as the target field name.
Optionally, in this embodiment, the obtaining of the target field name from the initial field name and the reference field name may be, but not limited to, automatic decision by a system mechanism or user's decision.
Optionally, in this embodiment, fig. 7 is a schematic diagram of a generation manner of a target field name according to an embodiment of the present application, as shown in fig. 7, the system mechanism automatic decision may include, but is not limited to, two cases, where, when the initial field name (index _ number) is included in the reference field names (order _ number, order _ id, and index _ number), the initial field name (index _ number) is directly determined as the target field name; in the case where the reference field name does not include the initial field name, a target field name may be determined according to, but not limited to, the current usage rate in the reference field names, and a field name (order _ number) having the highest current usage rate in the reference field names may be determined as the target field name.
Optionally, in this embodiment, fig. 8 is a schematic diagram of a second generation manner of a target field name according to an embodiment of the present application, and as shown in fig. 8, the user may decide by himself or herself, but not by limitation, to include a procedure of displaying an initial field name (index _ number) and the reference field names (order _ number, order _ id, index _ number) as candidate field names on a display interface, and determining a candidate field name on which a selection operation is performed in the candidate field names (order _ number) as the target field name.
In an exemplary embodiment, after the obtaining of the target field name from the initial field name and the reference field name, the chinese annotation and the initial field name having a correspondence relationship may be further constructed in a case where the initial field name is determined to be the target field name and the annotation vocabulary and field names having a correspondence relationship do not include the initial field name; storing the Chinese annotation and the initial field name having a correspondence relationship in an annotation vocabulary and a field name having a correspondence relationship.
Alternatively, in this embodiment, it may also be selected that a reference field name is not used, the user insists on using an initial field name "index _ number", and when "index _ number" is not included in the thesaurus, the chinese annotation (order number) and the initial field name (index _ number) having a correspondence may be constructed and then stored to the chinese annotation and the initial field name having a correspondence; alternatively, when the reference field name is an empty set, i.e., the initial field name does not match the corresponding reference field name, the above-described storing operation may be performed.
In an exemplary embodiment, after storing the chinese annotation having a correspondence with the initial field name in the annotation vocabulary having a correspondence with the field name, a target usage rate of the initial field name may be further detected in a case where a storage time of the initial field name reaches a target time; deleting the Chinese annotation and the initial field name having a correspondence from the annotation vocabulary and the field name having a correspondence in a case where the target usage is less than or equal to a second threshold.
Optionally, in this embodiment, when the target usage rate is less than or equal to the second threshold, it may be considered that the usage rate of the initial field name is too low, and the initial field name does not belong to the mainstream field name, and may be deleted, so as to ensure the practicability of the field names stored in the thesaurus.
In order to better understand the process of acquiring the field names of the data table, the following describes an acquisition flow of the field names of the data table with reference to an alternative embodiment, but the flow is not limited to the technical solution of the embodiment of the present application.
In this embodiment, a method for acquiring a field name of a data table is provided, and fig. 9 is a flowchart of a method for acquiring a field name of a data table according to an embodiment of the present application, and as shown in fig. 9, the method mainly includes the following steps:
step S901: inputting a table building statement and checking grammar;
step S902: traversing the field names, inputting the field names and the corresponding Chinese comments into an Elasticissearch, and segmenting the comments by using an ik segmentation tool;
step S903: judging whether similar field names exist or not;
step S904: under the condition of similar field names, returning to 15 fields with the highest use frequency; under the condition that no similar field name exists, entering a word stock, and storing the field name and Chinese annotations;
step S905: traversing the next field name using the returned similar field name; under the condition that the returned similar field names are not used, entering a word stock, storing the field names and Chinese annotations, and traversing the next field name;
step S906: after the traversal is complete, a data table is created.
With the above embodiment, an elastic search is added between the database management tool and the database to store field names and Chinese annotations. The user firstly inputs a table building statement in the database management tool, then a grammar check is carried out, and the field is traversed after the check is passed. Each field name and corresponding chinese annotation is passed to the Elasticsearch. The Chinese annotation of the field is segmented by an IK segmentation tool, then the Chinese annotation of the field is searched one by one to find whether the Chinese annotation is similar or not, and if the Chinese annotation and the field name of the field are not searched, the Chinese annotation and the field name of the field are directly stored in an elastic search. If the Chinese characters can be retrieved, 5 characters with the highest utilization rate are returned to the database management tool for the user to select, if the user selects to use, the use times of the corresponding field in the Elasticissearch are updated, and if the user does not select to use, the Chinese notes and the field names of the field are directly stored in the Elasticissearch. A large database is gradually formed. The user can refer to the existing name in the big database for each field when building the table. The same field, the nomenclature of figure eight gates, is avoided. The method can greatly improve the working efficiency of developers, testers and operation and maintenance personnel, gradually improve the naming unity rate of the same field and reduce the troubles caused by different naming of the same field of different developers or the same field of different tables.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method of the embodiments of the present application.
FIG. 10 is a block diagram of an apparatus for obtaining field names of a data table according to an embodiment of the present application; as shown in fig. 10, includes:
a first obtaining module 1002, configured to obtain an initial field name carried in a table building statement, and obtain a chinese comment corresponding to the initial field name, where the table building statement is used to create a data table in a database, where the data table carries a data field indicated by the initial field name, and the chinese comment is used to indicate a chinese meaning of the initial field name;
a second obtaining module 1004, configured to obtain a reference field name matching the chinese annotation from the annotation vocabulary and field names having a corresponding relationship;
a third obtaining module 1006, configured to obtain a target field name from the initial field name and the reference field name, where the target field name is used to construct the data table.
Through the embodiment, the initial field name is firstly obtained from the table building statement, and the Chinese annotation corresponding to the initial field name is obtained, because although different initial field names may exist in the same data field, the Chinese meanings corresponding to the data field are kept consistent, the corresponding reference field name can be determined based on the Chinese annotation bearing the Chinese meaning, the reference field name corresponding to the Chinese annotation can be matched from the annotation vocabulary and the field name with the corresponding relation, the naming consistency of the field name is further enhanced, the target field name can be determined from the initial field name and the reference field name, and the data table can be built subsequently by using the target field name. By adopting the technical scheme, the problems that the consistency of the field names used for constructing the data table is low and the like in the related technology are solved, and the technical effect of improving the consistency of the field names used for constructing the data table is realized.
In an exemplary embodiment, the second obtaining module includes:
the first obtaining unit is used for obtaining a comment vocabulary set corresponding to the Chinese comment, wherein the comment vocabulary set comprises a target comment vocabulary which allows the Chinese meaning to be expressed;
and the second acquisition unit is used for acquiring the field name corresponding to each target annotation word from the annotation words and the field names with the corresponding relation to obtain the reference field name.
In an exemplary embodiment, the second obtaining unit is configured to:
acquiring the current utilization rate of the field name corresponding to each target annotation word;
and determining the field name corresponding to the target annotation word with the current utilization rate larger than a first threshold value as the reference field name.
In an exemplary embodiment, the first obtaining unit is configured to:
obtaining a comment vocabulary with semantic similarity higher than first similarity with the Chinese comment from comment vocabularies recorded in a word stock to obtain a comment vocabulary set, wherein the word stock is used for recording the comment vocabularies and field names with corresponding relations;
acquiring a target vocabulary category with semantic similarity higher than a second similarity with the Chinese annotation from a plurality of vocabulary categories recorded in a word stock, wherein the word stock is used for recording annotation vocabularies and field names with corresponding relations, and the vocabulary categories are obtained by classifying the annotation vocabularies recorded in the word stock according to semantics; and obtaining the annotation vocabulary included in the target vocabulary category to obtain the annotation vocabulary set.
In an exemplary embodiment, the third obtaining module includes one of:
a determination unit configured to determine the initial field name as the target field name in a case where the initial field name is included in the reference field name; determining a field name with the highest current utilization rate in the reference field names as the target field name under the condition that the reference field names do not comprise the initial field names;
the display unit is used for displaying the initial field name and the reference field name as candidate field names on a display interface; and determining the candidate field name of which the selection operation is performed in the candidate field names as the target field name.
In one exemplary embodiment, the apparatus further comprises:
a building module, configured to, after obtaining a target field name from the initial field name and the reference field name, build the chinese annotation and the initial field name having a correspondence relationship when the initial field name is determined to be the target field name and the initial field name is not included in the annotation vocabulary and the field names having a correspondence relationship;
and the storage module is used for storing the Chinese annotation and the initial field name which have the corresponding relation in the annotation vocabulary and the field name which have the corresponding relation.
In one exemplary embodiment, the apparatus further comprises:
a detection module, configured to detect a target usage rate of the initial field name when a storage time of the initial field name reaches a target time after the chinese annotation and the initial field name having a correspondence are stored in the annotated vocabulary and the field name having a correspondence;
a deleting module, configured to delete the chinese annotation and the initial field name having a correspondence relationship from the annotation vocabulary and the field name having a correspondence relationship when the target usage rate is less than or equal to a second threshold.
Embodiments of the present application also provide a storage medium including a stored program, where the program performs any one of the methods described above when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store program codes for performing the following steps:
s1, acquiring an initial field name carried in a table building statement, and acquiring a Chinese annotation corresponding to the initial field name, wherein the table building statement is used for creating a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name;
s2, obtaining the reference field name matched with the Chinese annotation from the annotation vocabulary and field name with corresponding relation;
s3, obtaining a target field name from the initial field name and the reference field name, wherein the target field name is used for constructing the data table.
Embodiments of the present application further provide an electronic device comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, acquiring an initial field name carried in a table building statement, and acquiring a Chinese annotation corresponding to the initial field name, wherein the table building statement is used for creating a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name;
s2, obtaining the reference field name matched with the Chinese annotation from the annotation vocabulary and field name with corresponding relation;
s3, obtaining a target field name from the initial field name and the reference field name, wherein the target field name is used for constructing the data table.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.
It will be apparent to those skilled in the art that the modules or steps of the present application described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (10)

1. A method for acquiring field names of a data table is characterized by comprising the following steps:
the method comprises the steps of obtaining an initial field name carried in a table building statement and obtaining a Chinese annotation corresponding to the initial field name, wherein the table building statement is used for creating a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name;
acquiring a reference field name matched with the Chinese annotation from the annotation vocabulary and the field name with the corresponding relation;
and acquiring a target field name from the initial field name and the reference field name, wherein the target field name is used for constructing the data table.
2. The method for acquiring field names of data table according to claim 1, wherein the acquiring reference field names matching the chinese annotations from the annotation vocabulary and field names having corresponding relations comprises:
acquiring a comment vocabulary set corresponding to the Chinese comment, wherein the comment vocabulary set comprises a target comment vocabulary which allows the Chinese meaning to be expressed;
and acquiring a field name corresponding to each target annotation word from the annotation words and the field names with corresponding relations, and acquiring the reference field name.
3. The method for acquiring field names of a data table according to claim 2, wherein the acquiring a field name corresponding to each of the target annotation vocabularies from the annotation vocabularies and the field names having a corresponding relationship to obtain the reference field name comprises:
acquiring the current utilization rate of the field name corresponding to each target annotation word;
and determining the field name corresponding to the target annotation word with the current utilization rate larger than a first threshold value as the reference field name.
4. The method of claim 2, wherein the obtaining of the annotated vocabulary set corresponding to the chinese annotation comprises one of:
obtaining a comment vocabulary with semantic similarity higher than first similarity with the Chinese comment from comment vocabularies recorded in a word stock to obtain a comment vocabulary set, wherein the word stock is used for recording the comment vocabularies and field names with corresponding relations;
acquiring a target vocabulary category with semantic similarity higher than a second similarity with the Chinese annotation from a plurality of vocabulary categories recorded in a word stock, wherein the word stock is used for recording annotation vocabularies and field names with corresponding relations, and the vocabulary categories are obtained by classifying the annotation vocabularies recorded in the word stock according to semantics; and obtaining the annotation vocabulary included in the target vocabulary category to obtain the annotation vocabulary set.
5. The method of claim 1, wherein the obtaining of the target field name from the initial field name and the reference field name comprises one of:
determining the initial field name as the target field name in a case where the initial field name is included in the reference field name; determining a field name with the highest current utilization rate in the reference field names as the target field name under the condition that the reference field names do not comprise the initial field names;
displaying the initial field name and the reference field name as candidate field names on a display interface; and determining the candidate field name of which the selection operation is performed in the candidate field names as the target field name.
6. The method of claim 1, wherein after obtaining the target field name from the initial field name and the reference field name, the method further comprises:
constructing the Chinese annotation and the initial field name having a correspondence relationship in a case where the initial field name is determined to be the target field name and the annotation vocabulary and the field name having a correspondence relationship do not include the initial field name;
storing the Chinese annotation and the initial field name having a correspondence relationship in an annotation vocabulary and a field name having a correspondence relationship.
7. The method of claim 6, wherein after storing the chinese annotations and the initial field names having a correspondence in the annotated vocabulary and the field names having a correspondence, the method further comprises:
under the condition that the storage time of the initial field name reaches the target time, detecting the target utilization rate of the initial field name;
deleting the Chinese annotation and the initial field name having a correspondence from the annotation vocabulary and the field name having a correspondence in a case where the target usage is less than or equal to a second threshold.
8. An apparatus for obtaining field names of a data table, comprising:
the system comprises a first obtaining module, a second obtaining module and a third obtaining module, wherein the first obtaining module is used for obtaining an initial field name carried in a table building statement and obtaining a Chinese annotation corresponding to the initial field name, the table building statement is used for creating a data table carrying a data field indicated by the initial field name in a database, and the Chinese annotation is used for indicating the Chinese meaning of the initial field name;
the second acquisition module is used for acquiring a reference field name matched with the Chinese annotation from the annotation vocabulary and the field name with the corresponding relation;
a third obtaining module, configured to obtain a target field name from the initial field name and the reference field name, where the target field name is used to construct the data table.
9. A computer-readable storage medium, comprising a stored program, wherein the program when executed performs the method of any of claims 1 to 7.
10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 7 by means of the computer program.
CN202210434715.2A 2022-04-24 2022-04-24 Method and device for acquiring field names of data table, storage medium and electronic device Pending CN114818709A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210434715.2A CN114818709A (en) 2022-04-24 2022-04-24 Method and device for acquiring field names of data table, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210434715.2A CN114818709A (en) 2022-04-24 2022-04-24 Method and device for acquiring field names of data table, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN114818709A true CN114818709A (en) 2022-07-29

Family

ID=82507100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210434715.2A Pending CN114818709A (en) 2022-04-24 2022-04-24 Method and device for acquiring field names of data table, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN114818709A (en)

Similar Documents

Publication Publication Date Title
JP4920023B2 (en) Inter-object competition index calculation method and system
CN106447346A (en) Method and system for construction of intelligent electric power customer service system
CN110502227B (en) Code complement method and device, storage medium and electronic equipment
US20220019739A1 (en) Item Recall Method and System, Electronic Device and Readable Storage Medium
JP2020135891A (en) Methods, apparatus, devices and media for providing search suggestions
Gong et al. A survey on dataset quality in machine learning
CN111090686A (en) Data processing method, device, server and storage medium
CN111414410A (en) Data processing method, device, equipment and storage medium
CN114579104A (en) Data analysis scene generation method, device, equipment and storage medium
CN110457595A (en) Emergency event alarm method, device, system, electronic equipment and storage medium
CN114065063A (en) Information processing method, information processing apparatus, storage medium, and electronic device
CN116561134B (en) Business rule processing method, device, equipment and storage medium
US20210271637A1 (en) Creating descriptors for business analytics applications
CN112148837A (en) Maintenance scheme acquisition method, device, equipment and storage medium
CN114818709A (en) Method and device for acquiring field names of data table, storage medium and electronic device
CN111723273A (en) Smart cloud retrieval system and method
CN113065329A (en) Data processing method and device
CN112784113A (en) Data processing method and device, computer readable storage medium and electronic equipment
CN113868508B (en) Writing material query method and device, electronic equipment and storage medium
CN108132940B (en) Application program data extraction method and device
CN110347922A (en) Recommended method, device, equipment and storage medium based on similarity
CN114491232A (en) Information query method and device, electronic equipment and storage medium
CN115185535A (en) Dependency relationship display method and device, storage medium and electronic equipment
CN113064982A (en) Question-answer library generation method and related equipment
JP2018013819A (en) Business matching support system, and business matching support method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination