CN112364603A - Index code generation method, device, equipment and storage medium - Google Patents

Index code generation method, device, equipment and storage medium Download PDF

Info

Publication number
CN112364603A
CN112364603A CN202011298686.9A CN202011298686A CN112364603A CN 112364603 A CN112364603 A CN 112364603A CN 202011298686 A CN202011298686 A CN 202011298686A CN 112364603 A CN112364603 A CN 112364603A
Authority
CN
China
Prior art keywords
index
dimension
code
dimensional
indexes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011298686.9A
Other languages
Chinese (zh)
Other versions
CN112364603B (en
Inventor
周子豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202011298686.9A priority Critical patent/CN112364603B/en
Publication of CN112364603A publication Critical patent/CN112364603A/en
Application granted granted Critical
Publication of CN112364603B publication Critical patent/CN112364603B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of big data and discloses an index code generation method, an index code generation device, index code generation equipment and a storage medium. The method comprises the following steps: acquiring an index name of a service requirement; adopting a preset natural language processing model to perform semantic word segmentation processing on the index names to obtain a plurality of index words; acquiring a multi-dimensional index table and an index dimensional coding mapping table, comparing the multi-dimensional index table with each index vocabulary, and determining a dimensional index and a measurement index corresponding to each index vocabulary; searching each dimension index and the atom index code corresponding to each measurement index from the index dimension code mapping table, and sequencing each atom index code according to a preset rule; and splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the required index name. The invention also relates to a blockchain technique, the index encoding being stored in a blockchain. The invention improves the standardization degree of the service index management architecture.

Description

Index code generation method, device, equipment and storage medium
Technical Field
The present invention relates to the field of big data, and in particular, to a method, an apparatus, a device, and a storage medium for generating an index code.
Background
The index is a quantized measurement value obtained by subdividing the service unit, which enables the service target to be describable, measurable and detachable. On the basis of standard definition and management of indexes, the construction of a bottom fact table and a dimension table can be promoted, and the uniqueness of a data source of data statistics and the uniformity of calculation calibers are guaranteed. Meanwhile, by combining the indexes and the modifiers, business personnel can conveniently carry out self-service analysis and data use, and the data acquisition efficiency is reduced, so that a valuable conclusion is generated, decision is assisted, and the value of the data is fully exerted.
At present, business indexes adopt a vocabulary description mode to manage each business process, and for different department management units of the same company, the used management dimensions, index vocabularies, statistical sources, statistical modes, statistical time nodes and the like are different due to different related fields, so that the following problems exist: for the same service scene, different indexes are used for measuring each department; for the calculation of the same index, statistics is carried out at different time points, and the statistical sources and the statistical modes have differences; calculating the difference of the calibers; in summary, the existing service index management architecture has a problem of low normalization degree.
Disclosure of Invention
The invention mainly aims to solve the technical problem of low standardization degree of the existing service index management architecture.
The first aspect of the present invention provides an index code generation method, including:
acquiring an index name of a service requirement;
performing semantic word segmentation processing on the index names by adopting a preset natural language processing model to obtain a plurality of index words;
acquiring a multi-dimensional index table and an index dimension code mapping table, acquiring the multi-dimensional index table and the index dimension code mapping table, and screening out dimension indexes and measurement indexes corresponding to index vocabularies from the multi-dimensional index table;
searching each dimension index and the atom index code corresponding to each measurement index from the index dimension code mapping table, and sequencing each atom index code according to a preset rule;
and splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the demand index name.
Optionally, in a first implementation manner of the first aspect of the present invention, before the obtaining the index name of the service requirement, the method further includes:
acquiring historical service information, and performing abstract processing on the historical service information to obtain a plurality of feature words;
clustering each feature vocabulary by adopting a preset clustering model to obtain a plurality of dimension indexes with different dimensions and one or more measurement indexes;
fixing the one or more measurement indexes, and combining the dimension indexes based on the dimension of each dimension index to obtain a corresponding multi-dimension index table;
and coding the dimension index and the measurement index to obtain a corresponding atom index code, and creating an index dimension code mapping table based on the atom index code.
Optionally, in a second implementation manner of the first aspect of the present invention, the screening out the dimension index and the measurement index corresponding to each index vocabulary from the multi-dimension index table includes:
extracting identification information of an index type in the multi-dimensional index table, and determining a first type index and a second type index in the multi-dimensional index table based on the identification information;
matching the index vocabulary with the first type index, and determining one or more dimension indexes contained in the index vocabulary;
and matching the screened index vocabulary with the second type index, and determining one or more measurement indexes contained in the index vocabulary.
Optionally, in a third implementation manner of the first aspect of the present invention, before the screening out the dimension index and the measurement index corresponding to each index vocabulary from the multi-dimension index table, the method further includes:
judging whether the multi-dimensional index table contains dimension indexes and measurement indexes corresponding to all the index vocabularies;
and if not, generating reminding information based on the index vocabulary not contained so as to prompt a user to perfect the multi-dimensional index table.
Optionally, in a fourth implementation manner of the first aspect of the present invention, after the obtaining the index name of the service requirement, the method further includes:
determining the measuring caliber of the index name, and judging whether the measuring caliber of the index name meets the preset measuring caliber specification;
if the index name is in accordance with the word segmentation processing rule, performing vocabulary entity standardization processing on the index name by adopting a preset character recognition library so as to conform to the word segmentation processing rule of a preset natural language processing model.
Optionally, in a fifth implementation manner of the first aspect of the present invention, after the splicing the atomic index codes based on the sorting order of the atomic index codes to obtain the index code of the required index name, the method further includes:
after index codes of all index names in the service demands are obtained, acquiring a plurality of measurement data corresponding to the service demands based on each index code;
and generating a business report according to the measurement data, and displaying the business report.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the index code is further stored in a block chain.
A second aspect of the present invention provides an index code generation apparatus, including:
the acquisition module is used for acquiring the index name of the service requirement;
the word segmentation module is used for performing semantic word segmentation processing on the index names by adopting a preset natural language processing model to obtain a plurality of index words;
the screening module is used for acquiring a multi-dimensional index table and an index dimensional code mapping table and screening out the dimensional indexes and the measurement indexes corresponding to the index vocabularies from the multi-dimensional index table;
the searching module is used for searching the dimensional indexes and the atomic index codes corresponding to the measurement indexes from the index dimensional code mapping table and sequencing the atomic index codes according to a preset rule;
and the splicing module is used for splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the demand index name.
Optionally, in a first implementation manner of the second aspect of the present invention, the index code generating apparatus further includes:
the abstract processing module is used for acquiring historical service information and carrying out abstract processing on the historical service information to obtain a plurality of characteristic vocabularies;
the clustering module is used for clustering each characteristic vocabulary by adopting a preset clustering model to obtain a plurality of dimension indexes with different dimensions and one or more measurement indexes;
the combination module is used for fixing the one or more measurement indexes and combining the dimension indexes based on the dimension of each dimension index to obtain a corresponding multi-dimension index table;
and the coding module is used for coding the dimension index and the measurement index to obtain a corresponding atom index code, and creating an index dimension code mapping table based on the atom index code.
Optionally, in a second implementation manner of the second aspect of the present invention, the screening module includes:
the extraction unit is used for extracting identification information of the index types in the multi-dimensional index table and determining a first type index and a second type index in the multi-dimensional index table based on the identification information;
the first matching unit is used for matching the index vocabulary with the first type index and determining one or more dimension indexes contained in the index vocabulary;
and the second matching unit is used for matching the screened index vocabulary with the second type index and determining one or more measurement indexes contained in the index vocabulary.
Optionally, in a third implementation manner of the first aspect of the present invention, the index code generating device further includes:
the first judging module is used for judging whether the multi-dimensional index table contains dimension indexes and measurement indexes corresponding to all the index vocabularies;
and the generating module is used for generating reminding information based on the index vocabularies which are not contained if the multi-dimensional index table does not contain the dimension indexes and the measurement indexes which correspond to all the index vocabularies so as to prompt a user to perfect the multi-dimensional index table.
Optionally, in a fourth implementation manner of the second aspect of the present invention, the index code generating apparatus further includes:
the second judging module is used for determining the measuring caliber of the index name and judging whether the measuring caliber of the index name meets the preset measuring caliber specification;
and the standardization processing module is used for carrying out vocabulary entity standardization processing on the index name by adopting a preset character recognition library if the measuring caliber of the index name accords with a preset measuring caliber specification so as to accord with a word segmentation processing rule of a preset natural language processing model.
Optionally, in a fifth implementation manner of the second aspect of the present invention, the index code generating device further includes a report display module, configured to obtain, based on each index code, a plurality of measurement data corresponding to the service demand after the index codes of all index names in the service demand are obtained; and generating a business report according to the measurement data, and displaying the business report.
Optionally, in a sixth implementation manner of the second aspect of the present invention, the index code is further stored in a block chain.
A third aspect of the present invention provides an index code generation device, including: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the index code generation apparatus to perform the index code generation method described above.
A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to execute the index code generation method described above.
The technical scheme provided by the invention provides an index management method based on natural language processing. According to the method, semantic word segmentation processing is carried out on the obtained index name by adopting a natural language processing model, and a plurality of index words of the index name are extracted; screening the dimension indexes and measurement indexes of each index vocabulary through a pre-constructed multi-dimension index table, and determining the atomic index codes of each dimension index and measurement index through an index dimension code mapping table; the index codes with the unique sorting sequence are obtained by sorting and splicing preset rules, ambiguity generated by sorting of different atomic index codes is avoided, the corresponding index names are expressed by the index codes, and the standardization degree of the service index management architecture is increased.
Drawings
FIG. 1 is a diagram of a first embodiment of an index code generation method according to an embodiment of the present invention;
FIG. 2 is a diagram of a second embodiment of an index code generation method according to an embodiment of the present invention;
FIG. 3 is a diagram of a third embodiment of an index code generation method according to an embodiment of the present invention;
FIG. 4 is a diagram of a fourth embodiment of an index code generation method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an embodiment of an index code generation apparatus according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of another embodiment of an index code generation apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of an embodiment of an index code generation apparatus in an embodiment of the present invention.
Detailed Description
The embodiment of the invention provides an index code generation method, an index code generation device, index code generation equipment and a storage medium, and the index code generation method, the index code generation device, the index code generation equipment and the storage medium are used for acquiring an index name of a service requirement; adopting a preset natural language processing model to perform semantic word segmentation processing on the index names to obtain a plurality of index words; acquiring a multi-dimensional index table and an index dimensional coding mapping table, comparing the multi-dimensional index table with each index vocabulary, and determining a dimensional index and a measurement index corresponding to each index vocabulary; searching each dimension index and the atom index code corresponding to each measurement index from the index dimension code mapping table, and sequencing each atom index code according to a preset rule; and splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the required index name. The invention improves the standardization degree of the service index management architecture.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For convenience of understanding, a specific flow of the embodiment of the present invention is described below, and referring to fig. 1, a first embodiment of the index code generation method in the embodiment of the present invention includes:
101. acquiring an index name of a service requirement;
it is to be understood that the execution subject of the present invention may be an index code generation apparatus, and may also be a terminal or a server, which is not limited herein. The embodiment of the present invention is described by taking a server as an execution subject. It is emphasized that the index code may also be stored in a node of a block chain in order to further ensure the privacy and security of the index code.
In this embodiment, the index name is used to describe a business process, for example, "the sales amount of the product a in the time period B and the channel C" is an index name. The index name is manually input by a user through a front end according to business requirements and is received by the server.
102. Performing semantic word segmentation processing on the index names by adopting a preset natural language processing model to obtain a plurality of index words;
in the embodiment, for semantic word segmentation of the index name by the natural language processing model, a plurality of common word segments of the index name are extracted by limiting the length of the word segments; then, text cleaning is carried out on the multiple word segmentations, and word segmentations without entity meanings and rare words are screened out; and classifying the segmented words after the text is cleaned by adopting a characteristic template in a preset natural language processing model to obtain index words and non-index words, wherein the characteristic template comprises corresponding characteristics of the index words, such as xxxx year, xx, month xx, day, xth quarter and the like of the time index words.
103. Acquiring a multi-dimensional index table and an index dimension code mapping table, acquiring the multi-dimensional index table and the index dimension code mapping table, and screening out dimension indexes and measurement indexes corresponding to index vocabularies from the multi-dimensional index table;
in this embodiment, the multidimensional index table is composed of a plurality of dimensional indexes and measurement indexes, and includes all basic data that can be analyzed in an enterprise; the dimension index represents the angle of data analysis, the sales channel can comprise online sales, offline sales, agent sales and the like, and the traffic dimension can be formed by aviation, sea roads, roads and railways; the metrics represent data presented by the analysis, such as sales amount, sales quantity, net profit, and the like.
The multi-dimensional index table comprises a plurality of dimensional categories, such as product types, sales channels, analysis time ranges, sales amounts and the like, one dimensional category can comprise a plurality of dimensional members, such as time dimensions, and can comprise a first quarter, a second quarter, a third quarter and a fourth quarter; dimension members can be defined by different dimension hierarchies, such as "year, month, day" and "year, quarter, month" for defining different dimension members; the dimension hierarchy may also include multiple dimension levels, for example, for "year, month and day", including three levels of year, month and day, different levels have coverage relationships between parent classes and child classes.
The measurement index is a special dimension in the multi-dimension index table, and all other dimension indexes need to aggregate and analyze data by taking the measurement index as a direction. For example, the measurement index is the sales amount, and "the sales amount of the product a in the time period B and the channel C" may be analyzed, or "the sales amount of the product B in the time period a, the channel E and the region C" may be analyzed, where the sales amount is fixed.
104. Searching each dimension index and the atom index code corresponding to each measurement index from the index dimension code mapping table, and sequencing each atom index code according to a preset rule;
in the embodiment, the dimension indexes and the measurement indexes in the multi-dimension index table are both expressed in the form of atomic index codes and are uniformly stored in the form of an index dimension code mapping table; and subsequently, directly searching each dimension index of the target index name and the atom index code corresponding to each measurement index by using the index dimension code mapping table.
The fields with different specifications can represent dimension indexes and measurement indexes with different dimensions, for example, the field with the specification A represents an atomic index code of a time dimension, the field with the specification B represents an atomic index code of a product dimension, and the field with the specification C represents an atomic index code of a channel dimension; the A01 field identifies the current day, the A02 field identifies the current month, the A03 field identifies the current quarter, the B01 field identifies insurance products, the B02 field identifies financial products, the C01 field identifies online sales channels, the C02 field identifies offline sales channels, and the C03 field identifies agent sales channels.
105. And splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the demand index name.
In this embodiment, first, the sorting order of the atomic index codes corresponding to each dimension type is defined, so as to prevent the atomic codes from being spliced into ambiguities caused by different splicing orders, for example, although a01B02C03 and B02C03a01 indicate the same index name, they are expressed by using different index codes, and the meanings of the two are different.
In this embodiment, after the index codes of all the index names in the service demand are obtained, a plurality of measurement data corresponding to the service demand are obtained based on each index code; and generating a business report according to the measurement data, and displaying the business report. The measurement data is a data value corresponding to the measurement index, and the selection of the measurement data depends on the dimension index.
For example, the premium of all vehicle insurance on the same day can be obtained through the index code A, the premium of all vehicle insurance on the same month can be obtained through the index code B, the premium of all vehicle insurance on the same day in the last year can be obtained through the index code C, the premium of all vehicle insurance on the same month in the last year can be obtained through the index code A, and then the premium service report on the same day and the same month and the same year service report on the same month are generated.
The embodiment of the invention provides an index management method based on natural language processing. According to the method, semantic word segmentation processing is carried out on the obtained index name by adopting a natural language processing model, and a plurality of index words of the index name are extracted; screening the dimension indexes and measurement indexes of each index vocabulary through a pre-constructed multi-dimension index table, and determining the atomic index codes of each dimension index and measurement index through an index dimension code mapping table; the index codes with the unique sorting sequence are obtained by sorting and splicing preset rules, ambiguity generated by sorting of different atomic index codes is avoided, the corresponding index names are expressed by the index codes, and the standardization degree of the service index management architecture is increased.
Referring to fig. 2, a second embodiment of the pointer code generating method according to the embodiment of the present invention includes:
201. acquiring historical service information, and performing abstract processing on the historical service information to obtain a plurality of feature words;
202. clustering each feature vocabulary by adopting a preset clustering model to obtain a plurality of dimension indexes with different dimensions and one or more measurement indexes;
in this embodiment, the abstract processing of the historical service information specifically includes the following steps:
performing word segmentation processing on historical service information to obtain a plurality of words, and cleaning the words to obtain a plurality of characteristic words, wherein the word segmentation cleaning comprises (1) constructing a stop word list and removing the words in the stop word list, wherein the list records words with common but no practical meaning, such as mood assistant words: o, ho, Yita, etc.; (2) removing rare words, removing overlength or overlength word segments, known brand names, noise characters and the like;
then, converting each participle into a Word vector by using a conventional model, such as one-hot (one-bit effective coding), BOW (Bag of Words), continuous Bag of Words (CBOW), Skip-Gram model, Word2vec model and the like, wherein, for example, for BOW Bag of Words model, when converting into Word vector values, we need to convert the Word vector values into a three-dimensional TF-IDF (term frequency-inverse text frequency index) matrix, and the three-dimensional TF-IDF can be regarded as a primary weight for extracting text features;
adopting PCA (Principal Component Analysis) to perform data dimension reduction processing on the three-dimensional TF-IDF matrix until the transformed data has the maximum variance to obtain the three-dimensional TF-IDF matrix after dimension reduction, wherein each participle in the TF-IDF matrix can be used as a dimension index or a measurement index after clustering processing of each characteristic vocabulary, and the dimension indexes or the measurement indexes of different dimensions can be classified according to different clustering rules and corresponding storage partitions are defined.
203. Fixing the one or more measurement indexes, and combining the dimension indexes based on the dimension of each dimension index to obtain a corresponding multi-dimension index table;
in this embodiment, the multi-dimensional index table may be represented as a three-dimensional coordinate system, one dimensional level represents one coordinate axis, and a cross section of one X-axis, Y-axis, and Z-axis may be taken as one dimensional category; according to the number of dimension categories and the dimension hierarchical level, taking an equal value on the cross section to divide equally into partitions, and then taking the partition as a dimension member (dimension index) of the corresponding dimension category; one or more predefined metrics are fixed according to the intersection areas of the members of different dimensions.
For the equal-value equipartition partition, if the dimension hierarchical level in the time dimension is defined as 'year and quarter', each equipartition partition is quarterly, and each year comprises four partition intervals; for the dimension type quantity, if there is an intersection region between the time dimension and the product dimension, the sales channel dimension, and the fixed metric index is the sales amount, the intersection region may be expressed as "the sales amount of the sales channel of the product B in the first quarter a".
204. Coding the dimension index and the measurement index to obtain a corresponding atomic index code, and creating an index dimension code mapping table based on the atomic index code;
in this embodiment, the dimension indexes or measurement indexes of different dimensions are encoded according to a predefined encoding rule, and are written into the index dimension mapping table according to the dimension category sequence.
205. Acquiring an index name of a service requirement;
206. performing semantic word segmentation processing on the index names by adopting a preset natural language processing model to obtain a plurality of index words;
207. acquiring a multi-dimensional index table and an index dimension code mapping table, acquiring the multi-dimensional index table and the index dimension code mapping table, and screening out dimension indexes and measurement indexes corresponding to index vocabularies from the multi-dimensional index table;
208. searching each dimension index and the atom index code corresponding to each measurement index from the index dimension code mapping table, and sequencing each atom index code according to a preset rule;
209. and splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the demand index name.
In the embodiment of the invention, the construction process of the dimension index table and the coding mapping table is introduced in detail, the dimension indexes and the measurement indexes with different dimensions are obtained by clustering a plurality of characteristic words in historical service information and are combined to obtain the multi-dimension index table, then each dimension index and each measurement index are coded according to a preset coding rule and are stored as the coding mapping table, and the coding mapping table can be directly applied to index coding generation in the later period, so that the generation efficiency of index coding is improved.
Referring to fig. 3, a third embodiment of the index code generation method according to the embodiment of the present invention includes:
301. acquiring an index name of a service requirement;
302. performing semantic word segmentation processing on the index names by adopting a preset natural language processing model to obtain a plurality of index words;
303. acquiring a multi-dimensional index table and an index dimensional coding mapping table;
304. extracting identification information of an index type in the multi-dimensional index table, and determining a first type index and a second type index in the multi-dimensional index table based on the identification information;
in this embodiment, the multi-dimensional index table includes a dimensional index and a measurement index, and the two indexes have different identification information, and the dimensional index (a first type index) and the measurement index (a second type index) can be distinguished through the identification information;
305. matching the index vocabulary with the first type index, and determining one or more dimension indexes contained in the index vocabulary;
306. matching the screened index vocabulary with the second type index, and determining one or more measurement indexes contained in the index vocabulary;
through the measurement indexes in the multi-dimensional index table, the measurement index vocabularies and the corresponding measurement indexes contained in the index vocabularies can be screened out; through the dimension indexes in the multi-dimension index table, dimension index vocabularies and corresponding measurement indexes contained in the index vocabularies can be screened out, wherein the measurement indexes and the dimension indexes can be matched with the index vocabularies according to similar vocabularies, and the similar vocabularies comprise synonyms, homophones, error-correcting words and the like.
307. Judging whether the multi-dimensional index table contains dimension indexes and measurement indexes corresponding to all the index vocabularies;
in this embodiment, because the multidimensional index table may not be updated or the index vocabularies may not be recognized, there may be a problem that the multidimensional index table does not include the dimension indexes corresponding to all the index vocabularies, so it is necessary to determine whether all the index vocabularies match the corresponding dimension indexes and measurement indexes.
308. And if not, generating reminding information based on the index vocabulary not contained so as to prompt a user to perfect the multi-dimensional index table.
In this embodiment, if there is a dimension index or a measurement index corresponding to an index vocabulary that cannot be found in the multi-dimension index table, the user needs to manually maintain the multi-dimension index table, and a dimension type and a corresponding dimension index or measurement index are newly added to the multi-dimension index table.
309. Searching each dimension index and the atom index code corresponding to each measurement index from the index dimension code mapping table, and sequencing each atom index code according to a preset rule;
310. and splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the demand index name.
In the embodiment of the invention, the dimension indexes and the measurement indexes corresponding to the index vocabularies are screened out from the multi-dimension index table in detail, and for the dimension indexes or the measurement indexes not contained in the multi-dimension index table, a user needs to redevelop the underlying technology according to the requirements of new index vocabularies, so that the multi-dimension index table is perfected, and the standardization and the comprehensiveness of the multi-dimension index table are considered.
Referring to fig. 4, a fourth embodiment of the index code generation method according to the embodiment of the present invention includes:
401. acquiring an index name of a service requirement;
402. determining the measuring caliber of the index name, and judging whether the measuring caliber of the index name meets the preset measuring caliber specification;
in this embodiment, the measurement aperture refers to whether the word in the index name is normal or not, and whether the measurement index is recorded in the multidimensional index table, for example, each measurement index or dimension index in the index name has a wrong word, a synonym occurs, and the measurement index expression is different from the measurement index record in the multidimensional index table, for example, the measurement index expression in the index name is "first two quarters", that is, the recording content of the measurement index in the multidimensional index table can be determined to be "first quarter + second quarter".
403. If the index name is in accordance with the word segmentation processing rule, performing vocabulary entity standardization processing on the index name by adopting a preset character recognition library so as to conform to the word segmentation processing rule of a preset natural language processing model.
In this embodiment, a plurality of data texts in the business field, including documents, periodicals, newsletters, blogs, web pages, and the like, are acquired; extracting the entity meanings of the single characters and the words in each data text; associating the corresponding single characters and words according to the entity meanings of the characters and words in the index name, and storing the single characters and words in a character recognition library; and subsequently, replacing the characters and words corresponding to the index vocabularies in the multi-dimensional index table based on the character recognition library according to the individual characters and the words in the index name.
404. Performing semantic word segmentation processing on the index names by adopting a preset natural language processing model to obtain a plurality of index words;
405. acquiring a multi-dimensional index table and an index dimension code mapping table, acquiring the multi-dimensional index table and the index dimension code mapping table, and screening out dimension indexes and measurement indexes corresponding to index vocabularies from the multi-dimensional index table;
406. searching each dimension index and the atom index code corresponding to each measurement index from the index dimension code mapping table, and sequencing each atom index code according to a preset rule;
407. and splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the demand index name.
In the embodiment of the invention, after the index name is obtained, whether the index is copied with the preset measurement caliber or not is determined, and the index name is subjected to standardized processing so as to adapt to the word segmentation processing rule of the preset natural language model, so that the efficiency of processing each item of the subsequent index name is improved.
With reference to fig. 5, the method for generating index codes according to the embodiment of the present invention is described above, and an example of the apparatus for generating index codes according to the embodiment of the present invention includes:
an obtaining module 501, configured to obtain an index name of a service requirement;
a word segmentation module 502, configured to perform semantic word segmentation on the index name by using a preset natural language processing model to obtain multiple index words;
the screening module 503 is configured to obtain a multidimensional index table and an index dimension code mapping table, and screen out a dimension index and a measurement index corresponding to each index vocabulary from the multidimensional index table;
a searching module 504, configured to search the index dimension code mapping table for the dimension indexes and the atom index codes corresponding to the measurement indexes, and sort the atom index codes according to a preset rule;
and a splicing module 505, configured to splice the atomic index codes based on the sorting order of the atomic index codes, so as to obtain the index code of the demand index name.
The embodiment of the invention provides an index management method based on natural language processing. According to the method, semantic word segmentation processing is carried out on the obtained index name by adopting a natural language processing model, and a plurality of index words of the index name are extracted; screening the dimension indexes and measurement indexes of each index vocabulary through a pre-constructed multi-dimension index table, and determining the atomic index codes of each dimension index and measurement index through an index dimension code mapping table; the index codes with the unique sorting sequence are obtained by sorting and splicing preset rules, ambiguity generated by sorting of different atomic index codes is avoided, the corresponding index names are expressed by the index codes, and the standardization degree of the service index management architecture is increased.
Referring to fig. 6, another embodiment of the index code generating apparatus according to the embodiment of the present invention includes:
an obtaining module 501, configured to obtain an index name of a service requirement;
a word segmentation module 502, configured to perform semantic word segmentation on the index name by using a preset natural language processing model to obtain multiple index words;
the screening module 503 is configured to obtain a multidimensional index table and an index dimension code mapping table, and screen out a dimension index and a measurement index corresponding to each index vocabulary from the multidimensional index table;
a searching module 504, configured to search the index dimension code mapping table for the dimension indexes and the atom index codes corresponding to the measurement indexes, and sort the atom index codes according to a preset rule;
and a splicing module 505, configured to splice the atomic index codes based on the sorting order of the atomic index codes, so as to obtain the index code of the demand index name.
Specifically, the index code generation device further includes:
an abstract processing module 506, configured to obtain historical service information, and perform abstract processing on the historical service information to obtain a plurality of feature vocabularies;
a clustering module 507, configured to perform clustering processing on the feature vocabularies by using a preset clustering model to obtain multiple dimension indexes of different dimensions and one or more measurement indexes;
a combination module 508, configured to fix the one or more measurement indexes, and combine the dimension indexes based on the dimension to which the dimension indexes belong to obtain a corresponding multidimensional index table;
an encoding module 509, configured to encode the dimension index and the measurement index to obtain a corresponding atomic index code, and create an index dimension code mapping table based on the atomic index code.
Specifically, the screening module includes:
an extracting unit 5031, configured to extract identification information of an index type in the multi-dimensional index table, and determine a first type index and a second type index in the multi-dimensional index table based on the identification information;
a first matching unit 5032, configured to match the index vocabulary with the first type index, and determine one or more dimension indexes included in the index vocabulary;
a second matching unit 5033, configured to match the filtered index vocabulary with the second type index, and determine one or more degree metrics included in the index vocabulary.
Specifically, the index code generation device further includes:
a first determining module 510, configured to determine whether the multidimensional index table includes dimension indexes and measurement indexes corresponding to all the index vocabularies;
the generating module 511 is configured to generate a prompting message based on the index vocabulary not included if the multidimensional index table does not include the dimension indexes and the measurement indexes corresponding to all the index vocabularies, so as to prompt a user to perfect the multidimensional index table.
Specifically, the index code generation device further includes:
a second judging module 512, configured to determine the measurement aperture of the index name, and judge whether the measurement aperture of the index name meets a preset measurement aperture specification;
and a standardization processing module 513, configured to, if the measurement aperture of the index name meets a preset measurement aperture specification, perform vocabulary entity standardization processing on the index name by using a preset character recognition library to meet a word segmentation processing rule of a preset natural language processing model.
Specifically, the index code generation device further includes a report display module 514, configured to obtain, based on each index code, a plurality of metric data corresponding to the service requirement after the index codes of all index names in the service requirement are obtained; and generating a business report according to the measurement data, and displaying the business report.
Specifically, the index code is also stored in a block chain.
In the embodiment of the invention, a multi-dimensional index table is obtained by clustering a plurality of characteristic vocabularies in historical service information, and coding is carried out according to a preset rule to obtain a coding mapping table, so that the method can be directly applied to index coding generation in the later period, and the generation efficiency of index coding is improved; the method specifically introduces the steps of screening out dimension indexes and measurement indexes corresponding to index vocabularies from a multi-dimension index table, perfecting the multi-dimension index table according to the dimension indexes or the measurement indexes which are not contained in the multi-dimension index table, and considering standardization and comprehensiveness of the multi-dimension index table; finally, after the index name is obtained, whether the index is copied to the preset measurement caliber or not is determined, the index name is subjected to standardized processing so as to adapt to word segmentation processing rules of the preset natural language model, and the efficiency of processing each item of the subsequent index name is improved.
Fig. 5 and fig. 6 describe the index code generation apparatus in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the index code generation apparatus in the embodiment of the present invention is described in detail from the perspective of hardware processing.
Fig. 7 is a schematic structural diagram of an index code generating apparatus 700 according to an embodiment of the present invention, where the index code generating apparatus 700 may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 710 (e.g., one or more processors) and a memory 720, and one or more storage media 730 (e.g., one or more mass storage devices) storing an application 733 or data 732. Memory 720 and storage medium 730 may be, among other things, transient storage or persistent storage. The program stored in the storage medium 730 may include one or more modules (not shown), each of which may include a series of instruction operations on the index code generation apparatus 700. Still further, the processor 710 may be configured to communicate with the storage medium 730 to execute a series of instruction operations in the storage medium 730 on the index code generating apparatus 700.
The metric-code generating device 700 may also include one or more power supplies 740, one or more wired or wireless network interfaces 750, one or more input-output interfaces 760, and/or one or more operating systems 731, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and so forth. Those skilled in the art will appreciate that the index code generation apparatus configuration shown in fig. 7 does not constitute a limitation of the index code generation apparatus, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.
The present invention also provides an index code generation device, which includes a memory and a processor, where the memory stores computer readable instructions, and the computer readable instructions, when executed by the processor, cause the processor to execute the steps of the index code generation method in the above embodiments.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the index code generation method.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. An index code generation method, characterized by comprising:
acquiring an index name of a service requirement;
performing semantic word segmentation processing on the index names by adopting a preset natural language processing model to obtain a plurality of index words;
acquiring a multi-dimensional index table and an index dimensional coding mapping table, and screening out the dimensional indexes and measurement indexes corresponding to the index vocabularies from the multi-dimensional index table;
searching each dimension index and the atom index code corresponding to each measurement index from the index dimension code mapping table, and sequencing each atom index code according to a preset rule;
and splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the demand index name.
2. The index code generation method according to claim 1, further comprising, before the obtaining the index name of the service requirement:
acquiring historical service information, and performing abstract processing on the historical service information to obtain a plurality of feature words;
clustering each feature vocabulary by adopting a preset clustering model to obtain a plurality of dimension indexes with different dimensions and one or more measurement indexes;
fixing the one or more measurement indexes, and combining the dimension indexes based on the dimension of each dimension index to obtain a corresponding multi-dimension index table;
and coding the dimension index and the measurement index to obtain a corresponding atom index code, and creating an index dimension code mapping table based on the atom index code.
3. The index code generation method according to claim 1, wherein the screening of the dimension index and the metric index corresponding to each index vocabulary from the multi-dimension index table includes:
extracting identification information of an index type in the multi-dimensional index table, and determining a first type index and a second type index in the multi-dimensional index table based on the identification information;
matching the index vocabulary with the first type index, and determining one or more dimension indexes contained in the index vocabulary;
and matching the screened index vocabulary with the second type index, and determining one or more measurement indexes contained in the index vocabulary.
4. The index code generation method according to claim 1, further comprising, before the screening out the dimension index and the metric index corresponding to each index vocabulary from the multi-dimension index table, the steps of:
judging whether the multi-dimensional index table contains dimension indexes and measurement indexes corresponding to all the index vocabularies;
and if not, generating reminding information based on the index vocabulary not contained so as to prompt a user to perfect the multi-dimensional index table.
5. The index code generation method according to any one of claims 1 to 4, further comprising, after the obtaining of the index name of the service requirement:
determining the measuring caliber of the index name, and judging whether the measuring caliber of the index name meets the preset measuring caliber specification;
if the index name is in accordance with the word segmentation processing rule, performing vocabulary entity standardization processing on the index name by adopting a preset character recognition library so as to conform to the word segmentation processing rule of a preset natural language processing model.
6. The index code generation method according to claim 5, wherein after the atomic index codes are spliced based on the sorting order of the atomic index codes to obtain the index code of the required index name, the method further comprises:
after index codes of all index names in the service demands are obtained, acquiring a plurality of measurement data corresponding to the service demands based on each index code;
and generating a business report according to the measurement data, and displaying the business report.
7. The index code generation method according to claim 1, wherein the index code is further stored in a block chain.
8. An index code generation device, characterized by comprising:
the acquisition module is used for acquiring the index name of the service requirement;
the word segmentation module is used for performing semantic word segmentation processing on the index names by adopting a preset natural language processing model to obtain a plurality of index words;
the screening module is used for acquiring a multi-dimensional index table and an index dimensional code mapping table and screening out the dimensional indexes and the measurement indexes corresponding to the index vocabularies from the multi-dimensional index table;
the searching module is used for searching the dimensional indexes and the atomic index codes corresponding to the measurement indexes from the index dimensional code mapping table and sequencing the atomic index codes according to a preset rule;
and the splicing module is used for splicing the atomic index codes based on the sequencing sequence of the atomic index codes to obtain the index code of the demand index name.
9. An index code generation device characterized by comprising: a memory and at least one processor, the memory having instructions stored therein;
the at least one processor invokes the instructions in the memory to cause the metric code generation apparatus to perform the metric code generation method of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements an index code generation method according to any one of claims 1 to 7.
CN202011298686.9A 2020-11-19 2020-11-19 Index code generation method, device, equipment and storage medium Active CN112364603B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011298686.9A CN112364603B (en) 2020-11-19 2020-11-19 Index code generation method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011298686.9A CN112364603B (en) 2020-11-19 2020-11-19 Index code generation method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112364603A true CN112364603A (en) 2021-02-12
CN112364603B CN112364603B (en) 2023-10-03

Family

ID=74532961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011298686.9A Active CN112364603B (en) 2020-11-19 2020-11-19 Index code generation method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112364603B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590607A (en) * 2021-09-29 2021-11-02 国网江苏省电力有限公司营销服务中心 Electric power marketing report realization method and system based on report factor
CN115150201A (en) * 2022-09-02 2022-10-04 南通市艺龙科技有限公司 Remote encryption transmission method for cloud computing data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038175A (en) * 2017-09-30 2018-05-15 用友金融信息技术股份有限公司 Multidimensional data dynamically associates querying method, device, computer equipment and medium
CN110427434A (en) * 2019-06-28 2019-11-08 苏宁云计算有限公司 A kind of multidimensional data query method and device
CN111125121A (en) * 2020-03-30 2020-05-08 四川新网银行股份有限公司 Real-time data display method based on HBase table
CN111737476A (en) * 2020-08-05 2020-10-02 腾讯科技(深圳)有限公司 Text processing method and device, computer readable storage medium and electronic equipment
CN111949655A (en) * 2020-07-24 2020-11-17 北京每日优鲜电子商务有限公司 Form display method and device, electronic equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038175A (en) * 2017-09-30 2018-05-15 用友金融信息技术股份有限公司 Multidimensional data dynamically associates querying method, device, computer equipment and medium
CN110427434A (en) * 2019-06-28 2019-11-08 苏宁云计算有限公司 A kind of multidimensional data query method and device
CN111125121A (en) * 2020-03-30 2020-05-08 四川新网银行股份有限公司 Real-time data display method based on HBase table
CN111949655A (en) * 2020-07-24 2020-11-17 北京每日优鲜电子商务有限公司 Form display method and device, electronic equipment and medium
CN111737476A (en) * 2020-08-05 2020-10-02 腾讯科技(深圳)有限公司 Text processing method and device, computer readable storage medium and electronic equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590607A (en) * 2021-09-29 2021-11-02 国网江苏省电力有限公司营销服务中心 Electric power marketing report realization method and system based on report factor
CN115150201A (en) * 2022-09-02 2022-10-04 南通市艺龙科技有限公司 Remote encryption transmission method for cloud computing data
CN115150201B (en) * 2022-09-02 2022-11-08 南通市艺龙科技有限公司 Remote encryption transmission method for cloud computing data

Also Published As

Publication number Publication date
CN112364603B (en) 2023-10-03

Similar Documents

Publication Publication Date Title
US20150134569A1 (en) Domain-specific syntactic tagging in a functional information system
Nakagawa et al. Stock price prediction using k‐medoids clustering with indexing dynamic time warping
CN113283675B (en) Index data analysis method, device, equipment and storage medium
KR101511656B1 (en) Ascribing actionable attributes to data that describes a personal identity
CN110162754B (en) Method and equipment for generating post description document
CN112364603B (en) Index code generation method, device, equipment and storage medium
CN113435202A (en) Product recommendation method and device based on user portrait, electronic equipment and medium
CN111209400A (en) Data analysis method and device
CN110990529A (en) Enterprise industry detail division method and system
CN111127068A (en) Automatic pricing method and device for engineering quantity list
CN114037545A (en) Client recommendation method, device, equipment and storage medium
CN113342977B (en) Invoice image classification method, device, equipment and storage medium
CN111078828A (en) Enterprise historical information extraction method and system
KR101753768B1 (en) A knowledge management system of searching documents on categories by using weights
CN108073678B (en) Document analysis processing method, system and device applied to big data analysis
CN113159118A (en) Logistics data index processing method, device, equipment and storage medium
US20140201193A1 (en) Intellectual property asset information retrieval system
CN115131139B (en) Method, device and medium for obtaining target result based on structural data
CN113240325B (en) Data processing method, device, equipment and storage medium
CN116150185A (en) Data standard extraction method, device, equipment and medium based on artificial intelligence
CN114926082A (en) Artificial intelligence-based data fluctuation early warning method and related equipment
CN112818215A (en) Product data processing method, device, equipment and storage medium
CN113269179A (en) Data processing method, device, equipment and storage medium
CN113571198A (en) Conversion rate prediction method, device, equipment and storage medium
CN110909112B (en) Data extraction method, device, terminal equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant