CN112948429B - Data reporting method, device and equipment - Google Patents

Data reporting method, device and equipment Download PDF

Info

Publication number
CN112948429B
CN112948429B CN202110141803.9A CN202110141803A CN112948429B CN 112948429 B CN112948429 B CN 112948429B CN 202110141803 A CN202110141803 A CN 202110141803A CN 112948429 B CN112948429 B CN 112948429B
Authority
CN
China
Prior art keywords
reported
data
check
reporting
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110141803.9A
Other languages
Chinese (zh)
Other versions
CN112948429A (en
Inventor
赵乐
王超
李铮杰
池纪锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110141803.9A priority Critical patent/CN112948429B/en
Publication of CN112948429A publication Critical patent/CN112948429A/en
Application granted granted Critical
Publication of CN112948429B publication Critical patent/CN112948429B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a data reporting method, device and equipment, wherein the method comprises the following steps: acquiring a target data set to be reported; the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, and each group of data at least comprises a value of a field to be reported; acquiring a verification data set; wherein, the check data set contains the check rule of the field to be reported; removing values of to-be-reported fields which do not accord with the check rule in each first to-be-reported data subset by using the check data set to obtain a plurality of second to-be-reported data subsets; and carrying out data reporting based on the plurality of second data subsets to be reported. In the embodiment of the specification, the field to be reported can be checked and screened first, and then the second data subset to be reported is generated, so that all the fields to be reported are numerical values meeting the supervision specification, and the accuracy of data reporting is effectively improved.

Description

Data reporting method, device and equipment
Technical Field
The embodiment of the specification relates to the technical field of big data, in particular to a data reporting method, a device and equipment.
Background
Along with the continuous perfection of the supervision report system, besides requiring more reported data, the quality of the reported data is also the key point of supervision and inspection. The reporting flow in the prior art is as follows: firstly extracting a field to be reported from big data of a bank to report a supervision department, and then explaining the reported data quality problem according to supervision requirements. In the prior art, the reporting mode can only ensure that the total reporting of the data is finished, and each value in each field for selecting reporting cannot be thinned to be the value which most accords with the requirement of supervision reporting in the existing data, so that the quality of the reported data is low. Therefore, the technical scheme in the prior art cannot accurately report the data.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the specification provides a data reporting method, device and equipment, which are used for solving the problem that the data reporting cannot be accurately carried out in the prior art.
The embodiment of the specification provides a data reporting method, which comprises the following steps: acquiring a target data set to be reported; the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, and each group of data at least comprises a value of a field to be reported; acquiring a verification data set; wherein the check data set contains check rules of fields to be reported; removing values of fields to be reported which do not accord with the check rule in each first data subset to be reported by using the check data set to obtain a plurality of second data subsets to be reported; and carrying out data reporting based on the plurality of second data subsets to be reported.
The embodiment of the specification also provides a data reporting device, which comprises: the first acquisition module is used for acquiring a target data set to be reported; the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, and each group of data at least comprises a value of a field to be reported; the second acquisition module is used for acquiring a check data set; wherein the check data set contains check rules of fields to be reported; the processing module is used for removing the values of the to-be-reported fields which do not accord with the check rule in each first to-be-reported data subset by utilizing the check data set to obtain a plurality of second to-be-reported data subsets; and the data reporting module is used for reporting data based on the plurality of second data subsets to be reported.
The embodiment of the specification also provides a data reporting device, which comprises a processor and a memory for storing instructions executable by the processor, wherein the steps of the data reporting method are realized when the processor executes the instructions.
Embodiments of the present specification also provide a computer readable storage medium having stored thereon computer instructions which, when executed, implement the steps of any of the datagram methods.
The embodiment of the specification provides a data reporting method, which can acquire a target data set to be reported and a check data set, wherein the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, each group of data at least comprises a value of a field to be reported, and the check data set comprises a check rule of the field to be reported. Therefore, the check data set can be utilized to remove the values of the to-be-reported fields which do not accord with the check rules in the first to-be-reported data subsets, so as to obtain a plurality of second to-be-reported data subsets, and the data reporting is performed based on the second to-be-reported data subsets. Therefore, the fields to be reported can be checked and screened first, and then the second data subset to be reported is generated, so that all the fields to be reported are numerical values meeting the supervision standard, and the accuracy of data reporting is effectively improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of embodiments of the present specification, are incorporated in and constitute a part of this specification and do not limit the embodiments of the present specification. In the drawings:
Fig. 1 is a schematic diagram of steps of a datagram delivery method according to an embodiment of the present disclosure;
Fig. 2 is a schematic structural diagram of a datagram delivery device according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a datagram delivery device according to an embodiment of the present disclosure.
Detailed Description
The principles and spirit of the embodiments of the present specification will be described below with reference to several exemplary implementations. It should be understood that these embodiments are presented merely to enable one skilled in the art to better understand and implement the present description embodiments and are not intended to limit the scope of the present description embodiments in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Those skilled in the art will appreciate that the implementations of the embodiments of the present description may be implemented as a system, apparatus, method, or computer program product. Accordingly, the present disclosure may be embodied in the following forms, namely: complete hardware, complete software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
While the flow described below includes a number of operations occurring in a particular order, it should be apparent that these processes may include more or fewer operations, which may be performed sequentially or in parallel (e.g., using a parallel processor or a multi-threaded environment).
In some cases, the reporting procedure may be: firstly, completing reporting, and analyzing and reporting a service, wherein a developer provides a fetch logic, fetches and reporting supervision department; and then the reported data quality problem is interpreted according to the supervision requirement, and a developer analyzes the verification specification, verifies the data and analyzes the verification result. For existing mass data of banks, the integrity of information known by personnel responsible for development of reporting logic cannot be guaranteed, whether data related to reporting are fully considered or not, and the taking logic formulated by utilization may not be optimal. In addition, the fetch logic in the prior art can only ensure complete total reporting of data, and cannot refine each value in each field for selecting reporting to be the value most conforming to the requirement of supervision reporting in the existing data, so that the quality of the reported data is low. Further, in some cases, the technical solution needs to communicate between each business department, then contact the developer, and communicate between each developer, and the communication process consumes long time and much labor resource investment, so that the time for determining the reporting logic is long, and the data reporting cannot be performed efficiently and accurately.
Based on this, referring to fig. 1, the present embodiment may provide a data reporting method. The data reporting method can be used for efficiently and accurately reporting data. The data reporting method may include the following steps.
S101: acquiring a target data set to be reported; the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, and each group of data at least comprises a value of a field to be reported.
In this embodiment, the target data set to be reported may be acquired first. The target data set to be reported may be a set of data to be reported, where the set of data to be reported is stored in a data center station, and the set of data to be reported may include all fields to be reported of a target service type. For example: the list name to be reported is a personal demand deposit account, wherein the personal demand deposit account comprises three business types of data of personal demand deposit, account foreign exchange and account noble metal, and the fields such as a demand deposit account number, deposit balance, account name, account opening date and the like are required to be reported.
In this embodiment, the target data set to be reported may be obtained from data, where all the historical data downloaded by the upstream application is stored. Since there is typically more than one data set corresponding to each service type stored in the data center, the data to be reported may be determined for different service types, respectively. The target data set to be reported includes a plurality of first data subsets to be reported of a target service type, each first data subset to be reported includes a plurality of groups of data, and each group of data may include at least one value of a field to be reported.
In this embodiment, when the target service type is a personal savings in life, the table in which the service type recorded in the data center table is a personal savings in life includes: the table a, the table c and the table d are all recorded with fields such as a user number, a demand deposit account number, a deposit balance, an account name, an account opening date and the like, and because the verification rules of different fields are different, the data of the user number column and each field column to be reported in the table a, the table c and the table d can be used as a first data subset to be reported. In some embodiments, the table a, the table c and the table d may not be physically split, but only the user number column and the data of one field column to be submitted are checked as a whole during checking. Of course, physical splitting may be performed, and specific embodiments of the present specification may be determined according to actual situations, which are not limited.
In this embodiment, the first subset of data to be sent may be recorded in a table, and each user has a unique number, which may be referred to as a user number, and the user number may be a primary key unique to each table in the data center. Of course, the first subset of data to be sent may also be recorded in other forms, for example, text, etc., which may be specifically determined according to practical situations, which is not limited in this embodiment of the present disclosure. The field to be reported may be a field whose field name is the same as the field name specified by the target service type in the table, and a column of the table is called a "field", where each field contains information of a certain topic. Just as in the "address book" database, "name", "contact" are attributes common to all rows in the table, these columns are referred to as the "name" field and the "contact" field.
S102: acquiring a verification data set; the check data set contains check rules of the field to be reported.
In this embodiment, a check data set may be acquired; the check data set may be used to characterize a reporting requirement of a field to be reported corresponding to a target service type, and the check data set may be used to characterize a check rule of at least one field to be reported. For example, the check rule of the date of the account opening may be a table name: personal demand deposit account; check field: account opening date; the report request is as follows: the format check, the date of opening of the individual demand deposit account cannot be 99991231. Of course, the checking rules are not limited to the above examples, and other modifications are possible by those skilled in the art in light of the technical spirit of the embodiments of the present disclosure, but all the functions and effects implemented by the checking rules are included in the protection scope of the embodiments of the present disclosure as long as they are the same or similar to the embodiments of the present disclosure.
In this embodiment, the method for acquiring the check data set may include: pulling from a preset database or mining from the text description by using a rule extraction method in combination with a corpus. It will be understood, of course, that other possible ways of obtaining the sample data set may be used, for example, receiving a verification data set input by a user, and specifically may be determined according to the actual situation, which is not limited by the embodiment of the present disclosure.
S103: and removing the values of the to-be-reported fields which do not accord with the check rule in each first to-be-reported data subset by using the check data set to obtain a plurality of second to-be-reported data subsets.
In this embodiment, data in each first data subset to be reported may be checked before reporting, and the value of the field to be reported, which does not conform to the check rule, in each first data subset to be reported may be removed by using the check data set, so as to obtain a plurality of second data subsets to be reported. The number of the second data subsets to be reported may be the same as the number of the first data subsets to be reported, and the second data subsets to be reported may be in one-to-one correspondence with the first data subsets to be reported.
S104: and carrying out data reporting based on the plurality of second data subsets to be reported.
In this embodiment, after the verification of the field to be reported is completed, data reporting may be performed based on the second subset of data to be reported. In some embodiments, since the user number does not belong to the field to be reported, the user number columns in the second data subset to be reported may be deleted and then reported, which may be specifically determined according to the actual situation, which is not limited in the embodiments of the present disclosure.
In the present embodiment, when the list name to be reported is a personal demand deposit account, the personal demand deposit account includes three business types of data including personal demand deposit, account foreign exchange, and account noble metal, and fields such as demand deposit account number, deposit balance, account name, and date of opening are required to be reported. Because the method comprises three different service types, after all the data which are corresponding to the three service types and need to be reported are determined, the data are combined into one data set to be reported, and the process of determining the data which are corresponding to the three service types and need to be reported can be executed in parallel, so that the efficiency of reporting the data can be effectively improved, and the number of times of reporting connection is reduced.
From the above description, it can be seen that the following technical effects are achieved in the embodiments of the present specification: the target data set to be reported and the check data set can be obtained, and because the target data set to be reported comprises a plurality of first data subsets to be reported of the target service type, each first data subset to be reported comprises a plurality of groups of data, each group of data at least comprises a value of a field to be reported, and the check data set comprises a check rule of the field to be reported. Therefore, the check data set can be utilized to remove the values of the to-be-reported fields which do not accord with the check rules in the first to-be-reported data subsets, so as to obtain a plurality of second to-be-reported data subsets, and the data reporting is performed based on the plurality of second to-be-reported data subsets. Therefore, the fields to be reported can be checked and screened first, and then the second data subset to be reported is generated, so that all the fields to be reported are numerical values meeting the supervision standard, and the accuracy of data reporting is effectively improved.
In one embodiment, acquiring the target data set to be submitted may include: acquiring a reporting specification data set, wherein the reporting specification data set can comprise a plurality of groups of data, and each group of data comprises: table name, contained service type, field to be reported. The method comprises the steps that a target service type corresponding to a target table name in a report specification data set can be determined, and a plurality of source tables with the service type being the target service type are obtained from a data center table; the source table may include a user number and a field to be reported. Further, under the condition that the number of the field to be reported corresponding to the target service type is multiple, each field column to be reported and the user number column in each source table can be respectively used as a first data subset to be reported by utilizing a fuzzy matching technology, so that a plurality of first data subsets to be reported are obtained, and the plurality of first data subsets to be reported of the target service type are used as target data sets to be reported.
In this embodiment, each regulatory agency may send a report specification, which may be by open source word segmentation software: and performing word segmentation processing on the newspaper delivery specification by using a barker word segmentation system ICTCLAS (Chinese word segmentation system) and the like to acquire information such as a table name, contained service types, to-be-delivered fields and the like. For example, word segmentation processing is performed on the report specification issued by the supervision organization to obtain a set of data may be: the personal demand deposit account number comprises three business types of data of personal demand deposit account, account foreign exchange and account noble metal, and the fields of demand deposit account number, deposit balance, account name, account opening date and the like. The segmented multiple groups of data can be used as a report specification data set, and each group of data can comprise: table name, contained service type, field to be reported.
In this embodiment, the target service type may correspond to one or more to-be-reported fields, and to-be-reported data corresponding to each to-be-reported field may be determined respectively, and then to-be-reported data of each to-be-reported field is combined to be the to-be-reported field of the target service type. Further, the service types corresponding to the target table names may include one or more service types, and data to be reported of different service types under the same table name may be determined respectively, and then the data to be reported of different service types are combined and used as the data to be reported of the target table names for reporting.
In this embodiment, the data center is a middle and supporting platform for realizing new service and new application of data enabling, by precipitating existing/newly established informationized system service and data. All historical data downloaded by the upstream application are stored in the data center table, the downloading table can be used as a source table, a T table in the data center table can record the service type of each source table, and the service type can be shown in the table 1:
TABLE 1
Table name Service type
Table a Personal living deposit
Table b Deposit account
Table c Personal living deposit
Table d Personal living deposit
Table e Account foreign exchange
Table f Account foreign exchange
... ...
In this embodiment, the table name of the source table whose service type is the target service type may be determined according to the T table in the data center table, so that a plurality of source tables may be acquired according to the determined table name. The source table may at least include data such as a user number and a field to be reported.
In this embodiment, since the verification needs to be performed on the different fields to be reported respectively, when it is determined that the fields to be reported corresponding to the target service type are multiple, each field column to be reported and the user number column in each source table are respectively used as a first data subset to be reported by using the fuzzy matching technology, where the first data subset to be reported may use the user number as a unique primary key.
In this embodiment, when the target service type is a personal savings in life, the table that can determine that the service type recorded in the data center table is a personal savings in life according to table 1 includes: the table a, the table c and the table d are all recorded with fields such as a user number, a demand deposit account number, a deposit balance, an account name, an account opening date and the like, and because the verification rules of different fields are different, the data of the user number column and each field column to be reported in the table a, the table c and the table d can be used as a first data subset to be reported. In some embodiments, the table a, the table c and the table d may not be physically split, but only the user number column and the data of one field column to be submitted are checked as a whole during checking. Of course, physical splitting may be performed, and specific embodiments of the present specification may be determined according to actual situations, which are not limited.
In this embodiment, a plurality of source tables corresponding to the target service types corresponding to the target table names in the reporting specification data set may be obtained from the data center table, and each field column to be reported and the user number column in each source table are respectively used as a first subset of data to be reported, so that the total data to be checked corresponding to the field to be reported of each target service type may be obtained efficiently.
In one embodiment, after obtaining the plurality of source tables with the service type being the target service type from the data center station, the method may further include: the user numbers in the multiple source tables are taken out through the structured query language and inserted into the user number column of the first summary table after the duplication is removed, so that a second summary table is obtained; the first summary table contains fields to be reported corresponding to the user numbers and the target service types, and the values of the fields to be reported in the first summary table are null. Correspondingly, the data reporting based on the plurality of second data subsets to be reported may include: and writing the values of the corresponding fields to be reported in the second data subsets to be reported in the same fields to be reported into a second summary table respectively to obtain a third summary table, and reporting the data based on the third summary table.
In this embodiment, the first summary table may be set according to the report specification data set, and the values of the user number and the plurality of field columns to be reported in the first summary table may be all null. The user numbers in the multiple source tables can be extracted through a Structured Query Language (SQL) and inserted into the user number column of the first summary table after duplication removal, so as to obtain a second summary table, thereby obtaining the user numbers of all the services corresponding to the type of the generated target service, and the values of the fields to be reported corresponding to the user numbers are the data to be reported. Wherein, when the target service type is a personal savings, the second summary table may be as shown in table 2:
TABLE 2
User numbering Demand deposit account number Deposit balance Account name Date of account opening
0000001
0000002
0000003
0000004
0000005
...
n
In this embodiment, the keyword distict is used to return unique and different values in the Structured Query Language (SQL), SELECT DISTINCT indicates that duplicate rows are removed from the query result, and Distinct indicates that duplicate rows are removed. Thus, the structured query language described above may be employed to remove duplicate user numbers in multiple source tables. It will be understood, of course, that in some embodiments, other ways of deduplication may be used, and in particular, may be determined according to the actual situation, which is not limited by the embodiments of the present disclosure.
In this embodiment, since all the user numbers required to be reported for the target service type are recorded in the second summary table, and the user numbers are unique primary keys, the values of the corresponding fields to be reported in the second subsets of data to be reported, which are the same as the fields to be reported, may be written into the second summary table, respectively, to obtain a third summary table, and the data reporting may be performed based on the third summary table.
In this embodiment, the user numbers in the multiple source tables may be taken out through the structured query language and inserted into the user number column of the first summary table after duplication removal, so as to obtain the second summary table, thereby accurately determining a set of user numbers to be reported corresponding to the target service type, where the values of the fields to be reported corresponding to the user numbers are the data to be reported.
In one embodiment, the removing, by using the check data set, the value of the to-be-reported field that does not conform to the check rule in each first to-be-reported data subset to obtain a plurality of second to-be-reported data subsets may include: and segmenting each check rule in the check data set to obtain a word segmentation result of each check rule, wherein the word segmentation result comprises a table name, a field name and a reporting requirement corresponding to the check rule. The structured query language of each check rule can be generated according to the word segmentation result, and the structured query language of each check rule is executed based on each first data subset to be submitted, so that the check result of each first data subset to be submitted is obtained. Further, according to the verification result, the values of the fields to be reported, which do not accord with the verification rule, in each first data subset to be reported can be removed, so as to obtain a plurality of second data subsets to be reported.
In this embodiment, the supervision authority may formulate a corresponding verification rule for the reporting result, so as to form a verification data set, and separate, for each verification rule in the verification data set, a table name, a field name, and a reporting requirement of each verification rule to be verified by using open source word segmentation software. The report request may include specification information such as dictionary format, formula request, check type, etc. In one embodiment, the word segmentation result of the check rule may be as shown in table 3:
TABLE 3 Table 3
In this embodiment, since the data size to be checked is large, the structured query language of each check rule may be generated based on the word segmentation result, so that each first data subset to be submitted may be checked efficiently by using the structured query language of each check rule, and a plurality of corresponding second data subsets to be submitted may be obtained after removing the values of the fields to be submitted that do not conform to the check rule in each first data subset to be submitted. The communication cost can be effectively reduced, and verification can be automatically performed by using word segmentation technology and structured query language.
In one embodiment, generating a structured query language for each check rule based on the word segmentation results may include: based on the word segmentation result, the newspaper sending requirements corresponding to each verification rule are segmented to obtain a verification parameter set and a verification type of each newspaper sending requirement. Further, a parameter table of the structured query language corresponding to each check type can be obtained, and the structured query language of each check rule is generated by using the check parameter set required by each report according to the parameter table of the structured query language corresponding to each check type.
In this embodiment, the verification rule is divided into verification types such as format verification, length verification, association verification, and the like, and because different verification types have different verification logics, modes, and the like, the parameter table of the corresponding structured query language can be preconfigured for different verification types.
In this embodiment, the word segmentation technique may be used to segment the reporting requirements corresponding to each verification rule, so as to obtain a verification parameter set and a verification type of each reporting requirement. For example, the reporting requirement is [ format check: the account opening date of the individual demand deposit account cannot be 99991231 ], and the individual demand deposit account opening date is split into the following parts by using open source word segmentation software: the verification type is "format verification", and the word segmentation result can be inserted into the corresponding position in the parameter table of the structured query language corresponding to the format verification by using the structured query language to obtain table 4, so that a verification program can be automatically generated to verify the specification of the data to be reported.
TABLE 4 Table 4
Field name Field description Remarks
table_name Informing the name of the list Personal demand deposit account
col_name Field name Date of account opening
whether Judgment logic Cannot be used
check Verification description 99991231
date Reporting date Batch time
In this embodiment, for example, the report request is [ length check: the individual demand deposit account opening date length is 8 ], split into by using open source word segmentation software: the length check personal demand deposit account opening date length is 8, and the check rule can be maintained in a format check parameter table according to the length check in the word segmentation result to obtain the table 5.
TABLE 5
Field name Field description Remarks
table_name Informing the name of the list Personal demand deposit account
col_name Field name Date of account opening
length Length of 8
date Reporting date Batch time
In this embodiment, according to the information in the verification parameter set required for each report, a structured query language of each verification rule may be generated and executed, so as to extract the value of the field to be reported, which does not conform to the verification rule. For example, the structured query language template for the format verification corresponding to Table 4 may be: select count (1) FROM report related source table WHERE DATE = "AND khrq not IN ('check'); the corresponding length-verified structured query language templates of table 5 may be: source table WHERE DATE = "AND LENGTH (col_name) =length" related to Select count (1) FROM report. Of course, the structured query language is not limited to the above examples, and other modifications may be made by those skilled in the art in light of the technical spirit of the embodiments of the present disclosure, and it should be understood that the present disclosure is also intended to cover all the functions and effects implemented by the embodiments of the present disclosure as long as they are the same or similar to the functions and effects of the embodiments of the present disclosure.
In this embodiment, the word segmentation technology may be used to segment the reporting requirement corresponding to each verification rule, so as to obtain a verification parameter set and a verification type of each reporting requirement, and because different verification types may have different verification logics and different verification modes, the corresponding structured query language may be automatically generated for different verification types, so as to verify the specification of the data to be reported.
In one embodiment, the set of verification parameters includes: table name, field name, judgment logic, check description, report date and other parameters. It will be understood, of course, that the above verification parameter set may further include other data, for example, association relationships, etc., which may be specifically determined according to practical situations, and this embodiment of the present disclosure is not limited thereto.
In one embodiment, executing the structured query language of each check rule based on each first subset of data to be sent to obtain the check result of each first subset of data to be sent may include: and determining the structured query language of at least one check rule matched with the field to be reported of each first data subset to be reported, and executing the structured query language of the matched at least one check rule on each first data subset to be reported to obtain the check result of each first data subset to be reported.
In this embodiment, since the fields to be reported related to each first subset of data to be reported are different, each first subset of data to be reported is checked by using a structured query language of at least one check rule matching the fields to be reported related thereto.
In this embodiment, since the target field to be reported may correspond to a structured query language with a plurality of check rules, the verification may be performed sequentially by using the structured query languages with a plurality of check rules, so that the second subset of data to be reported, which finally includes the target field to be reported, simultaneously conforms to the structured query language with a plurality of check rules corresponding to the second subset of data to be reported.
In this embodiment, each first data subset to be submitted is checked by using a structured query language of at least one check rule matched with the related field to be submitted, so that the checking efficiency is effectively improved.
In an embodiment, each set of data of the first to-be-reported data subset may further include a user number, and each set of data of the corresponding second to-be-reported data subset may also include a user number, and before the data reporting is performed based on the plurality of second to-be-reported data subsets, the method may further include: counting the number of user numbers in the second data subsets to be reported, and arranging the second data subsets to be reported with the same fields to be reported according to the descending order of the number of the user numbers to obtain a sequencing result. Furthermore, according to the sorting result, the reporting logic of different fields to be reported can be determined; the reporting logic is used for representing the access priority of each second data subset to be reported, wherein the second data subset to be reported has the same field to be reported.
In the present embodiment, for example, the table in which the service type is the individual living deposit includes: the fields to be reported in table a, table c and table d comprise account opening dates, and the account opening dates in table c are all checked by 100, namely 65 checked by 65 and 50 checked by 50, assuming that the total number of users involved in the live deposit business is 100. If the data is first fetched from the table c, it may be possible to directly obtain all the data to be reported, if the data in the table c is incomplete, it may be fetched from the table a again, and if the data to be reported is not obtained yet, it may be fetched from the table d again. However, if the number is first taken from the table a, the number is necessarily taken from the table c or the table d, so that the reporting logic ensures that the date of the final report is in accordance with the data specification, but ignores the problem of the execution efficiency, needs to be associated at least twice, consumes resources and reduces the running speed.
In this embodiment, the number of data meeting the verification rule in the second data subsets to be reported may be used as a characterization and sorted, and because the user number is a unique primary key, the number of user numbers in each second data subset to be reported may be counted, and each second data subset to be reported with the same field to be reported is arranged in descending order according to the number of user numbers, so as to obtain the optimal reporting logic of different fields to be reported, so as to determine the access priority of each second data subset to be reported with the same field to be reported. Therefore, the second data subset to be reported obtained through verification can be utilized to automatically complete continuous optimization of reporting logic, and the optimal reporting logic of different fields to be reported can be determined efficiently.
In one embodiment, according to the sorting result, determining the reporting logic of the different fields to be reported may include: setting a fetch priority according to the arrangement sequence of the second data subsets to be reported, which are identical in the fields to be reported, in the ordering result; wherein the front fetch priority is greater than the rear fetch priority. Furthermore, the reporting logic of different fields to be reported can be determined according to the access priority of the same second data subsets to be reported.
In the present embodiment, for example, the table in which the service type is the individual living deposit includes: the fields required to be reported in the tables a, c and d comprise account opening dates, wherein the account opening dates in the table c are totally checked by 100, namely, all the account opening dates in the table a are totally checked by 65, and the account opening dates in the table d are totally checked by 50. Since 100>65>50, the access priority of table c is greater than that of table a and greater than that of table d.
In this embodiment, the determined reporting logic of each reporting field may include the access priority of the second subset of data to be reported corresponding to each reporting field, so that the access may be efficiently performed according to the access priority of the second subset of data to be reported, to obtain the total data to be reported.
The above method is described below in connection with a specific embodiment, however, it should be noted that this specific embodiment is only for better illustrating the embodiments of the present specification, and is not meant to be unduly limiting.
The implementation of the invention provides a data reporting method, which can comprise the following steps:
step 1: and for the reporting specification issued by each supervision organization, performing word segmentation processing through open source word segmentation software to obtain a table name, contained service types and fields to be reported.
For example: the method comprises the steps of obtaining data of three business types, namely, personal demand deposit, account foreign exchange and account noble metal, from a report specification of a supervision department, wherein the fields requiring report comprise a demand deposit account number, deposit balance, account name and account opening date.
Step 2: and determining the total data required to be reported by the individual demand deposit account.
Because the account of the individual demand deposit needs to report the data of the service types of the individual demand deposit, the account foreign exchange and the account noble metal, the source tables of the service types of the individual demand deposit service, the account foreign exchange service and the account noble metal service can be screened out from the T table of the data table. The structured query language of the screening may be SELECT FROM T WHERE service type= 'person live deposit', etc.
The user numbers in all source tables with the service type of personal live deposit can be extracted through the structured query language and inserted into the user number column of the table gamma after duplication removal, so that all the user numbers with personal live deposit service are obtained, and the user numbers can be shown in the table 2. The personal demand deposit account service in the personal demand deposit account separating account is to report the demand deposit account numbers, deposit balances, account names and account opening dates of all users in the table gamma.
By doing the same, the table delta of the user numbers of all the account foreign exchange services and the table xi of the user numbers of all the account noble metal services can be obtained. All data are related in the table gamma, the table delta and the table xi, namely the total data required to be reported by the individual demand deposit account.
Furthermore, three source tables of the personal movable deposit business sharing table a, the table c and the table d can be determined according to the T table in the data center table, and fields needing to be reported, such as a user number, a movable deposit account number, a deposit balance, an account name, an account opening date and the like, in the table a, the table c and the table d are taken out from the data center table, so that the table a ', the table c ' and the table d ' are obtained and stored in a database to be reported for later verification. Correspondingly, fields needing to be reported such as a user number, a demand deposit account number, a deposit balance, an account name, an account opening date and the like in a source list of the account foreign exchange service and the account noble metal service can be obtained and stored in a database to be reported.
Step 3: verification is performed using a structured query language.
And separating out the table names, the field names and the reporting requirements of each verification rule formulated by the supervision authorities for the reporting results by using open source word segmentation software, and classifying the verification rules according to the table names and the field names in the word segmentation results, thereby obtaining the table 3.
Taking account opening date as an example, format check: the account opening date of the individual demand deposit account cannot be 99991231 ], and the individual demand deposit account opening date is split into the following parts by using open source word segmentation software: the format check personal demand deposit account date cannot be 99991231, and this check rule may be maintained in the format check parameter table according to the "format check" in the word segmentation result, to obtain table 4. Length check: the individual demand deposit account opening date length is 8 ], split into by using open source word segmentation software: the length check personal demand deposit account opening date length is 8, and the check rule can be maintained in a format check parameter table according to the length check in the word segmentation result to obtain the table 5.
The structured query language for checking the date of opening can be generated according to tables 4 and 5, the structured query language template of the corresponding service type can be utilized to generate the corresponding structured query language for each source table, at this time, the user number and the date of opening in the source table can be used as a data subset for checking, and other fields to be reported in the source table can be ignored, so that checking can be performed for different fields to be reported.
Taking the table a ', the table c ' and the table d ' related to the deposit service of the deposit of the demand deposit as an example, the generated structured query language for checking the date of opening an account can be:
Format verification (SQL 1-1): select count (1) FROM table a ' WHERE DATE = '2020-12-31'AND khrq not IN (' 99991231 '); select count (1) FROM table c ' WHERE DATE = '2020-12-31'AND khrq not IN (' 99991231 '); select count (1) FROM table d ' WHERE DATE = '2020-12-31'AND khrq not IN (' 99991231 ').
Length check (SQL 2-1): select count (1) FROM table a ' WHERE DATE = '2020-12-31' and length (khrq) =8; select count (1) FROM table c ' WHERE DATE = '2020-12-31' and length (khrq) =8; select count (1) FROM table d ' WHERE DATE = '2020-12-31' and length (khrq) =8.
Further, the service types in the account of the individual demand deposit can be respectively the table a ', the table c ' and the table d ' in the individual demand deposit, and the fields corresponding to the opening date are respectively the a, the c, the d, the opening date through fuzzy matching tools of the opening source such as Power Query (Query enhancement), ELASTIC SEARCH and the like. And checking three fields of a. Date of opening an account, c. Date of opening an account and d. Date of opening an account through the generated check SQL1-1 and check SQL2-1, and deriving a check result. Wherein ELASTIC SEARCH is a search server that provides a distributed multi-user capable full-text search engine that advantageously enables searching, analysis and exploration of large amounts of data.
Firstly checking a. Account opening date, c. Account opening date and d. Account opening date through SQL1-1, assuming that 100 users related to personal living deposit service are total, the user numbers corresponding to the 100 users are recorded in a table gamma, after the 100 user numbers are related in a table a' through the user numbers, 80 account opening date values in the corresponding a. Account opening date can be obtained through SQL1-1, 85 account opening date values in the c. Account opening date can be obtained through SQL1-1, and 90 account opening date values in the d. Account opening date can be obtained through SQL 1-1.
TABLE 6
Fields Meet the total amount of SQL1-1
A. Date of account opening 80
C. Date of account opening 85
D. Date of account opening 90
Checking the date of opening a, the date of opening c and the date of opening d by checking the date value of opening SQL1-1 and then using SQL 2-1. For example, a. The date of opening corresponding to 80 user numbers through SQL1-1 passes 65 checks through SQL2-1, which 65 satisfies both SQL1-1 and SQL2-1, i.e., all checks related to the date of opening.
And c, checking the date of opening the account and d, checking the date of opening the account in two rounds in the same way. c. 80 account dates corresponding to 85 user numbers passing through SQL1-1 pass through SQL2-1, and 50 account dates corresponding to 90 user numbers passing through SQL1-1 pass through SQL 2-1. Assuming that the result of the test is as follows, "the total m satisfying SQL1-1 and satisfying SQL 2-1" is that in all source tables of which the service type is a personal living deposit, the date of opening field passes SQL1-1 and passes the total of SQL2-1, and Table 7 is obtained.
TABLE 7
Fields Sum of the total sum satisfying the check 1-1 and the check 2-1
A. Date of account opening 65
C. Date of account opening 80
D. Date of account opening 50
Step 4: and determining reporting logic.
Taking the account opening date as an example, the total amount satisfying SQL1-1 and satisfying check SQL2-1 can be ordered, and the ordering result is 80>65>50. After the sorting result is obtained, the access priority of the account opening date can be determined according to the sorting result, so that the final reporting logic is obtained. The determined reporting logic may be: firstly, according to the user number in the table gamma, taking the corresponding account opening date from the c. account opening date with the highest check number, inserting the account opening date into the table gamma, taking the corresponding account opening date from the a. account opening date under the condition that the empty value still exists in the account opening date of the table gamma, inserting the corresponding account opening date into the table gamma, and taking the corresponding account opening date from the d. account opening date and inserting the corresponding account opening date into the table gamma if the empty value still exists in the account opening date of the table gamma.
In this embodiment, the corresponding date of opening an account is first taken from the c. date of opening an account and inserted into the table γ, and the corresponding structured query language is:
select c, date of opening as date of opening
From Table gamma
LEFT JOIN Table c
On table γ. User number=table c. User number
Wherein, the LEFT JOIN is a query type in the structured query language, namely a JOIN query, which is called LEFT outer JOIN (LEFT outer JOIN), which is one of the outer JOINs. After the corresponding account opening date is taken from the c-account opening date according to the user number in the table gamma and is inserted into the table gamma, null value judgment can be carried out on the account opening date field in the table gamma: select count (1) from table γsphere date of opening is null. If the return value of the structured query language is 0, the acquisition of the date data of all users in the table gamma can be completed through c. If the return value is not 0, it indicates that there is still a null value in the date of opening in the table gamma, and the corresponding date of opening is required to be taken from the a. The above logic may be repeated until the date of the opening in table gamma has no null value.
The determination of the optimal reporting logic of the account opening date is described above, other fields can also be determined according to the same logic concept, and the repetition is not repeated. The method can complete the determination of the optimal reporting logic of the individual demand deposit date field in the individual demand deposit account, and the determination of the optimal reporting logic of the account foreign exchange can be completed by the same way through the source table with the business type of the account foreign exchange; and determining the optimal reporting logic of the account noble metal through the source table with the service type of the account noble metal. Therefore, the table gamma, the table delta and the table xi after filling the data can be obtained, and the user numbers in the table gamma, the table delta and the table xi can be removed and combined because the user numbers are not fields to be reported, so that the data of the individual demand deposit account can be finally used for reporting.
According to the difference of each supervision report, the report may be carried out once a month, once a quarter or once a year, and specifically may be determined according to the actual situation, which is not limited in the embodiment of the present specification. Before each report, the corresponding report logic needs to be determined. For example, the supervision requires that the full-size data be submitted once per quarter, and assuming that the full-size data is 100 when submitted in the last quarter, the total number of the check numbers in table c is 80, the total number of the check numbers in table a is 65, the total number of the check numbers in table d is 60, and the reporting logic determined in the last quarter is the first reporting logic. When the first reporting logic is used for reporting the data in the quarter, the total data to be reported is 120, the number of the check numbers in the table a is 90, the number of the over-check numbers in the table d is 80, the number of the check numbers in the table c is 65, and the first reporting logic determined in the quarter is not used any more and needs to be determined again.
In this embodiment, the verification specification may be analyzed first, and the data may be verified and screened multiple times according to the verification rule, so as to determine the optimal reporting logic, and generate the data that best meets the requirement of supervision reporting. In the time of mass data, the working steps from manual analysis of the access logic and analysis of error data supervision standards to formation of processing logic are effectively reduced, and supervision reporting accuracy and timeliness are improved. Through the word segmentation technology and the fuzzy matching application, the communication cost can be reduced, and the automatic generation of the newspaper data is realized. The data is firstly checked and screened, and the report data is regenerated, so that all fields in the report are selected from the values which most accord with the supervision standard in the existing mass data, and the report quality is greatly improved.
Based on the same inventive concept, the embodiments of the present disclosure further provide a data reporting device, as follows. Because the principle of the data reporting device for solving the problem is similar to that of the data reporting method, the implementation of the data reporting device can refer to the implementation of the data reporting method, and the repetition is omitted. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated. Fig. 2 is a block diagram of a data reporting device according to the embodiment of the present disclosure, and as shown in fig. 2, may include: the first acquisition module 201, the second acquisition module 202, the processing module 203, and the datagram module 204 are described below.
A first obtaining module 201, configured to obtain a target data set to be reported; the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, and each group of data at least comprises a value of a field to be reported.
A second acquisition module 202, which may be used to acquire a verification data set; the check data set contains check rules of the field to be reported.
The processing module 203 may be configured to remove values of to-be-reported fields that do not conform to the check rule in each first to-be-reported data subset by using the check data set, so as to obtain a plurality of second to-be-reported data subsets.
The data reporting module 204 may be configured to perform data reporting based on the plurality of second data subsets to be reported.
The embodiment of the present disclosure further provides an electronic device, which may specifically refer to a schematic structural diagram of an electronic device based on the data reporting method provided in the embodiment of the present disclosure shown in fig. 3, where the electronic device may specifically include an input device 31, a processor 32, and a memory 33. Wherein the input device 31 may be used for inputting the target data set to be reported and the verification data set in particular. The processor 32 may be specifically configured to remove values of to-be-reported fields that do not conform to the check rule in each first to-be-reported data subset by using the check data set, to obtain a plurality of second to-be-reported data subsets; and carrying out data reporting based on the plurality of second data subsets to be reported. The memory 33 may be specifically configured to store parameters such as a plurality of second data subsets to be reported.
In this embodiment, the input device may specifically be one of the main means for exchanging information between the user and the computer system. The input device may include a keyboard, mouse, camera, scanner, light pen, handwriting input board, voice input apparatus, etc.; the input device is used to input raw data and a program for processing these numbers into the computer. The input device may also obtain data transmitted from other modules, units, and devices. The processor may be implemented in any suitable manner. For example, a processor may take the form of, for example, a microprocessor or processor, and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application SPECIFIC INTEGRATED Circuits (ASICs), programmable logic controllers, and embedded microcontrollers, among others. The memory may in particular be a memory device for storing information in modern information technology. The memory may comprise a plurality of levels, and in a digital system, may be memory as long as binary data can be stored; in an integrated circuit, a circuit with a memory function without a physical form is also called a memory, such as a RAM, a FIFO, etc.; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card, and the like.
In this embodiment, the specific functions and effects of the electronic device may be explained in comparison with other embodiments, which are not described herein.
The embodiment of the present specification further provides a computer storage medium based on the datagram method, where the computer storage medium stores computer program instructions, and when the computer program instructions are executed, the implementation may be that: acquiring a target data set to be reported; the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, and each group of data at least comprises a value of a field to be reported; acquiring a verification data set; wherein, the check data set contains the check rule of the field to be reported; removing values of to-be-reported fields which do not accord with the check rule in each first to-be-reported data subset by using the check data set to obtain a plurality of second to-be-reported data subsets; and carrying out data reporting based on the plurality of second data subsets to be reported.
In the present embodiment, the storage medium includes, but is not limited to, a random access Memory (Random Access Memory, RAM), a Read-Only Memory (ROM), a Cache (Cache), a hard disk (HARD DISK DRIVE, HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.
In this embodiment, the functions and effects of the program instructions stored in the computer storage medium may be explained in comparison with other embodiments, and are not described herein.
It will be apparent to those skilled in the art that the modules or steps of the embodiments described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module. Thus, embodiments of the present specification are not limited to any specific combination of hardware and software.
Although the present description provides the method operational steps as described in the above embodiments or flowcharts, more or fewer operational steps may be included in the method, either on a routine or non-inventive basis. In steps where there is logically no necessary causal relationship, the execution order of the steps is not limited to the execution order provided in the embodiments of the present specification. The described methods, when performed in an actual apparatus or an end product, may be performed sequentially or in parallel (e.g., in a parallel processor or multithreaded environment) as shown in the embodiments or figures.
It is to be understood that the above description is intended to be illustrative, and not restrictive. Many embodiments and many applications other than the examples provided will be apparent to those of skill in the art upon reading the above description. The scope of the embodiments of the specification should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
The above description is only of the preferred embodiments of the present embodiments and is not intended to limit the present embodiments, and various modifications and variations can be made to the present embodiments by those skilled in the art. Any modification, equivalent replacement, improvement, or the like made within the spirit and principles of the embodiments of the present specification should be included in the protection scope of the embodiments of the present specification.

Claims (10)

1. A method of reporting data, comprising:
Acquiring a target data set to be reported; the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, and each group of data at least comprises a value of a field to be reported;
Acquiring a verification data set; wherein the check data set contains check rules of fields to be reported;
Removing values of fields to be reported which do not accord with the check rule in each first data subset to be reported by using the check data set to obtain a plurality of second data subsets to be reported;
carrying out data reporting based on the plurality of second data subsets to be reported;
The method for acquiring the target data set to be reported comprises the following steps:
acquiring a report specification data set; wherein, the report specification data set contains a plurality of groups of data, and each group of data contains: table name, contained service type, field to be reported;
determining a target service type corresponding to a target table name in the reporting specification data set;
Acquiring a plurality of source tables with the service type being the target service type from a data center; wherein, the source list contains user number and field to be reported;
Under the condition that a plurality of fields to be reported corresponding to the target service type are determined, each field column to be reported and the user number column in each source table are respectively used as a first data subset to be reported by utilizing a fuzzy matching technology, so that a plurality of first data subsets to be reported are obtained;
taking a plurality of first data subsets to be reported of the target service type as target data sets to be reported;
After obtaining the plurality of source tables with the service types being the target service types from the data center station, the method further comprises the following steps:
The user numbers in the source tables are taken out through the structured query language, are subjected to duplication removal and are inserted into the user number column of the first summary table, and a second summary table is obtained; the first summary table comprises a user number and a field to be reported corresponding to the target service type, and the value of the field to be reported in the first summary table is null;
correspondingly, the data reporting is performed based on the plurality of second data subsets to be reported, which comprises the following steps:
Respectively writing the values of the corresponding fields to be reported in each second data subset to be reported, which are the same as the fields to be reported, into the second summary table to obtain a third summary table;
And reporting data based on the third summary table.
2. The method of claim 1, wherein removing values of fields to be reported that do not meet the check rule in each first subset of data to be reported using the check data set to obtain a plurality of second subsets of data to be reported, comprising:
Performing word segmentation on each check rule in the check data set to obtain a word segmentation result of each check rule; the word segmentation result comprises a table name, a field name and a reporting requirement corresponding to a verification rule;
Generating a structured query language of each check rule according to the word segmentation result;
Executing the structured query language of each check rule based on each first data subset to be sent to obtain a check result of each first data subset to be sent;
and removing the values of the to-be-reported fields which do not accord with the verification rule in the first to-be-reported data subsets according to the verification result to obtain a plurality of second to-be-reported data subsets.
3. The method of claim 2, wherein generating a structured query language for each check rule based on the word segmentation results comprises:
based on the word segmentation result, word segmentation is carried out on the reporting requirements corresponding to each verification rule, and a verification parameter set and a verification type of each reporting requirement are obtained;
Acquiring a parameter table of a structured query language corresponding to each check type;
And generating the structured query language of each check rule by using the check parameter set of each reporting requirement according to the parameter table of the structured query language corresponding to each check type.
4. A method according to claim 3, wherein the set of verification parameters comprises: table name, field name, judgment logic, check description and report date.
5. The method of claim 2, wherein executing the structured query language for each check rule based on the respective first subset of data to be sent results in the respective first subset of data to be sent, comprises:
Determining a structured query language of at least one check rule matched with the field to be reported of each first data subset to be reported;
And executing the matched structured query language of at least one check rule on each first data subset to be submitted to obtain check results of each first data subset to be submitted.
6. The method of claim 1, wherein each set of data of the first subset of data to be reported further comprises a user number, and wherein prior to reporting data based on the plurality of second subsets of data to be reported, further comprising:
Counting the number of user numbers in the second data subset to be reported;
Arranging all second data subsets to be reported, which are identical in field to be reported, in descending order according to the number of user numbers to obtain a sequencing result;
According to the sorting result, determining the reporting logic of different fields to be reported; the reporting logic is configured to characterize the access priority of each second subset of data to be reported, where the second subset of data to be reported has the same field to be reported.
7. The method of claim 6, wherein determining the reporting logic for the different fields to be reported based on the ordering result comprises:
Setting a fetch priority according to the arrangement sequence of the second data subsets to be reported, which are identical in the fields to be reported, in the sequencing result; wherein, the front fetch priority is greater than the rear fetch priority;
and determining the reporting logic of different fields to be reported according to the access priority of each second data subset to be reported, which is the same with the fields to be reported.
8. A datagram delivery device, comprising:
The first acquisition module is used for acquiring a target data set to be reported; the target data set to be reported comprises a plurality of first data subsets to be reported of a target service type, each first data subset to be reported comprises a plurality of groups of data, and each group of data at least comprises a value of a field to be reported;
the second acquisition module is used for acquiring a check data set; wherein the check data set contains check rules of fields to be reported;
the processing module is used for removing the values of the to-be-reported fields which do not accord with the check rule in each first to-be-reported data subset by utilizing the check data set to obtain a plurality of second to-be-reported data subsets;
the data reporting module is used for reporting data based on the plurality of second data subsets to be reported;
The method for acquiring the target data set to be reported comprises the following steps:
acquiring a report specification data set; wherein, the report specification data set contains a plurality of groups of data, and each group of data contains: table name, contained service type, field to be reported;
determining a target service type corresponding to a target table name in the reporting specification data set;
Acquiring a plurality of source tables with the service type being the target service type from a data center; wherein, the source list contains user number and field to be reported;
Under the condition that a plurality of fields to be reported corresponding to the target service type are determined, each field column to be reported and the user number column in each source table are respectively used as a first data subset to be reported by utilizing a fuzzy matching technology, so that a plurality of first data subsets to be reported are obtained;
taking a plurality of first data subsets to be reported of the target service type as target data sets to be reported;
After obtaining the plurality of source tables with the service types being the target service types from the data center station, the method further comprises the following steps:
The user numbers in the source tables are taken out through the structured query language, are subjected to duplication removal and are inserted into the user number column of the first summary table, and a second summary table is obtained; the first summary table comprises a user number and a field to be reported corresponding to the target service type, and the value of the field to be reported in the first summary table is null;
correspondingly, the data reporting is performed based on the plurality of second data subsets to be reported, which comprises the following steps:
Respectively writing the values of the corresponding fields to be reported in each second data subset to be reported, which are the same as the fields to be reported, into the second summary table to obtain a third summary table;
And reporting data based on the third summary table.
9. A datagram device comprising a processor and a memory for storing processor executable instructions which when executed implement the steps of the method of any of claims 1 to 7.
10. A computer readable storage medium, having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of any of claims 1 to 7.
CN202110141803.9A 2021-02-02 2021-02-02 Data reporting method, device and equipment Active CN112948429B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110141803.9A CN112948429B (en) 2021-02-02 2021-02-02 Data reporting method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110141803.9A CN112948429B (en) 2021-02-02 2021-02-02 Data reporting method, device and equipment

Publications (2)

Publication Number Publication Date
CN112948429A CN112948429A (en) 2021-06-11
CN112948429B true CN112948429B (en) 2024-04-26

Family

ID=76241451

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110141803.9A Active CN112948429B (en) 2021-02-02 2021-02-02 Data reporting method, device and equipment

Country Status (1)

Country Link
CN (1) CN112948429B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113450077A (en) * 2021-07-06 2021-09-28 中国工商银行股份有限公司 Foreign exchange voucher processing method and device
CN113468211A (en) * 2021-07-19 2021-10-01 中国银行股份有限公司 Report form checking method and device, electronic equipment and storage medium
CN114022031A (en) * 2021-11-23 2022-02-08 中国工商银行股份有限公司 Data processing method, data processing apparatus, electronic device, medium, and computer program product
CN115186023B (en) * 2022-09-07 2022-12-06 杭州安恒信息技术股份有限公司 Data set generation method, device, equipment and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107870827A (en) * 2017-11-07 2018-04-03 中国银行股份有限公司 Data quality control method and device based on verification
CN107908725A (en) * 2017-11-14 2018-04-13 中国银行股份有限公司 A kind of batch data method of calibration, device and system
CN110163735A (en) * 2019-04-09 2019-08-23 平安科技(深圳)有限公司 Concerning taxes data processing method, device, computer equipment and storage medium
CN110473080A (en) * 2019-07-30 2019-11-19 阿里巴巴集团控股有限公司 A kind of report processing method, device and computer equipment
CN110515937A (en) * 2019-09-02 2019-11-29 中国农业银行股份有限公司 A kind of data verification method and device
CN111782718A (en) * 2020-08-11 2020-10-16 支付宝(杭州)信息技术有限公司 Plug-in data reporting system and data reporting method
CN112181962A (en) * 2020-09-25 2021-01-05 中国建设银行股份有限公司 Report form checking method, device, equipment and storage medium
CN112183039A (en) * 2020-09-16 2021-01-05 支付宝(杭州)信息技术有限公司 Compliance verification method and device for business report

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107870827A (en) * 2017-11-07 2018-04-03 中国银行股份有限公司 Data quality control method and device based on verification
CN107908725A (en) * 2017-11-14 2018-04-13 中国银行股份有限公司 A kind of batch data method of calibration, device and system
CN110163735A (en) * 2019-04-09 2019-08-23 平安科技(深圳)有限公司 Concerning taxes data processing method, device, computer equipment and storage medium
CN110473080A (en) * 2019-07-30 2019-11-19 阿里巴巴集团控股有限公司 A kind of report processing method, device and computer equipment
CN110515937A (en) * 2019-09-02 2019-11-29 中国农业银行股份有限公司 A kind of data verification method and device
CN111782718A (en) * 2020-08-11 2020-10-16 支付宝(杭州)信息技术有限公司 Plug-in data reporting system and data reporting method
CN112183039A (en) * 2020-09-16 2021-01-05 支付宝(杭州)信息技术有限公司 Compliance verification method and device for business report
CN112181962A (en) * 2020-09-25 2021-01-05 中国建设银行股份有限公司 Report form checking method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112948429A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN112948429B (en) Data reporting method, device and equipment
CN111222305B (en) Information structuring method and device
CN108509482A (en) Question classification method, device, computer equipment and storage medium
CN106021410A (en) Source code annotation quality evaluation method based on machine learning
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
CN110929125A (en) Search recall method, apparatus, device and storage medium thereof
CN109960727A (en) For the individual privacy information automatic testing method and system of non-structured text
JP2021022359A (en) Management system and management method
CN110990529B (en) Industry detail dividing method and system for enterprises
CN112052396A (en) Course matching method, system, computer equipment and storage medium
CN112783825B (en) Data archiving method, device, computer device and storage medium
CN103886092A (en) Method and device for providing terminal failure problem solutions
CN111506595B (en) Data query method, system and related equipment
CN110389941A (en) Database method of calibration, device, equipment and storage medium
CN102521713B (en) Data processing equipment and data processing method
US20110289086A1 (en) System, method and apparatus for data analysis
CN113628043A (en) Complaint validity judgment method, device, equipment and medium based on data classification
CN117668036A (en) Method and device for managing problem data, electronic equipment and storage medium
CN112579781A (en) Text classification method and device, electronic equipment and medium
CN117114142A (en) AI-based data rule expression generation method, apparatus, device and medium
CN116226108A (en) Data management method and system capable of realizing different management degrees
CN115422180A (en) Data verification method and system
CN111930911B (en) Rapid field question-answering method and device thereof
CN109785099B (en) Method and system for automatically processing service data information
CN113722421A (en) Contract auditing method and system and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant