CN112395415A - Report classification method and device, computer equipment and storage medium - Google Patents

Report classification method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN112395415A
CN112395415A CN202011202385.1A CN202011202385A CN112395415A CN 112395415 A CN112395415 A CN 112395415A CN 202011202385 A CN202011202385 A CN 202011202385A CN 112395415 A CN112395415 A CN 112395415A
Authority
CN
China
Prior art keywords
report
tab
current
historical
header
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011202385.1A
Other languages
Chinese (zh)
Other versions
CN112395415B (en
Inventor
杨志召
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Kingdee Tianyanyun Computing Co ltd
Original Assignee
Shenzhen Kingdee Tianyanyun Computing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Kingdee Tianyanyun Computing Co ltd filed Critical Shenzhen Kingdee Tianyanyun Computing Co ltd
Priority to CN202011202385.1A priority Critical patent/CN112395415B/en
Publication of CN112395415A publication Critical patent/CN112395415A/en
Application granted granted Critical
Publication of CN112395415B publication Critical patent/CN112395415B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/106Display of layout of documents; Previewing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a report classification method and device, computer equipment and a storage medium. The method comprises the following steps: and acquiring a current report and a historical report. And extracting the current tab and the current header data in the current report, and extracting the historical tab and the historical header data in the historical report. And determining the tab similarity of the current tab and the historical tabs, and determining the historical tabs with the tab similarity larger than a preset tab similarity threshold value as candidate tabs. And determining the header similarity of the current header data and the historical header data corresponding to the candidate page labels. And when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab. And displaying the report corresponding to the current tab belonging to the same report category on the same target page. By adopting the method, the report operation efficiency can be improved.

Description

Report classification method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for classifying a report, a computer device, and a storage medium.
Background
With the development of computer technology, many report controls appear on the market. The report control is a control for realizing the design and printing capability of the report by using graphics and data, and is an object on a screen in a graphical user interface, and a user can operate the object to execute various operations of design, modification and the like of the report and finally display the report. At present, after a report, for example, a report in an Excel format, is imported to a system platform on which a report control is deployed, a header and a body in an electronic form are often separated by the report control, and the body data is stored in a database, so that the report is displayed according to a format defined by the report control.
However, when the report is a complete set of report with multiple tabs, the current report display mode lacks intelligent classification of the report, and the links corresponding to each tab need to be clicked one by one and jumped to the corresponding pages to view the report data of the corresponding tab, which is tedious in operation and low in report operation efficiency.
Disclosure of Invention
Therefore, it is necessary to provide a report classification method, apparatus, computer device and storage medium capable of improving the report operation efficiency in view of the above technical problems.
A method of report classification, the method comprising:
acquiring a current report and a historical report;
extracting a current tab and current header data in the current report, and extracting historical tab and historical header data in the historical report;
determining the tab similarity of the current tab and the historical tab, and determining the historical tab of which the tab similarity is greater than a preset tab similarity threshold value as a candidate tab;
determining the header similarity of the current header data and the historical header data corresponding to the candidate tab;
when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab;
and displaying the report corresponding to the current tab belonging to the same report category on the same target page.
In one embodiment, the method further comprises:
when the maximum header similarity is less than or equal to a preset header similarity threshold, determining a root category corresponding to the current report;
attributing the current tab to the root category corresponding to the current report, and generating a classification page;
displaying a report category list in the classification page;
and determining the report category of the current tab in response to the classification operation on the report category list.
In one embodiment, the extracting the current tab and the current header data in the current report includes:
converting the current report into a corresponding workbook object;
traversing and extracting a current tab object in the workbook object;
extracting a current tab and current header data from the current tab object; and extracting the current header data according to the style of the current report.
In one embodiment, the method further comprises:
aiming at each current tab, taking the current tab as a key, taking current header data corresponding to the current tab as a value, and generating a key value pair corresponding to each current tab;
generating a relational mapping table corresponding to the current report according to the key value pair corresponding to each current tab;
aiming at each sub-history report in the history report, taking the year of the sub-history report as a key, taking the history tab and the history header data in the sub-history report as values, and generating a key value pair corresponding to each sub-history report; the data structures of the history tab and the history header data are that the history tab is used as a key, and the history header data is used as a value;
and generating a relational mapping table corresponding to the historical report according to the key value pair corresponding to each sub-historical report.
In one embodiment, the extracting of the historical tab and the historical header data in the historical report includes:
determining a sub-historical report corresponding to the year closest to the year of the current report in the historical reports as a target historical report;
and extracting the historical tab and the historical header data in the target historical report.
In one embodiment, the displaying the report corresponding to the current tab belonging to the same report category on the same target page includes:
displaying the current tab belonging to the same report type in a tab area of a target page;
displaying the report corresponding to the current tab in a report area of the target page;
and responding to a tab selection operation triggered in the tab area, and switching the report displayed in the report area into the report corresponding to the current tab determined by the tab selection operation.
A report sorting apparatus, the apparatus comprising:
the acquisition module is used for acquiring a current report and a historical report;
the extraction module is used for extracting the current tab and the current header data in the current report and extracting the historical tab and the historical header data in the historical report;
the determining module is used for determining the tab similarity of the current tab and the historical tab and determining the historical tab of which the tab similarity is greater than a preset tab similarity threshold value as a candidate tab; determining the header similarity of the current header data and the historical header data corresponding to the candidate tab; when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab;
and the display module is used for displaying the report corresponding to the current tab belonging to the same report category on the same target page.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring a current report and a historical report;
extracting a current tab and current header data in the current report, and extracting historical tab and historical header data in the historical report;
determining the tab similarity of the current tab and the historical tab, and determining the historical tab of which the tab similarity is greater than a preset tab similarity threshold value as a candidate tab;
determining the header similarity of the current header data and the historical header data corresponding to the candidate tab;
when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab;
and displaying the report corresponding to the current tab belonging to the same report category on the same target page.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring a current report and a historical report;
extracting a current tab and current header data in the current report, and extracting historical tab and historical header data in the historical report;
determining the tab similarity of the current tab and the historical tab, and determining the historical tab of which the tab similarity is greater than a preset tab similarity threshold value as a candidate tab;
determining the header similarity of the current header data and the historical header data corresponding to the candidate tab;
when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab;
and displaying the report corresponding to the current tab belonging to the same report category on the same target page.
According to the report classification method, the report classification device, the computer equipment and the storage medium, the current tab and the current header data in the current report are extracted and the historical tab and the historical header data in the historical report are extracted by acquiring the current report and the historical report. And determining the similarity of the current tab and the historical tabs, and determining the historical tabs with the tab similarity larger than a preset tab similarity threshold as candidate tabs so as to narrow the report classification range. And determining the header similarity of the current header data and the historical header data corresponding to the candidate page labels. And when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab. And displaying the report corresponding to the current tab belonging to the same report category on the same target page. Therefore, through calculating the similarity of the current page labels and the page labels of the historical page labels and calculating the similarity of the header of the current header data and the header of the historical header data, the imported multi-page label report forms are intelligently classified according to the similarity of the page labels and the similarity of the header, so that the unified operation of the report forms in the same category is facilitated, the report forms in the same report form category are displayed in the same page, the page frequent skipping is avoided, the display effect of the report forms is more friendly, and the operation efficiency of the report forms is improved.
Drawings
FIG. 1 is a diagram illustrating an exemplary embodiment of a report classification method;
FIG. 2 is a flowchart illustrating a report classification method according to an embodiment;
FIG. 3 is a diagram illustrating a report display effect according to the present application in one embodiment;
FIG. 4 is a diagram illustrating a report display effect according to the prior art in one embodiment;
FIG. 5 is a diagram illustrating a report display effect according to the prior art in another embodiment;
FIG. 6 is a flowchart illustrating a report classification method according to another embodiment;
FIG. 7 is a block diagram illustrating an exemplary report sorter;
FIG. 8 is a block diagram of a report sorting apparatus according to another embodiment;
FIG. 9 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The report classification method provided by the application can be applied to the application environment shown in fig. 1. The application environment includes a terminal 102 and a server 104. The terminal 102 and the server 104 communicate via a network. The terminal 102 may specifically include a desktop terminal or a mobile terminal. The mobile terminal may specifically include at least one of a mobile phone, a tablet computer, a notebook computer, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers. Those skilled in the art will understand that the application environment shown in fig. 1 is only a part of the scenario related to the present application, and does not constitute a limitation to the application environment of the present application.
The terminal 102 obtains the current report and the historical report from the server 104. The terminal 102 extracts the current tab and the current header data in the current report, and extracts the historical tab and the historical header data in the historical report. The terminal 102 determines the tab similarity of the current tab and the historical tab, and determines the historical tab with the tab similarity larger than a preset tab similarity threshold as a candidate tab. The terminal 102 determines the header similarity of the current header data and the historical header data corresponding to the candidate tab. When the maximum header similarity is greater than the preset header similarity threshold, the terminal 102 determines the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab. The terminal 102 displays the report corresponding to the current tab belonging to the same report category on the same target page.
In an embodiment, as shown in fig. 2, a report classification method is provided, which is described by taking the method as an example applied to the terminal 102 in fig. 1, and includes the following steps:
s202, acquiring a current report and a historical report.
The current report is a report to be classified in the current import terminal, and the historical report is a classified report in the historical import terminal.
Specifically, a system platform for managing reports is operated in the terminal, and the terminal can acquire historical reports from a server of the system platform through the system platform. Meanwhile, the user can import the current report to the terminal, and the terminal can acquire the current report imported by the user. In one embodiment, the terminal can also obtain the current report from the server storing the current report.
S204, extracting the current tab and the current header data in the current report, and extracting the historical tab and the historical header data in the historical report.
Specifically, the current report includes a current tab and current header data, and the historical report includes a historical tab and historical header data. The terminal can extract the current tab and the current header data in the current report and extract the historical tab and the historical header data in the historical report.
In an embodiment, when the current report and the historical report are in an Excel document format, the terminal may analyze and extract a current tab and current header data in the current report and analyze and extract historical tab and historical header data in the historical report through a POI (point of automation). The POI is a free-source cross-platform Java API (Application Programming Interface) written in Java, and is generally used to manipulate related files in the Excel document format.
S206, determining the tab similarity of the current tab and the historical tabs, and determining the historical tabs with the tab similarity larger than a preset tab similarity threshold as candidate tabs.
Wherein, the tab similarity is the similarity between the current tab and the historical tab.
Specifically, the terminal can calculate the tab similarity of each current tab and each history tab respectively. The terminal can define a preset tab similarity threshold locally, and the preset tab similarity threshold can be used for screening out historical tabs with higher similarity to the current tabs from all historical tabs. It can be understood that the similarity of the current tab and the historical tab is greater than the preset tab similarity threshold, which indicates that the similarity of the current tab and the historical tab is higher. Furthermore, the terminal can determine the historical tab with the tab similarity larger than the preset tab similarity threshold as a candidate tab.
And S208, determining the header similarity of the current header data and the historical header data corresponding to the candidate page.
The header similarity is the similarity between the current header data and the historical header data.
Specifically, each candidate tab includes history header data. The terminal can determine the header similarity of each current header data and each historical header data corresponding to the candidate tab.
In an embodiment, the similarity between the current tab and the historical tab and the header similarity between the current header data and the historical header data corresponding to the candidate tab may be calculated by any one of similarity calculation methods including euclidean distance, pearson correlation coefficient, cosine similarity, and Tanimoto coefficient.
S210, when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab.
The report type is a type obtained by classifying according to data recorded in the report.
Specifically, the terminal may locally define a preset header similarity threshold. The terminal can determine the report type to which each candidate tab belongs. When the header similarity of each current header data and each historical header data corresponding to the candidate tab is obtained, the terminal can compare the maximum header similarity with a preset header similarity threshold. When the maximum header similarity is greater than the preset header similarity threshold, the terminal can determine the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab.
S212, the report corresponding to the current tab belonging to the same report category is displayed on the same target page.
Specifically, after determining the report type to which the report corresponding to each current tab belongs, the terminal may bind the reports corresponding to the current tabs belonging to the same report type, and then the terminal may display the reports corresponding to the current tabs belonging to the same report type in the same target page after binding.
In the report classification method, the current tab and the current header data in the current report are extracted and the historical tab and the historical header data in the historical report are extracted by acquiring the current report and the historical report. And determining the similarity of the current tab and the historical tabs, and determining the historical tabs with the tab similarity larger than a preset tab similarity threshold as candidate tabs so as to narrow the report classification range. And determining the header similarity of the current header data and the historical header data corresponding to the candidate page labels. And when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab. And displaying the report corresponding to the current tab belonging to the same report category on the same target page. Therefore, through calculating the similarity of the current page labels and the page labels of the historical page labels and calculating the similarity of the header of the current header data and the header of the historical header data, the imported multi-page label report forms are intelligently classified according to the similarity of the page labels and the similarity of the header, so that the unified operation of the report forms in the same category is facilitated, the report forms in the same report form category are displayed in the same page, the page frequent skipping is avoided, the display effect of the report forms is more friendly, and the operation efficiency of the report forms is improved.
In one embodiment, the report classification method further comprises the following steps: when the maximum header similarity is less than or equal to a preset header similarity threshold, determining a root category corresponding to the current report; attributing the current tab to the root category corresponding to the current report, and generating a classification page; displaying a report category list in the classification page; and determining the report category of the current tab in response to the classification operation on the report category list.
Wherein the root category is the most basic category for classifying the current report. For example, the current report is a report of 2019 years, the terminal needs to classify the report of 2019 years, and the root category is 2019 years.
In an embodiment, the report category in the report category list may specifically be a report category corresponding to the historical report, or may be a self-defined report category.
In an embodiment, the classifying operation on the report category list may specifically include that the user directly selects the report category listed in the report category list, so as to directly determine the report category of the current tab. The classifying operation of the report type list can specifically include that a user creates and defines report types which are not listed in the report type list in a new mode and defines the report types which are not listed in the report type list in a user-defined mode, and then the defined report types are determined to be the report types of the current tabs.
In the above embodiment, when the maximum header similarity is less than or equal to the preset header similarity threshold, the report category list is displayed, so that a user is supported to newly create or directly select a report category in the report category list, and a report category to which a report that cannot be intelligently classified by calculating the header similarity belongs is determined, thereby ensuring that each current report can be successfully classified.
In an embodiment, the step of extracting the current tab and the current header data in the current report in step S204 specifically includes: converting the current report into a corresponding workbook object; traversing and extracting a current tab object in the workbook object; extracting a current tab and current header data from the current tab object; and extracting the current header data according to the style of the current report.
Specifically, the terminal can convert the current report into a corresponding workbook object through a conversion function in the POI, and traverse and extract a current tab object in the workbook object, where the current tab object includes a current tab and current header data. The terminal can directly extract the current tab from the current tab object and extract the current header data according to the style of the current report.
In the above embodiment, the current report is converted into the corresponding workbook object, the current tab object in the workbook object is extracted in a traversing manner, and the current tab and the current header data are extracted from the current tab object, so that the extraction efficiency of the current tab and the current header data is improved. The current header data is extracted according to the style of the current report, so that the data format of the current header data is consistent with the style data format of the current report, and the header similarity between the current header data and the historical header data can be calculated conveniently.
In one embodiment, the report classification method further comprises the following steps: aiming at each current tab, taking the current tab as a key, taking current header data corresponding to the current tab as a value, and generating a key value pair corresponding to each current tab; generating a relational mapping table corresponding to the current report according to the key value pair corresponding to each current tab; aiming at each sub-history report in the history report, taking the year of the sub-history report as a key, taking the history tab and the history header data in the sub-history report as values, and generating a key value pair corresponding to each sub-history report; the data structure of the history tab and the history header data is that the history tab is used as a key, and the history header data is used as a value; and generating a relational mapping table corresponding to the historical report according to the key value pair corresponding to each sub-historical report.
The relational mapping table corresponding to the current report is a mapping table for recording the corresponding relation between each current tab and each current header data in the current report. The relation mapping table corresponding to the historical report is a mapping table for recording the corresponding relation of each sub-historical report and each historical tab in the historical report and each historical header data. The sub-history report is a history report corresponding to each year in the history report.
In the above embodiment, the data structures of the current tab and the current header data are defined as key value pairs, so as to generate the mapping table corresponding to the current report. And defining the year of the sub-historical report and the data structures of the historical page tags and the historical header data in the sub-historical report as key value pairs, and generating a relational mapping table corresponding to the historical report. Therefore, the tab similarity between each current tab and each historical tab and the header similarity between each current header data and each historical header data can be calculated, and the calculation efficiency is improved.
In an embodiment, the step of extracting the history tab and the history header data in the history report in step S204 specifically includes: determining a sub-historical report corresponding to the year closest to the year of the current report in the historical reports as a target historical report; and extracting historical page tags and historical header data in the target historical report.
For example, the historical report may include 9-year sub-historical reports from 2010 to 2018, and the current report currently imported to the terminal is the 2019-year report. The terminal can determine the sub-history report corresponding to the 2018 year closest to the 2019 year as the target history report. Furthermore, the terminal can extract the historical tab and the historical header data in the target historical report, namely the report corresponding to the 2018 years.
In the above embodiment, the sub-historical report corresponding to the year closest to the year of the current report in the historical reports is determined as the target historical report, and the historical tab and the historical header data in the target historical report are extracted, so that the sorting speed of the report can be improved.
In an embodiment, step S212, namely, the step of displaying the report corresponding to the current tab belonging to the same report category on the same target page specifically includes: displaying the current tab belonging to the same report type in a tab area of a target page; displaying the report corresponding to the current tab in a report area of the target page; and responding to the tab selection operation triggered in the tab area, and switching the report displayed in the report area into the report corresponding to the current tab determined by the tab selection operation.
The tab area is used for displaying a tab corresponding to each report. The report area is an area for presenting each report.
Specifically, the terminal may determine all current tabs belonging to the same report category and the report corresponding to each current tab. The terminal can display the current tab belonging to the same report type in the tab area of the target page and display the report corresponding to the current tab in the report area of the target page. The user can perform tab selection operation based on the current tab in the tab area, and the terminal can respond to the tab selection operation triggered in the tab area and switch the report displayed in the report area into the report corresponding to the current tab determined by the tab selection operation.
In one embodiment, as shown in FIG. 3, the terminal classifies tabs 1-7 into a report category of the reduced pressure table, and displays tabs 1-7 on the same page. The terminal displays the current tab in the tab area of the target page, and displays the report corresponding to the current tab in the report area of the target page. And when any current tab is selected in the tab area, switching and displaying the tab in the report area to be a corresponding report. Compared with the traditional display mode, as shown in fig. 4, the terminal displays a whole set of multi-tab reports on different pages respectively in a link mode, and when each link is clicked, the corresponding report page is entered respectively, as shown in fig. 5, the link of the tab 3 is clicked to enter the displayed report. The report display mode carries out intelligent classification to the report of leading-in many pages of signing to show the report of same report classification in same page, avoid the page frequently to jump, make the bandwagon show effect more friendly, thereby promote report operating efficiency.
In the embodiment, the current tab belonging to the same report type is displayed in the tab area of the target page, and the report corresponding to the current tab is displayed in the report area of the target page, so that the reports corresponding to the current tab and the current tab are separately displayed, and the display effect is improved. By responding to the tab selection operation triggered in the tab region, the report displayed in the report region is switched to the report corresponding to the current tab determined by the tab selection operation, so that the report corresponding to the current tab belonging to the same report category is displayed in the same page, frequent page skipping is avoided, the display effect of the report is more friendly, and the report operation efficiency is further improved.
In an embodiment, as shown in fig. 6, the current report is a report in an Excel document format, and after the terminal acquires the report in the Excel document format, the terminal can read the report in the Excel document format through the POI and analyze the current tab and the current header data. Furthermore, the terminal can extract the historical tab and the historical header data of the previous year report, namely the historical report. The terminal can compare the similarity of the current tab with the similarity of the historical tab, compare the similarity of the current header data with the similarity of the historical header data, and calculate the similarity of the tab and the header. And when the page label is similar to the header, intelligently classifying the current report. And when the tab and the header are uncertain and similar, the user can check to realize the artificial classification of the current report.
It should be understood that although the various steps of fig. 2 are shown in order, the steps are not necessarily performed in order. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 2 may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 7, there is provided a report sorting apparatus 700, including: an obtaining module 701, an extracting module 702, a determining module 703 and a displaying module 704, wherein:
the obtaining module 701 is configured to obtain a current report and a historical report.
The extracting module 702 is configured to extract a current tab and current header data in the current report, and extract a historical tab and historical header data in the historical report.
A determining module 703, configured to determine the tab similarity between the current tab and the historical tab, and determine the historical tab with the tab similarity greater than a preset tab similarity threshold as a candidate tab; determining the header similarity of the current header data and the historical header data corresponding to the candidate page; and when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab.
And the display module 704 is configured to display the report corresponding to the current tab belonging to the same report category on the same target page.
In an embodiment, the determining module 703 is further configured to determine a root category corresponding to the current report when the maximum header similarity is less than or equal to a preset header similarity threshold; attributing the current tab to the root category corresponding to the current report, and generating a classification page; displaying a report category list in the classification page; and determining the report category of the current tab in response to the classification operation on the report category list.
In one embodiment, the extraction module 702 is further configured to convert the current report into a corresponding workbook object; traversing and extracting a current tab object in the workbook object; extracting a current tab and current header data from the current tab object; and extracting the current header data according to the style of the current report.
In one embodiment, the extracting module 702 is further configured to determine, as the target historical report, a sub-historical report corresponding to a year closest to the year of the current report in the historical reports; and extracting historical page tags and historical header data in the target historical report.
In one embodiment, the presentation module 704 is further configured to present the current tab belonging to the same report category in the tab area of the target page; displaying the report corresponding to the current tab in a report area of the target page; and responding to the tab selection operation triggered in the tab area, and switching the report displayed in the report area into the report corresponding to the current tab determined by the tab selection operation.
Referring to fig. 8, in an embodiment, the report sorting apparatus 700 further includes: a generating module 705, wherein:
a generating module 705, configured to, for each current tab, use the current tab as a key, use current header data corresponding to the current tab as a value, and generate a key value pair corresponding to each current tab; generating a relational mapping table corresponding to the current report according to the key value pair corresponding to each current tab; aiming at each sub-history report in the history report, taking the year of the sub-history report as a key, taking the history tab and the history header data in the sub-history report as values, and generating a key value pair corresponding to each sub-history report; the data structure of the history tab and the history header data is that the history tab is used as a key, and the history header data is used as a value; and generating a relational mapping table corresponding to the historical report according to the key value pair corresponding to each sub-historical report.
According to the report classification device, the current tab and the current header data in the current report are extracted and the historical tab and the historical header data in the historical report are extracted by acquiring the current report and the historical report. And determining the similarity of the current tab and the historical tabs, and determining the historical tabs with the tab similarity larger than a preset tab similarity threshold as candidate tabs so as to narrow the report classification range. And determining the header similarity of the current header data and the historical header data corresponding to the candidate page labels. And when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab. And displaying the report corresponding to the current tab belonging to the same report category on the same target page. Therefore, through calculating the similarity of the current page labels and the page labels of the historical page labels and calculating the similarity of the header of the current header data and the header of the historical header data, the imported multi-page label report forms are intelligently classified according to the similarity of the page labels and the similarity of the header, so that the unified operation of the report forms in the same category is facilitated, the report forms in the same report form category are displayed in the same page, the page frequent skipping is avoided, the display effect of the report forms is more friendly, and the operation efficiency of the report forms is improved.
For the specific definition of the report sorting device, reference may be made to the above definition of the report sorting method, which is not described herein again. All modules in the report sorting device can be completely or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be the terminal 102 in fig. 1, and its internal structure diagram may be as shown in fig. 9. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a report classification method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
acquiring a current report and a historical report;
extracting a current tab and current header data in a current report, and extracting historical tab and historical header data in a historical report;
determining the tab similarity of the current tab and the historical tabs, and determining the historical tabs with the tab similarity larger than a preset tab similarity threshold as candidate tabs;
determining the header similarity of the current header data and the historical header data corresponding to the candidate page;
when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab;
and displaying the report corresponding to the current tab belonging to the same report category on the same target page.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
when the maximum header similarity is less than or equal to a preset header similarity threshold, determining a root category corresponding to the current report;
attributing the current tab to the root category corresponding to the current report, and generating a classification page;
displaying a report category list in the classification page;
and determining the report category of the current tab in response to the classification operation on the report category list.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
converting the current report into a corresponding workbook object;
traversing and extracting a current tab object in the workbook object;
extracting a current tab and current header data from the current tab object; and extracting the current header data according to the style of the current report.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
aiming at each current tab, taking the current tab as a key, taking current header data corresponding to the current tab as a value, and generating a key value pair corresponding to each current tab;
generating a relational mapping table corresponding to the current report according to the key value pair corresponding to each current tab;
aiming at each sub-history report in the history report, taking the year of the sub-history report as a key, taking the history tab and the history header data in the sub-history report as values, and generating a key value pair corresponding to each sub-history report; the data structure of the history tab and the history header data is that the history tab is used as a key, and the history header data is used as a value;
and generating a relational mapping table corresponding to the historical report according to the key value pair corresponding to each sub-historical report.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
determining a sub-historical report corresponding to the year closest to the year of the current report in the historical reports as a target historical report;
and extracting historical page tags and historical header data in the target historical report.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
displaying the current tab belonging to the same report type in a tab area of a target page;
displaying the report corresponding to the current tab in a report area of the target page;
and responding to the tab selection operation triggered in the tab area, and switching the report displayed in the report area into the report corresponding to the current tab determined by the tab selection operation.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring a current report and a historical report;
extracting a current tab and current header data in a current report, and extracting historical tab and historical header data in a historical report;
determining the tab similarity of the current tab and the historical tabs, and determining the historical tabs with the tab similarity larger than a preset tab similarity threshold as candidate tabs;
determining the header similarity of the current header data and the historical header data corresponding to the candidate page;
when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab;
and displaying the report corresponding to the current tab belonging to the same report category on the same target page.
In one embodiment, the computer program when executed by the processor further performs the steps of:
when the maximum header similarity is less than or equal to a preset header similarity threshold, determining a root category corresponding to the current report;
attributing the current tab to the root category corresponding to the current report, and generating a classification page;
displaying a report category list in the classification page;
and determining the report category of the current tab in response to the classification operation on the report category list.
In one embodiment, the computer program when executed by the processor further performs the steps of:
converting the current report into a corresponding workbook object;
traversing and extracting a current tab object in the workbook object;
extracting a current tab and current header data from the current tab object; and extracting the current header data according to the style of the current report.
In one embodiment, the computer program when executed by the processor further performs the steps of:
aiming at each current tab, taking the current tab as a key, taking current header data corresponding to the current tab as a value, and generating a key value pair corresponding to each current tab;
generating a relational mapping table corresponding to the current report according to the key value pair corresponding to each current tab;
aiming at each sub-history report in the history report, taking the year of the sub-history report as a key, taking the history tab and the history header data in the sub-history report as values, and generating a key value pair corresponding to each sub-history report; the data structure of the history tab and the history header data is that the history tab is used as a key, and the history header data is used as a value;
and generating a relational mapping table corresponding to the historical report according to the key value pair corresponding to each sub-historical report.
In one embodiment, the computer program when executed by the processor further performs the steps of:
determining a sub-historical report corresponding to the year closest to the year of the current report in the historical reports as a target historical report;
and extracting historical page tags and historical header data in the target historical report.
In one embodiment, the computer program when executed by the processor further performs the steps of:
displaying the current tab belonging to the same report type in a tab area of a target page;
displaying the report corresponding to the current tab in a report area of the target page;
and responding to the tab selection operation triggered in the tab area, and switching the report displayed in the report area into the report corresponding to the current tab determined by the tab selection operation.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A report classification method is characterized by comprising the following steps:
acquiring a current report and a historical report;
extracting a current tab and current header data in the current report, and extracting historical tab and historical header data in the historical report;
determining the tab similarity of the current tab and the historical tab, and determining the historical tab of which the tab similarity is greater than a preset tab similarity threshold value as a candidate tab;
determining the header similarity of the current header data and the historical header data corresponding to the candidate tab;
when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab;
and displaying the report corresponding to the current tab belonging to the same report category on the same target page.
2. The method of claim 1, further comprising:
when the maximum header similarity is less than or equal to a preset header similarity threshold, determining a root category corresponding to the current report;
attributing the current tab to the root category corresponding to the current report, and generating a classification page;
displaying a report category list in the classification page;
and determining the report category of the current tab in response to the classification operation on the report category list.
3. The method of claim 1, wherein said extracting current tab and current header data in said current report comprises:
converting the current report into a corresponding workbook object;
traversing and extracting a current tab object in the workbook object;
extracting a current tab and current header data from the current tab object; and extracting the current header data according to the style of the current report.
4. The method of claim 1, further comprising:
aiming at each current tab, taking the current tab as a key, taking current header data corresponding to the current tab as a value, and generating a key value pair corresponding to each current tab;
generating a relational mapping table corresponding to the current report according to the key value pair corresponding to each current tab;
aiming at each sub-history report in the history report, taking the year of the sub-history report as a key, taking the history tab and the history header data in the sub-history report as values, and generating a key value pair corresponding to each sub-history report; the data structures of the history tab and the history header data are that the history tab is used as a key, and the history header data is used as a value;
and generating a relational mapping table corresponding to the historical report according to the key value pair corresponding to each sub-historical report.
5. The method of claim 4, wherein extracting historical tab and historical header data in the historical report comprises:
determining a sub-historical report corresponding to the year closest to the year of the current report in the historical reports as a target historical report;
and extracting the historical tab and the historical header data in the target historical report.
6. The method according to claim 1, wherein the displaying the report corresponding to the current tab belonging to the same report category on the same target page comprises:
displaying the current tab belonging to the same report type in a tab area of a target page;
displaying the report corresponding to the current tab in a report area of the target page;
and responding to a tab selection operation triggered in the tab area, and switching the report displayed in the report area into the report corresponding to the current tab determined by the tab selection operation.
7. A report sorting apparatus, said apparatus comprising:
the acquisition module is used for acquiring a current report and a historical report;
the extraction module is used for extracting the current tab and the current header data in the current report and extracting the historical tab and the historical header data in the historical report;
the determining module is used for determining the tab similarity of the current tab and the historical tab and determining the historical tab of which the tab similarity is greater than a preset tab similarity threshold value as a candidate tab; determining the header similarity of the current header data and the historical header data corresponding to the candidate tab; when the maximum header similarity is larger than a preset header similarity threshold, determining the report type to which the candidate tab corresponding to the maximum header similarity belongs as the report type of the current tab;
and the display module is used for displaying the report corresponding to the current tab belonging to the same report category on the same target page.
8. The apparatus according to claim 7, wherein the determining module is further configured to determine a root category corresponding to the current report when the maximum table header similarity is less than or equal to a preset table header similarity threshold; attributing the current tab to the root category corresponding to the current report, and generating a classification page; displaying a report category list in the classification page; and determining the report category of the current tab in response to the classification operation on the report category list.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented by the processor when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
CN202011202385.1A 2020-11-02 2020-11-02 Report classification method, device, computer equipment and storage medium Active CN112395415B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011202385.1A CN112395415B (en) 2020-11-02 2020-11-02 Report classification method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011202385.1A CN112395415B (en) 2020-11-02 2020-11-02 Report classification method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112395415A true CN112395415A (en) 2021-02-23
CN112395415B CN112395415B (en) 2024-04-02

Family

ID=74597320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011202385.1A Active CN112395415B (en) 2020-11-02 2020-11-02 Report classification method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112395415B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101419621A (en) * 2008-12-18 2009-04-29 金蝶软件(中国)有限公司 Method for switching page label and apparatus
US20160021181A1 (en) * 2013-07-23 2016-01-21 George Ianakiev Data fusion and exchange hub - architecture, system and method
CN109255082A (en) * 2018-08-10 2019-01-22 天津五八到家科技有限公司 Bookmark display methods and device
CN110019478A (en) * 2017-12-28 2019-07-16 贵州白山云科技股份有限公司 Data lead-in method, medium, equipment and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101419621A (en) * 2008-12-18 2009-04-29 金蝶软件(中国)有限公司 Method for switching page label and apparatus
US20160021181A1 (en) * 2013-07-23 2016-01-21 George Ianakiev Data fusion and exchange hub - architecture, system and method
CN110019478A (en) * 2017-12-28 2019-07-16 贵州白山云科技股份有限公司 Data lead-in method, medium, equipment and device
CN109255082A (en) * 2018-08-10 2019-01-22 天津五八到家科技有限公司 Bookmark display methods and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王暹昊;朱勇士;: "POI实现Java程序操作Excel报表的应用分析", 华南金融电脑, no. 07, 10 July 2010 (2010-07-10) *

Also Published As

Publication number Publication date
CN112395415B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
US11244208B2 (en) Two-dimensional document processing
CN111666401B (en) Document recommendation method, device, computer equipment and medium based on graph structure
CN110008251B (en) Data processing method and device based on time sequence data and computer equipment
CN109753653B (en) Entity name recognition method, entity name recognition device, computer equipment and storage medium
CN109886719B (en) Data mining processing method and device based on grid and computer equipment
CN114155543A (en) Neural network training method, document image understanding method, device and equipment
CN111324716A (en) Index data acquisition method and device, computer equipment and storage medium
CN108509424A (en) Institutional information processing method, device, computer equipment and storage medium
CN110166522B (en) Server identification method and device, readable storage medium and computer equipment
CN109710571B (en) File analysis method, device and storage medium
CN111552903A (en) Page generation method and device based on HTML (Hypertext markup language) template and computer equipment
CN113468338A (en) Big data analysis method for digital cloud service and big data server
CN110955608A (en) Test data processing method and device, computer equipment and storage medium
US20160259852A1 (en) Audio file management method, device and storage medium
US10949604B1 (en) Identifying artifacts in digital documents
CN111985467A (en) Chat record screenshot processing method and device, computer equipment and storage medium
CN112395415A (en) Report classification method and device, computer equipment and storage medium
CN109241371B (en) Map data storage method, map data loading method, map data storage device, map data loading device and computer equipment
US11468126B2 (en) Method for collecting component model in component e-commerce platform
CN115690821A (en) Intelligent electronic file cataloging method and computer equipment
CN112765453A (en) Content recommendation method and device, computer equipment and storage medium
CN112416785A (en) Word cutting tool version difference testing method, device, equipment and storage medium
CN110599338A (en) Transaction data processing method and device, computer equipment and storage medium
CN110781378A (en) Data graphical processing method and device, computer equipment and storage medium
CN110674093A (en) File data processing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant