CN113420042A - Data statistics method, device, equipment and storage medium based on presentation - Google Patents

Data statistics method, device, equipment and storage medium based on presentation Download PDF

Info

Publication number
CN113420042A
CN113420042A CN202110690523.3A CN202110690523A CN113420042A CN 113420042 A CN113420042 A CN 113420042A CN 202110690523 A CN202110690523 A CN 202110690523A CN 113420042 A CN113420042 A CN 113420042A
Authority
CN
China
Prior art keywords
data
presentation
chart
target
title
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110690523.3A
Other languages
Chinese (zh)
Inventor
李平梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Pension Insurance Corp
Original Assignee
Ping An Pension Insurance Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Pension Insurance Corp filed Critical Ping An Pension Insurance Corp
Priority to CN202110690523.3A priority Critical patent/CN113420042A/en
Publication of CN113420042A publication Critical patent/CN113420042A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2428Query predicate definition using graphical user interfaces, including menus and forms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Abstract

The application relates to the technical field of data visualization, and discloses a data statistics method, a device, equipment and a storage medium based on a presentation, wherein the method comprises the steps of acquiring sample data and a presentation file by acquiring a request from a user side; and accurately acquiring required target data through the titles of the slides of the presentation file, acquiring the corresponding chart to be filled according to the presentation file, acquiring filling data according to the titles of the charts, generating statistical data of different chart types by combining a preset data statistical mode according to the charts, and filling the statistical data to generate the target presentation. The present application also relates to blockchain techniques, with sample data stored in the blockchain. According to the method and the device, the corresponding data in the presentation are identified through the title information, and the accuracy of data statistics is improved.

Description

Data statistics method, device, equipment and storage medium based on presentation
Technical Field
The present application relates to the field of data visualization, and in particular, to a method, an apparatus, a device, and a storage medium for data statistics based on a presentation.
Background
Currently, if a salesman or a manager analyzes the quality of a service of an insured customer, the customer continues to maintain, and during operation management analysis, the whole situation of the customer is often required to be known across a plurality of systems, for example: insurance application conditions (number of insurance application persons, insurance application amount, insurance addition and subtraction amount, sex and age distribution, etc.), service conditions (number of insurance addition and subtraction persons, number of off-line claims, number of consultation receptions), and settlement conditions (responsibility settlement, age settlement, high-frequency diagnosis, etiology analysis, etc.).
The existing system has scattered things, cannot be directly acquired for use, needs to report different data to an IT system in the existing system function foundation, makes a presentation according to the data acquired by the IT system, and provides services for clients according to the made presentation. Because the data need be obtained through different systems to current mode for data acquisition is difficult and have the condition of omitting, and does not filter data, and then leads to the data statistics accurate inadequately. There is a need for a method that can improve the accuracy of data statistics.
Disclosure of Invention
An embodiment of the present application provides a method, an apparatus, a device, and a storage medium for data statistics based on a presentation, so as to improve accuracy of data statistics.
In order to solve the above technical problem, an embodiment of the present application provides a data statistics method based on a presentation, including:
acquiring a request from a user side, wherein the request comprises a time range selected by the user side and a required demonstration manuscript template type, and taking sample data corresponding to the time range as initial data, wherein the sample data is stored in a data warehouse;
acquiring a corresponding presentation file from an NAS server according to the name corresponding to the presentation template type;
identifying the title of each slide from the presentation file, and acquiring target data from the initial data through the title;
screening slides to be filled from the presentation file, and identifying chart identifications and titles corresponding to the charts in the slides to be filled;
acquiring filling data from the target data through a title corresponding to the chart, and judging the chart type according to the chart identifier;
and generating a target chart and statistical data according to the chart type and the filling data, and filling the demonstration file with the target chart and the statistical data to obtain a target demonstration file.
In order to solve the above technical problem, an embodiment of the present application provides a presentation-based data statistics apparatus, including:
the system comprises a sample data acquisition module, a data warehouse and a demonstration document template generation module, wherein the sample data acquisition module is used for acquiring a request from a user side, the request comprises a time range selected by the user side and a required demonstration document template type, and sample data corresponding to the time range is used as initial data, and the sample data is stored in the data warehouse;
the demonstration manuscript file acquisition module is used for acquiring a corresponding demonstration manuscript file from the NAS server according to the name corresponding to the demonstration manuscript template type;
the target data extraction module is used for identifying the title of each slide from the presentation file and acquiring target data from the initial data through the title;
the slide to be filled screening module is used for screening out slides to be filled from the presentation file and identifying chart identifications and titles corresponding to the charts in the slides to be filled;
the filling data acquisition module is used for acquiring filling data from the target data through the title corresponding to the chart and judging the chart type according to the chart identifier;
and the target presentation generation module is used for generating a target chart and statistical data according to the chart type and the filling data, and filling the presentation file with the target chart and the statistical data to obtain a target presentation.
In order to solve the technical problems, the invention adopts a technical scheme that: a computer device is provided that includes, one or more processors; a memory for storing one or more programs for causing the one or more processors to implement any of the presentation-based data statistics methods described above.
In order to solve the technical problems, the invention adopts a technical scheme that: a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the presentation-based data statistics methods described above.
The embodiment of the invention provides a data statistics method, a data statistics device, data statistics equipment and a storage medium based on a presentation. According to the embodiment of the invention, the request from the user side is acquired, the request comprises the time range selected by the user side and the required type of the demonstration manuscript template, and the sample data corresponding to the time range is used as the initial data, wherein the sample data is stored in the data warehouse, so that the data is stored in the same database, and the data acquisition is facilitated; the method comprises the steps of accurately acquiring required data through the titles of slides, acquiring corresponding charts to be filled according to presentation files, acquiring filling data according to the titles of the charts, generating statistical data of different chart types by combining preset data statistical modes of the charts, filling the statistical data to generate a target presentation, and achieving accurate acquisition of the data to be counted and conversion of the data to be counted according to the titles and the preset statistical modes in preset presentation templates to generate the statistical data and the target charts, so that the accuracy of data statistics is improved.
Drawings
In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a schematic application environment diagram of a presentation-based data statistics method provided in an embodiment of the present application;
FIG. 2 is a flowchart of an implementation of a presentation-based data statistics method according to an embodiment of the present application;
fig. 3 is a flowchart of an implementation of a sub-process in a presentation-based data statistics method according to an embodiment of the present application;
fig. 4 is a flowchart of another implementation of a sub-process in a presentation-based data statistics method according to an embodiment of the present application;
fig. 5 is a flowchart of another implementation of a sub-process in a presentation-based data statistics method according to an embodiment of the present application;
fig. 6 is a flowchart of another implementation of a sub-process in a presentation-based data statistics method according to an embodiment of the present application;
fig. 7 is a flowchart of another implementation of a sub-process in a presentation-based data statistics method according to an embodiment of the present application;
FIG. 8 is a flowchart of another implementation of a sub-process in a method for statistics based on presentation according to an embodiment of the present application;
fig. 9 is a schematic diagram of a presentation-based data statistics apparatus according to an embodiment of the present application;
fig. 10 is a schematic diagram of a computer device provided in an embodiment of the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
The present invention will be described in detail below with reference to the accompanying drawings and embodiments.
Referring to fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a web browser application, a search-type application, an instant messaging tool, and the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that the presentation-based data statistics method provided in the embodiments of the present application is generally executed by a server, and accordingly, the presentation-based data statistics apparatus is generally configured in the server.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring to fig. 2, fig. 2 illustrates an embodiment of a presentation-based data statistics method.
It should be noted that, if the result is substantially the same, the method of the present invention is not limited to the flow sequence shown in fig. 2, and the method includes the following steps:
s1: the method comprises the steps of obtaining a request from a user side, wherein the request comprises a time range selected by the user side and a required demonstration manuscript template type, and taking sample data corresponding to the time range as initial data, wherein the sample data are stored in a data warehouse.
In the embodiments of the present application, in order to more clearly understand the technical solution, the following detailed description is made on the terminal related to the present application.
The server can acquire sample data from a data warehouse stored in the server, and can also acquire the sample data from an externally connected database and acquire a presentation template. And screening corresponding data from the data warehouse according to the acquired demonstration manuscript template, and filling the data into a demonstration manuscript file so as to generate a target demonstration manuscript. For example, a demonstration of the operation of a certain insurance policy over a certain period of time is generated according to the needs of a certain user terminal.
And secondly, the user side can select a certain time period and send the demonstration manuscript requirement generated under a certain condition to the server, and can also receive the target demonstration manuscript generated by the server.
Specifically, the data warehouse may be Hive, Oracle, yellowcrack, etc., and is not limited herein. The preferred data warehouse is yellowbreak. The yellowcrack data warehouse is a data base for storing and counting data, the reason for storing and acquiring the data through the yellowcrack data warehouse is that the calculation capacity is much faster than that of an oracle data base, so the data warehouse can be configured with a plurality of schemes (such as a core library A, a claim settlement library B, an MIS library and the like) by adopting the analysis and counting data, and the data warehouse can acquire the corresponding schemes according to the contents in the presentation files, such as A slides needing to acquire insurance participants, the total number of insurance employees, the sum of insurance premiums of family members and the related information of claim settlement. The insurance participators obtain the information from the core library, the claim information from the claim library and the service data from the MIS library.
The presentation template is a presentation preset according to different requirements, and comprises a plurality of slides, each slide has a title, the slides comprise slides to be filled and basic slides, the basic slides are slides which do not need to be filled with data, and the slides to be filled are slides which need to be filled with data. The sample data refers to certain business data, for example, insurance policy data of insurance industry.
It should be noted that the time range is selected by the user end according to the actual situation, and is not limited herein. In one embodiment, the time ranges from 1/month 1/2020 to 1/month 31/2020.
S2: and acquiring a corresponding presentation file from the NAS server according to the name corresponding to the type of the presentation template.
Specifically, various presentation files are stored in the NAS server in advance, and when data is to be filled in one of the presentation files, the presentation file can be acquired from the NAS server by the name of the presentation template type.
A NAS server is defined as a special dedicated data storage server, comprising storage devices (e.g. disk arrays, CD/DVD drives, tape drives or removable storage media) and embedded system software, that can provide cross-platform file sharing functionality. In the embodiment of the application, the NAS server stores the presentation file in advance, and acquires the presentation file when data filling is needed, because the NAS server can support multiple protocols (such as NFS, CIFS, FTP, HTTP, and the like) and can support various operating systems, the efficiency of generating the target presentation file is improved.
S3: the title of each slide is identified from the presentation file, and target data is retrieved from the initial data by title.
Specifically, each slide of the presentation file needs to have a corresponding title, and the slide to be filled is also filled according to the content of the title, so that the title of each slide in the presentation file needs to be acquired. And performing word segmentation processing on all the slide titles to obtain keywords in the slide titles, and obtaining target data from a core library corresponding to the data warehouse through the keywords. And because the user side selects the data in the time range, namely the initial data, the target data can be obtained from the initial data through the keywords.
Referring to fig. 3, fig. 3 shows an embodiment of step S3, which is described in detail as follows:
s31: the title of each slide is identified from the presentation file to obtain a title set.
Specifically, the corresponding titles are identified from each slide, and combined to form a title set, i.e., a title set. The title set user identifies keywords to obtain the corresponding data needed for the slide show.
S32: and performing word segmentation and keyword acquisition on the title set to obtain basic keywords.
Specifically, the title of each slide in the presentation file is read to form a title set, and then the title set is subjected to word segmentation processing in a preset word segmentation mode to obtain each initial word segmentation. The preset word segmentation modes include but are not limited to; jieba participles, viterbi algorithm participles, etc.; preferably, the word segmentation is carried out in a Jieba word segmentation mode, which facilitates subsequent part-of-speech tagging so as to facilitate keyword extraction. And then extracting the keywords, wherein the extraction of the keywords can be carried out in a part-of-speech tagging mode.
S33: and acquiring target data from the initial data of the data warehouse in a traversal mode according to the basic keywords.
Specifically, after a plurality of keywords are extracted, sample data, namely initial data, of the data warehouse belonging to the time range is traversed according to the keywords, and therefore target data are obtained.
In the embodiment, the title of each slide is identified from the presentation file to obtain the title set, the title set is subjected to word segmentation and keyword acquisition to obtain the basic keywords, and the target data is acquired from the initial data of the data warehouse in a traversal mode according to the basic keywords, so that the data is firstly screened from the titles of the slides, and the speed and the accuracy of subsequently acquiring filling data are improved.
Referring to fig. 4, fig. 4 shows an embodiment of step S32, which is described in detail as follows:
s321: and performing word segmentation processing on the title set by adopting a Jieba word segmentation mode to obtain initial word segmentation.
Specifically, due to the fact that the Jieba word segmentation is a Chinese open source word segmentation packet, the method has the advantages of being high in performance, accuracy, expandability and the like, python is mainly supported at present, other languages also have relevant versions, the text can be rapidly segmented, subsequent part-of-speech tagging is facilitated, keyword extraction is facilitated, and therefore word segmentation processing is conducted on a title set by adopting a Jieba word segmentation mode.
S322: and performing part-of-speech tagging on the initial participle to obtain tagged participles.
Specifically, the tagged participles are obtained by performing part-of-speech tagging on each initial participle, and each tagged participle has the part-of-speech. The part of speech comprises real words and imaginary words, wherein the real words comprise words representing real meanings, and the real words comprise nouns, verbs, adjectives, numerators, quantifiers, pronouns, status words and distinguishing words; the term "virtual word" means a word not representing the actual meaning but representing the grammatical meaning, and includes adverb, preposition, conjunctive, auxiliary word, sigh word and pseudonym.
S323: and selecting initial participles with parts of speech being real words from the labeled participles as basic keywords.
Specifically, because the particle word does not represent the word with the real meaning but the word with the grammatical meaning, the part of the particle word does not need to be selected, and because the titles of the slides are often refined, the rest of the particle words can be used as the basic key words after the particle word part is removed.
In this embodiment, a Jieba word segmentation mode is adopted to perform word segmentation processing on the title set to obtain initial word segmentation, perform part-of-speech tagging on the initial word segmentation to obtain tagged word segmentation, and select an initial word segmentation with a part-of-speech being a real word from the tagged word segmentation as a basic keyword, so that keyword extraction is performed on a title of a slide, and subsequent extraction of corresponding data is facilitated, thereby facilitating filling of a presentation.
S4: and screening slides to be filled from the presentation file, and identifying chart identifications and titles corresponding to the charts in the slides to be filled.
Specifically, because the presentation file is a preset presentation template, some slides are already edited and do not need to be data-filled, for example, the first slide often only contains a theme and does not need to be data-filled, so that the slides need to be excluded during data-filling. And screening slides to be filled in the presentation file by judging whether the slides have the areas to be edited. And then judging and identifying chart identifiers in each slide to be filled, wherein the chart identifiers can identify the types of the charts, such as a bar chart, a pie chart, a broken line chart, a table and the like. And each chart needing to be filled contains the content needing to be filled, so that the corresponding title of the chart needs to be identified.
Referring to fig. 5, fig. 5 shows an embodiment of step S4, which is described in detail as follows:
s41: and judging whether the slide in the presentation file has an area to be edited or not to obtain a judgment result.
S42: and if the judgment result is that the area to be edited exists, the slide is the slide to be filled.
Specifically, the presentation template is set in advance according to needs, and has set slides which need to be data-filled and slides which do not need to be data-filled, so that only by judging whether the slides in the presentation file have regions to be edited, if so, the slides are the slides to be filled, and if not, the slides do not need to be data-filled.
S43: and sequentially identifying the chart identification in each slide to be filled according to the page number of the slide, and sequentially acquiring the title corresponding to each chart.
Specifically, different chart identifiers represent different charts, and the presentation template thereof needs to be set in advance, and the set different chart identifiers are filled with corresponding chart data. And each chart has a corresponding title for description, so that the corresponding data can be acquired later by acquiring the title corresponding to the chart.
In this embodiment, the slides to be filled are determined by determining whether the slides in the presentation file have the to-be-edited region, and then the chart identifiers and the titles corresponding to the charts in the slides to be filled are identified, which facilitates subsequent acquisition of filling data and filling of the slide data.
Referring to fig. 6, fig. 6 shows a specific implementation before step S4, which includes:
S4A: and acquiring a presentation object corresponding to the presentation file by a preset method.
S4B: and acquiring each slide and the number of pages of the slide of the presentation object according to the presentation object.
Specifically, a presentation object is obtained through an XMLSlideShow class of an Apache POI method, each slide is obtained in a loop through a getSlides method of the presentation object, and the number of slides is obtained through a getSlideNumber () method. The first slide is preferably processed and skipped directly if padding is not required. The presentation object is a Java object, and is an instantiation of the apache poi method XMLSlideShow class, which is used to parse the presentation file.
The Apache POI introduction is a free-source cross-platform Java API written by Java, and provides API functions for Java programs to read and write files in Microsoft Office (Excel, WORD, PowerPoint, Visio and the like) formats.
In this embodiment, a presentation object corresponding to the presentation file is obtained by a preset method, and then each slide and the number of pages of the slide of the presentation object are obtained according to the presentation object, so as to obtain each slide and the number of pages of the slide, and make a foundation for subsequent data filling. The number of slides refers to the page number corresponding to each slide.
S5: and acquiring filling data from the target data through the title corresponding to the chart, and judging the chart type according to the chart identification.
Specifically, since the above steps have already acquired the corresponding target data according to the presentation template, the data required by all slides in the data are included in the data required by each slide to be filled. The title of each slide to be filled is identified first, and the corresponding data of each slide to be filled is identified through the title. And as a plurality of charts of slides to be filled may exist and need to be filled with data, the filling data needed by each chart is further acquired according to the corresponding title of each chart. Moreover, because the graphs have different types, and the data statistics mode of each type is different, the type of each graph is judged through graph identification.
Referring to fig. 7, fig. 7 shows an embodiment of step S5, which is described in detail as follows:
s51: the title corresponding to each slide to be filled is identified as the base title.
S52: and acquiring data corresponding to the basic title from the target data based on the basic title as basic data.
Specifically, in the embodiment of the present application, word segmentation processing and keyword extraction are performed on the dynamic title first, and basic data is obtained from the target data through the keyword.
S53: and acquiring filling data from the basic data according to the title corresponding to the chart, and judging the chart type according to the chart identification.
Specifically, in the embodiment of the present application, word segmentation processing and keyword extraction are performed on the titles corresponding to the graphs, and then the filling data is obtained from the dynamic data through the keyword.
It should be noted that the first acquisition of the target data is to screen all data that can be needed by the target presentation from the initial data, so that the range of the basic data and the padding data acquisition data can be reduced, and the corresponding data can be quickly and accurately acquired.
In the embodiment, the title corresponding to each slide to be filled is identified to serve as the basic title, the data corresponding to the basic title is obtained from the target data based on the basic title, the filling data is obtained from the basic data according to the title corresponding to the chart and the chart type is judged according to the chart identification, so that the final filling data can be accurately obtained, and the accuracy of data statistics can be improved.
S6: and generating a target chart and statistical data according to the chart type and the filling data, and filling the target chart and the statistical data into the presentation file to obtain the target presentation.
Specifically, different presentation files are templates that are set in advance, and the contents to be counted by each chart are different, so when the templates are set in advance, the data statistics modes corresponding to different chart types are already set therein. The method comprises the steps of obtaining a data statistical mode corresponding to a chart by identifying the type of the chart, carrying out corresponding calculation processing on filling data according to the data mode to obtain statistical data which are counted, generating a corresponding target chart from the statistical data, and finally filling the target chart and the statistical data into a presentation file to obtain a target presentation.
Further, the present application populates the presentation with target charts and statistics by the Apache POI technique.
In this embodiment, by obtaining a request from a user, the request includes a time range selected by the user and a required type of a presentation template, and sample data corresponding to the time range is used as initial data, where the sample data is stored in a data warehouse, so that the data is stored in the same database, which is convenient for data acquisition; the method comprises the steps of accurately acquiring required data through the titles of slides, acquiring corresponding charts to be filled according to presentation files, acquiring filling data according to the titles of the charts, generating statistical data of different chart types by combining preset data statistical modes of the charts, filling the statistical data to generate a target presentation, and achieving accurate acquisition of the data to be counted and conversion of the data to be counted according to the titles and the preset statistical modes in preset presentation templates to generate the statistical data and the target charts, so that the accuracy of data statistics is improved.
Referring to fig. 8, fig. 8 shows an embodiment of step S6, which is described in detail as follows:
s61: and acquiring a data statistical mode corresponding to the chart type.
The data statistics method is a method of forming data statistics for each chart type, and is set according to the need in advance.
S62: and calculating the filling data according to the data statistical mode to obtain statistical data.
S63: based on the statistical data, a target graph is generated.
Specifically, different graph types have different data statistics manners, for example, the data statistics manner corresponding to the histogram may be to count the number of each object, and the data statistics manner corresponding to the pie chart is the proportion occupied in different situations. Therefore, the corresponding data statistical mode is obtained according to the graph type, the filling data is calculated according to the data statistical mode to obtain statistical data, and the target graph is formed through the statistical data.
S64: and filling the target graph and the statistical data into the presentation file to obtain the target presentation.
In a specific embodiment, the title of a slide to be filled is "insurance application condition", and the server acquires various target data corresponding to insurance application conditions within a time range according to the title. The slide shows two charts, namely a pie chart with the title of 'male and female conditions' and a bar chart with the title of 'age distribution', wherein the data statistics mode of the pie chart is to count data of male and female insurance in the data to be filled, so that a target pie chart is generated. The data statistics mode of the histogram is to count the age numbers of different policemen in the data to be filled, so as to generate a target state diagram.
In the implementation, the statistical data is obtained by obtaining the data statistical mode corresponding to the chart type, calculating the filling data according to the data statistical mode, generating the target chart based on the statistical data, and filling the target chart and the statistical data into the presentation file to obtain the target presentation file.
It is emphasized that, in order to further ensure the privacy and security of the sample data, the sample data may also be stored in a node of a block chain.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the computer program is executed. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).
Referring to fig. 9, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a presentation-based data statistics apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be applied to various electronic devices.
As shown in fig. 9, the presentation-based data statistics apparatus of the present embodiment includes: a sample data acquisition module 71, a presentation file acquisition module 72, a target data extraction module 73, a to-be-filled slide screening module 74, a filling data acquisition module 75, and a target presentation generation module 76, where:
a sample data obtaining module 71, configured to obtain a request from a user, where the request includes a time range selected by the user and a required type of the presentation template, and use sample data corresponding to the time range as initial data, where the sample data is stored in a data warehouse;
a presentation file acquiring module 72, configured to acquire a corresponding presentation file from the NAS server according to a name corresponding to the presentation template type;
a target data extraction module 73, configured to identify a title of each slide from the presentation file, and obtain target data from the initial data by using the title;
a slide to fill screening module 74, configured to screen a slide to fill from the presentation file, and identify a chart identifier and a title corresponding to the chart in the slide to fill;
a filling data obtaining module 75, configured to obtain filling data from the target data through a title corresponding to the graph, and determine a graph type according to the graph identifier;
and a target presentation generating module 76, configured to generate a target chart and statistical data according to the chart type and the fill data, and fill the target chart and the statistical data into the presentation file to obtain a target presentation.
Further, the target data extraction module 73 includes:
a title set acquisition unit for identifying the title of each slide from the presentation file to obtain a title set;
the basic keyword extraction unit is used for performing word segmentation and keyword acquisition on the title set to obtain basic keywords;
and the target data acquisition unit is used for acquiring target data from the initial data of the data warehouse in a traversal mode according to the basic key words.
Further, the basic keyword extracting unit includes:
the initial word segmentation acquisition subunit is used for performing word segmentation processing on the title set in a Jieba word segmentation mode to obtain initial words;
the initial word segmentation labeling subunit is used for performing part-of-speech labeling on the initial word segmentation to obtain labeled word segmentation;
and the basic keyword selecting subunit is used for selecting the initial participle with the part of speech being the real word from the labeled participle as the basic keyword.
Further, the to-fill slide filtering module 74 includes:
the judging result acquiring unit is used for judging whether the slides in the presentation file have the areas to be edited or not to obtain judging results;
a slide to be filled determining unit, configured to determine that the slide is a slide to be filled if the determination result indicates that the area to be edited exists;
and the chart identifier acquisition unit is used for sequentially identifying the chart identifier in each slide to be filled according to the page number of the slide and sequentially acquiring the title corresponding to each chart.
Further, before the slide filter to fill module 74, the method further includes:
the demonstration document object generating module is used for acquiring a demonstration document object corresponding to the demonstration document file through a preset method;
and the slide page number acquisition module is used for acquiring each slide and the page number of the slide of the presentation document object according to the presentation document object.
Further, the padding data acquiring module 75 includes:
a basic title identifying unit for identifying a title corresponding to each slide to be filled as a basic title;
a basic data acquisition unit, configured to acquire, based on the basic header, data corresponding to the basic header from the target data as basic data;
and the filling data acquisition unit is used for acquiring filling data from the basic data according to the title corresponding to the chart and judging the chart type according to the chart identification.
Further, the target presentation generating module 76 includes:
the data statistical mode acquisition unit is used for acquiring a data statistical mode corresponding to the chart type;
the statistical data generating unit is used for calculating the filling data according to the data statistical mode to obtain statistical data;
a target chart generation unit for generating a target chart based on the statistical data;
and the demonstration manuscript filling unit is used for filling the demonstration manuscript file with the target chart and the statistical data to obtain a target demonstration manuscript.
It is emphasized that, in order to further ensure the privacy and security of the sample data, the sample data may also be stored in a node of a block chain.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 10, fig. 10 is a block diagram of a basic structure of a computer device according to the present embodiment.
The computer device 8 includes a memory 81, a processor 82, and a network interface 83 communicatively connected to each other via a system bus. It is noted that only a computer device 8 having three components, a memory 81, a processor 82, and a network interface 83, is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
The memory 81 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 81 may be an internal storage unit of the computer device 8, such as a hard disk or a memory of the computer device 8. In other embodiments, the memory 81 may also be an external storage device of the computer device 8, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a flash Card (FlashCard), or the like provided on the computer device 8. Of course, the memory 81 may also include both internal and external storage devices of the computer device 8. In this embodiment, the memory 81 is generally used for storing an operating system installed in the computer device 8 and various types of application software, such as program codes of a presentation-based data statistics method. Further, the memory 81 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 82 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 82 is typically used to control the overall operation of the computer device 8. In this embodiment, the processor 82 is configured to execute the program codes or process data stored in the memory 81, for example, the program codes of the above-mentioned presentation-based data statistics method, so as to implement various embodiments of the presentation-based data statistics method.
The network interface 83 may include a wireless network interface or a wired network interface, and the network interface 83 is generally used to establish communication connections between the computer device 8 and other electronic devices.
The present application further provides another embodiment, which is to provide a computer-readable storage medium storing a computer program, which is executable by at least one processor to cause the at least one processor to perform the steps of a presentation-based data statistics method as described above.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method of the embodiments of the present application.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims (10)

1. A data statistical method based on presentation is characterized by comprising the following steps:
acquiring a request from a user side, wherein the request comprises a time range selected by the user side and a required demonstration manuscript template type, and taking sample data corresponding to the time range as initial data, wherein the sample data is stored in a data warehouse;
acquiring a corresponding presentation file from an NAS server according to the name corresponding to the presentation template type;
identifying the title of each slide from the presentation file, and acquiring target data from the initial data through the title;
screening slides to be filled from the presentation file, and identifying chart identifications and titles corresponding to the charts in the slides to be filled;
acquiring filling data from the target data through a title corresponding to the chart, and judging the chart type according to the chart identifier;
and generating a target chart and statistical data according to the chart type and the filling data, and filling the demonstration file with the target chart and the statistical data to obtain a target demonstration file.
2. The presentation-based data statistics method of claim 1, wherein the identifying a title of each slide from the presentation file and obtaining target data from the initial data through the title comprises:
identifying the title of each slide from the presentation file to obtain a title set;
performing word segmentation and keyword acquisition on the title set to obtain basic keywords;
and acquiring target data from the initial data of the data warehouse in a traversal mode according to the basic keywords.
3. The method of claim 2, wherein the performing word segmentation and keyword acquisition on the topic collection to obtain basic keywords comprises:
performing word segmentation processing on the title set by adopting a Jieba word segmentation mode to obtain initial word segmentation;
performing part-of-speech tagging on the initial participle to obtain tagged participles;
and selecting initial participles with parts of speech being real words from the labeled participles as the basic keywords.
4. The presentation-based data statistics method of claim 1, wherein the screening out slides to be filled from the presentation file and identifying chart identifications and chart-corresponding titles in the slides to be filled comprises:
judging whether a slide in the presentation file has an area to be edited or not to obtain a judgment result;
if the judgment result is that the area to be edited exists, the slide is the slide to be filled;
and sequentially identifying the chart identification in each slide to be filled according to the page number of the slide, and sequentially acquiring the title corresponding to each chart.
5. The presentation-based data statistics method of claim 1, wherein before the filtering out slides to be filled from the presentation file and identifying chart identifications and chart-corresponding headings in the slides to be filled, the method further comprises:
acquiring a presentation object corresponding to the presentation file by a preset method;
and acquiring each slide and the number of pages of the slide of the presentation document object according to the presentation document object.
6. The method of claim 1, wherein the obtaining of fill data from the target data via a title corresponding to the chart and determining the chart type according to the chart identifier comprises:
identifying a title corresponding to each slide to be filled as a basic title;
acquiring data corresponding to the basic title from the target data based on the basic title, and taking the data as basic data;
and acquiring the filling data from the basic data according to the title corresponding to the chart, and judging the chart type according to the chart identification.
7. A method according to any one of claims 1 to 6, wherein the generating a target chart and statistical data according to the chart type and the fill-in data, and filling the target chart and the statistical data in the presentation file to obtain a target presentation comprises:
acquiring a data statistical mode corresponding to the chart type;
according to the data statistical mode, calculating the filling data to obtain statistical data;
generating a target graph based on the statistical data;
and filling the demonstration file with the target chart and the statistical data to obtain a target demonstration file.
8. A presentation-based data statistics apparatus, comprising:
the system comprises a sample data acquisition module, a data warehouse and a demonstration document template generation module, wherein the sample data acquisition module is used for acquiring a request from a user side, the request comprises a time range selected by the user side and a required demonstration document template type, and sample data corresponding to the time range is used as initial data, and the sample data is stored in the data warehouse;
the demonstration manuscript file acquisition module is used for acquiring a corresponding demonstration manuscript file from the NAS server according to the name corresponding to the demonstration manuscript template type;
the target data extraction module is used for identifying the title of each slide from the presentation file and acquiring target data from the initial data through the title;
the slide to be filled screening module is used for screening out slides to be filled from the presentation file and identifying chart identifications and titles corresponding to the charts in the slides to be filled;
the filling data acquisition module is used for acquiring filling data from the target data through the title corresponding to the chart and judging the chart type according to the chart identifier;
and the target presentation generation module is used for generating a target chart and statistical data according to the chart type and the filling data, and filling the presentation file with the target chart and the statistical data to obtain a target presentation.
9. A computer device comprising a memory in which a computer program is stored and a processor that implements the presentation-based data statistics method according to any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the presentation-based data statistics method of any of claims 1 to 7.
CN202110690523.3A 2021-06-22 2021-06-22 Data statistics method, device, equipment and storage medium based on presentation Pending CN113420042A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110690523.3A CN113420042A (en) 2021-06-22 2021-06-22 Data statistics method, device, equipment and storage medium based on presentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110690523.3A CN113420042A (en) 2021-06-22 2021-06-22 Data statistics method, device, equipment and storage medium based on presentation

Publications (1)

Publication Number Publication Date
CN113420042A true CN113420042A (en) 2021-09-21

Family

ID=77789720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110690523.3A Pending CN113420042A (en) 2021-06-22 2021-06-22 Data statistics method, device, equipment and storage medium based on presentation

Country Status (1)

Country Link
CN (1) CN113420042A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117272965A (en) * 2023-09-11 2023-12-22 中关村科学城城市大脑股份有限公司 Demonstration manuscript generation method, demonstration manuscript generation device, electronic equipment and computer readable medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104850559A (en) * 2014-02-18 2015-08-19 华东师范大学 Slide independent storage, retrieval and recombination method and equipment based on presentation document
CN105531699A (en) * 2013-06-06 2016-04-27 微软技术许可有限责任公司 Automated system for organizing presentation slides
CN108073680A (en) * 2016-11-10 2018-05-25 谷歌有限责任公司 Generation is with the presentation slides for refining content
CN108509405A (en) * 2018-04-11 2018-09-07 北京深度智耀科技有限公司 A kind of generation method of PowerPoint, device and equipment
CN111259069A (en) * 2019-10-29 2020-06-09 浙江浙大中控信息技术有限公司 Data visualization implementation method based on configuration
CN112560406A (en) * 2020-12-17 2021-03-26 中科三清科技有限公司 Method and device for generating forecast consultation demonstration manuscript
CN112711937A (en) * 2021-01-18 2021-04-27 腾讯科技(深圳)有限公司 Template recommendation method, device, equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105531699A (en) * 2013-06-06 2016-04-27 微软技术许可有限责任公司 Automated system for organizing presentation slides
CN104850559A (en) * 2014-02-18 2015-08-19 华东师范大学 Slide independent storage, retrieval and recombination method and equipment based on presentation document
CN108073680A (en) * 2016-11-10 2018-05-25 谷歌有限责任公司 Generation is with the presentation slides for refining content
CN108509405A (en) * 2018-04-11 2018-09-07 北京深度智耀科技有限公司 A kind of generation method of PowerPoint, device and equipment
CN111259069A (en) * 2019-10-29 2020-06-09 浙江浙大中控信息技术有限公司 Data visualization implementation method based on configuration
CN112560406A (en) * 2020-12-17 2021-03-26 中科三清科技有限公司 Method and device for generating forecast consultation demonstration manuscript
CN112711937A (en) * 2021-01-18 2021-04-27 腾讯科技(深圳)有限公司 Template recommendation method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117272965A (en) * 2023-09-11 2023-12-22 中关村科学城城市大脑股份有限公司 Demonstration manuscript generation method, demonstration manuscript generation device, electronic equipment and computer readable medium
CN117272965B (en) * 2023-09-11 2024-04-12 中关村科学城城市大脑股份有限公司 Demonstration manuscript generation method, demonstration manuscript generation device, electronic equipment and computer readable medium

Similar Documents

Publication Publication Date Title
CN112162965B (en) Log data processing method, device, computer equipment and storage medium
WO2022174491A1 (en) Artificial intelligence-based method and apparatus for medical record quality control, computer device, and storage medium
AU2014343044B2 (en) Method and system for document data extraction template management
CN112613917A (en) Information pushing method, device and equipment based on user portrait and storage medium
CN111177319A (en) Risk event determination method and device, electronic equipment and storage medium
US20160371244A1 (en) Collaboratively reconstituting tables
WO2023272850A1 (en) Decision tree-based product matching method, apparatus and device, and storage medium
CN113420042A (en) Data statistics method, device, equipment and storage medium based on presentation
WO2019071907A1 (en) Method for identifying help information based on operation page, and application server
CN117195886A (en) Text data processing method, device, equipment and medium based on artificial intelligence
CN116755688A (en) Component processing method, device, computer equipment and storage medium
CN116453125A (en) Data input method, device, equipment and storage medium based on artificial intelligence
US20230418859A1 (en) Unified data classification techniques
CN111859985B (en) AI customer service model test method and device, electronic equipment and storage medium
CN114066603A (en) Post-loan risk early warning method and device, electronic equipment and computer readable medium
CN113032515A (en) Method, system, device and storage medium for generating chart based on multiple data sources
CN113342646B (en) Use case generation method, device, electronic equipment and medium
WO2019019456A1 (en) Claim settlement data processing method and apparatus, computer device and storage medium
CN112182158A (en) Automatic document classification method, device, equipment and storage medium
CN117034173A (en) Data processing method, device, computer equipment and storage medium
CN117271790A (en) Method and device for expanding annotation data, computer equipment and storage medium
CN117389607A (en) Signboard configuration method and device, computer equipment and storage medium
CN116069785A (en) Method for training information extraction model, information extraction method and device
CN115757067A (en) Seat buried point data processing method and device, computer equipment and storage medium
CN115809241A (en) Data storage method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination