WO2020140639A1 - Machine learning-based report generating method, apparatus, and computer device - Google Patents

Machine learning-based report generating method, apparatus, and computer device Download PDF

Info

Publication number
WO2020140639A1
WO2020140639A1 PCT/CN2019/119480 CN2019119480W WO2020140639A1 WO 2020140639 A1 WO2020140639 A1 WO 2020140639A1 CN 2019119480 W CN2019119480 W CN 2019119480W WO 2020140639 A1 WO2020140639 A1 WO 2020140639A1
Authority
WO
WIPO (PCT)
Prior art keywords
report
preliminary
type
current user
user
Prior art date
Application number
PCT/CN2019/119480
Other languages
French (fr)
Chinese (zh)
Inventor
徐凯
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020140639A1 publication Critical patent/WO2020140639A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Definitions

  • This application relates to the field of computers, and in particular to a method, device, computer equipment, and storage medium for generating reports based on machine learning.
  • the main purpose of the present application is to provide a report generation method, device, computer equipment and storage medium based on machine learning, aiming to use templates to accurately generate the reports required by users and reduce the time and effort for users to adjust the reports.
  • this application proposes a method for generating a report based on machine learning, including the following steps:
  • the characteristic information of the current user includes at least the occupation information of the current user;
  • the feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is trained by training data composed of user feature information and a report type corresponding to the user feature information Made
  • a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used;
  • the preliminary report is adjusted to obtain a final report.
  • the step of retrieving a preset preliminary report from the database according to the type of report to be used includes:
  • the step of adjusting the preliminary report according to the information input by the current user to obtain the final report includes:
  • the text content input by the current user is filled into the text portion of the preliminary report to obtain the final report.
  • the method for obtaining the report type prediction model includes:
  • the preliminary training model is recorded as the report type prediction model.
  • step of recording the preliminary training model as the report type prediction model includes:
  • the preliminary training model is recorded as the report type prediction model.
  • the method for obtaining the report type prediction model includes:
  • sample data includes user feature information and a report type corresponding to the user feature information
  • the preliminary CHAID decision tree is recorded as the report type prediction model.
  • the step of inputting the sample data of the training set into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree includes:
  • the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples contained in the parent node and the minimum samples contained in the child node number;
  • the sample data of the training set is input into the CHAID decision tree model established by the chi-square automatic interactive detection method for training to obtain a preliminary CHAID decision tree.
  • This application provides a report generation device based on machine learning, including:
  • a characteristic information obtaining unit configured to obtain characteristic information of the current user, the characteristic information of the current user includes at least the occupation information of the current user;
  • Report type prediction model operation unit used for inputting the feature information into a preset report type prediction model based on machine learning, wherein the report type prediction model is determined by user feature information and corresponding to the user feature information The training data composed of the report type is trained;
  • a report type prediction unit used to output the predicted report type that the current user will use
  • a preliminary report retrieval unit configured to retrieve a preset preliminary report from the database according to the type of report to be used, wherein the type of the preliminary report is the same as the type of report to be used;
  • the final report obtaining unit is used to adjust the preliminary report according to the information input by the current user, so as to obtain the final report.
  • the present application provides a computer device, including a memory and a processor.
  • the memory stores computer-readable instructions.
  • the processor executes the computer-readable instructions, any of the steps of the method described above is implemented.
  • the present application provides a computer-readable storage medium on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, implement the steps of any one of the above methods.
  • the machine learning-based report generation method, device, computer equipment and storage medium of the present application through the machine learning-based report type prediction model, to predict the report type that the current user will use, and then adjust from the database according to the report type Take the preset preliminary report, adjust the preliminary report to get the final report, so as to avoid the user's tedious operation, improve the user experience, and improve the efficiency of report completion.
  • FIG. 1 is a schematic flowchart of a method for generating a report based on machine learning according to an embodiment of the application
  • FIG. 2 is a schematic block diagram of a structure of a report generation device based on machine learning according to an embodiment of the present application
  • FIG. 3 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • an embodiment of the present application provides a report generation method based on machine learning, including the following steps:
  • the feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is trained by user feature information and a report type corresponding to the user feature information.
  • the current user's characteristic information is obtained, and the current user's characteristic information includes at least the current user's occupation information.
  • the characteristic information of the current user refers to information that can reflect the characteristics of the current user, such as the occupation of the current user, the type of report used by the current user (within a specified time), the age of the current user, and the gender of the current user.
  • the occupation information of the current user has a greater relationship with the type of report that may be used, so the characteristic information of the current user includes at least the occupation information of the current user, so as to improve the accuracy of the prediction of the report type. For example, stock commentators in the financial industry have a higher probability of using stock statements.
  • the process of acquiring the characteristic information of the current user includes: extracting the characteristic information of the current user from the registered account information of the current user.
  • the feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model passes user feature information and a report type corresponding to the user feature information Trained.
  • the report type prediction model based on machine learning, through continuous self-learning, improves the accuracy of prediction and avoids predicting the wrong report type.
  • the report type prediction model can be generated based on any machine learning model, such as a neural network model and a classification tree model, and then trained through training data. Wherein, the report type prediction model is trained by training data composed of user characteristic information and the report type corresponding to the user characteristic information.
  • the user characteristic information may include any information that can reflect the characteristics of the user, such as the occupation of the user, the type of report recently used by the user (within a specified time), the age of the user, and the gender of the user. Among them, the user refers to a user who uses a report. Wherein, the report type prediction model is used to predict the report type to be used by the user based on the user characteristic information.
  • the predicted report type to be used by the current user is output.
  • the report type can be classified into any classification method. For example, it can be divided into the following chart types: pie chart report (including pie chart in the report), curve chart report, and histogram report; : Financial statements, financial statements, statistical analysis reports.
  • the report type prediction model the predicted report type to be used by the current user can be output.
  • a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used.
  • the preliminary report of the pie chart report can include three types of reports: the pie chart is at the top of the report, the middle of the report, and the end of the report; the financial preliminary report can include the report with the K chart, the report with the histogram, and the report with the curve chart. Reports, etc.
  • preliminary reports can be directly selected by the user, thereby eliminating the user's trivial process of creating reports from scratch, and improving the speed of report formation.
  • the preset preliminary report may be a completed complete report, or the current user may select the exact chart template and text part template from multiple alternative chart templates and text part templates. The combined report.
  • the preliminary report is adjusted to obtain a final report.
  • a preliminary report is provided, and then the final report can be obtained based on the information input by the current user.
  • the information input by the current user includes at least one of information for adjusting the parameters of the preliminary report and text content information for describing the preliminary report.
  • the final report includes at least a chart part, and further, may include a text part.
  • the preset preliminary report is retrieved from the database according to the type of report to be used, wherein the step S4 of the type of the preliminary report is the same as the type of report to be used, including:
  • the preset preliminary report is retrieved from the database.
  • the preliminary report is generated by a combination of chart templates (including chart style templates and chart data templates) and text part templates. Because different reports require different specific requirements, the format, layout, charts, etc. of the report are different. By decomposing the report into charts and text parts, the chart templates and text part templates are designed in advance and stored in the database. When a specific report is needed, you only need to choose from the existing chart templates and text part templates. Combined together, the preliminary report can be formed.
  • the step S5 of adjusting the preliminary report according to the information input by the current user to obtain the final report includes:
  • the preliminary report is adjusted according to the information input by the current user, thereby obtaining the final report.
  • preliminary reports have been obtained.
  • some detailed parameters and specific data content of the preliminary report are not supplemented.
  • adjustments are made according to the specific instructions of the current user, wherein the chart adjustment information includes: adjusting the size of the chart, the data display parameters of the chart (such as the unit time length of the time axis, etc.), etc.; the chart data content information includes : Chart data (such as the data points of the graph, the proportion of each block of the pie chart, etc.); the adjustment information of the text part includes: adjusting the font size and color. Then, the text content input by the current user is filled in the text part to obtain a final report.
  • the method for obtaining the report type prediction model includes:
  • the characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user.
  • the machine learning in this embodiment uses a neural network model, such as the VGG-F model, VGG16 model, InceptionV3 model, Xception model, AlexNet model, etc., and then includes user feature information and a report corresponding to the user feature information
  • the type of sample data trains the neural network model, wherein the more sample data, the more accurate the prediction model trained.
  • BP reverse conduction law
  • the reverse conduction law is based on the gradient descent method, which is essentially a mapping relationship: the function of an n-input m-output BP neural network is from n-dimensional Euclidean space to m-dimensional Euclidean Continuous mapping of a finite field in the Hurst space to update the parameters.
  • the step 203 of recording the preliminary training model as the report type prediction model includes:
  • S2031 Obtain a verification set including a specified amount of sample data, where the sample data of the verification set includes user feature information and a report type corresponding to the user feature information;
  • the preliminary training model is recorded as the report type prediction model.
  • the characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user.
  • the sample data of the verification set is used to predict the report type of the current user in order to verify the preliminary training model, so the verification set of the specified amount of sample data is preferably sample data related to the current user, such as samples with the same occupation and the same age data.
  • the method for obtaining the report type prediction model includes:
  • S211 Obtain a specified amount of sample data, where the sample data includes user feature information and a report type corresponding to the user feature information;
  • the report type prediction model is obtained.
  • the decision tree is classified according to the user's characteristic information to predict the type of report that the user will adopt.
  • the characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user.
  • the CHAID decision tree model refers to the decision tree model that uses the chi-square automatic interactive detection method CHAID.
  • the recording of the preliminary CHAID decision tree as the report type prediction model further includes: verifying the preliminary CHAID decision tree using a verification set composed of sample data obtained in advance; if the verification passes, then the The preliminary CHAID decision tree is recorded as the report type prediction model.
  • the step S212 of inputting the sample data of the training set into the CHAID decision tree model to obtain a preliminary CHAID decision tree includes:
  • the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples included by the parent node, and the The minimum number of samples;
  • the CHAID decision tree model can only be determined by setting standard modeling parameters of the CHAID decision tree model.
  • the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples contained by the parent node and the minimum number of samples contained by the child nodes, for example, the maximum number of layers of the tree is 3-5
  • the significance level of the subdivision of the parent node is 0.05, the minimum number of samples included in the parent node is 100-200, and the minimum number of samples included in the child node is 50-100.
  • the machine learning-based report generation method of this application uses a machine-learning-based report type prediction model to predict the type of report that the current user will use, and then retrieves a preset preliminary report from the database according to the report type.
  • the preliminary report is adjusted to obtain the final report, thereby avoiding the user's tedious operations, improving the user experience, and improving the efficiency of report completion.
  • an embodiment of the present application provides a report generation device based on machine learning, including:
  • the characteristic information obtaining unit 10 is configured to obtain characteristic information of the current user, and the characteristic information of the current user includes at least the occupation information of the current user;
  • a report type prediction model calculation unit 20 configured to input the feature information into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is composed of user feature information and the user feature information Trained by the training data composed of the corresponding report types;
  • the report type prediction unit 30 is used to output the predicted report type to be used by the current user
  • the preliminary report retrieval unit 40 is configured to retrieve a preset preliminary report from the database according to the type of report to be used, wherein the type of the preliminary report is the same as the type of report to be used;
  • the final report obtaining unit 50 is configured to adjust the preliminary report according to the information input by the current user, thereby obtaining the final report.
  • the characteristic information of the current user is obtained, and the characteristic information of the current user includes at least the occupation information of the current user.
  • the characteristic information of the current user refers to information that can reflect the characteristics of the current user, such as the occupation of the current user, the type of report used by the current user (within a specified time), the age of the current user, and the gender of the current user.
  • the occupation information of the current user has a greater relationship with the type of report that may be used, so the characteristic information of the current user includes at least the occupation information of the current user, so as to improve the accuracy of the prediction of the report type. For example, stock commentators in the financial industry have a higher probability of using stock statements.
  • the process of acquiring the characteristic information of the current user includes: extracting the characteristic information of the current user from the registered account information of the current user.
  • the feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model passes user feature information and a report type corresponding to the user feature information Trained.
  • the report type prediction model based on machine learning, through continuous self-learning, improves the accuracy of prediction and avoids predicting the wrong report type.
  • the report type prediction model can be generated based on any machine learning model, such as a neural network model and a classification tree model, and then trained through training data. Wherein, the report type prediction model is trained by training data composed of user characteristic information and the report type corresponding to the user characteristic information.
  • the user characteristic information may include any information that can reflect the characteristics of the user, such as the occupation of the user, the type of report recently used by the user (within a specified time), the age of the user, and the gender of the user. Among them, the user refers to a user who uses a report. Wherein, the report type prediction model is used to predict the report type to be used by the user based on the user characteristic information.
  • the predicted report type to be used by the current user is output.
  • the report type can be classified into any classification method. For example, it can be divided into the following chart types: pie chart report (including pie chart in the report), curve chart report, and histogram report; : Financial statements, financial statements, statistical analysis reports.
  • the report type prediction model the predicted report type to be used by the current user can be output.
  • a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of the report to be used.
  • the preliminary report of the pie chart report can include three types of reports: the pie chart is at the top of the report, the middle of the report, and the end of the report; the financial preliminary report can include the report with the K chart, the report with the histogram, and the report with the curve chart. Reports, etc.
  • preliminary reports can be directly selected by the user, thereby eliminating the user's trivial process of creating reports from scratch, and improving the speed of report formation.
  • the preset preliminary report may be a completed complete report, or the current user may select the exact chart template and text part template from multiple alternative chart templates and text part templates. The combined report.
  • the preliminary report is adjusted according to the information input by the current user, thereby obtaining the final report. From the foregoing, a preliminary report is provided, and then the final report can be obtained based on the information input by the current user.
  • the information input by the current user includes at least one of information for adjusting the parameters of the preliminary report and text content information for describing the preliminary report.
  • the final report includes at least a chart part, and further, may include a text part.
  • the preliminary report retrieval unit 40 includes:
  • the drawing template and text part template retrieval subunit is used to retrieve a preset plurality of chart templates and a plurality of text part templates from a preset database according to the report type to be used;
  • a combination subunit configured to combine the graph template and the text part template selected by the current user into the preliminary report
  • the calling subunit is used for calling the preliminary report.
  • the preset preliminary report is retrieved from the database.
  • the preliminary report is generated by a combination of chart templates (including chart style templates and chart data templates) and text part templates. Because different reports require different specific requirements, the format, layout, charts, etc. of the report are different. By decomposing the report into charts and text parts, the chart templates and text part templates are designed in advance and stored in the database. When a specific report is needed, you only need to choose from the existing chart templates and text part templates. Combined together, the preliminary report can be formed.
  • the final report obtaining unit 50 includes:
  • the final report obtaining subunit is used to fill the text content input by the current user into the text portion of the preliminary report to obtain the final report.
  • the preliminary report is adjusted according to the information input by the current user, thereby obtaining the final report.
  • preliminary reports have been obtained.
  • some detailed parameters and specific data content of the preliminary report are not supplemented.
  • the adjustment is made according to the specific instructions of the current user, wherein the chart adjustment information includes: adjusting the size of the chart, the data display parameters of the chart (such as the unit time length of the time axis, etc.), etc.; the chart data content information includes : Chart data (such as the data points of the graph, the proportion of each block of the pie chart, etc.); the adjustment information of the text part includes: adjusting the font size and color. Then, the text content input by the current user is filled in the text part to obtain a final report.
  • the device includes a report type prediction model acquisition unit, and the report type prediction model acquisition unit includes:
  • a training set acquisition subunit for acquiring a training set including a specified amount of sample data, wherein the sample data includes user feature information and a report type corresponding to the user feature information;
  • the neural network model training subunit is used to input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the neural network model is updated using the reverse conduction law The parameters of the layer to get the preliminary training model;
  • the report type prediction model marking subunit is used to record the preliminary training model as the report type prediction model.
  • the characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user.
  • the machine learning in this embodiment uses a neural network model, such as the VGG-F model, VGG16 model, InceptionV3 model, Xception model, AlexNet model, etc., and then includes user feature information and a report corresponding to the user feature information
  • the type of sample data trains the neural network model, wherein the more sample data, the more accurate the prediction model trained.
  • BP reverse conduction law
  • the reverse conduction law is based on the gradient descent method, which is essentially a mapping relationship: the function of an n-input m-output BP neural network is from n-dimensional Euclidean space to m-dimensional Euclidean Continuous mapping of a finite field in the Hurst space to update the parameters.
  • the report type prediction model marking subunit includes:
  • a verification set obtaining module configured to obtain a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user characteristic information and a report type corresponding to the user characteristic information;
  • a verification module configured to use the sample data of the verification set to verify the preliminary training model
  • the report type prediction model marking module is used to record the preliminary training model as the report type prediction model if the verification is passed.
  • the characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user.
  • the sample data of the verification set is used to predict the report type of the current user in order to verify the preliminary training model, so the verification set of the specified amount of sample data is preferably sample data related to the current user, such as samples with the same occupation and the same age data.
  • the device includes a report type prediction model acquisition unit, and the report type prediction model acquisition unit includes:
  • a sample data obtaining subunit configured to obtain a specified amount of sample data, wherein the sample data includes user characteristic information and a report type corresponding to the user characteristic information;
  • the decision tree model training subunit is used to input the sample data of the training set into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree;
  • the decision tree marking subunit is used to record the preliminary CHAID decision tree as the report type prediction model.
  • the report type prediction model is obtained.
  • the decision tree is classified according to the user's characteristic information to predict the type of report that the user will adopt.
  • the characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user.
  • the CHAID decision tree model refers to the decision tree model that uses the chi-square automatic interactive detection method CHAID.
  • the recording of the preliminary CHAID decision tree as the report type prediction model further includes: verifying the preliminary CHAID decision tree using a verification set composed of sample data obtained in advance; if the verification passes, then the The preliminary CHAID decision tree is recorded as the report type prediction model.
  • the decision tree model training subunit includes:
  • the modeling standard parameter setting module is used to set the modeling standard parameters of the CHAID decision tree model, the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the parent node contains The minimum number of samples and the minimum number of samples contained in the child node;
  • the preliminary CHAID decision tree obtaining module is used for inputting the sample data of the training set into the CHAID decision tree model established by the chi-square automatic interactive detection method for training to obtain a preliminary CHAID decision tree.
  • the CHAID decision tree model can only be determined by setting standard modeling parameters of the CHAID decision tree model.
  • the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples contained by the parent node and the minimum number of samples contained by the child nodes, for example, the maximum number of layers of the tree is 3-5
  • the significance level of the subdivision of the parent node is 0.05, the minimum number of samples included in the parent node is 100-200, and the minimum number of samples included in the child node is 50-100.
  • the report generation device based on machine learning of this application uses the report type prediction model based on machine learning to predict the report type that the current user will use, and then retrieves the preset preliminary report from the database according to the report type.
  • the preliminary report is adjusted to obtain the final report, thereby avoiding the user's tedious operations, improving the user experience, and improving the efficiency of report completion.
  • an embodiment of the present invention further provides a computer device.
  • the computer device may be a server, and its internal structure may be as shown in the figure.
  • the computer device includes a processor, memory, network interface, and database connected by a system bus. Among them, the processor designed by the computer is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer-readable instructions, and a database.
  • the memory device provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer device is used to store data used in the report generation method based on machine learning.
  • the network interface of the computer device is used to communicate with external terminals through a network connection. When the computer-readable instructions are executed by the processor, a report generation method based on machine learning is realized.
  • the above processor executes the above machine learning-based report generation method, including the following steps: acquiring the current user's feature information, the current user's feature information includes at least the current user's occupation information; and entering the feature information into a preset machine-based report Operation in the learned report type prediction model, where the report type prediction model is trained by training data composed of user characteristic information and the report type corresponding to the user characteristic information; the predicted current user will output The type of report used; according to the type of report to be used, a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used; according to the information entered by the current user , Adjust the preliminary report to obtain the final report.
  • the step of retrieving a preset preliminary report from the database according to the type of report to be used includes: The report type to be used, extracting preset multiple chart templates and multiple text part templates from a preset database; combining the current user selected chart template and text part template into the preliminary report; Recall the preliminary report.
  • the step of adjusting the preliminary report according to the information input by the current user to obtain the final report includes: adjusting information, content information of the chart data, and text according to the chart input by the current user Adjust the information, adjust the chart and text parts in the preliminary report; fill the text part of the current user into the text part of the preliminary report to obtain the final report.
  • the method for obtaining a report type prediction model includes: obtaining a training set including a specified amount of sample data, wherein the sample data includes user feature information and a report corresponding to the user feature information Type; input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the parameters of each layer of the neural network model are updated by the reverse conduction method to obtain preliminary training Model; record the preliminary training model as the report type prediction model.
  • the step of recording the preliminary training model as the report type prediction model includes: obtaining a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user characteristics Information, and the report type corresponding to the user feature information; verify the preliminary training model using the sample data of the verification set; if the verification passes, then record the preliminary training model as the report type prediction model.
  • the method for obtaining a report type prediction model includes: obtaining a specified amount of sample data, wherein the sample data includes user feature information and a report type corresponding to the user feature information;
  • the sample data of the training set is input into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree; the preliminary CHAID decision tree is recorded as the report type prediction model.
  • the step of inputting the sample data of the training set into the CHAID decision tree model to obtain a preliminary CHAID decision tree includes: setting modeling standard parameters of the CHAID decision tree model.
  • the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples included in the parent node and the minimum number of samples included in the child node; input the sample data of the training set to the chi-square automatic
  • the CHAID decision tree model established by the interactive detection method is trained to obtain a preliminary CHAID decision tree.
  • the computer equipment of this application uses the report type prediction model based on machine learning to predict the report type that the current user will use, and then retrieves the preset preliminary report from the database according to the report type and adjusts the preliminary report In order to get the final report, so as to avoid the user's tedious operations, improve the user experience, and improve the efficiency of report completion.
  • An embodiment of the present application also provides a computer-readable storage medium on which computer-readable instructions are stored.
  • a method for generating a report form based on machine learning includes the following steps: obtaining the current user’s Feature information, the feature information of the current user includes at least the occupation information of the current user; the feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is determined by the user feature information , And training data composed of the report type corresponding to the user characteristic information is trained; output the predicted report type that the current user will use; according to the report type to be used, the preset is retrieved from the database
  • the preliminary report wherein the type of the preliminary report is the same as the type of the report to be used; according to the information input by the current user, the preliminary report is adjusted to obtain the final report.
  • the computer-readable storage medium is, for example, a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.
  • the step of retrieving a preset preliminary report from the database according to the type of report to be used includes: The report type to be used, extracting preset multiple chart templates and multiple text part templates from a preset database; combining the current user selected chart template and text part template into the preliminary report; Recall the preliminary report.
  • the step of adjusting the preliminary report according to the information input by the current user to obtain the final report includes: adjusting information, content information of the chart data, and text according to the chart input by the current user Adjust the information, adjust the chart and text parts in the preliminary report; fill the text part of the current user into the text part of the preliminary report to obtain the final report.
  • the method for obtaining a report type prediction model includes: obtaining a training set including a specified amount of sample data, wherein the sample data includes user feature information and a report corresponding to the user feature information Type; input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the parameters of each layer of the neural network model are updated by the reverse conduction method to obtain preliminary training Model; record the preliminary training model as the report type prediction model.
  • the step of recording the preliminary training model as the report type prediction model includes: obtaining a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user characteristics Information, and the report type corresponding to the user feature information; verify the preliminary training model using the sample data of the verification set; if the verification passes, then record the preliminary training model as the report type prediction model.
  • the method for obtaining a report type prediction model includes: obtaining a specified amount of sample data, wherein the sample data includes user feature information and a report type corresponding to the user feature information;
  • the sample data of the training set is input into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree; the preliminary CHAID decision tree is recorded as the report type prediction model.
  • the step of inputting the sample data of the training set into the CHAID decision tree model to obtain a preliminary CHAID decision tree includes: setting modeling standard parameters of the CHAID decision tree model.
  • the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples included in the parent node and the minimum number of samples included in the child node; input the sample data of the training set to the chi-square automatic
  • the CHAID decision tree model established by the interactive detection method is trained to obtain a preliminary CHAID decision tree.
  • the computer-readable storage medium of the present application predicts the report type that the current user will use through a report type prediction model based on machine learning, and then retrieves the preset preliminary report from the database according to the report type After adjustment, the final report is obtained, thereby avoiding the user's tedious operations, improving the user experience, and improving the efficiency of report completion.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM random access memory
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual-speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Operations Research (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A machine learning-based report generating method and an apparatus, a computer device, and a storage medium, the method comprising: acquiring feature information of a current user; inputting the feature information into a preset machine learning-based report category prediction model and performing calculation on same, the report category prediction model having been trained on training data constituted by user feature information and report categories corresponding to the user feature information; outputting a predicted report category the current user will use; on the basis of the report category that will be used, retrieving a preset initial report from a database; and on the basis of information input by the current user, adjusting the initial report to obtain a final report. The report that the user requires is thereby correctly generated, reducing time and effort spent by the user in adjusting the report.

Description

基于机器学习的报表生成方法、装置和计算机设备Report generation method, device and computer equipment based on machine learning
本申请要求于2019年1月2日提交中国专利局、申请号为2019100029515,发明名称为“基于机器学习的报表生成方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires the priority of the Chinese patent application submitted to the China Patent Office on January 2, 2019, with the application number 2019100029515 and the invention titled "Machine learning-based report generation method, device and computer equipment", the entire content of which is cited by reference Incorporated in this application.
技术领域Technical field
本申请涉及到计算机领域,特别是涉及到一种基于机器学习的报表生成方法、装置、计算机设备和存储介质。This application relates to the field of computers, and in particular to a method, device, computer equipment, and storage medium for generating reports based on machine learning.
背景技术Background technique
在不同行业中,经常需要不同种类的报表,即使在同一个行业中,也可能需要不同种类的报表。现有技术不能有效地预测用户具体需要什么样的报表,因此经常需要用户手动调整报表类型、报表参数,从而需要繁琐的人工操作,降低用户的工作效率,用户体验必然极差。另外,现有技术的用户需要人工调整报表,以获得最终想要的报表,即没有报表模板、或者没有合适的报表模板,进而在需要新的报表的时候,不得不采用手动调整的方式,大幅度地调整报表。因此,现有技术的报表生成方案,不能避免用户花费额外调整报表的时间与精力。In different industries, different types of reports are often required. Even in the same industry, different types of reports may be required. The existing technology cannot effectively predict what kind of report the user specifically needs. Therefore, it is often necessary for the user to manually adjust the report type and report parameters, thereby requiring tedious manual operations, reducing the user's work efficiency, and the user experience must be extremely poor. In addition, users in the prior art need to manually adjust the report to obtain the final desired report, that is, there is no report template, or no suitable report template, and then when a new report is needed, manual adjustment is required. Adjust the report by a wide margin. Therefore, the report generation solution of the prior art cannot prevent users from spending extra time and effort to adjust the report.
技术问题technical problem
本申请的主要目的为提供一种基于机器学习的报表生成方法、装置、计算机设备和存储介质,旨在利用模板准确生成用户需要的报表,减少用户调整报表的时间与精力。The main purpose of the present application is to provide a report generation method, device, computer equipment and storage medium based on machine learning, aiming to use templates to accurately generate the reports required by users and reduce the time and effort for users to adjust the reports.
技术解决方案Technical solution
为了实现上述目的,本申请提出一种基于机器学习的报表生成方法,包括以下步骤:In order to achieve the above purpose, this application proposes a method for generating a report based on machine learning, including the following steps:
获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;Acquiring characteristic information of the current user, the characteristic information of the current user includes at least the occupation information of the current user;
将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;The feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is trained by training data composed of user feature information and a report type corresponding to the user feature information Made
输出预测的所述当前用户将使用的报表类型;Output the predicted report type that the current user will use;
根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;According to the type of report to be used, a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used;
根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。According to the information input by the current user, the preliminary report is adjusted to obtain a final report.
进一步地,所述根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同的步骤,包括:Further, the step of retrieving a preset preliminary report from the database according to the type of report to be used, wherein the step of the type of the preliminary report being the same as the type of report to be used includes:
根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;According to the type of report to be used, retrieve preset multiple chart templates and multiple text part templates from a preset database;
将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;Combining the chart template and the text part template selected by the current user into the preliminary report;
调取所述初步报表。Recall the preliminary report.
进一步地,所述根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表的步骤,包括:Further, the step of adjusting the preliminary report according to the information input by the current user to obtain the final report includes:
根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;Adjust the graph and text parts in the preliminary report according to the graph adjustment information, graph data content information and text part adjustment information input by the current user;
将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。The text content input by the current user is filled into the text portion of the preliminary report to obtain the final report.
进一步地,所述报表类型预测模型的获取方法,包括:Further, the method for obtaining the report type prediction model includes:
获取包括指定量的样本数据的训练集,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;Obtaining a training set including a specified amount of sample data, where the sample data includes user feature information and a report type corresponding to the user feature information;
将所述训练集的样本数据输入到神经网络模型中进行训练,其中,训练的过程中采用随机梯度下降法,利用反向传导法则更新所述神经网络模型各层的参数,得到初步训练模型;Input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the parameters of each layer of the neural network model are updated by the reverse conduction rule to obtain the preliminary training model;
将所述初步训练模型记为所述报表类型预测模型。The preliminary training model is recorded as the report type prediction model.
进一步地,所述将所述初步训练模型记为所述报表类型预测模型的步骤,包括:Further, the step of recording the preliminary training model as the report type prediction model includes:
获取包括指定量的样本数据的验证集,其中,所述验证集的样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;Obtaining a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user feature information and a report type corresponding to the user feature information;
利用所述验证集的样本数据验证所述初步训练模型;Verify the preliminary training model using the sample data of the verification set;
如果验证通过,则将所述初步训练模型记为所述报表类型预测模型。If the verification is passed, the preliminary training model is recorded as the report type prediction model.
进一步地,所述报表类型预测模型的获取方法,包括:Further, the method for obtaining the report type prediction model includes:
获取指定量的样本数据,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;Obtain a specified amount of sample data, where the sample data includes user feature information and a report type corresponding to the user feature information;
将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树;Input the sample data of the training set into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree;
将所述初步CHAID决策树记为所述报表类型预测模型。The preliminary CHAID decision tree is recorded as the report type prediction model.
进一步地,所述将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树的步骤,包括:Further, the step of inputting the sample data of the training set into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree includes:
设置所述CHAID决策树模型的建模标准参数,所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数;Set the modeling standard parameters of the CHAID decision tree model, the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples contained in the parent node and the minimum samples contained in the child node number;
将训练集的样本数据输入到采用卡方自动交互检测法建立的所述CHAID决策树模型中进行训练,得到初步CHAID决策树。The sample data of the training set is input into the CHAID decision tree model established by the chi-square automatic interactive detection method for training to obtain a preliminary CHAID decision tree.
本申请提供一种基于机器学习的报表生成装置,包括:This application provides a report generation device based on machine learning, including:
特征信息获取单元,用于获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;A characteristic information obtaining unit, configured to obtain characteristic information of the current user, the characteristic information of the current user includes at least the occupation information of the current user;
报表类型预测模型运算单元,用于将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;Report type prediction model operation unit, used for inputting the feature information into a preset report type prediction model based on machine learning, wherein the report type prediction model is determined by user feature information and corresponding to the user feature information The training data composed of the report type is trained;
报表类型预测单元,用于输出预测的所述当前用户将使用的报表类型;A report type prediction unit, used to output the predicted report type that the current user will use;
初步报表调取单元,用于根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;A preliminary report retrieval unit, configured to retrieve a preset preliminary report from the database according to the type of report to be used, wherein the type of the preliminary report is the same as the type of report to be used;
最终报表获得单元,用于根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。The final report obtaining unit is used to adjust the preliminary report according to the information input by the current user, so as to obtain the final report.
本申请提供一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现上述任一项所述方法的步骤。The present application provides a computer device, including a memory and a processor. The memory stores computer-readable instructions. When the processor executes the computer-readable instructions, any of the steps of the method described above is implemented.
本申请提供一种计算机可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述任一项所述的方法的步骤。The present application provides a computer-readable storage medium on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, implement the steps of any one of the above methods.
有益效果Beneficial effect
本申请的基于机器学习的报表生成方法、装置、计算机设备和存储介质,通过基于机器学习的报表类型预测模型,以预测当前用户将使用的报表类型,再根据所述报表类型,从数据库中调取预设的初步报表,对初步报表进行调整后以得到最终报表,从而避免用户的繁琐操作,提升用户体验,提高报表完成效率。The machine learning-based report generation method, device, computer equipment and storage medium of the present application, through the machine learning-based report type prediction model, to predict the report type that the current user will use, and then adjust from the database according to the report type Take the preset preliminary report, adjust the preliminary report to get the final report, so as to avoid the user's tedious operation, improve the user experience, and improve the efficiency of report completion.
附图说明BRIEF DESCRIPTION
图1 为本申请一实施例的基于机器学习的报表生成方法的流程示意图;1 is a schematic flowchart of a method for generating a report based on machine learning according to an embodiment of the application;
图2 为本申请一实施例的基于机器学习的报表生成装置的结构示意框图;2 is a schematic block diagram of a structure of a report generation device based on machine learning according to an embodiment of the present application;
图3 为本申请一实施例的计算机设备的结构示意框图。FIG. 3 is a schematic block diagram of a computer device according to an embodiment of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional characteristics and advantages of the present application will be further described in conjunction with the embodiments and with reference to the drawings.
本发明的最佳实施方式Best Mode of the Invention
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions and advantages of the present application more clear, the following describes the present application in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, and are not used to limit the present application.
参照图1,本申请实施例提供一种基于机器学习的报表生成方法,包括以下步骤:Referring to FIG. 1, an embodiment of the present application provides a report generation method based on machine learning, including the following steps:
S1、获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;S1. Obtain the current user's characteristic information, where the current user's characteristic information includes at least the current user's occupation information;
S2、将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;S2. The feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is trained by user feature information and a report type corresponding to the user feature information. Data training;
S3、输出预测的所述当前用户将使用的报表类型;S3. Output the predicted report type that the current user will use;
S4、根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;S4. According to the type of report to be used, retrieve a preset preliminary report from the database, wherein the type of the preliminary report is the same as the type of report to be used;
S5、根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。S5. Adjust the preliminary report according to the information input by the current user, so as to obtain a final report.
如上述步骤S1所述,获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息。其中,所述当前用户的特征信息是指,能够体现当前用户的特性的信息,例如当前用户的职业、当前用户近期(指定时间内)使用的报表类型、当前用户的年龄、当前用户的性别。其中,当前用户的职业信息与其可能使用的报表类型有较大关系,所以所述当前用户的特征信息至少包括当前用户的职业信息,以提高报表类型预测的准确性。例如,金融行业中的股票评论师,有较高机率使用股票报表。获取当前用户的特征信息的过程包括:从所述当前用户的注册帐户信息中提取所述当前用户的特征信息。As described in step S1 above, the current user's characteristic information is obtained, and the current user's characteristic information includes at least the current user's occupation information. Wherein, the characteristic information of the current user refers to information that can reflect the characteristics of the current user, such as the occupation of the current user, the type of report used by the current user (within a specified time), the age of the current user, and the gender of the current user. Among them, the occupation information of the current user has a greater relationship with the type of report that may be used, so the characteristic information of the current user includes at least the occupation information of the current user, so as to improve the accuracy of the prediction of the report type. For example, stock commentators in the financial industry have a higher probability of using stock statements. The process of acquiring the characteristic information of the current user includes: extracting the characteristic information of the current user from the registered account information of the current user.
如上述步骤S2所述,将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过用户特征信息,及与所述用户特征信息对应的报表类型训练而成。基于机器学习的报表类型预测模型,通过不断的自我学习,提高预测的准确性,避免预测出错误的报表类型。所述报表类型预测模型可以基于任意机器学习的模型生成,例如神经网络模型、分类树模型,再经由训练数据训练而成。其中,所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成。所述用户特征信息可以包括任意能够体现用户的特性的信息,例如用户的职业、用户近期(指定时间内)使用的报表类型、用户的年龄、用户的性别。其中,所述用户指使用报表的用户。其中,所述报表类型预测模型用于根据用户特征信息,预测出用户将要使用的报表类型。As described in step S2 above, the feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model passes user feature information and a report type corresponding to the user feature information Trained. The report type prediction model based on machine learning, through continuous self-learning, improves the accuracy of prediction and avoids predicting the wrong report type. The report type prediction model can be generated based on any machine learning model, such as a neural network model and a classification tree model, and then trained through training data. Wherein, the report type prediction model is trained by training data composed of user characteristic information and the report type corresponding to the user characteristic information. The user characteristic information may include any information that can reflect the characteristics of the user, such as the occupation of the user, the type of report recently used by the user (within a specified time), the age of the user, and the gender of the user. Among them, the user refers to a user who uses a report. Wherein, the report type prediction model is used to predict the report type to be used by the user based on the user characteristic information.
如上述步骤S3所述,输出预测的所述当前用户将使用的报表类型。所述报表类型可以为任意分类方法分成的类型,例如以包括的图表种类来分为:饼状图报表(报表中包括饼状图)、曲线图报表、直方图报表;根据用途来来分为:财务报表、金融报表、统计分析报表。通过所述报表类型预测模型,即可输出预测的所述当前用户将使用的报表类型。As described in step S3 above, the predicted report type to be used by the current user is output. The report type can be classified into any classification method. For example, it can be divided into the following chart types: pie chart report (including pie chart in the report), curve chart report, and histogram report; : Financial statements, financial statements, statistical analysis reports. Through the report type prediction model, the predicted report type to be used by the current user can be output.
如上述步骤S4所述,根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同。其中预设的数据库中预存有不同报表类型的初步报表,同一报表类型的初步报表可以有多种,以供用户选择。例如饼状图报表初步报表可以包括饼状图位于报表顶部、位于报表中间、位于报表末尾等三种报表;金融初步报表可以包括具有K线图的报表、具有直方图的报表、具有曲线图的报表等。这些初步报表可供用户直接选择,从而省去了用户琐碎地从零开始建立报表的过程,提高了报表形成速度。其中,所述预设的初步报表可以是已经建好的完整的报表,也可以是由当前用户从多个备选的图表模板与文字部分模板中选出需要的确切的图表模板与文字部分模板所组合而成的报表。As described in step S4 above, according to the type of report to be used, a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used. There are pre-stored preliminary reports of different report types in the preset database, and there can be multiple preliminary reports of the same report type for users to choose. For example, the preliminary report of the pie chart report can include three types of reports: the pie chart is at the top of the report, the middle of the report, and the end of the report; the financial preliminary report can include the report with the K chart, the report with the histogram, and the report with the curve chart. Reports, etc. These preliminary reports can be directly selected by the user, thereby eliminating the user's trivial process of creating reports from scratch, and improving the speed of report formation. Wherein, the preset preliminary report may be a completed complete report, or the current user may select the exact chart template and text part template from multiple alternative chart templates and text part templates. The combined report.
如上述步骤S5所述,根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。由前述,提供了初步报表,再根据所述当前用户输入的信息,即可得到最终报表。其中,所述当前用户输入的信息,包括对初步报表的参数进行调整的信息、对初步报表进行描述的文字内容信息中的至少一种。其中,所述最终报表至少包括图表部分,进一步地,还可以包括文字部分。As described in step S5 above, according to the information input by the current user, the preliminary report is adjusted to obtain a final report. From the foregoing, a preliminary report is provided, and then the final report can be obtained based on the information input by the current user. Wherein, the information input by the current user includes at least one of information for adjusting the parameters of the preliminary report and text content information for describing the preliminary report. Wherein, the final report includes at least a chart part, and further, may include a text part.
在一个实施方式中,所述根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同的步骤S4,包括:In an embodiment, the preset preliminary report is retrieved from the database according to the type of report to be used, wherein the step S4 of the type of the preliminary report is the same as the type of report to be used, including:
S401、根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;S401. According to the type of report to be used, retrieve a plurality of preset chart templates and a plurality of text part templates from a preset database;
S402、将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;S402: Combine the graph template selected by the current user and the text part template into the preliminary report;
S403、调取所述初步报表。S403. Recall the preliminary report.
如上所述,实现了从数据库中调取预设的初步报表。其中,所述初步报表由图表模板(包括图表样式模板和图表数据模板)与文字部分模板组合生成。由于不同报表需要的具体要求不同,因此报表的格式、布局、图表等各方面都不尽相同。通过将报表分解为图表和文字部分,再针对性地预先设计有图表模板与文字部分模板并存储于数据库中,当需要特定报表时,只需从已有的图表模板与文字部分模板中选择,再组合起来,即可形成所述初步报表。As mentioned above, the preset preliminary report is retrieved from the database. Wherein, the preliminary report is generated by a combination of chart templates (including chart style templates and chart data templates) and text part templates. Because different reports require different specific requirements, the format, layout, charts, etc. of the report are different. By decomposing the report into charts and text parts, the chart templates and text part templates are designed in advance and stored in the database. When a specific report is needed, you only need to choose from the existing chart templates and text part templates. Combined together, the preliminary report can be formed.
在一个实施方式中,所述根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表的步骤S5,包括:In one embodiment, the step S5 of adjusting the preliminary report according to the information input by the current user to obtain the final report includes:
S501、根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;S501. Adjust the chart and text in the preliminary report according to the chart adjustment information, chart data content information, and text part adjustment information input by the current user;
S502、将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。S502. Fill the text content input by the current user into the text portion of the preliminary report to obtain a final report.
如上所述,实现了根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。如前述,已经得到了初步报表。但是所述初步报表的一些细节参数与具体数据内容并未补充。据此,根据当前用户的具体指令进行调整,其中,所述图表调整信息包括:调整图表的大小、图表的数据显示参数(如时间轴的单位时间长度等)等;所述图表数据内容信息包括:图表数据(如曲线图的数据点,饼状图的各区块占比等)等;所述文字部分调整信息包括:调整字体大小、颜色等。再将所述当前用户输入的文字内容填入所述文字部分,获得最终报表。As described above, the preliminary report is adjusted according to the information input by the current user, thereby obtaining the final report. As mentioned earlier, preliminary reports have been obtained. However, some detailed parameters and specific data content of the preliminary report are not supplemented. According to this, adjustments are made according to the specific instructions of the current user, wherein the chart adjustment information includes: adjusting the size of the chart, the data display parameters of the chart (such as the unit time length of the time axis, etc.), etc.; the chart data content information includes : Chart data (such as the data points of the graph, the proportion of each block of the pie chart, etc.); the adjustment information of the text part includes: adjusting the font size and color. Then, the text content input by the current user is filled in the text part to obtain a final report.
在一个实施方式中,所述报表类型预测模型的获取方法,包括:In one embodiment, the method for obtaining the report type prediction model includes:
S201、获取包括指定量的样本数据的训练集,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;S201. Obtain a training set including a specified amount of sample data, where the sample data includes user feature information and a report type corresponding to the user feature information;
S202、将所述训练集的样本数据输入到神经网络模型中进行训练,其中,训练的过程中采用随机梯度下降法,利用反向传导法则更新所述神经网络模型各层的参数,得到初步训练模型;S202. Input the sample data of the training set into a neural network model for training, in which a stochastic gradient descent method is used in the training process, and the parameters of each layer of the neural network model are updated using the reverse conduction law to obtain preliminary training. model;
S203、将所述初步训练模型记为所述报表类型预测模型。S203. Record the preliminary training model as the report type prediction model.
如上所述,实现了获取所述报表类型预测模型。其中,所述用户的特征信息是指,能够体现用户的特性的信息,例如用户的职业、当前近期(指定时间内)使用的报表类型、用户的年龄、用户的性别。本实施方式中的机器学习,采用神经网络模型,例如采用VGG-F模型、VGG16模型、InceptionV3模型、Xception模型、AlexNet模型等,再以包括用户特征信息,以及与所述用户特征信息对应的报表类型的样本数据训练所述神经网络模型,其中样本数据越多,训练而得的预测模型越准确。当样品数据过多时,优选采用随机梯度下降法训练,即随机取样一些训练数据,替代整个训练集,从而提高训练速度。其中反向传导法则(BP)建立在梯度下降法的基础上,其实质上是一种映射关系:一个n输入m输出的BP神经网络所完成的功能是从n维欧氏空间向m维欧氏空间中一有限域的连续映射,以实现参数的更新。As mentioned above, the acquisition of the report type prediction model is realized. The characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user. The machine learning in this embodiment uses a neural network model, such as the VGG-F model, VGG16 model, InceptionV3 model, Xception model, AlexNet model, etc., and then includes user feature information and a report corresponding to the user feature information The type of sample data trains the neural network model, wherein the more sample data, the more accurate the prediction model trained. When there are too many sample data, it is preferable to use the stochastic gradient descent method for training, that is, to randomly sample some training data to replace the entire training set, thereby increasing the training speed. The reverse conduction law (BP) is based on the gradient descent method, which is essentially a mapping relationship: the function of an n-input m-output BP neural network is from n-dimensional Euclidean space to m-dimensional Euclidean Continuous mapping of a finite field in the Hurst space to update the parameters.
在一个实施方式中,所述将所述初步训练模型记为所述报表类型预测模型的步骤203,包括:In one embodiment, the step 203 of recording the preliminary training model as the report type prediction model includes:
S2031、获取包括指定量的样本数据的验证集,其中,所述验证集的样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;S2031: Obtain a verification set including a specified amount of sample data, where the sample data of the verification set includes user feature information and a report type corresponding to the user feature information;
S2032、利用所述验证集的样本数据验证所述初步训练模型;S2032. Use the sample data of the verification set to verify the preliminary training model;
S2033、如果验证通过,则将所述初步训练模型记为所述报表类型预测模型。S2033. If the verification is passed, the preliminary training model is recorded as the report type prediction model.
如上所述,实现了获取所述报表类型预测模型。其中,所述用户的特征信息是指,能够体现用户的特性的信息,例如用户的职业、当前近期(指定时间内)使用的报表类型、用户的年龄、用户的性别。其中验证集的样本数据为了验证初步训练模型是为能够用于当前用户的报表类型预测,所以指定量的样本数据的验证集,优选与当前用户相关的样本数据,例如职业相同、年龄相同的样本数据。当验证通过,表明该初步训练模型可用,据此将所述初步训练模型记为所述报表类型预测模型。As mentioned above, the acquisition of the report type prediction model is realized. The characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user. The sample data of the verification set is used to predict the report type of the current user in order to verify the preliminary training model, so the verification set of the specified amount of sample data is preferably sample data related to the current user, such as samples with the same occupation and the same age data. When the verification passes, it indicates that the preliminary training model is available, and accordingly the preliminary training model is recorded as the report type prediction model.
在一个实施方式中,所述报表类型预测模型的获取方法,包括:In one embodiment, the method for obtaining the report type prediction model includes:
S211、获取指定量的样本数据,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;S211: Obtain a specified amount of sample data, where the sample data includes user feature information and a report type corresponding to the user feature information;
S212、将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树;S212. Input the sample data of the training set into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree;
S213、将所述初步CHAID决策树记为所述报表类型预测模型。S213. Record the preliminary CHAID decision tree as the report type prediction model.
如上所述,实现了获得报表类型预测模型。决策树根据用户特征信息进行分类,以预测用户将采用的报表类型。其中所述用户的特征信息是指,能够体现用户的特性的信息,例如用户的职业、当前近期(指定时间内)使用的报表类型、用户的年龄、用户的性别。其中CHAID决策树模型指采用卡方自动交互检测法CHAID的决策树模型。在此简单介绍CHAID决策树的原理:1、合并组内对决策变量影响差别不显著的组值;2、选取卡方值最大的变量作为树分类变量;3、重复1、2步骤,至不能选取卡方值大于某值或P值不再小于某临界值,或样本小于某数。其中,所述CHAID的决策树模型的建模标准例如为树的最大层数为3层、母节点可再分的显著水平为0.05、母节点包含的最小样本数为100、子节点包含的最小样本数为50。进一步地,所述将所述初步CHAID决策树记为所述报表类型预测模型还包括:利用预先获得的由样本数据构成的验证集验证所述初步CHAID决策树;如果验证通过,则将所述初步CHAID决策树记为所述报表类型预测模型。As described above, the report type prediction model is obtained. The decision tree is classified according to the user's characteristic information to predict the type of report that the user will adopt. The characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user. The CHAID decision tree model refers to the decision tree model that uses the chi-square automatic interactive detection method CHAID. Here is a brief introduction to the principles of the CHAID decision tree: 1. Group values in the merged group that have no significant difference in decision variables; 2. Select the variable with the largest chi-square value as the tree classification variable; 3. Repeat steps 1 and 2 Choose the chi-square value greater than a certain value or P value is no longer less than a critical value, or the sample is less than a certain number. Among them, the modeling criteria of the CHAID decision tree model is, for example, the maximum number of layers of the tree is 3, the significance level of the subdivision of the parent node is 0.05, the minimum number of samples included in the parent node is 100, and the minimum value included in the child nodes is The number of samples is 50. Further, the recording of the preliminary CHAID decision tree as the report type prediction model further includes: verifying the preliminary CHAID decision tree using a verification set composed of sample data obtained in advance; if the verification passes, then the The preliminary CHAID decision tree is recorded as the report type prediction model.
在一个实施方式中,所述将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树的步骤S212,包括:In one embodiment, the step S212 of inputting the sample data of the training set into the CHAID decision tree model to obtain a preliminary CHAID decision tree includes:
S2121、设置所述CHAID决策树模型的建模标准参数,所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数;S2121. Set the modeling standard parameters of the CHAID decision tree model. The modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples included by the parent node, and the The minimum number of samples;
S2122、将训练集的样本数据输入到采用卡方自动交互检测法建立的所述CHAID决策树模型中进行训练,得到初步CHAID决策树。S2122. Input the sample data of the training set into the CHAID decision tree model established by the chi-square automatic interactive detection method for training to obtain a preliminary CHAID decision tree.
如上所述,实现了得到初步CHAID决策树。其中设置所述CHAID决策树模型的建模标准参数,才可确定所述CHAID决策树模型。所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数,例如树的最大层数为3-5层、母节点可再分的显著水平为0.05、母节点包含的最小样本数为100-200、子节点包含的最小样本数为50-100。As mentioned above, the preliminary CHAID decision tree is achieved. The CHAID decision tree model can only be determined by setting standard modeling parameters of the CHAID decision tree model. The modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples contained by the parent node and the minimum number of samples contained by the child nodes, for example, the maximum number of layers of the tree is 3-5 The significance level of the subdivision of the parent node is 0.05, the minimum number of samples included in the parent node is 100-200, and the minimum number of samples included in the child node is 50-100.
本申请的基于机器学习的报表生成方法,通过基于机器学习的报表类型预测模型,以预测当前用户将使用的报表类型,再根据所述报表类型,从数据库中调取预设的初步报表,对初步报表进行调整后以得到最终报表,从而避免用户的繁琐操作,提升用户体验,提高报表完成效率。The machine learning-based report generation method of this application uses a machine-learning-based report type prediction model to predict the type of report that the current user will use, and then retrieves a preset preliminary report from the database according to the report type. The preliminary report is adjusted to obtain the final report, thereby avoiding the user's tedious operations, improving the user experience, and improving the efficiency of report completion.
参照图2,本申请实施例提供一种基于机器学习的报表生成装置,包括:Referring to FIG. 2, an embodiment of the present application provides a report generation device based on machine learning, including:
特征信息获取单元10,用于获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;The characteristic information obtaining unit 10 is configured to obtain characteristic information of the current user, and the characteristic information of the current user includes at least the occupation information of the current user;
报表类型预测模型运算单元20,用于将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;A report type prediction model calculation unit 20, configured to input the feature information into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is composed of user feature information and the user feature information Trained by the training data composed of the corresponding report types;
报表类型预测单元30,用于输出预测的所述当前用户将使用的报表类型;The report type prediction unit 30 is used to output the predicted report type to be used by the current user;
初步报表调取单元40,用于根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;The preliminary report retrieval unit 40 is configured to retrieve a preset preliminary report from the database according to the type of report to be used, wherein the type of the preliminary report is the same as the type of report to be used;
最终报表获得单元50,用于根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。The final report obtaining unit 50 is configured to adjust the preliminary report according to the information input by the current user, thereby obtaining the final report.
如上述单元10所述,获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息。其中,所述当前用户的特征信息是指,能够体现当前用户的特性的信息,例如当前用户的职业、当前用户近期(指定时间内)使用的报表类型、当前用户的年龄、当前用户的性别。其中,当前用户的职业信息与其可能使用的报表类型有较大关系,所以所述当前用户的特征信息至少包括当前用户的职业信息,以提高报表类型预测的准确性。例如,金融行业中的股票评论师,有较高机率使用股票报表。获取当前用户的特征信息的过程包括:从所述当前用户的注册帐户信息中提取所述当前用户的特征信息。As described in the above unit 10, the characteristic information of the current user is obtained, and the characteristic information of the current user includes at least the occupation information of the current user. Wherein, the characteristic information of the current user refers to information that can reflect the characteristics of the current user, such as the occupation of the current user, the type of report used by the current user (within a specified time), the age of the current user, and the gender of the current user. Among them, the occupation information of the current user has a greater relationship with the type of report that may be used, so the characteristic information of the current user includes at least the occupation information of the current user, so as to improve the accuracy of the prediction of the report type. For example, stock commentators in the financial industry have a higher probability of using stock statements. The process of acquiring the characteristic information of the current user includes: extracting the characteristic information of the current user from the registered account information of the current user.
如上述单元20所述,将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过用户特征信息,及与所述用户特征信息对应的报表类型训练而成。基于机器学习的报表类型预测模型,通过不断的自我学习,提高预测的准确性,避免预测出错误的报表类型。所述报表类型预测模型可以基于任意机器学习的模型生成,例如神经网络模型、分类树模型,再经由训练数据训练而成。其中,所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成。所述用户特征信息可以包括任意能够体现用户的特性的信息,例如用户的职业、用户近期(指定时间内)使用的报表类型、用户的年龄、用户的性别。其中,所述用户指使用报表的用户。其中,所述报表类型预测模型用于根据用户特征信息,预测出用户将要使用的报表类型。As described in the above unit 20, the feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model passes user feature information and a report type corresponding to the user feature information Trained. The report type prediction model based on machine learning, through continuous self-learning, improves the accuracy of prediction and avoids predicting the wrong report type. The report type prediction model can be generated based on any machine learning model, such as a neural network model and a classification tree model, and then trained through training data. Wherein, the report type prediction model is trained by training data composed of user characteristic information and the report type corresponding to the user characteristic information. The user characteristic information may include any information that can reflect the characteristics of the user, such as the occupation of the user, the type of report recently used by the user (within a specified time), the age of the user, and the gender of the user. Among them, the user refers to a user who uses a report. Wherein, the report type prediction model is used to predict the report type to be used by the user based on the user characteristic information.
如上述单元30所述,输出预测的所述当前用户将使用的报表类型。所述报表类型可以为任意分类方法分成的类型,例如以包括的图表种类来分为:饼状图报表(报表中包括饼状图)、曲线图报表、直方图报表;根据用途来来分为:财务报表、金融报表、统计分析报表。通过所述报表类型预测模型,即可输出预测的所述当前用户将使用的报表类型。As described in the above unit 30, the predicted report type to be used by the current user is output. The report type can be classified into any classification method. For example, it can be divided into the following chart types: pie chart report (including pie chart in the report), curve chart report, and histogram report; : Financial statements, financial statements, statistical analysis reports. Through the report type prediction model, the predicted report type to be used by the current user can be output.
如上述单元40所述,根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同。其中预设的数据库中预存有不同报表类型的初步报表,同一报表类型的初步报表可以有多种,以供用户选择。例如饼状图报表初步报表可以包括饼状图位于报表顶部、位于报表中间、位于报表末尾等三种报表;金融初步报表可以包括具有K线图的报表、具有直方图的报表、具有曲线图的报表等。这些初步报表可供用户直接选择,从而省去了用户琐碎地从零开始建立报表的过程,提高了报表形成速度。其中,所述预设的初步报表可以是已经建好的完整的报表,也可以是由当前用户从多个备选的图表模板与文字部分模板中选出需要的确切的图表模板与文字部分模板所组合而成的报表。As described in the above unit 40, according to the report type to be used, a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of the report to be used. There are pre-stored preliminary reports of different report types in the preset database, and there can be multiple preliminary reports of the same report type for users to choose. For example, the preliminary report of the pie chart report can include three types of reports: the pie chart is at the top of the report, the middle of the report, and the end of the report; the financial preliminary report can include the report with the K chart, the report with the histogram, and the report with the curve chart. Reports, etc. These preliminary reports can be directly selected by the user, thereby eliminating the user's trivial process of creating reports from scratch, and improving the speed of report formation. Wherein, the preset preliminary report may be a completed complete report, or the current user may select the exact chart template and text part template from multiple alternative chart templates and text part templates. The combined report.
如上述单元50所述,根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。由前述,提供了初步报表,再根据所述当前用户输入的信息,即可得到最终报表。其中,所述当前用户输入的信息,包括对初步报表的参数进行调整的信息、对初步报表进行描述的文字内容信息中的至少一种。其中,所述最终报表至少包括图表部分,进一步地,还可以包括文字部分。As described in the above unit 50, the preliminary report is adjusted according to the information input by the current user, thereby obtaining the final report. From the foregoing, a preliminary report is provided, and then the final report can be obtained based on the information input by the current user. Wherein, the information input by the current user includes at least one of information for adjusting the parameters of the preliminary report and text content information for describing the preliminary report. Wherein, the final report includes at least a chart part, and further, may include a text part.
在一个实施方式中,所述初步报表调取单元40,包括:In one embodiment, the preliminary report retrieval unit 40 includes:
图表模板与文字部分模板调取子单元,用于根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;The drawing template and text part template retrieval subunit is used to retrieve a preset plurality of chart templates and a plurality of text part templates from a preset database according to the report type to be used;
组合子单元,用于将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;A combination subunit, configured to combine the graph template and the text part template selected by the current user into the preliminary report;
调取子单元,用于调取所述初步报表。The calling subunit is used for calling the preliminary report.
如上所述,实现了从数据库中调取预设的初步报表。其中,所述初步报表由图表模板(包括图表样式模板和图表数据模板)与文字部分模板组合生成。由于不同报表需要的具体要求不同,因此报表的格式、布局、图表等各方面都不尽相同。通过将报表分解为图表和文字部分,再针对性地预先设计有图表模板与文字部分模板并存储于数据库中,当需要特定报表时,只需从已有的图表模板与文字部分模板中选择,再组合起来,即可形成所述初步报表。As mentioned above, the preset preliminary report is retrieved from the database. Wherein, the preliminary report is generated by a combination of chart templates (including chart style templates and chart data templates) and text part templates. Because different reports require different specific requirements, the format, layout, charts, etc. of the report are different. By decomposing the report into charts and text parts, the chart templates and text part templates are designed in advance and stored in the database. When a specific report is needed, you only need to choose from the existing chart templates and text part templates. Combined together, the preliminary report can be formed.
在一个实施方式中,所述最终报表获得单元50,包括:In one embodiment, the final report obtaining unit 50 includes:
调整子单元,用于根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;An adjustment subunit for adjusting the chart and text in the preliminary report according to the chart adjustment information, chart data content information and text part adjustment information input by the current user;
最终报表获得子单元,用于将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。The final report obtaining subunit is used to fill the text content input by the current user into the text portion of the preliminary report to obtain the final report.
如上所述,实现了根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。如前述,已经得到了初步报表。但是所述初步报表的一些细节参数与具体数据内容并未补充。据此,根据当前用户的具体指令进行调整,其中,所述图表调整信息包括:调整图表的大小、图表的数据显示参数(如时间轴的单位时间长度等)等;所述图表数据内容信息包括:图表数据(如曲线图的数据点,饼状图的各区块占比等)等;所述文字部分调整信息包括:调整字体大小、颜色等。再将所述当前用户输入的文字内容填入所述文字部分,获得最终报表。As described above, the preliminary report is adjusted according to the information input by the current user, thereby obtaining the final report. As mentioned earlier, preliminary reports have been obtained. However, some detailed parameters and specific data content of the preliminary report are not supplemented. According to this, the adjustment is made according to the specific instructions of the current user, wherein the chart adjustment information includes: adjusting the size of the chart, the data display parameters of the chart (such as the unit time length of the time axis, etc.), etc.; the chart data content information includes : Chart data (such as the data points of the graph, the proportion of each block of the pie chart, etc.); the adjustment information of the text part includes: adjusting the font size and color. Then, the text content input by the current user is filled in the text part to obtain a final report.
在一个实施方式中,所述装置包括报表类型预测模型获取单元,所述报表类型预测模型获取单元,包括:In one embodiment, the device includes a report type prediction model acquisition unit, and the report type prediction model acquisition unit includes:
训练集获取子单元,用于获取包括指定量的样本数据的训练集,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;A training set acquisition subunit for acquiring a training set including a specified amount of sample data, wherein the sample data includes user feature information and a report type corresponding to the user feature information;
神经网络模型训练子单元,用于将所述训练集的样本数据输入到神经网络模型中进行训练,其中,训练的过程中采用随机梯度下降法,利用反向传导法则更新所述神经网络模型各层的参数,得到初步训练模型;The neural network model training subunit is used to input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the neural network model is updated using the reverse conduction law The parameters of the layer to get the preliminary training model;
报表类型预测模型标记子单元,用于将所述初步训练模型记为所述报表类型预测模型。The report type prediction model marking subunit is used to record the preliminary training model as the report type prediction model.
如上所述,实现了获取所述报表类型预测模型。其中,所述用户的特征信息是指,能够体现用户的特性的信息,例如用户的职业、当前近期(指定时间内)使用的报表类型、用户的年龄、用户的性别。本实施方式中的机器学习,采用神经网络模型,例如采用VGG-F模型、VGG16模型、InceptionV3模型、Xception模型、AlexNet模型等,再以包括用户特征信息,以及与所述用户特征信息对应的报表类型的样本数据训练所述神经网络模型,其中样本数据越多,训练而得的预测模型越准确。当样品数据过多时,优选采用随机梯度下降法训练,即随机取样一些训练数据,替代整个训练集,从而提高训练速度。其中反向传导法则(BP)建立在梯度下降法的基础上,其实质上是一种映射关系:一个n输入m输出的BP神经网络所完成的功能是从n维欧氏空间向m维欧氏空间中一有限域的连续映射,以实现参数的更新。As mentioned above, the acquisition of the report type prediction model is realized. The characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user. The machine learning in this embodiment uses a neural network model, such as the VGG-F model, VGG16 model, InceptionV3 model, Xception model, AlexNet model, etc., and then includes user feature information and a report corresponding to the user feature information The type of sample data trains the neural network model, wherein the more sample data, the more accurate the prediction model trained. When there are too many sample data, it is preferable to use the stochastic gradient descent method for training, that is, to randomly sample some training data to replace the entire training set, thereby increasing the training speed. The reverse conduction law (BP) is based on the gradient descent method, which is essentially a mapping relationship: the function of an n-input m-output BP neural network is from n-dimensional Euclidean space to m-dimensional Euclidean Continuous mapping of a finite field in the Hurst space to update the parameters.
在一个实施方式中,所述报表类型预测模型标记子单元,包括:In one embodiment, the report type prediction model marking subunit includes:
验证集获取模块,用于获取包括指定量的样本数据的验证集,其中,所述验证集的样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;A verification set obtaining module, configured to obtain a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user characteristic information and a report type corresponding to the user characteristic information;
验证模块,用于利用所述验证集的样本数据验证所述初步训练模型;A verification module, configured to use the sample data of the verification set to verify the preliminary training model;
报表类型预测模型标记模块,用于如果验证通过,则将所述初步训练模型记为所述报表类型预测模型。The report type prediction model marking module is used to record the preliminary training model as the report type prediction model if the verification is passed.
如上所述,实现了获取所述报表类型预测模型。其中,所述用户的特征信息是指,能够体现用户的特性的信息,例如用户的职业、当前近期(指定时间内)使用的报表类型、用户的年龄、用户的性别。其中验证集的样本数据为了验证初步训练模型是为能够用于当前用户的报表类型预测,所以指定量的样本数据的验证集,优选与当前用户相关的样本数据,例如职业相同、年龄相同的样本数据。当验证通过,表明该初步训练模型可用,据此将所述初步训练模型记为所述报表类型预测模型。As described above, the acquisition of the report type prediction model is realized. The characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user. The sample data of the verification set is used to predict the report type of the current user in order to verify the preliminary training model, so the verification set of the specified amount of sample data is preferably sample data related to the current user, such as samples with the same occupation and the same age data. When the verification passes, it indicates that the preliminary training model is available, and accordingly the preliminary training model is recorded as the report type prediction model.
在一个实施方式中,所述装置包括报表类型预测模型获取单元,所述报表类型预测模型获取单元,包括:In one embodiment, the device includes a report type prediction model acquisition unit, and the report type prediction model acquisition unit includes:
样本数据获取子单元,用于获取指定量的样本数据,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;A sample data obtaining subunit, configured to obtain a specified amount of sample data, wherein the sample data includes user characteristic information and a report type corresponding to the user characteristic information;
决策树模型训练子单元,用于将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树;The decision tree model training subunit is used to input the sample data of the training set into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree;
决策树标记子单元,用于将所述初步CHAID决策树记为所述报表类型预测模型。The decision tree marking subunit is used to record the preliminary CHAID decision tree as the report type prediction model.
如上所述,实现了获得报表类型预测模型。决策树根据用户特征信息进行分类,以预测用户将采用的报表类型。其中所述用户的特征信息是指,能够体现用户的特性的信息,例如用户的职业、当前近期(指定时间内)使用的报表类型、用户的年龄、用户的性别。其中CHAID决策树模型指采用卡方自动交互检测法CHAID的决策树模型。在此简单介绍CHAID决策树的原理:1、合并组内对决策变量影响差别不显著的组值;2、选取卡方值最大的变量作为树分类变量;3、重复1、2步骤,至不能选取卡方值大于某值或P值不再小于某临界值,或样本小于某数。其中,所述CHAID的决策树模型的建模标准例如为树的最大层数为3层、母节点可再分的显著水平为0.05、母节点包含的最小样本数为100、子节点包含的最小样本数为50。进一步地,所述将所述初步CHAID决策树记为所述报表类型预测模型还包括:利用预先获得的由样本数据构成的验证集验证所述初步CHAID决策树;如果验证通过,则将所述初步CHAID决策树记为所述报表类型预测模型。As described above, the report type prediction model is obtained. The decision tree is classified according to the user's characteristic information to predict the type of report that the user will adopt. The characteristic information of the user refers to information that can reflect the characteristics of the user, such as the occupation of the user, the type of report currently used recently (within a specified time), the age of the user, and the gender of the user. The CHAID decision tree model refers to the decision tree model that uses the chi-square automatic interactive detection method CHAID. Here is a brief introduction to the principles of the CHAID decision tree: 1. Group values in the merged group that have no significant difference in decision variables; 2. Select the variable with the largest chi-square value as the tree classification variable; 3. Repeat steps 1 and 2 Choose the chi-square value greater than a certain value or P value is no longer less than a critical value, or the sample is less than a certain number. Among them, the modeling criteria of the CHAID decision tree model is, for example, the maximum number of layers of the tree is 3, the significance level of the subdivision of the parent node is 0.05, the minimum number of samples included in the parent node is 100, and the minimum value included in the child nodes is The number of samples is 50. Further, the recording of the preliminary CHAID decision tree as the report type prediction model further includes: verifying the preliminary CHAID decision tree using a verification set composed of sample data obtained in advance; if the verification passes, then the The preliminary CHAID decision tree is recorded as the report type prediction model.
在一个实施方式中,所述决策树模型训练子单元,包括:In one embodiment, the decision tree model training subunit includes:
建模标准参数设置模块,用于设置所述CHAID决策树模型的建模标准参数,所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数;The modeling standard parameter setting module is used to set the modeling standard parameters of the CHAID decision tree model, the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the parent node contains The minimum number of samples and the minimum number of samples contained in the child node;
初步CHAID决策树获得模块,用于将训练集的样本数据输入到采用卡方自动交互检测法建立的所述CHAID决策树模型中进行训练,得到初步CHAID决策树。The preliminary CHAID decision tree obtaining module is used for inputting the sample data of the training set into the CHAID decision tree model established by the chi-square automatic interactive detection method for training to obtain a preliminary CHAID decision tree.
如上所述,实现了得到初步CHAID决策树。其中设置所述CHAID决策树模型的建模标准参数,才可确定所述CHAID决策树模型。所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数,例如树的最大层数为3-5层、母节点可再分的显著水平为0.05、母节点包含的最小样本数为100-200、子节点包含的最小样本数为50-100。As mentioned above, the preliminary CHAID decision tree is achieved. The CHAID decision tree model can only be determined by setting standard modeling parameters of the CHAID decision tree model. The modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples contained by the parent node and the minimum number of samples contained by the child nodes, for example, the maximum number of layers of the tree is 3-5 The significance level of the subdivision of the parent node is 0.05, the minimum number of samples included in the parent node is 100-200, and the minimum number of samples included in the child node is 50-100.
本申请的基于机器学习的报表生成装置,通过基于机器学习的报表类型预测模型,以预测当前用户将使用的报表类型,再根据所述报表类型,从数据库中调取预设的初步报表,对初步报表进行调整后以得到最终报表,从而避免用户的繁琐操作,提升用户体验,提高报表完成效率。The report generation device based on machine learning of this application uses the report type prediction model based on machine learning to predict the report type that the current user will use, and then retrieves the preset preliminary report from the database according to the report type. The preliminary report is adjusted to obtain the final report, thereby avoiding the user's tedious operations, improving the user experience, and improving the efficiency of report completion.
参照图3,本发明实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储基于机器学习的报表生成方法所用数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于机器学习的报表生成方法。Referring to FIG. 3, an embodiment of the present invention further provides a computer device. The computer device may be a server, and its internal structure may be as shown in the figure. The computer device includes a processor, memory, network interface, and database connected by a system bus. Among them, the processor designed by the computer is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer-readable instructions, and a database. The memory device provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer device is used to store data used in the report generation method based on machine learning. The network interface of the computer device is used to communicate with external terminals through a network connection. When the computer-readable instructions are executed by the processor, a report generation method based on machine learning is realized.
上述处理器执行上述基于机器学习的报表生成方法,包括以下步骤:获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;输出预测的所述当前用户将使用的报表类型;根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。The above processor executes the above machine learning-based report generation method, including the following steps: acquiring the current user's feature information, the current user's feature information includes at least the current user's occupation information; and entering the feature information into a preset machine-based report Operation in the learned report type prediction model, where the report type prediction model is trained by training data composed of user characteristic information and the report type corresponding to the user characteristic information; the predicted current user will output The type of report used; according to the type of report to be used, a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used; according to the information entered by the current user , Adjust the preliminary report to obtain the final report.
在一个实施方式中,所述根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同的步骤,包括:根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;调取所述初步报表。In one embodiment, the step of retrieving a preset preliminary report from the database according to the type of report to be used, wherein the step of the type of the preliminary report being the same as the type of report to be used includes: The report type to be used, extracting preset multiple chart templates and multiple text part templates from a preset database; combining the current user selected chart template and text part template into the preliminary report; Recall the preliminary report.
在一个实施方式中,所述根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表的步骤,包括:根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。In one embodiment, the step of adjusting the preliminary report according to the information input by the current user to obtain the final report includes: adjusting information, content information of the chart data, and text according to the chart input by the current user Adjust the information, adjust the chart and text parts in the preliminary report; fill the text part of the current user into the text part of the preliminary report to obtain the final report.
在一个实施方式中,所述报表类型预测模型的获取方法,包括:获取包括指定量的样本数据的训练集,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;将所述训练集的样本数据输入到神经网络模型中进行训练,其中,训练的过程中采用随机梯度下降法,利用反向传导法则更新所述神经网络模型各层的参数,得到初步训练模型;将所述初步训练模型记为所述报表类型预测模型。In one embodiment, the method for obtaining a report type prediction model includes: obtaining a training set including a specified amount of sample data, wherein the sample data includes user feature information and a report corresponding to the user feature information Type; input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the parameters of each layer of the neural network model are updated by the reverse conduction method to obtain preliminary training Model; record the preliminary training model as the report type prediction model.
在一个实施方式中,所述将所述初步训练模型记为所述报表类型预测模型的步骤,包括:获取包括指定量的样本数据的验证集,其中,所述验证集的样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;利用所述验证集的样本数据验证所述初步训练模型;如果验证通过,则将所述初步训练模型记为所述报表类型预测模型。In one embodiment, the step of recording the preliminary training model as the report type prediction model includes: obtaining a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user characteristics Information, and the report type corresponding to the user feature information; verify the preliminary training model using the sample data of the verification set; if the verification passes, then record the preliminary training model as the report type prediction model.
在一个实施方式中,所述报表类型预测模型的获取方法,包括:获取指定量的样本数据,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树;将所述初步CHAID决策树记为所述报表类型预测模型。In one embodiment, the method for obtaining a report type prediction model includes: obtaining a specified amount of sample data, wherein the sample data includes user feature information and a report type corresponding to the user feature information; The sample data of the training set is input into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree; the preliminary CHAID decision tree is recorded as the report type prediction model.
在一个实施方式中,所述将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树的步骤,包括:设置所述CHAID决策树模型的建模标准参数,所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数;将训练集的样本数据输入到采用卡方自动交互检测法建立的所述CHAID决策树模型中进行训练,得到初步CHAID决策树。In one embodiment, the step of inputting the sample data of the training set into the CHAID decision tree model to obtain a preliminary CHAID decision tree includes: setting modeling standard parameters of the CHAID decision tree model. The modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples included in the parent node and the minimum number of samples included in the child node; input the sample data of the training set to the chi-square automatic The CHAID decision tree model established by the interactive detection method is trained to obtain a preliminary CHAID decision tree.
本领域技术人员可以理解,图中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定。Those skilled in the art can understand that the structure shown in the figure is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
本申请的计算机设备,通过基于机器学习的报表类型预测模型,以预测当前用户将使用的报表类型,再根据所述报表类型,从数据库中调取预设的初步报表,对初步报表进行调整后以得到最终报表,从而避免用户的繁琐操作,提升用户体验,提高报表完成效率。The computer equipment of this application uses the report type prediction model based on machine learning to predict the report type that the current user will use, and then retrieves the preset preliminary report from the database according to the report type and adjusts the preliminary report In order to get the final report, so as to avoid the user's tedious operations, improve the user experience, and improve the efficiency of report completion.
本申请一实施例还提供一种计算机可读存储介质,其上存储有计算机可读指令,计算机可读指令被处理器执行时实现基于机器学习的报表生成方法,包括以下步骤:获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;输出预测的所述当前用户将使用的报表类型;根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。所述计算机可读存储介质,例如为非易失性的计算机可读存储介质,或者为易失性的计算机可读存储介质。An embodiment of the present application also provides a computer-readable storage medium on which computer-readable instructions are stored. When the computer-readable instructions are executed by a processor, a method for generating a report form based on machine learning includes the following steps: obtaining the current user’s Feature information, the feature information of the current user includes at least the occupation information of the current user; the feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is determined by the user feature information , And training data composed of the report type corresponding to the user characteristic information is trained; output the predicted report type that the current user will use; according to the report type to be used, the preset is retrieved from the database The preliminary report, wherein the type of the preliminary report is the same as the type of the report to be used; according to the information input by the current user, the preliminary report is adjusted to obtain the final report. The computer-readable storage medium is, for example, a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.
在一个实施方式中,所述根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同的步骤,包括:根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;调取所述初步报表。In one embodiment, the step of retrieving a preset preliminary report from the database according to the type of report to be used, wherein the step of the type of the preliminary report being the same as the type of report to be used includes: The report type to be used, extracting preset multiple chart templates and multiple text part templates from a preset database; combining the current user selected chart template and text part template into the preliminary report; Recall the preliminary report.
在一个实施方式中,所述根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表的步骤,包括:根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。In one embodiment, the step of adjusting the preliminary report according to the information input by the current user to obtain the final report includes: adjusting information, content information of the chart data, and text according to the chart input by the current user Adjust the information, adjust the chart and text parts in the preliminary report; fill the text part of the current user into the text part of the preliminary report to obtain the final report.
在一个实施方式中,所述报表类型预测模型的获取方法,包括:获取包括指定量的样本数据的训练集,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;将所述训练集的样本数据输入到神经网络模型中进行训练,其中,训练的过程中采用随机梯度下降法,利用反向传导法则更新所述神经网络模型各层的参数,得到初步训练模型;将所述初步训练模型记为所述报表类型预测模型。In one embodiment, the method for obtaining a report type prediction model includes: obtaining a training set including a specified amount of sample data, wherein the sample data includes user feature information and a report corresponding to the user feature information Type; input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the parameters of each layer of the neural network model are updated by the reverse conduction method to obtain preliminary training Model; record the preliminary training model as the report type prediction model.
在一个实施方式中,所述将所述初步训练模型记为所述报表类型预测模型的步骤,包括:获取包括指定量的样本数据的验证集,其中,所述验证集的样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;利用所述验证集的样本数据验证所述初步训练模型;如果验证通过,则将所述初步训练模型记为所述报表类型预测模型。In one embodiment, the step of recording the preliminary training model as the report type prediction model includes: obtaining a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user characteristics Information, and the report type corresponding to the user feature information; verify the preliminary training model using the sample data of the verification set; if the verification passes, then record the preliminary training model as the report type prediction model.
在一个实施方式中,所述报表类型预测模型的获取方法,包括:获取指定量的样本数据,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树;将所述初步CHAID决策树记为所述报表类型预测模型。In one embodiment, the method for obtaining a report type prediction model includes: obtaining a specified amount of sample data, wherein the sample data includes user feature information and a report type corresponding to the user feature information; The sample data of the training set is input into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree; the preliminary CHAID decision tree is recorded as the report type prediction model.
在一个实施方式中,所述将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树的步骤,包括:设置所述CHAID决策树模型的建模标准参数,所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数;将训练集的样本数据输入到采用卡方自动交互检测法建立的所述CHAID决策树模型中进行训练,得到初步CHAID决策树。In one embodiment, the step of inputting the sample data of the training set into the CHAID decision tree model to obtain a preliminary CHAID decision tree includes: setting modeling standard parameters of the CHAID decision tree model. The modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples included in the parent node and the minimum number of samples included in the child node; input the sample data of the training set to the chi-square automatic The CHAID decision tree model established by the interactive detection method is trained to obtain a preliminary CHAID decision tree.
本申请的计算机可读存储介质,通过基于机器学习的报表类型预测模型,以预测当前用户将使用的报表类型,再根据所述报表类型,从数据库中调取预设的初步报表,对初步报表进行调整后以得到最终报表,从而避免用户的繁琐操作,提升用户体验,提高报表完成效率。The computer-readable storage medium of the present application predicts the report type that the current user will use through a report type prediction model based on machine learning, and then retrieves the preset preliminary report from the database according to the report type After adjustment, the final report is obtained, thereby avoiding the user's tedious operations, improving the user experience, and improving the efficiency of report completion.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的和实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双速据率SDRAM(SSRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art may understand that all or part of the process in the method of the foregoing embodiments may be completed by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions may be stored in a non-volatile computer In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the foregoing method embodiments. Wherein, any reference to the memory, storage, database or other media provided in the present application and used in the embodiments may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual-speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It should be noted that in this article, the terms "include", "include" or any other variant thereof are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements includes not only those elements It also includes other elements that are not explicitly listed, or include elements inherent to such processes, devices, objects, or methods. Without more restrictions, the element defined by the sentence "include one..." does not exclude that there are other identical elements in the process, device, article or method that includes the element.
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only the preferred embodiments of the present application and do not limit the patent scope of the present application. Any equivalent structure or equivalent process transformation made by the description and drawings of this application, or directly or indirectly used in other related In the technical field, the same reason is included in the scope of patent protection of this application.

Claims (20)

  1. 一种基于机器学习的报表生成方法,其特征在于,包括:A report generation method based on machine learning is characterized by including:
    获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;Acquiring characteristic information of the current user, the characteristic information of the current user includes at least the occupation information of the current user;
    将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;The feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is trained by training data composed of user feature information and a report type corresponding to the user feature information Made
    输出预测的所述当前用户将使用的报表类型;Output the predicted report type that the current user will use;
    根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;According to the type of report to be used, a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used;
    根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。According to the information input by the current user, the preliminary report is adjusted to obtain a final report.
  2. 根据权利要求1所述的基于机器学习的报表生成方法,其特征在于,所述根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同的步骤,包括:The method for generating a report based on machine learning according to claim 1, characterized in that the preset preliminary report is retrieved from the database according to the type of report to be used, wherein the type of the preliminary report and all Describe the same steps as the type of report to be used, including:
    根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;According to the type of report to be used, retrieve preset multiple chart templates and multiple text part templates from a preset database;
    将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;Combining the chart template and the text part template selected by the current user into the preliminary report;
    调取所述初步报表。Recall the preliminary report.
  3. 根据权利要求1所述的基于机器学习的报表生成方法,其特征在于,所述根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表的步骤,包括:The method for generating a report based on machine learning according to claim 1, wherein the step of adjusting the preliminary report according to the information input by the current user to obtain the final report includes:
    根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;Adjust the graph and text parts in the preliminary report according to the graph adjustment information, graph data content information and text part adjustment information input by the current user;
    将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。The text content input by the current user is filled into the text portion of the preliminary report to obtain the final report.
  4. 根据权利要求1所述的基于机器学习的报表生成方法,其特征在于,所述报表类型预测模型的获取方法,包括:The method for generating a report form based on machine learning according to claim 1, wherein the method for obtaining a report type prediction model includes:
    获取包括指定量的样本数据的训练集,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;Obtaining a training set including a specified amount of sample data, where the sample data includes user feature information and a report type corresponding to the user feature information;
    将所述训练集的样本数据输入到神经网络模型中进行训练,其中,训练的过程中采用随机梯度下降法,利用反向传导法则更新所述神经网络模型各层的参数,得到初步训练模型;Input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the parameters of each layer of the neural network model are updated by the reverse conduction rule to obtain the preliminary training model;
    将所述初步训练模型记为所述报表类型预测模型。The preliminary training model is recorded as the report type prediction model.
  5. 根据权利要求4所述的基于机器学习的报表生成方法,其特征在于,所述将所述初步训练模型记为所述报表类型预测模型的步骤,包括:The method for generating a report based on machine learning according to claim 4, wherein the step of recording the preliminary training model as the report type prediction model includes:
    获取包括指定量的样本数据的验证集,其中,所述验证集的样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;Obtaining a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user feature information and a report type corresponding to the user feature information;
    利用所述验证集的样本数据验证所述初步训练模型;Verify the preliminary training model using the sample data of the verification set;
    如果验证通过,则将所述初步训练模型记为所述报表类型预测模型。If the verification is passed, the preliminary training model is recorded as the report type prediction model.
  6. 根据权利要求1所述的基于机器学习的报表生成方法,其特征在于,所述报表类型预测模型的获取方法,包括:The method for generating a report form based on machine learning according to claim 1, wherein the method for obtaining a report type prediction model includes:
    获取指定量的样本数据,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;Obtain a specified amount of sample data, where the sample data includes user feature information and a report type corresponding to the user feature information;
    将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树;Input the sample data of the training set into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree;
    将所述初步CHAID决策树记为所述报表类型预测模型。The preliminary CHAID decision tree is recorded as the report type prediction model.
  7. 根据权利要求6所述的基于机器学习的报表生成方法,其特征在于,所述将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树的步骤,包括:The method for generating a report based on machine learning according to claim 6, wherein the step of inputting sample data of the training set into a CHAID decision tree model for training to obtain a preliminary CHAID decision tree includes:
    设置所述CHAID决策树模型的建模标准参数,所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数;Set the modeling standard parameters of the CHAID decision tree model, the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the minimum number of samples contained in the parent node and the minimum samples contained in the child node number;
    将训练集的样本数据输入到采用卡方自动交互检测法建立的所述CHAID决策树模型中进行训练,得到初步CHAID决策树。The sample data of the training set is input into the CHAID decision tree model established by the chi-square automatic interactive detection method for training to obtain a preliminary CHAID decision tree.
  8. 一种基于机器学习的报表生成装置,其特征在于,包括:A report generation device based on machine learning is characterized by including:
    特征信息获取单元,用于获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;A characteristic information obtaining unit, configured to obtain characteristic information of the current user, the characteristic information of the current user includes at least the occupation information of the current user;
    报表类型预测模型运算单元,用于将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;Report type prediction model operation unit, used for inputting the feature information into a preset report type prediction model based on machine learning, wherein the report type prediction model is determined by user feature information and corresponding to the user feature information The training data composed of the report type is trained;
    报表类型预测单元,用于输出预测的所述当前用户将使用的报表类型;A report type prediction unit, used to output the predicted report type that the current user will use;
    初步报表调取单元,用于根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;A preliminary report retrieval unit, configured to retrieve a preset preliminary report from the database according to the type of report to be used, wherein the type of the preliminary report is the same as the type of report to be used;
    最终报表获得单元,用于根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。The final report obtaining unit is used to adjust the preliminary report according to the information input by the current user, so as to obtain the final report.
  9. 根据权利要求8所述的基于机器学习的报表生成装置,其特征在于,所述初步报表调取单元,包括:The report generation device based on machine learning according to claim 8, wherein the preliminary report retrieval unit includes:
    图表模板与文字部分模板调取子单元,用于根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;The drawing template and text part template retrieval subunit is used to retrieve a preset plurality of chart templates and a plurality of text part templates from a preset database according to the report type to be used;
    组合子单元,用于将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;A combination subunit, configured to combine the graph template and the text part template selected by the current user into the preliminary report;
    调取子单元,用于调取所述初步报表。The calling subunit is used for calling the preliminary report.
  10. 根据权利要求8所述的基于机器学习的报表生成装置,其特征在于,所述最终报表获得单元,包括:The apparatus for generating a report based on machine learning according to claim 8, wherein the final report obtaining unit includes:
    调整子单元,用于根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;An adjustment subunit for adjusting the chart and text in the preliminary report according to the chart adjustment information, chart data content information and text part adjustment information input by the current user;
    最终报表获得子单元,用于将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。The final report obtaining subunit is used to fill the text content input by the current user into the text portion of the preliminary report to obtain the final report.
  11. 根据权利要求8所述的基于机器学习的报表生成装置,其特征在于,所述装置包括报表类型预测模型获取单元,所述报表类型预测模型获取单元,包括:The report generation device based on machine learning according to claim 8, characterized in that the device includes a report type prediction model acquisition unit, and the report type prediction model acquisition unit includes:
    训练集获取子单元,用于获取包括指定量的样本数据的训练集,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;A training set acquisition subunit for acquiring a training set including a specified amount of sample data, wherein the sample data includes user feature information and a report type corresponding to the user feature information;
    神经网络模型训练子单元,用于将所述训练集的样本数据输入到神经网络模型中进行训练,其中,训练的过程中采用随机梯度下降法,利用反向传导法则更新所述神经网络模型各层的参数,得到初步训练模型;The neural network model training subunit is used to input the sample data of the training set into the neural network model for training, in which the stochastic gradient descent method is used in the training process, and the neural network model is updated using the reverse conduction law The parameters of the layer to get the preliminary training model;
    报表类型预测模型标记子单元,用于将所述初步训练模型记为所述报表类型预测模型。The report type prediction model marking subunit is used to record the preliminary training model as the report type prediction model.
  12. 根据权利要求11所述的基于机器学习的报表生成装置,其特征在于,所述报表类型预测模型标记子单元,包括:The apparatus for generating a report based on machine learning according to claim 11, wherein the report type prediction model marking subunit includes:
    验证集获取模块,用于获取包括指定量的样本数据的验证集,其中,所述验证集的样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;A verification set obtaining module, configured to obtain a verification set including a specified amount of sample data, wherein the sample data of the verification set includes user characteristic information and a report type corresponding to the user characteristic information;
    验证模块,用于利用所述验证集的样本数据验证所述初步训练模型;A verification module, configured to use the sample data of the verification set to verify the preliminary training model;
    报表类型预测模型标记模块,用于如果验证通过,则将所述初步训练模型记为所述报表类型预测模型。The report type prediction model marking module is used to record the preliminary training model as the report type prediction model if the verification is passed.
  13. 根据权利要求8所述的基于机器学习的报表生成装置,其特征在于,所述装置包括报表类型预测模型获取单元,所述报表类型预测模型获取单元,包括:The report generation device based on machine learning according to claim 8, characterized in that the device includes a report type prediction model acquisition unit, and the report type prediction model acquisition unit includes:
    样本数据获取子单元,用于获取指定量的样本数据,其中,所述样本数据包括用户特征信息,以及与所述用户特征信息对应的报表类型;A sample data obtaining subunit, configured to obtain a specified amount of sample data, wherein the sample data includes user characteristic information and a report type corresponding to the user characteristic information;
    决策树模型训练子单元,用于将所述训练集的样本数据输入到CHAID决策树模型中进行训练,得到初步CHAID决策树;The decision tree model training subunit is used to input the sample data of the training set into the CHAID decision tree model for training to obtain a preliminary CHAID decision tree;
    决策树标记子单元,用于将所述初步CHAID决策树记为所述报表类型预测模型。The decision tree marking subunit is used to record the preliminary CHAID decision tree as the report type prediction model.
  14. 根据权利要求13所述的基于机器学习的报表生成装置,其特征在于,所述决策树模型训练子单元,包括:The report generation device based on machine learning according to claim 13, wherein the decision tree model training subunit includes:
    建模标准参数设置模块,用于设置所述CHAID决策树模型的建模标准参数,所述建模标准参数包括决策树的最大层数、母节点的可再分的显著水平、母节点包含的最小样本数和子节点包含的最小样本数;The modeling standard parameter setting module is used to set the modeling standard parameters of the CHAID decision tree model, the modeling standard parameters include the maximum number of decision trees, the subdividable significance level of the parent node, the parent node contains The minimum number of samples and the minimum number of samples contained in the child node;
    初步CHAID决策树获得模块,用于将训练集的样本数据输入到采用卡方自动交互检测法建立的所述CHAID决策树模型中进行训练,得到初步CHAID决策树。The preliminary CHAID decision tree obtaining module is used for inputting the sample data of the training set into the CHAID decision tree model established by the chi-square automatic interactive detection method for training to obtain a preliminary CHAID decision tree.
  15. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现基于机器学习的报表生成方法,所述基于机器学习的报表生成方法,包括:A computer device includes a memory and a processor. The memory stores computer-readable instructions. The processor is characterized in that when the processor executes the computer-readable instructions, a report generation method based on machine learning is implemented. Learned report generation methods, including:
    获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;Acquiring characteristic information of the current user, the characteristic information of the current user includes at least the occupation information of the current user;
    将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;The feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is trained by training data composed of user feature information and a report type corresponding to the user feature information Made
    输出预测的所述当前用户将使用的报表类型;Output the predicted report type that the current user will use;
    根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;According to the type of report to be used, a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used;
    根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。According to the information input by the current user, the preliminary report is adjusted to obtain a final report.
  16. 根据权利要求15所述的计算机设备,其特征在于,所述根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同的步骤,包括:The computer device according to claim 15, wherein the preset preliminary report is retrieved from a database according to the type of report to be used, wherein the type of the preliminary report and the report to be used Steps of the same type, including:
    根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;According to the type of report to be used, retrieve preset multiple chart templates and multiple text part templates from a preset database;
    将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;Combining the chart template and the text part template selected by the current user into the preliminary report;
    调取所述初步报表。Recall the preliminary report.
  17. 根据权利要求15所述的计算机设备,其特征在于,所述根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表的步骤,包括:The computer device according to claim 15, wherein the step of adjusting the preliminary report based on the information input by the current user to obtain the final report includes:
    根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;Adjust the graph and text parts in the preliminary report according to the graph adjustment information, graph data content information and text part adjustment information input by the current user;
    将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。The text content input by the current user is filled into the text portion of the preliminary report to obtain the final report.
  18. 一种计算机可读存储介质,其上存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现基于机器学习的报表生成方法,所述基于机器学习的报表生成方法,包括:A computer-readable storage medium on which computer-readable instructions are stored, characterized in that, when the computer-readable instructions are executed by a processor, a report generation method based on machine learning is realized, and the report generation method based on machine learning ,include:
    获取当前用户的特征信息,所述当前用户的特征信息至少包括当前用户的职业信息;Acquiring characteristic information of the current user, the characteristic information of the current user includes at least the occupation information of the current user;
    将所述特征信息输入预设的基于机器学习的报表类型预测模型中运算,其中所述报表类型预测模型通过由用户特征信息,以及与所述用户特征信息对应的报表类型所组成的训练数据训练而成;The feature information is input into a preset report type prediction model based on machine learning for calculation, wherein the report type prediction model is trained by training data composed of user feature information and a report type corresponding to the user feature information Made
    输出预测的所述当前用户将使用的报表类型;Output the predicted report type that the current user will use;
    根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同;According to the type of report to be used, a preset preliminary report is retrieved from the database, wherein the type of the preliminary report is the same as the type of report to be used;
    根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表。According to the information input by the current user, the preliminary report is adjusted to obtain a final report.
  19. 根据权利要求18所述的计算机可读存储介质,其特征在于,所述根据所述将使用的报表类型,从数据库中调取预设的初步报表,其中所述初步报表的类型与所述将使用的报表类型相同的步骤,包括:The computer-readable storage medium according to claim 18, wherein the preset preliminary report is retrieved from a database according to the type of report to be used, wherein the type of the preliminary report and the Use the same steps for the report type, including:
    根据所述将使用的报表类型,从预设的数据库中调取预设的多个图表模板与多个文字部分模板;According to the type of report to be used, retrieve preset multiple chart templates and multiple text part templates from a preset database;
    将所述当前用户选择的图表模板与文字部分模板组合成为所述初步报表;Combining the chart template and the text part template selected by the current user into the preliminary report;
    调取所述初步报表。Recall the preliminary report.
  20. 根据权利要求18所述的计算机可读存储介质,其特征在于,所述根据所述当前用户输入的信息,调整所述初步报表,从而获得最终报表的步骤,包括:The computer-readable storage medium of claim 18, wherein the step of adjusting the preliminary report based on the information input by the current user to obtain a final report includes:
    根据所述当前用户输入的图表调整信息、图表数据内容信息和文字部分调整信息,调整所述初步报表中的图表和文字部分;Adjust the graph and text parts in the preliminary report according to the graph adjustment information, graph data content information and text part adjustment information input by the current user;
    将所述当前用户输入的文字内容填入所述初步报表的文字部分,获得最终报表。The text content input by the current user is filled into the text portion of the preliminary report to obtain the final report.
PCT/CN2019/119480 2019-01-02 2019-11-19 Machine learning-based report generating method, apparatus, and computer device WO2020140639A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910002951.5 2019-01-02
CN201910002951.5A CN109800333A (en) 2019-01-02 2019-01-02 Report form generation method, device and computer equipment based on machine learning

Publications (1)

Publication Number Publication Date
WO2020140639A1 true WO2020140639A1 (en) 2020-07-09

Family

ID=66558403

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/119480 WO2020140639A1 (en) 2019-01-02 2019-11-19 Machine learning-based report generating method, apparatus, and computer device

Country Status (2)

Country Link
CN (1) CN109800333A (en)
WO (1) WO2020140639A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541565A (en) * 2023-07-07 2023-08-04 中国平安财产保险股份有限公司 Data chart generation method and device, electronic equipment and storage medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800333A (en) * 2019-01-02 2019-05-24 平安科技(深圳)有限公司 Report form generation method, device and computer equipment based on machine learning
CN113127555A (en) * 2019-12-30 2021-07-16 北京阿博茨科技有限公司 Data visualization drawing matching device and method
CN113283222B (en) * 2021-06-11 2021-10-08 平安科技(深圳)有限公司 Automatic report generation method and device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107632971A (en) * 2016-07-19 2018-01-26 百度在线网络技术(北京)有限公司 Method and apparatus for generating multidimensional form
US20180189655A1 (en) * 2017-01-03 2018-07-05 Electronics And Telecommunications Research Institute Data meta-scaling apparatus and method for continuous learning
CN109800333A (en) * 2019-01-02 2019-05-24 平安科技(深圳)有限公司 Report form generation method, device and computer equipment based on machine learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037378A1 (en) * 2007-08-02 2009-02-05 Rockwell Automation Technologies, Inc. Automatic generation of forms based on activity
CN103885956A (en) * 2012-12-20 2014-06-25 北大方正集团有限公司 Method and equipment for generating reports

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107632971A (en) * 2016-07-19 2018-01-26 百度在线网络技术(北京)有限公司 Method and apparatus for generating multidimensional form
US20180189655A1 (en) * 2017-01-03 2018-07-05 Electronics And Telecommunications Research Institute Data meta-scaling apparatus and method for continuous learning
CN109800333A (en) * 2019-01-02 2019-05-24 平安科技(深圳)有限公司 Report form generation method, device and computer equipment based on machine learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541565A (en) * 2023-07-07 2023-08-04 中国平安财产保险股份有限公司 Data chart generation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN109800333A (en) 2019-05-24

Similar Documents

Publication Publication Date Title
WO2020140639A1 (en) Machine learning-based report generating method, apparatus, and computer device
US20230015665A1 (en) Multi-turn dialogue response generation with template generation
TWI621077B (en) Character recognition method and server for claim documents
WO2022142613A1 (en) Training corpus expansion method and apparatus, and intent recognition model training method and apparatus
WO2020232877A1 (en) Question answer selection method and apparatus, computer device, and storage medium
WO2020237869A1 (en) Question intention recognition method and apparatus, computer device, and storage medium
CN111666401B (en) Document recommendation method, device, computer equipment and medium based on graph structure
CN111159415B (en) Sequence labeling method and system, and event element extraction method and system
CN112036154A (en) Electronic medical record generation method and device based on inquiry dialogue and computer equipment
CN112016274B (en) Medical text structuring method, device, computer equipment and storage medium
US11113478B2 (en) Responsive document generation
KR102294364B1 (en) System for automatically converting document based on artificial intelligence and method thereof
CN111859916B (en) Method, device, equipment and medium for extracting key words of ancient poems and generating poems
CN111191457A (en) Natural language semantic recognition method and device, computer equipment and storage medium
CN112699923A (en) Document classification prediction method and device, computer equipment and storage medium
CN113761193A (en) Log classification method and device, computer equipment and storage medium
CN111435449B (en) Model self-training method, device, computer equipment and storage medium
US20230080261A1 (en) Apparatuses and Methods for Text Classification
KR102532216B1 (en) Method for establishing ESG database with structured ESG data using ESG auxiliary tool and ESG service providing system performing the same
CN110780850B (en) Requirement case auxiliary generation method and device, computer equipment and storage medium
US20220284280A1 (en) Data labeling for synthetic data generation
US20220253694A1 (en) Training neural networks with reinitialization
CN114969544A (en) Hot data-based recommended content generation method, device, equipment and medium
CN113627173A (en) Manufacturer name identification method and device, electronic equipment and readable medium
CN112580309B (en) Document data processing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19907315

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19907315

Country of ref document: EP

Kind code of ref document: A1