AU2018202420A1 - A System and Method for Generating Documents - Google Patents

A System and Method for Generating Documents Download PDF

Info

Publication number
AU2018202420A1
AU2018202420A1 AU2018202420A AU2018202420A AU2018202420A1 AU 2018202420 A1 AU2018202420 A1 AU 2018202420A1 AU 2018202420 A AU2018202420 A AU 2018202420A AU 2018202420 A AU2018202420 A AU 2018202420A AU 2018202420 A1 AU2018202420 A1 AU 2018202420A1
Authority
AU
Australia
Prior art keywords
document
content
accordance
historical
documents
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2018202420A
Inventor
Seyed Mohsen NOURI AHMADI GOURAB
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nodapp Pty Ltd
Original Assignee
Nodapp Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nodapp Pty Ltd filed Critical Nodapp Pty Ltd
Priority to AU2018202420A priority Critical patent/AU2018202420A1/en
Priority to PCT/AU2019/050305 priority patent/WO2019191817A1/en
Publication of AU2018202420A1 publication Critical patent/AU2018202420A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/131Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/174Form filling; Merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • G06F40/56Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/972Access to data in other repository systems, e.g. legacy data or dynamic Web page generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/274Converting codes to words; Guess-ahead of partial word inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Marketing (AREA)
  • Artificial Intelligence (AREA)
  • Human Resources & Organizations (AREA)
  • Finance (AREA)
  • Technology Law (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Primary Health Care (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Machine Translation (AREA)

Abstract

Abstract A System and Method for Generating Documents The present invention relates to a system and method for generating documents. The preparation of advice documentation by professional service providers requires much manual input. Although some parts of the document may be "boiler plate" much will be bespoke and require preparation by a professional. Embodiments of the present invention provide a system and method for generating documents which comprises a document content generator having a database storing many document segments. A content prediction engine, which may be in the form of a neural network, is arranged to provide content parameters which are used to select from the plurality of document segments to build a document. 10141985_1 5/14 WEB API Using API togather the historical 15 - fact finds - _ Historical Fact Finds Parsing the Microsoft word docs to extract the info by pairing the keys and values 150 in the fact find Like: Age= 15 Making a training set from the into a Deep Neutral Network to understand the relation between personal information &lStrategies/Products/Goals 21 Al/ Engine output layer inputglayer hidden layer ew Fct ind ill>beThe trained model will be 3used to predict the fed into the trained strategy and product 20 Model needed for the next client New FactFind information will 107 be used for both choosing text and content generation Labelled texts >High Level elements available for decision predicted using ML making Text, relevant to the predicted strategies, products and fact find will be added 153 to the new template - 154 2 5 _Figure 5 Generated SOA

Description

A System and Method for Generating Documents
Field of the Invention
The present invention relates to a system and method for generating documents, and, particularly, but not exclusively, to a system and method of generating documents that have dynamic content.
Background of the Invention
The preparation of advice documentation by professional service providers is ubiquitous. Contracts, letters of advice, agreements between parties, licences, financial advice and many other professional services deliver their product in the form of documentation which realises the client's requirements. All such documentation requires expert input and can be costly.
An example of the type of documentation that a professional adviser may prepare is a Statement of Advice SOA) provided by a financial adviser.
A financial adviser is a professional who suggests and renders financial services to clients based on their financial situation. They have to complete specific training and hold a license to provide advice.
During the first meeting with client, the adviser discusses the advice needs and then gives an idea of what they can do to help. The adviser will also be able to say how much the advice will cost so the client can decide whether to proceed any further. If they decide to continue with the adviser, he will prepare a statement of advice (SOA) that will formally document the advice, the strategies and any financial products recommended.
10141985.
In a nutshell, a SOA is a document that clearly outlines the recommendations that have been made by the industry expert and also explains why these recommendations have been made. In Australia, the SOA must be provided in accordance with the Financial Services Reform Act (2002) . It should not only provide details of what has been recommended and why, but also details of how the recommendations intended to benefit the client.
A range of information is provided by a properly drawn-up SOA. It may include the following information, as well as other information:
• The name, address and contact details of the adviser or the company that has offered the advice;
• Details about the advice that has been given;
• The date that the advice was given;
• The adviser's or company's Financial Services License Number;
• The type of product that the advice relates to;
• The recommendations that have been made and why they have been made;
• Details of the benefits that the products/services recommended should provide;
• The name of the insurer; and • Details of the client's instructions to the adviser.
The adviser or company must provide a client with a SOA in order to be compliant with regulations and in order to ensure that a client dealing with an adviser or company that is compliant there are a number of items that should
10141985_1 be checked. These include:
• Checking the document is official;
• Looking for justification of recommendations;
• Ensuring there are details of products/services recommended;
• Checking on commission disclosure; and • Ensuring there is an official license number.
The SOA is important for many reasons. First off, it provides full and important details of the services and products that have been recommended, so the client has a written summary of everything that has been discussed, as well as the reasons for recommendations, commission information and other important details. Also, a proper SOA with the license number included helps to ensure that the client is dealing with a compliant firm.
It is important to carefully read the SOA to ensure you fully understand how the adviser is remunerated and whether they have any relationships or associations that may influence their advice. The SOA will also outline any potential conflicts of interests in the provision of recommendations .
SOAs may provide limited advice on a particular product (e.g. insurance) or may provide more general advice on, for example, a financial plan going forward for the client who wishes to invest in superannuation and/or other investment portfolios. The SOA is therefore a complex document. It may include static content. This may include boiler plate passages such as statements regarding cooling off periods for investments. Such static content can be generated by, for example, current
10141985.
systems that store templates incorporating the static content.
Much of the SOA, however, will be complex dynamic content that must be currently prepared by a professional adviser. This dynamic content may include details of strategies, products to be provided in accordance with the strategies, and other expert knowledge.
For generating an SOA an adviser may currently have some support systems at hand. For example, they may have a CRM (customer relation management system) which is not only for administrative tasks but also helps putting in information through tools and forms to generate at least some of the SOA. This will provide some static content. Currently advisers need to go through different menus and wizards to put together an SOA by these applications and finally the generated SOA needs to be revised by a paraplanner for further adjustment. This is a very time consuming and expensive process that wastes a lot of resources of an advisory firm. The final generated SOA s are usually not consistent across a firm and the price and time of delivery can lead to a bad customer experience.
As discussed, much of the dynamic content must be bespoke prepared by the adviser and separately input to the document. This requires significant professional time and expertise. It is very costly. The problems relating to preparation of an SOA in the financial advisory industry can be extrapolated to the preparation of any documentation in any professional services that requires expert input (law, finance, medicine, and any other profession). Currently, there are no or limited alternatives to requiring significant amounts of expert time in preparation of advice and documentation. Supporting systems are currently limited. While they are useful for providing templates which contain static
10141985_1 content, they are not particularly useful for determining and incorporating dynamic content in a document.
Summary of the Invention
In accordance with a first aspect, the present invention provides a system for generating documents that have dynamic content, comprising a document content generator comprising a database storing a plurality of document segments, and a selection engine arranged to select document segments and build a document from the selected segments, and a content prediction engine arranged to receive input prediction attributes to process the input prediction attributes, to provide content parameters and to use the content parameters to affect the selection engine to select document segments and generate a document.
In an embodiment, document segments are prepared from a corpus of historical documents of the same type of document that is to be generated. For example, if the system is arranged to generate a Statement of Advice (SOA) for financial advice, then the historical corpus will be of many previous SOAs.
In an embodiment, the document content generator comprises a dynamic parser arranged to produce the plurality of document segments by analysis and segmentation of the historical documents. This provides a library (in the database) of content which the selection engine can select from in order to generate the document.
This embodiment requires the segments to be selected and the document to be built. In an embodiment, the content prediction engine facilitates the selection of the documentation based on the input prediction attributes.
10141985.
In an embodiment, the prediction engine comprises a machine learning arrangement. In an embodiment, this comprises a neural network. In an embodiment, the machine learning arrangement is trained based on input historical prediction attributes. These may be facts parsed from historical documents of the same type as the document to be generated. These historical prediction attributes are used to train the machine learning arrangement. In an embodiment, the output of the machine learning arrangement are strategic guidance elements which affect the selection engine to select appropriate document segments in accordance with the required content.
For example, in the case of an SOA for financial advice, the prediction attribute may comprise facts such as age, demographics of clients, appetite for risk, and the like.
Advantageously, the system in accordance with embodiments of the invention outputs a document which includes static portions and also dynamic portions, which have been selected via the machine learning arrangement. The machine learning arrangement acts as an artificial intelligence expert, governing the selection of dynamic portions of the document. In embodiments, this vastly reduces the workload of the professional adviser, as the strategic advise will automatically be prepared. The professional adviser can review the document prepared by the system, make changes and finalise it. This will take much less time than preparing a document from scratch. Further, the advice is likely to be more consistent and better presented across an organisation which utilises the system.
In an embodiment, documents produced by the system can be fed back into the historical document corpus for the content generator and also for the content prediction engine, so that the content prediction engine continues to
10141985_1 learn and improve.
In accordance with a second aspect, the present invention provides a method of generating documents that have dynamic content, comprising the steps of:
processing input prediction attributes by a content prediction process, to provide content parameters;
using the content parameters to affect selection from a plurality of stored document segments; and generating a document from the selected document segments.
In an embodiment, the method comprises the step of implementing the content prediction process by way of a machine learning process. In an embodiment, the machine learning process comprises a neural network.
In accordance with a third aspect, the present invention provides a method of constructing a system for generating documents that have dynamic content, the method comprising the steps of:
parsing a plurality of historical documents of the same type as the document to be generated, analysing and segmenting the historical documents, to prepare a database of a plurality of document segments;
training a machine learning arrangement based on input historical prediction attributes, obtained from historical document corpus of the same type of document as the document being generated, to prepare the machine learning arrangement to output strategic guidance elements for affecting the selection of a plurality of document segments from the available document segments, based on input current prediction attributes.
10141985_1
2018202420 05 Apr 2018
- 8 In accordance with a fourth aspect, the present invention provides a computer programme, comprising instructions for controlling a computer to implement a system in accordance with the first aspect of the invention.
In accordance with a fifth aspect, the present invention provides a computer readable medium, providing a computer programme in accordance with the fourth aspect of the invention .
In accordance with a sixth aspect, the present invention provides a media signal, comprising a computer programme in accordance with the fourth aspect of the invention.
Brief Description of the Figures
Features and advantages of the present invention will become apparent from the following description of embodiments thereof, by way of example only, with reference to the accompanying drawings, in which:
Figure 1: is a schematic diagram of a system in accordance with an embodiment of the invention;
Figure 2: is a schematic flow diagram illustrating overall operation of the system of Figure 1;
Figure 3: is a block diagram of a computing apparatus which may be used to implement the system of Figure 1;
Figure 4: is a more detailed view of one part of the flow diagram of Figure 2;
Figure 5: is a more detailed view of another part of the flow diagram of Figure 2;
10141985_1
Figure 6: is a flow diagram illustrating operation of a dynamic parser implemented in accordance with an embodiment of the present invention;
Figure 7: is a flow diagram illustrating further operation of the dynamic parser;
Figure 8: is a flow diagram illustrating further operation of the dynamic parser;
Figures 9 to 12: are illustrations of example document segments which may be produced by a system in accordance with an embodiment of the present invention;
Figure 13: is a flow diagram illustrating operation of a content prediction engine of the system of claim 1; and
Figure 14: is a representation of a display showing a portion of the output of a dynamic parser, in accordance with an embodiment of the present invention.
Detailed Description of Embodiments
Figure 1 illustrates a system in accordance with an embodiment of the present invention for generating dynamic documents. The system comprises a computing system 1, in this example comprising a server computing system hosted in the cloud (although it may comprise any other computer architecture). The system 1 comprises and one or more processors, memory and also a database 2.
System 1 implements a document content generator arranged to generate documents from segments of documents stored in the database 2. A selection engine 4 implemented by the system 1 is arranged to select the document segments and a conduct prediction engine 5 implemented by the system 1,
10141985_1 is arranged to receive input prediction attributes and process the input prediction attributes to provide content parameters arranged to affect the selection engine 4 to select the document segments and generate a document.
The document content generator 3, selection engine 4, and content prediction engine 5 may be implemented by any combination of software and/or hardware architecture.
In this embodiment, the system may be accessed by devices
6, 7 which may be remote and connected to the system 1 over a network. Devices 6, 7 may be any type of computing device, tablet device, smart phone or any other processing device arranged to communicate with system 1. These devices 6, 7 may be used by administrators to administrate the system (devices 6 for example) or by users who wish to use the system to generate documents (e.g. devices 7). Users may include professional service providers of an organisation who wish to generate documents for their clients using the expert system 1.
Figure 2 shows a flow diagram illustrating operation of the system 1. The operation can be considered to comprise a number of parts. In this example, the document content generator comprises a dynamic parser 10, implemented by appropriate software and hardware, which is arranged to produce the plurality of document segments by analysis and segmentation of historical documents 15 of the same type as the document being generated.
Another part of the system operation is the artificial intelligence (Al) engine 11. The Al engine 11 comprises a machine learning arrangement 12, in this example being a deep neural network. This will be implemented by appropriate hardware and software of system 1. The neural network 12 is trained from input of historical prediction attributes 16. The prediction attributes may comprise
10141985_1
- 11 2018202420 05 Apr 2018 factual circumstances and other information and data that an expert would require in order to provide the advice that would be included in a document. These attributes may be obtained from historical document corpus 15 and/or other sources (e.g. via a web API 17 obtaining information over a network).
Some examples of Facts are given in the tables below: It will be appreciated that there may be many other facts:
Personal details
Description A 8
Age 57 68
Date of birth 9 December 1959 23 August 1949
Marital status De Facto De Facto
Preferred Address Dromana VIC 3936
Health
Current state of health Good Good
Private health insurance TBA TBA
Smoker TBA TBA
Estate planning
Do you have a Will? No No
Enduri ng Powe r of Atto rney? No No
Enduring guardianship? No No
Estate planning last reviewed TBA TBA
income
Description Owner Annual Amount
Pension Income / UK Pension (ADD) A $9,000
Pension Income / Centrelink B $11,000
Pension income/Income Stream from CBUS AB $27,060
Total $47,060
Expenses
Description |_You_ha ve no_expetTses_qrnot_provjded_details
Lifestyle assets
Real Estate / Primary Residence / Dromana Joint $850,000
Life Style / Motor Vehicle / Motor Vehicje(s) plus Caravan Joint $45,000
Total $R')Sr(HX)
A further part of the Al engine 11 is a selection engine which is arranged to receive new prediction attributes
10141985_1 and outputs from the neural network 12 following input of the new prediction attributes to the neural network, to select document segments prepared by the dynamic parser to output a document 25. The document 25 may be finalised and checked by an expert.
An example of a computing apparatus which may be used to implement the system 1, will now be given with reference to Figure 3.
Figure 3 shows a schematic diagram of components of a computer system (900) which may implement the system 1. Computer system 900 may be a high-performance machine, such as a super computer, a desktop work station or a personal computer, or may be a distributed computing array or a computer cluster or a networked cluster of computers. In this example, the server architecture and database architecture is implemented by hardware and software supported in the Cloud. The system 1 may be provided as software/hardware as a service, or may be owned by the organisation .
The computer system 900 comprises a suitable operating system and appropriate software for implementation of the various processes operated by the system 1.
The computing apparatus 900 comprises one or more data processing units (CPUs) 902; memory 904, which may include volatile or non-volatile memory, such as various types of RAM memories, magnetic disks, optical disks and solidstate memories; a user interface 906 which may comprise a monitor, keyboard, mouse and/or touch-screen display, may enable access by an administrator of the system 3. A network communication interface 908 for communicating with other computers and devices (e.g. 6 and 7) is also provided, and one or more communication buses 910 for interconnecting the different parts of the system 900.
10141985_1
The computer system 900 may access data stored in a database 914 via network interface 908 (the database 914 may correspond to the database 2 shown in Figure 1).
Database 914 may be a distributed database.
A computing apparatus for implementing embodiments of the invention is not limited to the computer apparatus described above. Any computer system architecture may be utilised, such as standalone computers, networked computers, dedicated computing devices, or any device capable of processing information in accordance with embodiments of the present invention. The architecture may comprise client/server architecture, or any other architecture .
The computing system is provided with an operating system and various computer processes to implement functionality. The computer processes may be implemented as separate modules, which may share common foundations such as routines and sub-routines. The computer processes may be implemented in any suitable way and are not limited to separate modules. Any software/hardware architecture that implements the functionality may be utilised.
Figure 4 shows a detail on the flow diagram of Figure 2, showing the dynamic parser 10 side of the system. The dynamic parser 10 is arranged to receive as input many historical documents 15 of a historical document corpus. In this example, the document corpus is of Statements of Advice (SOA) for financial advice, which may have been prepared by financial advisers, for example. Dynamic parser 10 is arranged to break the documents 15 down into segments which can be stored in the database 2. The segments are labelled so that they can be used for processing by the selection engine in order to generate documents .
10141985.
Figures 9 to 12 illustrate portions of documents shown in segmented form. In the examples shown in Figures 9 to 12, there are a number of labels which identify various portions of the documents. These labels are applied by the dynamic parser 10 analysing the documents. The labels include heading. A heading, relates to a heading of the document. In Figure 9 example, the heading relates to the the scope of our advice heading of this particular section of the document heading.
The label static, relates to content of the document that is considered to be consistently appearing in the documents across the corpus. This type of content is considered to be static type content i.e. content that doesn't change from document to document. Static content would include, for example, boiler plate clauses and the like .
The label sub-heading relates to parts of the document that have been identified by the parser 10 as being subheadings .
Dynamic content relates to parts of the document which have been identified as varying from one document to the next.
Figures 9 to 12 give various examples of headings, static content, dynamic content and sub-headings. It will be appreciated that there will be many more than these types in any historical document corpus.
Referring again to Figure 4,the dynamic parser lOdetermines headings by checking the style of the text in the documents so that the documents can be segmented later and the same segments can be compared with each other (step 100). At 101 the sub-headings are determined under
10141985_1 each headline in the documents. At 102 sections with similar function in the documents are matched. The sentences in matched sections are compared to find similar and same sections and then they are labelled with numbers (step 103). The portions are then labelled dynamic or static based on the frequency of their appearances. Those sentences that consistently appear in documents will be labelled as static (step 104).
At step 105, the dynamic portions in particular are checked against fact find data to find any correlation between the dynamic parts and fact find information. Fact find may include personal information of the client seeking financial advice, their financial aims, risk profile etc. This may be obtained from the historical document corpus. Also, the combinations of Goals/Products/Strategies from the historical document corpus are examined for pattern recognition.
The dynamic parts that appear on special occasions and related either to combinations of features or individual personal situations are therefore understood and labelled. These labelled segments of documents are then available 106 for decision making in preparation of documents (107) .
Figures 6, 7 and 8 illustrate in more detail the process of document segmentation implemented by the dynamic parser 10 .
Figure 6 illustrates how the headings and sub-headings are determined. The historical SOAs are input into the system (200). The SOAs are parsed into text objects which can be processed by the system (201). Each SOA is then separately processed to determine headings and sub-headings. At step 202 an SOA is selected. Headings for the SOA are determined (203) and also sub-headings (204). A logical tree structure is constructed for the SOA in the system
10141985_1 (205) .
Labels are then assigned to the different headings and sub-headings. Headings that are the same as each other are provided with the same label as are sub-headings that are the same as each other.
The process is to, firstly, (step 206) determine if there is a label assigned to the heading. If there is not a label assigned to this heading, then label the heading with a new label (step 207). If a label already exists for the heading (that is, if that heading has already been determined) then the heading is labelled with the same label (208) as existing.
A similar process is carried out for sub-headings. At step 209, it is determined whether there is already a label assigned to the particular sub-heading. If not, a new label is applied to the sub-heading (210). If a label already exists for that sub-heading, then it is labelled with the same label (211) .
The process is repeated (step 212) until it is determined that all the headings and sub-headings are labelled for all the SOAs (213).
The next stage in the dynamic parser 10 process is illustrated in Figure 7. This part of the process relates to comparing sections of documents with each other and labelling the same or similar ones with the same label. At 300, all the SOAs with labelled heading and sub-heading are input into this section of the process. Headings are selected (step 303) and sub-headings under the selected heading are picked (304) and then any unlabelled sentences under the sub-heading are selected (306). The sentence is then compared with text under the same sub-heading from other SOAs (307) . The selected sentence is labelled with
10141985_1 the same label as similar or the same sentences already labelled (308) .
A similar process is applied to further sentences under other sub-headings and headings until all the SOAs are processed (steps 301, 302, 305 and 309).
The next stage of the parsing process is to label or tag the labelled text with associations to Products (e.g. financial products), Goals (e.g. financial goals) and Strategies (e.g. financial strategies) and any other attributes that the document features may require. Goals, Strategies and Products are obtained from parsing the historical documents.
Referring to Figure 8, all the labelled portions of documents are input to the process (400) . A dataset is created starting with the SOA file name as the first column (for test and validation) (401). For each label text a new column is added (402). A new column is added for the sub-heading label (403). Then a new column is added for each Goal (404), Strategy (405) and Product (406) .
A process is then implemented to tag the labelled portions of texts as Goals, Strategies or Products. Text is selected (step 409) and a Spearman correlation is tested between each labelled text and other columns in the dataset. A Spearman correlation matrix is built (410) and then for each labelled text in the dataset (step 411) a determination is made if the text is 100% correlation with Goals (412), Strategy (414) or Products (415). Depending on these determinations, the section of text is tagged with Goals, Strategies or Products (steps 413, 416 and 417) .
This process is repeated for each sub-heading (steps 407
10141985_1 and 408) until all of the content is appropriately tagged (420) .
The labelled texts are then available for document generation .
To summarise the above process, this state of the art technique is used to parse and extract all the dynamic part of the SOAs and map them across all SOAs, so that varieties of the values for different segments of an SOA can be identified. The segments are flagged automatically based on differences and similarities. The same ones will get the same ID that can be used as a class variable for classification.
The historical corpus of the SOAs also has the Fact Find in the documentation. The dynamic parser searches and extracts keywords and pairs with corresponding values for each key (e.g. Date of Birth: 54). The extracted data is restructured to form a training set suitable to be used as the input of the Al engine (neural network 12). The dynamic parser also checks the labelled text portions against the Fact Find data to find any correlations between dynamic parse and the Fact Find provided. The combination of Goals/Products/Strategies are examined for pattern recognition.
Figure 5 is a diagram illustrating the content prediction engine. In this embodiment, this is a machine learning arrangement 150, in the form of a neural network. The neural network 12 is a deep neural network with 25% dropout, RELU activation function, ADAM optimisation algorithm with four fully connected layers (100 neurons in the first layer, 50 in the second layer, 25 in the third layer, and 10 in the final layer).
The neural network 12 is trained based on input historical
10141985_1 prediction attributes which are obtained by parsing the historical documents 15 to find the Fact Finds (attributes). Other facts (attributes) can be obtained in many other ways e.g. manual entry, automatically obtaining information from networks e.g. the Web (via a Web API 17) and in any other way.
The historical Fact Finds are obtained from the historical document corpus 15 by parsing the documents to extract information and by pairing the keys and values in the Fact Find (step 150). All the attributes obtained from the parsing of the historical documents and by other means are then used to make a training set which is fed into the deep neural network 12 in order for the network to understand the relationship between personal information (and other attributes) and Strategies/Products/Goals (step
151) .
The trained model 12 will then be used to predict the Strategy, Product and Goals for the next client (step
152) . This will be based on the new Fact Find that will be input to the trained model.
The neural networks essentially output strategic guidance elements (153) that are input to the selection engine 21 with the new Fact Find information, so the selection engine 21 can select from the available sections of document (from database 2) to output a document which includes text relevant to the predicted, Strategies, Products and Fact Find 154, to generate the final SOA 25. The SOA may be reviewed by an expert to finalise it.
Figure 13 is a flow diagram illustrating the operation of the neural network 12 in more detail.
At step 500 the historical prediction attributes are input. For every field in the Fact Find a new column is
10141985_1 made in the table (501) . A record is added for every Fact Find in the historical SOAs 502. The categorical columns are binarised (503) and all the numerical columns are binned (504) . Strategies are then processed. The Strategies picked (506) and used as a class table (507) . The class table is hot coded (508) and a training set is made (509).
Parameters are adjusted (510) and hyper-parameters are adjusted (511). The deep learning neural model is trained (512) and the hyper-parameters are optimised (513) and the trained model is saved (514) . This process is repeated for all Strategies (step 505).
Next the Products are processed (step 516), in a similar manner to the Strategies (step 517, 518, 519, 520, 521, 522 and 523) . The trained model is saved (524) . This process is repeated for all Products (step 515) until the machine learning arrangement 12 is trained.
In an example, the training set for SOAs has been used to train the model and after 2000 epochs the model reached maximum accuracy of 91% on the training set.
New client Fact Finds and Goals are used as the prediction set (step 525). For each part of the SOA the model is run to predict the relevant segment that needs to be assembled in a new document. The new Fact Finds are processed by making a new column in a table for every field in the Fact Find (526). The categorical columns are binarised (527) and all the numerical columns are binned (528) .
The table is fed into the saved trained model to output prediction elements for predicting the relevant document segment that needs to be assembled in the new document.
Figure 14 shows a display of part of an output of a
10141985_1 dynamic parser. Document segments are shown on the left (reference numeral 500) and corresponding strategic guidance elements shown on the right (reference numeral
501) .
The output of the neural network selects the strategic guidance elements and from this the corresponding document segments are selected.
The generated SOA is revised by an expert for further modification and improvement so it is accurate and ready to be delivered to the client. Further, the final versions will be used for improvement of the training set and more ways will be assigned to the newer data points, so the model will be more influenced by changes.
In the above embodiment, the system is arranged for preparation of SOAs for financial advice. It will be appreciated that the system could be arranged for generating other documents, e.g. legal contracts, etc.
In the above embodiment, the Al engine is implemented by a deep neural network. The Al engine is not limited to a deep neural network. It may be implemented by any other types of neural network, or by any other types of machine learning process and system.
It will be understood to persons skilled in the art of the invention that many modifications may be made without departing from the spirit and scope of the invention.

Claims (17)

  1. Claims
    1. A system for generating documents that have dynamic content, comprising a document content generator comprising a database storing a plurality of document segments, and a selection engine arranged to select document segments and build a document from the selected segments;
    and a content prediction engine arranged to receive input attributes to process the input prediction attributes to provide content parameters and to use the content parameters to affect the selection engine to select document segments and generate a document.
  2. 2. A system according to claim 1, wherein the content prediction engine comprises a machine learning arrangement.
  3. 3. A system in accordance with claim 2, wherein the machine learning arrangement comprises a neural network.
  4. 4. A system in accordance with claim 2 or claim 3, wherein the machine learning arrangement is arranged to implement a solution to a multi-class classification problem.
  5. 5. A system in accordance with any one of claims 2 to 4, wherein the content parameters comprise strategic guidance elements, and the machine learning arrangement is arranged to output the strategic guidance elements.
  6. 6. A system in accordance with any one of claims 2 to 5, wherein the machine learning arrangement is trained based on input historical prediction attributes.
  7. 7. A system in accordance with claim 6, wherein the
    10141985_1 input historical prediction attributes are obtained from historical document corpus of the same type of document as the document being generated.
  8. 8. A system in accordance with any one of the preceding claims, wherein the document content generator comprises a dynamic parser arranged to produce the plurality of document segments by analysis and segmentation of historical documents of the same type as the document being generated.
  9. 9. A system in accordance with claim 8, wherein the dynamic parser is arranged to parse the historical set of documents to identify static portions of the documents (portions with content generally remaining the same) and dynamic portions of the document (portions with variable content).
  10. 10. A system in accordance with any one of the preceding claims, wherein the document is a statement of advice (SOA) for financial advice.
  11. 11. A method of generating documents that have dynamic content, comprising the steps of:
    processing input prediction attributes by a content prediction process, to provide content parameters;
    using the content parameters to affect selection from a plurality of stored document segments; and generating a document from the selected document segments .
  12. 12. A method in accordance with claim 11, comprising implementing the content prediction process by way of a machine learning process.
    10141985_1
  13. 13. A method in accordance with claim 11 wherein the machine learning process comprises a neural network.
  14. 14. A method of instructing a system for generating documents that have dynamic content, the method comprising the steps of:
    parsing a plurality of historical documents of the same type as the document to be generated, by analysing and segmenting the historical documents, to prepare a database of a plurality of document segments;
    training a machine learning arrangement based on input historical prediction attributes, obtained from historical document corpus of the same type of document as the document being generated, to prepare the machine learning arrangement to output strategic guidance elements for affecting the selection of a plurality of document segments from the available document segments, based on input current prediction attributes.
  15. 15. A computer program, comprising instructions for controlling a computer to implement a system in accordance with any one of claims 1 to 10.
  16. 16. A computer readable medium, providing a computer program in accordance with claim 15.
  17. 17. A data signal, comprising a computer program in accordance with claim 15.
AU2018202420A 2018-04-05 2018-04-05 A System and Method for Generating Documents Abandoned AU2018202420A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2018202420A AU2018202420A1 (en) 2018-04-05 2018-04-05 A System and Method for Generating Documents
PCT/AU2019/050305 WO2019191817A1 (en) 2018-04-05 2019-04-05 A system and method for generating documents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2018202420A AU2018202420A1 (en) 2018-04-05 2018-04-05 A System and Method for Generating Documents

Publications (1)

Publication Number Publication Date
AU2018202420A1 true AU2018202420A1 (en) 2019-10-24

Family

ID=68099663

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2018202420A Abandoned AU2018202420A1 (en) 2018-04-05 2018-04-05 A System and Method for Generating Documents

Country Status (2)

Country Link
AU (1) AU2018202420A1 (en)
WO (1) WO2019191817A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015978B (en) * 2020-07-24 2023-06-23 上海淇玥信息技术有限公司 Custom information sending method and device and electronic equipment
CN112749253B (en) * 2020-12-28 2022-04-05 湖南大学 Multi-text abstract generation method based on text relation graph
AU2023204364A1 (en) * 2022-08-22 2024-03-07 Rohirrim, Inc. Computer-generated content based on text classification, semantic relevance, and activation of deep learning large language models

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8073708B1 (en) * 2006-08-16 2011-12-06 Resource Consortium Limited Aggregating personal healthcare informatoin
US10839149B2 (en) * 2016-02-01 2020-11-17 Microsoft Technology Licensing, Llc. Generating templates from user's past documents
US10579721B2 (en) * 2016-07-15 2020-03-03 Intuit Inc. Lean parsing: a natural language processing system and method for parsing domain-specific languages
US11049190B2 (en) * 2016-07-15 2021-06-29 Intuit Inc. System and method for automatically generating calculations for fields in compliance forms

Also Published As

Publication number Publication date
WO2019191817A1 (en) 2019-10-10

Similar Documents

Publication Publication Date Title
Davenport From analytics to artificial intelligence
US11403715B2 (en) Method and system for providing domain-specific and dynamic type ahead suggestions for search query terms
Gampfer et al. Past, current and future trends in enterprise architecture—A view beyond the horizon
Bauer et al. Quantitive evaluation of Web site content and structure
Wawak et al. Research trends in quality management in years 2000-2019
Khang et al. AI-Aided Data Analytics Tools and Applications for the Healthcare Sector
Tlemsani et al. Screening of Murabaha business process through Quran and hadith: a text mining analysis
Archmiller et al. Computational reproducibility in the wildlife Society's flagship journals
US20230388413A1 (en) Tool for annotating and reviewing audio conversations
WO2019191817A1 (en) A system and method for generating documents
Surian et al. A shared latent space matrix factorisation method for recommending new trial evidence for systematic review updates
Balona ActuaryGPT: Applications of large language models to insurance and actuarial work
Velásquez et al. A knowledge base for the maintenance of knowledge extracted from web data
CN113112282A (en) Method, device, equipment and medium for processing consult problem based on client portrait
Damonte Gauging the import and essentiality of single conditions in standard configurational solutions
WO2023196413A1 (en) Automated regulatory decision-making for compliance
Zhang et al. Multiple imputation for missingness due to nonlinkage and program characteristics: a case study of the National Health Interview Survey linked to medicare claims
Kumar et al. A survey on IBM watson and its services
Te et al. Design of a small and medium enterprise growth prediction model based on web mining
Boryaev Development of intelligent system of global bibliographic search
Panayi et al. Evaluation of a prototype machine learning tool to semi-automate data extraction for systematic literature reviews
US20230195933A1 (en) Machine learning and rule-based identification, anonymization, and de-anonymization of sensitive structured and unstructured data
Donaldson et al. Trustworthy Digital Repository Certification: A Longitudinal Study
Harrag et al. Mining Stack Overflow: a Recommender Systems-Based Model
Kursh An Introduction to the" How To" for AI and Machine Learning.

Legal Events

Date Code Title Description
MK4 Application lapsed section 142(2)(d) - no continuation fee paid for the application