CN108563705A - A kind of audit auditing method and system based on text mining analysis technology - Google Patents

A kind of audit auditing method and system based on text mining analysis technology Download PDF

Info

Publication number
CN108563705A
CN108563705A CN201810253604.5A CN201810253604A CN108563705A CN 108563705 A CN108563705 A CN 108563705A CN 201810253604 A CN201810253604 A CN 201810253604A CN 108563705 A CN108563705 A CN 108563705A
Authority
CN
China
Prior art keywords
contract
audit
key message
enterprise
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810253604.5A
Other languages
Chinese (zh)
Inventor
高军
吴建国
罗江筑
巫阳波
李梅
韩进
梁从文
刘从容
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Power Grid Co Ltd
Original Assignee
Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Power Grid Co Ltd filed Critical Guizhou Power Grid Co Ltd
Priority to CN201810253604.5A priority Critical patent/CN108563705A/en
Publication of CN108563705A publication Critical patent/CN108563705A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation

Abstract

The invention discloses a kind of audit auditing methods based on text mining analysis technology, including step:S1, enterprise's contract dataset is extracted from enterprise's contract management system, and extract contract key message, carry out structured storage;S2, the contract key message of extraction and fund flow data are checked, finds audit issues.Compared with previous enterprise's contract audit data method, the invention has the advantages that:One, the contract key message in enterprise's contract is automatically extracted, compared with previous artificial extraction, saves prodigious people's financial resources cost;Two, the fund flow data that contract dataset is returned with bank is checked automatically, it can be found that any unmatched problem, avoids missing some problems because of human negligence.A kind of system executing this method is provided simultaneously, which includes:Module is checked in data acquisition module, contract key message abstraction module, audit.

Description

A kind of audit auditing method and system based on text mining analysis technology
Technical field
The present invention relates to Audit data excavation applications, specifically, being that a kind of audit based on Text Mining Technology is checked Method and system.
Background technology
With the arriving in big data epoch, the audit economic supervision department comprehensive as one will also face what it brought Huge challenge produces the unstructured data of magnanimity in business system, only manually audits, effect an utterly inadequate amount, The blind area for having become audit operations for the audit of unstructured data, it is unstructured using high-tech means and tool realization The demand that the analysis of data provides data supporting with excavation for audit operations is extremely urgent.
Text mining from non-structured text information for obtaining user's information interested or useful, text mining Cover multiple technologies, including information extraction, information retrieval, natural language processing and data mining technology, it is mainly used for From script without extracting unknown knowledge in the text that uses.
Existing audit is typically all artificial extraction data, is susceptible to the problem of data are slipped, and currently auditing Unstructured data has not yet been formed the audit system of automation in field.
Invention content
Present invention solves the technical problem that being:Audit field forms the audit system of automation not yet at present.
The solution that the present invention solves its technical problem is:On the one hand, a kind of examining based on text mining analysis technology Count auditing method, including step:
S1, enterprise's contract dataset is extracted from enterprise's contract management system, and extract contract key message, carry out structuring Storage;
S2, the contract key message of extraction and fund flow data are checked.
Further, enterprise's contract dataset includes contract documents, the document formats of the contract documents be pdf, doc, Docx is any.
Further, the contract key message includes:Contract payment information, contract total price, down payment time, first Secondary Payment Amount, second of payment time, second of Payment Amount.
Further, including:In step sl:
S11, technology reading contract documents are read using document;
S12, contract key message extracting rule library is formulated, and institute is extracted by text extraction techniques using the rule base State contract key message;
S13, tables of data is established, the contract key message that step S12 is extracted is stored in tables of data.
Further, including:In step s 2:
S21, fund flow data is extracted from financial system;
S22, by the fund flow data and the contract key message of step S12 extraction according to the audit regulation formulated in advance into Row matching;
S23, it underproof contract key message will be matched is grouped.
On the other hand, a kind of audit audit system based on text mining analysis technology is provided, including:Data acquisition module Module is checked in block, contract key message abstraction module, audit, and the data acquisition module is for extracting enterprise's contract dataset;Institute Contract key message abstraction module is stated for extracting contract key message from enterprise's contract dataset;Mould is checked in the audit Block by the audit regulation formulated in advance for matching the contract key message with fund flow data.
Further, this system further includes front end display module, and the front end display module includes:For showing the data Acquisition module extraction enterprise's contract dataset, for showing the contract key message abstraction module from enterprise's contract dataset The contract key message of extraction.
The beneficial effects of the invention are as follows:On the one hand, audit auditing method provided by the present invention is by text mining skill Art extracts contract key message from enterprise's contract automatically, forms the fund that bank returns in structural data, with financial system Flow data is compared, and by formulating audit issues rule, finding audit issues and being grouped, audit issues are sorted out in realization, with Just concentration audit is carried out to same problems.Compared with previous enterprise's contract audit data method, the invention has the advantages that: One, automatically extract the contract key message in enterprise's contract, compared with previous artificial extraction, save prodigious people's financial resources at This;Two, the fund flow data that contract dataset is returned with bank is checked automatically, it can be found that any unmatched problem, It avoids missing some problems because of human negligence.
On the other hand, the present invention also provides the systems for executing this method.
Description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described.Obviously, described attached drawing is a part of the embodiment of the present invention, rather than is all implemented Example, those skilled in the art without creative efforts, can also be obtained according to these attached drawings other designs Scheme and attached drawing.
Fig. 1 is the step flow chart of present invention audit auditing method;
Fig. 2 is the system block diagram of present invention audit audit system.
Specific implementation mode
The technique effect of the design of the present invention, concrete structure and generation is carried out below with reference to embodiment and attached drawing clear Chu is fully described by, to be completely understood by the purpose of the present invention, feature and effect.Obviously, described embodiment is this hair Bright a part of the embodiment, rather than whole embodiments, based on the embodiment of the present invention, those skilled in the art are not being paid The other embodiment obtained under the premise of creative work, belongs to the scope of protection of the invention.In addition, be previously mentioned in text All connection/connection relations not singly refer to component and directly connect, and refer to that can be added deduct according to specific implementation situation by adding Few couple auxiliary, to form more preferably coupling structure.Each technical characteristic in the invention, in not conflicting conflict Under the premise of can be with combination of interactions.
Embodiment 1, the invention discloses a kind of audit auditing method based on Text Mining Technology, wherein including as follows Step:
S1, enterprise's contract dataset is extracted from enterprise's contract management system, and extract contract key message, carry out structuring Storage;Wherein, enterprise's contract dataset includes contract documents, and the document format of the contract documents is appointed for pdf, doc, docx It is a kind of;The contract key message includes:Contract payment information, such as contract total price, down payment time, down payment gold Volume, second of payment time, second of Payment Amount.
S2, the contract key message of extraction and fund flow data are checked;
In conjunction with Fig. 1, the specific implementation process of above-mentioned steps is described in detail, content is as follows:
S11, technology reading contract documents are read using document;
S12, lay down a regulation library for contract key message, and extracts institute by text extraction techniques using the rule base State contract key message;
S13, tables of data is established in the database, the contract key message that step S12 is extracted is stored in tables of data;
Step S13 realizes the structured storage of unstructured information by establishing tables of data.
S21, fund flow data is extracted from financial system;
In step S21, fund flow data is the financial system data returned from bank, including:Payment time, payment gold Volume, payer.
S22, by the fund flow data and the contract key message of step S12 extraction according to the audit regulation formulated in advance into Row matching;
Such as:By cash flow data grabber payment time C1, Payment Amount C2, captures B companies by text techniques and close Payment time data D1, Payment Amount data D2 in compare payment time C1 and D1, Payment Amount according to audit regulation Whether activity between C2 and D2 main bodys meets audit regulation.
S23, underproof contract key message will be matched into line label grouping.
Step S23 is grouped by that will match underproof contract key message into line label, can be facilitated and be audited to same class Problem focuses on.
The contract key message includes:Contract payment information such as contract total price, the down payment time, is paid for the first time The money amount of money, second of payment time, second of Payment Amount, these information are believed with fixed format for the contract key The library that lays down a regulation is ceased, and the contract key message is extracted by text extraction techniques using the rule base;Such as it to extract The data of " contract total price ", " down payment time ", formulating extracting rule is:Keyword " contract the total price "+amount of money (canonical table It is up to formula
((^[-]([1-9]\d*))|^0)(\.\d{1,2})$|(^[-]0\.\d{1,2}$));Keyword is " for the first time Payment time "+time format data (time of YYYY/MM/DD forms, regular expression be ^ d { 4 } (- |/|) d 1, 2}\1\d{1,2}$).Rule base and text the crawl technology that the present embodiment is formulated are all made of pcre tools.
Wherein, the audit regulation of the audit regulation of the pre- formulation described in step S22, the pre- formulation can examining according to concern Meter problem (executing payment A1 as do not pressed contract terms, make payment beforehand A2, the inconsistent A3 of Payment Amount) defines audit issues rule, Form is B*:A* ... A*, such as B1:A1、B2:A2、B3:A3、B4:A2A3, wherein B* are rule numbers, and A* ... A* are expired by rule The condition of foot.
In conclusion the audit auditing method that the present embodiment is provided, is closed automatically from enterprise by Text Mining Technology With middle extraction contract key message, structural data is formed, the fund flow data returned with bank in financial system is compared, By formulating audit issues rule, finding audit issues and being grouped, realizes and audit issues are sorted out, to be carried out to same problems Concentrate audit.
Compared with previous enterprise's contract audit data method, the invention has the advantages that:One, enterprise's contract is automatically extracted In contract key message save prodigious people's financial resources cost compared with previous artificial extraction;Two, by contract dataset with The fund flow data that bank returns is checked automatically, it can be found that any unmatched problem, avoids missing because of human negligence Some problems.
With reference to figure 2, the system comprises data acquisition module A10, contract key message abstraction module A20 and audits to check Modules A 30.Data acquisition module A10 acquisition is stored in the contract documents data of enterprise's contract management system, and before passing through Hold display module A4 displayings.Described information abstraction module A20 extracts contract key message from the contract documents, is from finance System extraction fund flow data, and be shown by front end display module A4, people can be made timely by front end display module A4 Know whether the contract key message of extraction is correct,
The audit checks that modules A 30 includes that audit regulation formulates modules A 31, matching module A32.The audit regulation Audit regulation can be formulated according to the audit issues of concern by formulating modules A 31.The matching module A32 is used for the contract to extraction Key message, fund flow data are matched, and find audit issues by the audit regulation of formulation, and show mould by front end Block A4 is visualized.
The better embodiment of the present invention is illustrated above, but the invention is not limited to the implementation Example, those skilled in the art can also make various equivalent modifications or be replaced under the premise of without prejudice to spirit of that invention It changes, these equivalent modifications or replacement are all contained in the application claim limited range.

Claims (7)

1. a kind of audit auditing method based on text mining analysis technology, which is characterized in that including step:
S1, enterprise's contract dataset is extracted from enterprise's contract management system, and extract contract key message, carry out structuring and deposit Storage;
S2, the contract key message of extraction and fund flow data are checked.
2. a kind of audit auditing method based on text mining analysis technology according to claim 1, which is characterized in that institute It includes contract documents to state enterprise's contract dataset, and the document format of the contract documents is that pdf, doc, docx are any.
3. a kind of audit auditing method based on text mining analysis technology according to claim 2, it is characterised in that:Institute The contract key message of stating includes:Contract payment information, contract total price, down payment time, the down payment amount of money, second Payment time, second of Payment Amount.
4. a kind of audit auditing method based on text mining analysis technology according to claim 3, which is characterized in that packet It includes:In step sl:
S11, technology reading contract documents are read using document;
S12, contract key message extracting rule library is formulated, and the conjunction is extracted by text extraction techniques using the rule base Same key message;
S13, tables of data is established, the contract key message that step S12 is extracted is stored in tables of data.
5. a kind of audit auditing method based on text mining analysis technology according to claim 4, which is characterized in that packet It includes:In step s 2:
S21, fund flow data is extracted from financial system;
S22, by the fund flow data and the contract key message of step S12 extraction according to the audit regulation progress formulated in advance Match;
S23, it underproof contract key message will be matched is grouped.
6. a kind of audit audit system based on text mining analysis technology, which is characterized in that including:Data acquisition module, conjunction Module is checked with key message abstraction module, audit, and the data acquisition module is for extracting enterprise's contract dataset;The contract Key message abstraction module is used to extract contract key message from enterprise's contract dataset;The audit checks that module is used for The contract key message is matched with fund flow data by the audit regulation formulated in advance.
7. a kind of audit audit system based on text mining analysis technology according to claim 6, which is characterized in that also Including front end display module, the front end display module includes:Enterprise's contract for showing the data acquisition module extraction Data, the contract key message extracted from enterprise's contract dataset for showing the contract key message abstraction module.
CN201810253604.5A 2018-03-26 2018-03-26 A kind of audit auditing method and system based on text mining analysis technology Pending CN108563705A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810253604.5A CN108563705A (en) 2018-03-26 2018-03-26 A kind of audit auditing method and system based on text mining analysis technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810253604.5A CN108563705A (en) 2018-03-26 2018-03-26 A kind of audit auditing method and system based on text mining analysis technology

Publications (1)

Publication Number Publication Date
CN108563705A true CN108563705A (en) 2018-09-21

Family

ID=63533290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810253604.5A Pending CN108563705A (en) 2018-03-26 2018-03-26 A kind of audit auditing method and system based on text mining analysis technology

Country Status (1)

Country Link
CN (1) CN108563705A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658069A (en) * 2018-12-25 2019-04-19 广东电网有限责任公司 A kind of electricity charge electrovalence risk monitoring method and system
CN111815162A (en) * 2020-07-08 2020-10-23 国网上海市电力公司 Digital auditing tool and method
CN112668323A (en) * 2019-10-14 2021-04-16 北京慧点科技有限公司 Text element extraction method based on natural language processing and text examination system thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003036425A2 (en) * 2001-10-23 2003-05-01 Electronic Data Systems Corporation System and method for managing a procurement process
CN106815213A (en) * 2016-12-30 2017-06-09 全民互联科技(天津)有限公司 A kind of contract performance clause extraction method and system
CN107608958A (en) * 2017-09-07 2018-01-19 湖南湘君奕成信息技术有限公司 Contract text risk information method for digging and system based on clause unified Modeling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003036425A2 (en) * 2001-10-23 2003-05-01 Electronic Data Systems Corporation System and method for managing a procurement process
US20030115080A1 (en) * 2001-10-23 2003-06-19 Kasra Kasravi System and method for managing contracts using text mining
CN106815213A (en) * 2016-12-30 2017-06-09 全民互联科技(天津)有限公司 A kind of contract performance clause extraction method and system
CN107608958A (en) * 2017-09-07 2018-01-19 湖南湘君奕成信息技术有限公司 Contract text risk information method for digging and system based on clause unified Modeling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
罗江筑: "大数据环境下基于文本挖掘的审计数据分析框架", 《会计之友》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658069A (en) * 2018-12-25 2019-04-19 广东电网有限责任公司 A kind of electricity charge electrovalence risk monitoring method and system
CN112668323A (en) * 2019-10-14 2021-04-16 北京慧点科技有限公司 Text element extraction method based on natural language processing and text examination system thereof
CN112668323B (en) * 2019-10-14 2024-02-02 北京慧点科技有限公司 Text element extraction method based on natural language processing and text examination system thereof
CN111815162A (en) * 2020-07-08 2020-10-23 国网上海市电力公司 Digital auditing tool and method

Similar Documents

Publication Publication Date Title
CN107608958B (en) Contract text risk information mining method and system based on unified modeling of clauses
CN102129479B (en) World wide web service discovery method based on probabilistic latent semantic analysis model
CN105243117A (en) Data processing system and method
CN108563705A (en) A kind of audit auditing method and system based on text mining analysis technology
CN110851667B (en) Integration analysis method and tool for large amount of data of multiple sources
US20210366055A1 (en) Systems and methods for generating accurate transaction data and manipulation
CN103605651A (en) Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis
CN103345484A (en) Report form processing system based on dynamic domain and method
CN108280562B (en) Method for standardizing data resources of power enterprise
CN114117171A (en) Intelligent project file collecting method and system based on energized thinking
CN111382279A (en) Order examination method and device
CN107808334A (en) A kind of method that accounting voucher is automatically generated from business paper
CN107679977A (en) A kind of tax administration platform and implementation method based on semantic analysis
CN110111065A (en) A kind of data management system of patent agency
CN107168937A (en) Financial cloud accounting element particle and assemble method based on XBRL
CN112396437A (en) Trade contract verification method and device based on knowledge graph
CN111027832A (en) Tax risk determination method, apparatus and storage medium
Foucault et al. Automatic classification of tweets for analyzing communication behavior of museums
García et al. Linking XBRL financial data
US9195661B2 (en) Method and system for click-thru capability in electronic media
Hao et al. Continuous Auditing: Technical Innovation and Value-added
Haas et al. Identifying financial institutions by transaction signatures
CN111612601B (en) Financial risk identification method and device for marketing companies based on service institutions
Wang et al. Construction of knowledge graph for internal control of financial enterprises
Zhang Research on the factors influencing XBRL Financial reporting innovation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180921