CN113763143A - Auditing processing method, computer equipment and storage device - Google Patents

Auditing processing method, computer equipment and storage device Download PDF

Info

Publication number
CN113763143A
CN113763143A CN202110888051.2A CN202110888051A CN113763143A CN 113763143 A CN113763143 A CN 113763143A CN 202110888051 A CN202110888051 A CN 202110888051A CN 113763143 A CN113763143 A CN 113763143A
Authority
CN
China
Prior art keywords
audited
file
audit
files
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110888051.2A
Other languages
Chinese (zh)
Inventor
吴士泓
王志刚
李向
谢峰
徐静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yuanguang Software Co Ltd
Original Assignee
Yuanguang Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yuanguang Software Co Ltd filed Critical Yuanguang Software Co Ltd
Priority to CN202110888051.2A priority Critical patent/CN113763143A/en
Publication of CN113763143A publication Critical patent/CN113763143A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/08Auctions

Abstract

The application discloses an audit processing method, computer equipment and a storage device. The method comprises the following steps: establishing an audit file library of the project to be audited, wherein the audit file library is used for storing the file to be audited related to the project to be audited; acquiring at least one file to be audited from an audit file library, and acquiring audit key information of the file to be audited; and auditing the file to be audited based on the auditing key information. By the scheme, the auditing processing efficiency can be improved.

Description

Auditing processing method, computer equipment and storage device
Technical Field
The present application relates to the field of auditing processing technologies, and in particular, to an auditing processing method, a computer device, and a storage apparatus.
Background
In the course of bidding for each project, there are a lot of bidding documents for each project and bidding enterprise, and the auditing department usually needs to audit and supervise the bidding and other work of each project.
At present, in the process of manual auditing, the bidding documents of each project and bidding enterprise need to be manually collected, and a large number of auditors need to look up the bidding documents one by one, so as to manually search and record various auditing information and audit all the bidding documents. The auditing workload of the process is large, and the efficiency is low.
Disclosure of Invention
The technical problem mainly solved by the application is to provide the auditing processing method, the computer equipment and the storage device, so that the auditing processing efficiency can be improved.
In order to solve the above problem, a first aspect of the present application provides an audit processing method, including: establishing an audit file library of the project to be audited, wherein the audit file library is used for storing the file to be audited related to the project to be audited; acquiring at least one file to be audited from an audit file library, and acquiring audit key information of the file to be audited; and auditing the file to be audited based on the auditing key information.
In order to solve the above problem, a second aspect of the present application provides a computer device, which includes a memory and a processor coupled to each other, wherein the memory stores program data, and the processor is configured to execute the program data to implement any step of the above auditing processing method.
In order to solve the above problem, a third aspect of the present application provides a storage device, which stores program data capable of being executed by a processor, the program data being used for implementing any one of the steps of the above auditing processing method.
According to the scheme, an audit file library of the project to be audited is established, and the audit file library is used for storing the file to be audited related to the project to be audited; acquiring at least one file to be audited from an audit file library, and acquiring audit key information of the file to be audited; based on the audit key information, the files to be audited are audited, so that the manual audit of the files to be audited by auditors can be avoided, the audit workload is saved, and the audit processing efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions in the present application, the drawings required in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive labor. Wherein:
FIG. 1 is a schematic flow chart diagram illustrating one embodiment of an audit processing method of the present application;
FIG. 2 is a flowchart illustrating an embodiment of step S12 in FIG. 1;
FIG. 3 is a schematic flowchart of an embodiment of a method for extracting information from a document according to the present application;
FIG. 4 is a flowchart illustrating an embodiment of step S23 in FIG. 3;
FIG. 5 is a flowchart illustrating an embodiment of step S13 of FIG. 1;
FIG. 6 is a flowchart illustrating an embodiment of a method for calculating similarity of texts according to the present application;
FIG. 7 is a flowchart illustrating an embodiment of step S33 of FIG. 6;
FIG. 8 is a flowchart illustrating a method for calculating similarity of texts according to another embodiment of the present application;
FIG. 9 is a schematic block diagram of an embodiment of a computer apparatus of the present application;
FIG. 10 is a schematic structural diagram of an embodiment of a memory device according to the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first" and "second" in this application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The present application provides the following examples, each of which is specifically described below.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating an auditing method according to an embodiment of the present application. The method may comprise the steps of:
s11: and establishing an audit file library of the project to be audited, wherein the audit file library is used for storing the file to be audited related to the project to be audited.
Audit is taken as a supervision mechanism, and is an independent economic supervision activity for maintaining financial and financial disciplines, improving management and improving economic benefits by using a special method to examine and supervise the authenticity, correctness, compliance, legality and profitability of the finance, financial balance, operation management activities and related data of an audited unit according to national regulations, audit criteria and accounting theory and by using a special method.
In the process of auditing and supervising the auditing project and the like, a large number of auditing files in the auditing project need to be audited. The present application takes the bidding project as the project to be audited for explanation, but the present application is not limited to this.
In the process of auditing, an auditing file library of the project to be audited can be established, wherein the auditing file library can be used for storing the files to be audited related to the project to be audited. In the audit document library, bid documents and bid documents of a plurality of items to be audited can be stored, and the bid documents and/or the bid documents can be used as the files to be audited.
The bidding document is an outline of bidding engineering construction, is a working basis for construction units to implement engineering construction, and provides all conditions required for bidding participation to the bidding units.
The bidding document refers to a response document which is required to be compiled by a bidder and generally comprises a business document, a technical document, a quotation document and other parts. The bid document generally comprises three parts: credit-providing part, business part and technical part. The credit standing part comprises a series of contents such as company qualification, company condition introduction and the like, and also related contents such as other files required to be provided by the bidding document and the like, including company performance, various certificates, reports and the like. The technical part comprises technical schemes such as engineering description, design and construction scheme, engineering amount lists, personnel configuration, drawings, tables and other technical related data. The business segment includes bid quotation specifications, total bid prices, major material price tables and contractual terms (general and special), etc.
Establishing an audit file library of the project to be audited, specifically, storing the files to be audited in the audit file library in a classified manner according to the project to be audited and the file format of the files to be audited; wherein, the file format includes: at least one of an electronic Document Format and other formats, wherein the other formats include any one of a picture and a Portable Document Format (PDF). For example, files to be audited of different items to be audited are stored in a classified mode, and files to be audited of different file formats in the same item to be audited are stored in an audit file library in a classified mode. In addition, the distributed file system can be adopted to store the files to be audited in the audit file library in a classified mode.
S12: and acquiring at least one file to be audited from the audit file library, and acquiring audit key information of the file to be audited.
At least one file to be audited can be obtained from the audit file library, if one target project to be audited is audited, a plurality of files to be audited of the target project to be audited can be obtained from the audit file library, namely a plurality of or all bid files of the target project to be audited are obtained. Audit key information required for auditing the project to be audited is obtained from the files to be audited, and the audit key information required by corresponding auditing can be extracted according to the specific audit project. Such as bidding enterprises, bidding enterprise qualifications, bid quotes, bidding techniques, etc., to which the present application is not limited.
S13: and auditing the file to be audited based on the auditing key information.
Based on the extracted audit key information of the multiple files to be audited of each item to be audited, for example, based on the association relationship among the audit key information, each audit key information can be analyzed and mined so as to audit the files to be audited. In addition, the similarity between the files to be audited can be obtained based on the auditing key information, so that the auditing treatment is carried out on the files to be audited based on the similarity between the files to be audited.
In the embodiment, an audit file library of the project to be audited is established, and the audit file library is used for storing the file to be audited related to the project to be audited; acquiring at least one file to be audited from an audit file library, and acquiring audit key information of the file to be audited; based on the audit key information, the files to be audited are audited, so that the manual audit of the files to be audited by auditors can be avoided, the audit workload is saved, and the audit processing efficiency is improved.
In some embodiments, referring to fig. 2, the step S12 may include the following steps:
s121: and if the file format of the file to be audited is other formats, converting the file to be audited into an electronic document format file or structured data, wherein the other formats comprise any one of pictures and portable document formats.
Before obtaining the audit key information of the file to be audited in the step S12, the method may include: and detecting the file format of the file to be audited, and if the file format of the file to be audited is other formats, and the other formats comprise any one of pictures and portable document formats, converting the file to be audited into an electronic document format file or structured data. The electronic Document format may be an electronic book format, and may include any format such as DOC (Document), PPT (PowerPoint, slide format), TXT (text Document), and the like, where the format has a function of recording image-text information. Structured data, also called row data, is data logically represented and implemented by a two-dimensional table structure, strictly following the data format and length specifications, and mainly stored and managed by a relational database. The structured data is, for example, data in an EXcel document format, and the like, and the application is not limited thereto.
Alternatively, the pending file may be converted to an electronic document format file or structured data. Specifically, an Optical Character Recognition technology (OCR technology for short) may be used to perform Character Recognition on the file to be audited, so as to recognize Character information in the file to be audited, for example, an OCR technology may be used to perform Character Recognition on the file to be audited in a picture format, and convert characters in the picture into a text format, so as to obtain a Recognition result. Thus, an electronic document format file or a pending file of structured data can be generated based on the result of the recognition.
Optionally, the input additional information of the file to be audited may also be obtained, and the additional information may be information manually input by a worker to the file to be audited. The pending files may be converted to electronic document format files or to pending files of structured data based on the input additional information.
Optionally, when character recognition is performed on the file to be audited by using an optical character recognition technology, for the character which cannot be recognized, input additional information of the file to be audited may be obtained, where the additional information is information manually input by a worker on the character which cannot be recognized in OCR technology recognition. Thus, an electronic document format file or a pending file of structured data can be generated based on the result of recognition by the OCR technology, the input additional information.
The obtaining of the audit key information of the file to be audited in step 12 may include the following steps:
s122: and extracting audit business information in the bidding business document by using a preset extraction model.
The pending documents comprise bidding business documents and bidding technical documents related to the pending project. That is, the pending documents at least include the bid business documents and the bid technical documents in the bid documents. In addition, in some embodiments, the pending documents may also include a quotation document and other partial documents in the bid document, which is not limited in this application.
Different extraction modes can be respectively adopted for the bidding business document and the bidding technical document to extract the auditing key information in the bidding business document and the bidding technical document.
For the bidding business document, audit business information in the bidding business document can be extracted by using a preset extraction model, wherein the preset extraction model is a model established based on machine learning, and the preset extraction model can be trained before being used. The following embodiments may be specifically referred to in the training process of the preset extraction model.
Wherein, the extracting of the audit business information in the bid business document may include: at least one of qualification information of the bidding enterprise, enterprise information of the bidding enterprise, bidding quotation information, delivery date, and the like.
In some embodiments, auditing the business information may further include: at least one of bid item name, bid item number, bid enterprise name, bid agent identification number, and bid time. The application is not so limited with respect to auditing business information.
S123: and extracting audit scheme information in the bidding technical file by using a regular expression based on a preset extraction rule.
For the bidding technical file, based on a preset extraction rule, the audit scheme information in the bidding technical file can be extracted by using a regular expression. The preset extraction rule can be an extraction rule which is configured in advance based on the bidding technical file of the project to be reviewed.
The audit scheme information extracted from the bid technology document may include: at least one of bid item name, bid item number, bid enterprise name, bid agent identification number, bid time, and chapter structure information. In addition, the chapter structure information includes at least one of a project situation, a service scheme introduction, a service process, service arrangement after the project is finished, a progress control measure, and a quality measure chapter text, which is not limited in this application.
In some embodiments, after obtaining the audit key information of the file to be audited, the audit key information may also be stored in the audit file library. For example, a bid information table and a bid technical data table may be established in the audit document library, and the bid information table may be used to store the audit business information extracted from the bid business document, that is, at least one of a bid item name, a bid item number, a bid enterprise name, a bid agent identity number, and a bid time. The bidding technique data table is used for storing auditing scheme information extracted from the bidding technique file, namely at least one of a bidding project name, a bidding project number, a bidding enterprise name, a bidding agent identity card number, bidding time, project condition, service scheme introduction, service process, service arrangement after the project is finished, a progress control measure and a quality measure chapter text.
In some embodiments, for step S12 above, the present application also provides a method of extracting information in a document. Referring to fig. 3, fig. 3 is a flowchart illustrating an embodiment of a method for extracting information from a document according to the present application. The method comprises the following steps:
s21: acquiring a file to be audited of a project to be audited; the files to be evaluated comprise bid business files and bid technical files related to the items to be evaluated.
And (4) storing bidding documents and bidding documents related to the project to be evaluated in a classification manner in the audit document library. The bid documents may also include bid commerce documents, bid technology documents.
The method can obtain the files to be audited of the project to be audited from the audit file library, wherein the files to be audited stored in the audit file library are stored according to the project to be audited and the file format of the files to be audited, and the file format comprises the following steps: an electronic document format, and at least one of other formats.
After the files to be audited of the project to be audited are obtained, if the file format of the files to be audited is other formats, the files to be audited are converted into files in an electronic document format or structured data; wherein, the other formats comprise any one of pictures and portable document formats. Converting the pending file into an electronic document format file or structured data, comprising: identifying a file to be audited by utilizing an optical character identification technology, and generating an electronic document format file or structured data based on an identification result; and/or acquiring input additional recording information of the file to be audited, and converting the file to be audited into an electronic document format file or structured data based on the input additional recording information.
S22: and extracting audit business information in the bidding business document by using a preset extraction model.
S23: and extracting audit scheme information in the bidding technical file by using a regular expression based on a preset extraction rule.
In this embodiment, the steps S22 and S23 may be executed simultaneously, and the execution sequence of the steps S22 and S23 in this application is not limited to this.
In this embodiment, reference may be made to the specific implementation process of step S12 in the above embodiment for the specific implementation of step S21, which is not described herein again.
In the embodiment, files to be audited of projects to be audited are obtained; the files to be evaluated comprise bid business files and bid technical files related to the items to be evaluated; extracting audit business information in the bid business file by using a preset extraction model; and extracting audit scheme information in the bidding technical file by using a regular expression based on a preset extraction rule, and automatically extracting information required by audit from the files to be audited, so that auditors are assisted to audit, the auditors are prevented from reading the files to be audited one by one, and the audit work efficiency is improved.
In some embodiments, the preset extraction model may be trained in advance before step 22 described above. And extracting audit business information in the bidding business document by using the trained preset extraction model.
Specifically, a plurality of sample pending documents may be collected, wherein the sample pending documents may be specific representative bid documents screened from the aggregated bid documents. In addition, reference audit business information is marked on the sample of the document to be audited, and the reference audit business information can be obtained by manually marking the bidding document according to the audit business information needing to be extracted from the bidding document.
And inputting the file sample to be examined into a preset extraction model, and training the preset extraction model based on a sequence labeling algorithm. The sequence labeling algorithm is, for example, a Conditional Random field algorithm (CRF algorithm for short), and may be used to train a preset extraction model, and the CRF algorithm may be used to label and segment ordered data and may be used to solve the sequence labeling problem. In the training process of the preset extraction model, reference audit business information in a file sample to be audited can be marked and segmented.
In some embodiments, referring to fig. 4, the step S23 may include the following steps:
s231: and configuring preset extraction rules for the bidding technical files according to the chapter structures of the bidding technical files.
The preset extraction rules can be configured for the bidding technical files according to the chapter structures of the bidding technical files. For example, the chapter structure of a bid technology document includes: project condition, service scheme introduction, service process, service arrangement after the project is finished, progress control measures and quality measures. A corresponding preset extraction rule may be configured for each chapter structure.
S232: and extracting the structural information of each section in the bidding technical file by adopting a regular expression based on a preset extraction rule so as to obtain the auditing scheme information.
Based on the preset extraction rule, the regular expression can be adopted to realize the preset extraction rule. The regular expression is a logic formula for operating on character strings, namely a 'regular character string' is formed by using a plurality of specific characters defined in advance and a combination of the specific characters, and the 'regular character string' is used for expressing a filtering logic for the character strings. Regular expressions can be used to retrieve, replace, text that conforms to a certain pattern (rule).
By adopting the regular expression, the structure information of each section in the bidding technical file can be extracted, and the structure information of each section is used as auditing scheme information. The chapter structure information comprises at least one of project conditions, service scheme introduction, service process, service arrangement after the project is finished, progress control measures and quality measure chapter texts.
In this application, the audit business information and/or the audit scheme information obtained in the above steps 22 and S23 may be used to perform audit processing on the to-be-audited file of the to-be-audited item. The audit business information and/or audit scheme information can be used as audit key information, and the audit processing on the file to be audited can include: and auditing the files to be audited of the project to be audited by utilizing the incidence relation between the auditing business information and/or the auditing scheme information in each file to be audited. Acquiring the similarity between auditing scheme information of each file to be audited in the project to be audited by using a preset similarity calculation method; and taking the files to be audited with the similarity larger than the preset threshold value as abnormal files, and generating the auditing result of the project to be audited. The process can be referred to in particular in the following examples.
In some embodiments, referring to fig. 5, in the step S13, performing audit processing on the file to be audited based on the audit key information may include the following steps:
s131: and auditing the files to be audited by utilizing the incidence relation among the auditing key information of the files to be audited.
The audit business information and/or audit scheme information may be used as audit key information, that is, the audit key information may include audit business information that may include: the system comprises at least one of qualification information of the bidding enterprise, enterprise information of the bidding enterprise, bidding quotation information, delivery date, bidding project name, bidding project number, bidding enterprise name, bidding agent identity number, bidding time, project condition, service scheme introduction, service process and service arrangement after the project is finished, progress control measures and quality measures.
And auditing the files to be audited by utilizing the incidence relation among the auditing key information of the files to be audited. For example, the association relationship among each project to be audited, the bidding enterprise and the bidding agent is analyzed, the enterprise list of the bidding agent which is frequently exchanged and the like can be obtained, and the association relationship among the auditing key information can be utilized for analysis and mining, so that the analysis result is obtained, and the reference value is provided for auditing.
S132: acquiring the similarity between the files to be examined in the project to be examined by using a preset similarity calculation method; and taking the files to be audited with the similarity larger than the preset threshold value as abnormal files, and generating the auditing result of the project to be audited.
The similarity between the files to be audited in the project to be audited can be obtained by utilizing a preset similarity algorithm based on the auditing key information of each auditing file, so that the auditing treatment can be carried out on the files to be audited based on the similarity. And acquiring the repetition rate of each document to be evaluated based on the similarity between the documents to be evaluated, namely, carrying out duplication checking treatment on each document to be evaluated, so as to extract a similar bidding enterprise list of the documents to be evaluated.
Optionally, the similarity between the documents to be reviewed of the project to be reviewed can be obtained by using a preset similarity algorithm based on the auditing scheme information in the bidding technical document, wherein the preset similarity algorithm can be a text similarity algorithm based on the editing distance. If the similarity between the two files to be evaluated is greater than a preset threshold value, the preset threshold value can be set to be 0.4-0.6; the two files to be audited may be the same, the file to be audited with the similarity larger than the preset threshold value is taken as an abnormal file, and an auditing result of the item to be audited is generated based on the abnormal file. In addition, chapter structure information with high similarity in the two files to be audited can be used as an audit evidence.
In some embodiments, for the step S13, the present application provides a text similarity calculation method. Referring to fig. 6, fig. 6 is a flowchart illustrating an embodiment of a method for calculating similarity of texts according to the present application. The method comprises the following steps:
s31: and acquiring a plurality of files to be checked of the items to be checked, and acquiring the structure information of each chapter of the files to be checked.
The method can obtain files to be audited of the project to be audited from the audit file library, wherein the files to be audited stored in the audit file library are stored in a classified mode according to the project to be audited and the file format of the files to be audited, and the file format comprises the following steps: at least one of an electronic document format and other formats, which may include a picture format or a portable document format, and the present application is not limited thereto.
Optionally, the pending file in this embodiment may include a bidding technology file in the bidding file. When the structure information of each section of the file to be audited is obtained, a preset extraction rule can be configured for the bidding technical file according to the section structure of the bidding technical file, so that the structure information of each section of the bidding technical file is extracted by adopting a regular expression based on the preset extraction rule to serve as the structure information of each section of the file to be audited in the embodiment.
The chapter structure information of the pending file can comprise at least one of project conditions, service scheme introduction, service process, service arrangement after the project is finished, progress control measures and quality measure chapter texts.
The specific implementation process of this step in this implementation may refer to the implementation process of the above embodiment, and is not described herein again.
S32: and determining the similarity of the structures of the sections corresponding to the files to be examined based on the structural information of the sections among the files to be examined.
And determining the similarity of the structures of the sections corresponding to the files to be examined by using a text similarity calculation method based on the editing distance. That is, the similarity of the structure of each corresponding chapter in the documents to be examined, i.e., the project condition, the service scheme introduction, the service process, the service arrangement after the project is finished, the progress control measures and the quality measures can be obtained. The Edit Distance (ED) of a text may refer to the minimum number of editing operations required to convert one text string into another text string between two text strings. The editing operation includes the following: adding a character, deleting a character, modifying a character. The minimum edit distance literally reflects the degree of difference between the two texts, i.e. the more similar the two texts are, the smaller the edit distance is.
S33: and determining the similarity between the plurality of files to be examined based on the similarity corresponding to each chapter structure between the plurality of files to be examined and the weight of each chapter structure.
Weighting, such as weighted summation, weighted averaging and the like, can be performed based on the similarity of the structures of the sections of the documents to be examined and the weight of the structures of the sections, and the weighting result can be taken as the similarity between the documents to be examined.
In the embodiment, a plurality of files to be checked of the items to be checked are obtained, and the structure information of each chapter of the plurality of files to be checked is obtained; determining the similarity of the structures of the sections corresponding to the files to be examined based on the structure information of the sections among the files to be examined; the method comprises the steps of determining the similarity between a plurality of files to be audited based on the similarity between the files to be audited and corresponding to the structures of all chapters and the weight of the structures of all chapters, analyzing the files to be audited of mass projects by obtaining the similarity between the files to be audited, finding out the similar files to be audited, and assisting auditors to audit so as to improve the auditing work efficiency.
In some embodiments, referring to fig. 7, the step S33 may include the following steps:
s331: and carrying out normalization processing on the similarity of the structures of all the sections of the files to be examined, and taking the result of the normalization processing as the similarity of the structures of all the sections corresponding to the files to be examined.
The similarity of the structures of the sections corresponding to the files to be examined can be normalized, so that the similarity of the structures of the sections ranges from 0 to 1, and the result of the normalization processing is used as the similarity of the structures of the sections. The closer the similarity of the corresponding chapter structures is to 1, the higher the similarity between the two chapter structures is, and conversely, the lower the similarity between the two chapter structures is.
S332: and carrying out weighted average on the similarity of the structures of all the chapters of the files to be examined and the weight corresponding to the structures of all the chapters to obtain the similarity between the files to be examined.
The corresponding weight can be set for each chapter structure respectively, and the weight corresponding to each chapter structure can be set according to the bidding technical file of the specific project to be reviewed, which is not limited in the present application. For example, the weight of the case of the item in the chapter structure is 0.1, the weight of the introduction of the service plan is 0.4, the weight of the service arrangement after the service process and the item are ended is 0.2, the weight of the progress control measure is 0.15, and the weight of the quality measure is 0.15.
And in the plurality of files to be examined, carrying out weighted average on the similarity of the structures of all chapters and the corresponding weight of the structures of all chapters of every two files to be examined, and taking the weighted average as the similarity between the two files to be examined. Therefore, the similarity between every two documents to be examined in the documents to be examined can be obtained. And in the similarity between the current document to be evaluated and a plurality of documents to be evaluated, taking the similarity of two documents to be evaluated with the highest similarity as the similarity between the two documents to be evaluated.
Optionally, the similarity between the files to be audited can be used for auditing the files to be audited of the project to be audited. Specifically, if the similarity of the files to be audited is greater than the preset threshold, the files to be audited with the similarity greater than the preset threshold are used as abnormal files, and an auditing result of the project to be audited is generated.
The specific implementation of this embodiment can refer to the implementation process of the above embodiment, and is not described herein again.
Referring to fig. 8, fig. 8 is a schematic flowchart illustrating a method for calculating similarity of texts according to another embodiment of the present application. The method comprises the following steps:
s40: and acquiring a plurality of files to be checked of the items to be checked, and acquiring the structure information of each chapter of the files to be checked.
S41: and selecting one audit project from the project to be audited as a target audit project.
S42: and selecting two files to be checked as target files to be checked at will from the files to be checked in the target audit project.
S43: and determining the similarity of the structures of the sections corresponding to the target files to be examined based on the structural information of the sections between the target files to be examined.
S44: and determining the similarity between the target files to be examined based on the similarity between the target files to be examined and corresponding to the structures of the sections and the weight of the structures of the sections.
S45: and judging whether the similarity between the target files to be evaluated is greater than a preset threshold value.
If the value is greater than the preset threshold value, executing step S46; if not, go to step S47.
S46: and taking the target files to be audited with the similarity larger than the preset threshold value as abnormal files, and generating an auditing result of the target files to be audited.
Similar chapter structure information in the target document to be audited, a bidding enterprise corresponding to the target document to be audited, a bidding agent, a target project to be audited corresponding to bidding and the like can be used for generating an auditing result of the target document to be audited.
S47: and detecting whether all files to be audited under the target audit project are traversed.
If yes, step S48 is executed, otherwise, step S42 is executed.
S48: and detecting whether all the projects to be examined are traversed.
If yes, go to step S49; otherwise, execution continues with step S41.
S49: and outputting an auditing result of the file to be audited in the project to be audited.
The audit result generated for the abnormal file in step S46 can be obtained, and the bidding enterprise corresponding to the abnormal file, the project to be audited of the corresponding bid, the bid file, and the similar chapter structure and bid agent in the bid file, etc. can be used as the audit result of the file to be audited. Therefore, the auditing results of the abnormal project in the project to be audited, the bidding enterprises corresponding to the abnormal document and the like can be generated.
In this embodiment, reference may be made to the implementation process of the above embodiment for specific implementation of steps S40 to S49, which are not described herein again.
For the above embodiments, the present application provides a computer device, please refer to fig. 9, and fig. 9 is a schematic structural diagram of an embodiment of the computer device of the present application. The computer device 500 comprises a memory 501 and a processor 502, wherein the memory 501 and the processor 502 are coupled to each other, the memory 501 stores program data, and the processor 502 is configured to execute the program data to implement the steps of any of the above-mentioned methods.
In this embodiment, the processor 502 may also be referred to as a CPU (Central Processing Unit). The processor 502 may be an integrated circuit chip having signal processing capabilities. The processor 502 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor 502 may be any conventional processor or the like.
The specific implementation of this embodiment can refer to the implementation process of the above embodiment, and is not described herein again.
For the method of the above embodiment, it can be implemented in the form of a computer program, so that the present application provides a storage device, please refer to fig. 10, where fig. 10 is a schematic structural diagram of an embodiment of the storage device of the present application. The storage means 600 has stored therein program data 601 executable by a processor, the program data being executable by the processor to implement the steps of any of the embodiments of the method described above.
The specific implementation of this embodiment can refer to the implementation process of the above embodiment, and is not described herein again.
The storage device 600 of this embodiment may be a medium that can store program data, such as a usb disk, a portable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, or may be a server that stores the program data, and the server may transmit the stored program data to other devices for operation, or may self-operate the stored program data.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a storage device, which is a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing an electronic device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application.
It will be apparent to those skilled in the art that the modules or steps of the present application described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present application is not limited to any specific combination of hardware and software.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are intended to be included within the scope of the present application.

Claims (10)

1. An audit processing method, comprising:
establishing an audit file library of a project to be audited, wherein the audit file library is used for storing files to be audited related to the project to be audited;
acquiring at least one file to be audited from the audit file library, and acquiring audit key information of the file to be audited;
and auditing the file to be audited based on the auditing key information.
2. The method of claim 1, wherein establishing an audit file library of pending items comprises:
storing the files to be audited in the audit file library in a classified manner according to the project to be audited and the file format of the files to be audited; wherein the file format comprises: an electronic document format, and at least one of other formats.
3. The method of claim 2, wherein storing the audit file classification in the audit file repository comprises:
and classifying and storing the files to be audited in the audit file library by adopting a distributed file system.
4. The method of claim 1,
the pending documents comprise bidding business documents and bidding technical documents related to the pending projects;
the obtaining of the auditing key information of the file to be audited comprises the following steps:
extracting audit business information in the bid business file by using a preset extraction model;
and extracting the audit scheme information in the bidding technical file by using a regular expression based on a preset extraction rule.
5. The method of claim 4,
the audit business information comprises: at least one of qualification information of the bidding enterprise, enterprise information of the bidding enterprise and bidding quotation information;
the audit scheme information includes: at least one of bid item name, bid item number, bid enterprise name, bid agent identification number, bid time, and chapter structure information;
the chapter structure information comprises at least one of project conditions, service scheme introduction, service process, service arrangement after the project is finished, progress control measures and quality measure chapter texts.
6. The method of claim 4, wherein before obtaining audit key information of the pending document, the method comprises:
if the file format of the file to be audited is other formats, converting the file to be audited into an electronic document format file or structured data; wherein the other formats comprise any one of pictures and portable document formats.
7. The method of claim 6, wherein converting the pending file to an electronic document format file or structured data comprises:
identifying the file to be checked by utilizing an optical character identification technology, and generating the electronic document format file or the structured data based on the identification result; and/or the presence of a gas in the gas,
acquiring input additional recording information of the files to be audited, and converting the files to be audited into the electronic document format files or the structured data based on the input additional recording information.
8. The method of claim 1, wherein the auditing the pending document based on the audit key information comprises:
auditing the files to be audited by utilizing the incidence relation among the auditing key information of each file to be audited; and/or the presence of a gas in the gas,
acquiring the similarity between the files to be audited in the project to be audited by using a preset similarity calculation method; and taking the files to be audited with the similarity larger than a preset threshold value as abnormal files, and generating an auditing result of the items to be audited.
9. A computer device comprising a memory and a processor coupled to each other, the memory having stored therein program data for execution by the processor to perform the steps of the method of any one of claims 1 to 8.
10. A storage device, characterized by program data stored therein which can be executed by a processor for carrying out the steps of the method according to any one of claims 1 to 8.
CN202110888051.2A 2021-08-03 2021-08-03 Auditing processing method, computer equipment and storage device Pending CN113763143A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110888051.2A CN113763143A (en) 2021-08-03 2021-08-03 Auditing processing method, computer equipment and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110888051.2A CN113763143A (en) 2021-08-03 2021-08-03 Auditing processing method, computer equipment and storage device

Publications (1)

Publication Number Publication Date
CN113763143A true CN113763143A (en) 2021-12-07

Family

ID=78788485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110888051.2A Pending CN113763143A (en) 2021-08-03 2021-08-03 Auditing processing method, computer equipment and storage device

Country Status (1)

Country Link
CN (1) CN113763143A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114444105A (en) * 2022-01-28 2022-05-06 北京中友金审科技有限公司 Intelligent audit data reporting safety method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993454A (en) * 2019-04-10 2019-07-09 贵州电网有限责任公司 Audit risk processing method, device, computer equipment and storage medium
CN110046973A (en) * 2019-04-17 2019-07-23 成都市审计局 It is a kind of that mark string mark detection method is enclosed based on incidence relation big data analysis
CN111815162A (en) * 2020-07-08 2020-10-23 国网上海市电力公司 Digital auditing tool and method
CN112800113A (en) * 2021-02-04 2021-05-14 天津德尔塔科技有限公司 Bidding auditing method and system based on data mining analysis technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993454A (en) * 2019-04-10 2019-07-09 贵州电网有限责任公司 Audit risk processing method, device, computer equipment and storage medium
CN110046973A (en) * 2019-04-17 2019-07-23 成都市审计局 It is a kind of that mark string mark detection method is enclosed based on incidence relation big data analysis
CN111815162A (en) * 2020-07-08 2020-10-23 国网上海市电力公司 Digital auditing tool and method
CN112800113A (en) * 2021-02-04 2021-05-14 天津德尔塔科技有限公司 Bidding auditing method and system based on data mining analysis technology

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114444105A (en) * 2022-01-28 2022-05-06 北京中友金审科技有限公司 Intelligent audit data reporting safety method

Similar Documents

Publication Publication Date Title
US11574204B2 (en) Integrity evaluation of unstructured processes using artificial intelligence (AI) techniques
Zhaokai et al. Contract analytics in auditing
US9990356B2 (en) Device and method for analyzing reputation for objects by data mining
CN103154991B (en) Credit risk is gathered
US7389306B2 (en) System and method for processing semi-structured business data using selected template designs
US20150032645A1 (en) Computer-implemented systems and methods of performing contract review
US9025890B2 (en) Information classification device, information classification method, and information classification program
US20110106801A1 (en) Systems and methods for organizing documented processes
Fayaz et al. Ensemble machine learning model for classification of spam product reviews
CN109800354B (en) Resume modification intention identification method and system based on block chain storage
US11880435B2 (en) Determination of intermediate representations of discovered document structures
Sadasivam et al. Corporate governance fraud detection from annual reports using big data analytics
CN112364645A (en) Method and equipment for automatically auditing ERP financial system business documents
Falkner et al. Identifying requirements in requests for proposal: A research preview
CN113763143A (en) Auditing processing method, computer equipment and storage device
CN113626655A (en) Method for extracting information in file, computer equipment and storage device
Liu et al. Tracking disclosure change trajectories for financial fraud detection
TW202018616A (en) Intelligent accounting system and identification method for accounting documents
Heidari et al. Financial footnote analysis: developing a text mining approach
KR20110093398A (en) Device and method for managing mobile terminated service
CN113762719A (en) Text similarity calculation method, computer equipment and storage device
TWM575887U (en) Intelligent accounting system
EP1286284A1 (en) Spreadsheet data processing system
US20140201103A1 (en) System for research and development information assisting in investment, and a method, a computer program, and a readable and recordable media for computer thereof
CN113537964A (en) Application form processing method, device, storage medium and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination