CN110675289B - Method for cataloging electronic file along with criminal investigation - Google Patents

Method for cataloging electronic file along with criminal investigation Download PDF

Info

Publication number
CN110675289B
CN110675289B CN201910936642.5A CN201910936642A CN110675289B CN 110675289 B CN110675289 B CN 110675289B CN 201910936642 A CN201910936642 A CN 201910936642A CN 110675289 B CN110675289 B CN 110675289B
Authority
CN
China
Prior art keywords
file
criminal
catalog
volume
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910936642.5A
Other languages
Chinese (zh)
Other versions
CN110675289A (en
Inventor
何坤
董晶
周鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201910936642.5A priority Critical patent/CN110675289B/en
Publication of CN110675289A publication Critical patent/CN110675289A/en
Application granted granted Critical
Publication of CN110675289B publication Critical patent/CN110675289B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Technology Law (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of electronic volume cataloging, and discloses a case following criminal investigation electronic volume cataloging method, which comprises the following steps: analyzing a criminal review file, extracting the characteristics of the file, and constructing a criminal file characteristic library; classifying and identifying file files of criminal cases, extracting file information according to characteristics, and constructing a criminal file management library; and compiling a reading file and an archiving file catalog by combining the management library. The invention is helpful to know the source and the rough situation of the specific file from the catalogue, and integrates and records the file of each department. Makes up for the defects of independent authoring of the traditional public security (investigation volume), inspection court (examination volume), court (litigation volume) and administrative jurisdiction (executive volume). The invention is beneficial to the writing of novel materials and is convenient for the expansion of writing technology.

Description

Method for cataloging electronic file along with criminal investigation
Technical Field
The invention belongs to the technical field of electronic file cataloging, and particularly relates to a method for cataloging an electronic file along with criminal investigation.
Background
Currently, the closest prior art: as judicial informatization construction proceeds, criminal case volumes stored by various levels of judicial departments (courts, inspection courts, judicial administration) are increasing in tens of millions of levels each year. In order to facilitate criminal case handling and save file management cost, various levels of judicial departments preliminarily construct electronic criminal case files and respective online case handling business systems: the high-and-medium-level courts independently develop business systems in the units, such as an "trial management system", "an" electronic file system "and an" execution system "; the national court establishes an industry standard for the specification of the electronic file catalogue; private networks are built in national courts and civil courts, so that 'one-network' office handling is realized, and whole-course trace and supervision are realized; the national inspection authorities build a unified service application system, integrate functions of case handling, management, supervision, statistics and the like, and realize the network entry of case handling information, network management of case handling flow, network supervision and data generation of case handling activities of the national four-level inspection authorities. The judicial administrative authorities are a collective term for a number of functional entities: it includes judicial administration, notarization, legal assistance, basic legal service, people mediation, judicial identification, community correction, assistance and education arrangement, prison, guardian, detoxication place, etc. Some functional institutions have established business systems for this organization: such as "judicial administrative work information management information system", "notarization administrative and industry management system", "judicial evaluation auction management system", "judicial community correction management system", and "prison management information system", etc. Although electronic volume processing studies have been widely conducted domestically, various levels of judicial departments have basically established respective electronic volume management systems. However, criminal case file data flow, sharing and exchange among judicial departments of all levels are not completely realized.
Because China starts at a later time in the electronic file, the current criminal case file catalogue is only manually compiled according to the processing flow in each department, and the automatic extraction of the document number from the file, the classification catalogue based on the content and the hanging are not realized. Criminal-review volume catalogs mainly suffer from the following deficiencies in content: 1) The directory is too simple. The traditional file directory mainly comprises two stages, wherein one stage is a class name in a department, and the other stage is a file name. The conventional volume directory does not contain critical information of the volume file, such as "retention" of the document class (primary directory) (secondary directory) and "audiovisual material recording of the evidence class (primary directory) (secondary directory) in the investigation volume directory. The "retention card" (secondary catalog) does not have information to tell who is carrying out the retention card when; the "audiovisual material record" (secondary catalog) does not annotate audiovisual material about what. Criminal cases have different numbers of files in different files along with the cases, and many cases are put into hundred books, so that evidence files are disordered and mixed. It is difficult for the reader to roughly understand the basic situation and evidence constitution of the case from such a catalog, and the role of the catalog is not fully played. 2) The lack of unified criminal case bibliographic standardization among various levels of judicial departments, the current public inspection law respectively has a set of criminal case bibliographic standardization, which may lead to inconsistent names of a bibliographic, such as evidence files, of a bibliographic category of different departments. The criminal first-examination document can be divided into a reading document and an archiving document from the use angle, wherein the reading document refers to a readable document which flows among departments and consists of part of criminal first-examination document, the documents are different according to different departments or personal authorities, and the catalogue of the documents is also different according to individuals. The archive file refers to all the file sets of the first review file formed in the criminal case handling process, and mainly comprises documents and evidences formed by public security, judicial departments at all levels and litigation participants. The archive file directory should include public security, inspection, court, jurisdiction and litigation parties readable file names, submitted non-trusted file names and classified file names.
The defects of the current reading file catalogue are mainly expressed as follows: 1) The automation degree of the copybook is not high, and automatic file screening and cataloging of the existing criminal-review volume according to the authority of the reader are not realized yet. Current criminal-review documents are transferred between departments and corresponding catalogues, and professional personnel are required to conduct manual screening and writing. When a litigation participant (lawyer) wants to read a investigation file transferred to a inspection yard, the litigation participant generally contacts with a case handling person or a case management center for about time or application of reading the file, and the case handling person or the case management center manually screens the file according to the reading authority and the application and compiles a corresponding catalogue. 2) The bibliographic is not timely, and the traditional catalogue is generally compiled according to case handling nodes or preset time. 3) The bibliographic integration is poor, as a litigant participant (lawyer) wants to read the relevant college and court rolls simultaneously, he must apply for and reserve different times to the college and court. 4) The current catalogue is too simple, so that a reader cannot know the case condition briefly from the catalogue conveniently. The inadequacies of archiving directories appear as: 1) Because each department writes independently, the integration is poor; 2) The traditional criminal one-examination archive file is provided by a court, and the catalogue comprises legal documents and evidence materials of a part of jurisdictions of the public inspection law. Not including the contents of all jurisdictions such as prisons, gatekeepers and detoxified sites.
Aiming at the problems of low intellectualization degree, poor integration and the like of the current criminal investigation electronic volume bibliography, the automatic volume archiving and reading catalog bibliography technology for covering public security, courts, inspection homes, judicial administration institutions and litigation participants is urgently needed, the roles of the volume catalog in different applications are fully played, and the judicial high efficiency and sunshine are promoted. The method aims at solving the data knowledge problem of mass criminal one-examination files stored in a distributed mode.
In summary, the problems of the prior art are: the existing criminal investigation electronic volume bibliographic intelligentization degree is not high and the integration is poor.
The difficulty of solving the technical problems is as follows:
(1) Constructing a file feature library: the document files of criminal cases are more or less according to different cases, and are more or less in hundred, meanwhile, evidence files are disordered and mixed, and different documents and evidence describe different cases. In order to extract key information from documents and evidences, the invention analyzes the commonality and the dissimilarity of the same file samples of different criminal cases to form a file feature library. The accuracy of the file characteristics of a file depends on the number of file samples, and also determines the accuracy of the file profile and bibliography.
(2) Constructing a file management library: the criminal files have a large number and various forms, and are mainly expressed in the forms of texts, images, audiovisual media, copies, tables and the like, and the information expression modes of the files in different forms are different. In order to provide a file management library constructed along with file information of files, the invention integrates a character recognition technology, an image processing technology and fuzzy recognition.
Meaning of solving the technical problems:
(1) The main purpose of constructing criminal case volume feature library is as follows:
1) The catalogue can meet the examination habit of the public inspection law and the corresponding writing specification;
2) Providing instruction information for brief description of the follow-up file;
3) Providing necessary characteristic information for classification of files along with the files.
(2) The main purpose of constructing a follow-up file management library is as follows:
1) Providing necessary data support for generating a paper reading directory and an archiving directory;
2) And providing data support for adding the abstract of the file in the reading directory.
3) The writing order of the files is conveniently arranged.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention provides a method for cataloging an electronic file for criminal investigation.
The invention is realized in such a way that a criminal investigation electronic volume catalogue is compiled, which comprises the following steps:
analyzing a criminal review file, extracting characteristics of the file, and constructing a criminal file characteristic library;
step two, classifying and identifying the file of the criminal case, extracting file information according to the characteristics, and constructing a criminal file management library;
thirdly, compiling a reading file and an archiving file catalog by combining the management library.
Further, the first step builds a criminal volume feature library: the method comprises the steps of a file making organization, a file name, a file attribute, a file type, a file category, a directory code and key information;
the file class is a specific classification of the document class files;
the directory code is a directory number of the file, and is according to the specification of the directory sequence of the file monitored by the public inspection law;
the key information records the summary information of the file, and the key information of the file is constructed according to the reading key points of legal workers on the file.
Further, the constructing a criminal volume management library in the second step includes:
(1) File structure of the file: the file of criminal cases comprises files written by the official laws, text materials of self-complaints and reported persons; the template of the document comprises a head part, a text part and a tail part; header writers, file names, primary and secondary volumes, and others; the text part writes the reason and the clause of the offender; the tail writes the underwriting unit, underwriting person and date;
(2) Extracting file information: expressed in text, image, audiovisual media, copies and tabular form;
(3) Criminal case volume file management library:
using MYSQL8.1 to build a file management library, wherein the management library mainly comprises a file making organization, a file name, a file attribute, a file category, a right, a file ID number, a file type and a brief description;
the production organization fills in according to the release department;
the file name is extracted from the file by using a character recognition technology and is filled in;
the file attribute is extracted from the file by using a character recognition technology and is filled in;
inquiring and filling file types from a file feature library according to file names;
rights, recording the reading rights of the file, and filling in by the file publisher according to the case;
the file ID number not only indicates the sequence of the file in the catalog, but also indicates the serial number of the file in the file warehouse;
inquiring and filling in file types from a file feature library according to file names;
briefly, a summary of a file of the file is recorded, key information of the file is queried from a file feature library according to the file name, related content is retrieved from the file by utilizing the key information, and finally corresponding items are filled in.
Further, the first part of the formal document is divided into four types: the first class is only the document name; the second category comprises production authorities, file names and letter numbers; the third class consists of a production organization, a file name, a letter number and others; the fourth class is to add a positive and a negative volume on the basis of the third class;
the image and audiovisual material is mainly composed of two parts: a description body and related media data, the description body writes the source, time, place, collection personnel and related content description of the media; the copy refers to a valid certificate issued by a related unit; the table consists of table names and various table contents; the table names appear in the top page in separate rows.
Further, the extracting of the file information further includes:
1) And (3) extracting the names and information of the documents:
analyzing a PDF text structure; secondly, extracting each line of text of the home page by using a text recognition technology; finally, fuzzy matching is carried out on each line of texts and file name items in the feature library, and the names of the files are identified;
extracting other information of the document: according to the document name, firstly, searching corresponding contents by combining key information of the document in a feature library to form a brief description of the document; secondly, generating the ID number of the file according to the directory code of the file in the feature library; finally analyzing the file attribute and the category;
2) Information extraction of images and audiovisual media:
the description part of the image and the audio-visual data is expressed in PDF format, and the structure of the description part PDF text is analyzed first; secondly, the text recognition technology is utilized to combine the time, place and collection personnel content in the file feature library, and meanwhile, related content is searched from the explanation part according to key information in the feature library to form a brief explanation of images and audiovisual media; finally, generating the file ID number according to the directory code in the feature library;
3) Information extraction of the copy:
firstly, detecting the edge of a certificate by using an edge detection algorithm, detecting parallel lines on the upper, lower, left and right sides of the boundary of the certificate by carrying out Hough change on the edge, analyzing the angle of the certificate when the certificate is collected according to the slopes of the upper, lower parallel lines and the left and right parallel lines, and carrying out rotation treatment on the certificate according to the angle; secondly, the rotated copy is applied to OCR technology to extract the certificate type, name of the certificate holder and issue time information; finally, generating the file ID number according to the directory code in the feature library;
4) Information extraction of the table:
firstly, analyzing a PDF text structure; extracting each line of text of the home page, carrying out fuzzy matching on each line of text and file name items in a feature library, and identifying table names; and finally, extracting key information according to the table names and the characteristics thereof to form a brief description thereof, and generating an ID number.
Further, the third step of compiling a reading volume and an archiving volume catalog by combining the management library specifically comprises:
(1) Catalog framework:
the criminal case file catalog framework is designed as follows: police materials, inspection yard materials, court materials, executive materials, self-complaint materials, interviewee materials, third party agency materials, audio and video materials, other litigation-related materials and others are primary catalogs; legal documents and evidence are secondary catalogs; the file category is a three-level directory; the specific file is a four-level directory, and the directory is compiled according to the related items of the file management library;
(2) Catalogue of volumes:
the case files are divided into an archiving file and a reading file from the use angle, and an archiving file catalog and a reading file catalog are correspondingly generated; the directory order is determined by the file ID numbers in the volume management library.
Further, the primary catalog indicates the source of the file, and the production organization is generated according to the production organization item of the file management library;
the secondary catalog indicates the file type of the file, legal documents and evidence; generating according to file type items of a file management library of the file;
the third-level catalogue indicates the category of the file, and is generated according to category items of a file management library of the file;
the four-level directory is composed of names and abstracts of specific file files, and file names are generated according to file names in a file management library.
Further, the volume catalog authoring includes:
1) Archive catalogue is authored:
archiving files refers to all file sets formed in the criminal case handling process, and catalogues of the file sets are free of any constraint to summarize all file files of the case; the method is used for archiving, and the catalog does not contain the abstract part of the fourth-level catalog and consists of items such as file names of the first-level catalog, the second-level catalog, the third-level catalog and the fourth-level catalog;
2) Reading the catalogue of the files and writing:
the reading file is a readable file according to the authority of related personnel or departments, and the catalog of the reading file can only comprise file files in the reading authority; the catalogue mainly comprises a first grade catalogue, a second grade catalogue, a third grade catalogue and a fourth grade catalogue meeting the reading authority.
Further, the method for cataloging electronic files with crimes and criminals further comprises hanging the files, and searching and displaying the corresponding files according to the ID numbers of the files in the file management library.
In summary, the invention has the advantages and positive effects that: as the internet has an increasing impact on society, existing electronic records need to be changed. Criminal cases are mainly handled by courts, inspection homes, courts, judicial administrative offices and litigation participants (lawyers). Criminal-review electronic documents are a collective term for all legal documents and evidence that each department forms during the process, each legal document or evidence being referred to as a document. The number of the files is more or less according to different cases, two books are less, hundred books are more, and evidence files are disordered and mixed. The criminal first-examination electronic files can be divided into reading files and archiving files from the use angle, the archiving files refer to all file sets formed in the criminal case handling process, and the main function of the catalogue summarizes all file files of the criminal first-examination; the reading file is a readable file according to the authority by related personnel or departments, and the catalogue mainly helps the reader to know the case in the authority, and the reader can see the rough case and evidence composition from the abstract in the reading file catalogue.
The invention combines the files and corresponding bibliographic norms provided by public security, inspection homes, courts, administrative judicial authorities and litigation participants, and classifies the files of criminal cases into public security materials, inspection court materials, executive materials, self-complaint materials, interviewee materials, third party institution materials (arbitration notarized), audio and video materials, other litigation related materials and others, and takes them as a first-class catalogue. The method is beneficial to knowing the making organization or source of the specific file from the catalogue, and simultaneously integrating and writing the file of each department. Makes up for the defects of independent authoring of the traditional public security (investigation volume), inspection court (examination volume), court (litigation volume) and administrative jurisdiction (executive volume). Other materials are beneficial to the writing of novel materials, and the expansion of writing technology is convenient.
The invention integrates and records criminal cases, and is helpful for a reader to review the file files of different departments. The time and cost for the paper reader to apply for paper reading by each department are reduced.
According to the method, the follow-up file management library is constructed, file catalogs with different purposes can be automatically generated at any time according to the follow-up file management library, manual intervention is not needed, the defects of the traditional writing technology are overcome, and the labor and cost for generating the file catalogs are saved.
The invention adds the file abstract of the file on the traditional catalog, which is convenient for a reader to quickly know the basic condition and evidence constitution of the file from the catalog, and improves the quality and efficiency of the file.
The invention combines the traditional file catalogue of the public inspection method and the corresponding specification to establish a criminal case review file feature library. A certain foundation is laid for constructing the file feature library of other types of cases.
The invention promotes the synchronous deep application of the case-following electronic files and reduces the burden of legal workers. And the investigation, supervision, court trial and execution flow of criminal cases are supported, and the case handling quality and efficiency are improved. Paperless case handling is realized, and the full text of the file is displayed in time according to the needs in each link; and supporting the leaders and managers of all departments to synchronously consult the file. The invention fully plays the role of criminal review volume catalogues and makes up the defects of the traditional catalogues.
Drawings
FIG. 1 is a flow chart of a method for cataloging electronic volumes along with criminals according to an embodiment of the present invention.
Fig. 2 is a flowchart of an implementation method of cataloging an electronic file with criminals according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Aiming at the problems existing in the prior art, the invention provides a method for cataloging an electronic file along with criminal investigation, and the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the method for cataloging a criminal review electronic file provided by the embodiment of the invention comprises the following steps:
s101: analyzing a criminal review file, extracting the characteristics of the file, and constructing a criminal file characteristic library;
s102: classifying and identifying file files of criminal cases, extracting file information according to characteristics, and constructing a criminal file management library;
s103: and compiling a reading file and an archiving file catalog by combining the management library.
The technical scheme of the invention is further described below with reference to the accompanying drawings.
As shown in fig. 2, the method for cataloging an electronic file along with criminal investigation provided by the embodiment of the invention specifically includes the following steps:
first, constructing a criminal volume feature library:
criminal cases have different numbers of files according to different cases, but have the following commonalities: i) The file making institutions mainly comprise public security, inspection homes, courts, executive material self-complaints, reported persons, third party institutions and prisons; ii) file types are classified into a document class and an evidence class; iii) The file names of the same file are unified, for example, the arrest of all criminal cases is named as an arrest; iv) the file is divided into a primary volume and a secondary volume; v) the location of the volume file in the volume catalog of the different departments must meet the relevant specifications. In combination with the above-mentioned criminal-review case electronic file commonality and catalog authoring related specifications, the present invention establishes a file feature library. The feature library mainly comprises contents such as a file making organization, a file name, file attributes (a primary volume and a secondary volume), file types (a document class and a evidence class), file types, directory codes, key information and the like.
The document category is a specific classification of documents of document category by different departments in order to facilitate the examination of papers, for example, legal documents of public security authorities are classified into 3 categories: the document is managed by the filing of the case, the document is detected and forced, and the like.
The directory code is a file directory number for the volume file. The method is designed into a digital sequence AABBB according to the sequence specification of a file directory of a public inspection method, wherein AA represents file sources (public security materials, inspection yard materials, court materials, self-complaint materials, reported materials, third party organization materials (arbitration notarization), audio and video materials and other litigation related materials); the BBB represents the sequence numbers of different volume files in the directory.
The key information records the summary information of the file, the content of different files is different, and the same file is named the same but the content is different. And according to the reading gist of legal workers on the file, constructing key information of the file. Such as the person, executor, date, etc. in the retention form are key information of the retention form.
Secondly, constructing a criminal file management library:
criminal-review documents are collections of documents formed by various jurisdictions, which can be broadly divided into public security materials, inspection materials, court materials, executive materials, self-complaint materials, interviewee materials, third party agency materials (arbitration notarization), audio-visual materials, and other litigation-related materials (identification of litigation participants and commission procedures), and others. These materials are presented in forms that include mainly text, images, audiovisual media, copies and forms (trial information forms or criminal admission registry) in PDF format.
(1) File structure of the file:
the criminal files have a large number and various forms, but the files with the same form have similar writing formats.
The file of criminal cases mainly comprises formal files, self-complaints and text materials of reported people written by public inspection law. Document written by official document (legal document, pen, arbitration document, and commission) generally has a unified template, which contains a header, a body, and a tail. Header writers, file names, primary and secondary volumes, and others. The text part states the reason and the offensive law clause. The tail writes the underwriting unit, underwriter and date. The header of a document can be broadly divided into four categories: the first category is only document names, such as submitting a report of evidence and submitting an application. The second category comprises production authorities, file names and letter numbers, wherein the letter numbers have a uniform format: such as: the x criminal investigation words are < x > x, x checking (x) x, x criminal investigation final words are x. The third category consists of the production organization, the file name, the letter number, and others. The header of the first criminal decision book includes: production authorities, file names (x people court criminal judgment), case numbers and others (origin of complaint authorities, interviewee, foreman and case, judgment organization, judgment mode and judgment pass). The fourth class is to add a primary and secondary volume based on the third class. The text files of the complaints and the subjects are generally composed of a document name and a body. In criminal case document files, the specific positions of file names in the files are not unified and standardized, but the file names are all in the first page in independent lines.
The image and audiovisual material generally includes a description volume and associated media material, where the description volume describes the source, time, location, collection personnel and associated content description of the media. Copies are typically valid certificates issued by the relevant entity, such as identity cards, wedding cards, driver's licenses. The table mainly comprises table names and various table contents. The table names appear in the top page in separate rows.
(2) Extracting file information:
the content of a file is generally expressed in the forms of text, images, audiovisual media, copies, tables, and the like, and the file information in different forms is presented in different manners. In order to extract the file information of the file, the main content of the invention is as follows:
1) And (3) extracting the names and information of the documents:
in a criminal review document, the text appears in PDF format. In order to extract file information in text form, the invention firstly analyzes PDF text structure; secondly, extracting each line of text of the home page by using a text recognition technology; and finally, carrying out fuzzy matching on each line of text and the file name item in the feature library, and identifying the file name.
Extracting other information of the document: according to the document name, firstly, searching corresponding contents by combining key information of the document in a feature library to form a brief description of the document; secondly, generating the ID number of the file according to the directory code of the file in the feature library; and finally analyzing the file attribute and the category.
2) Information extraction of images and audiovisual media:
the description part of the image and the audio-visual data is expressed in PDF format, the invention firstly analyzes the structure of the PDF text of the description part; secondly, the text recognition technology is utilized to combine the contents such as time, place, collection personnel and the like in the file feature library, and related contents are searched from the explanation part according to key information in the feature library to form a brief explanation of images and audiovisual media; and finally, generating the file ID number according to the directory code in the feature library.
3) Information extraction of the copy:
copies are presented in the form of images in a document, which are acquired by means of a copier. In order to compensate the influence of certificate placement position and angle on the extraction of the copy information. Firstly, detecting the edge of a certificate by using an edge detection algorithm, detecting parallel lines on the upper, lower, left and right sides of the boundary of the certificate by carrying out Hough change on the edge, analyzing the angle of the certificate when the certificate is collected according to the slopes of the upper, lower parallel lines and the left and right parallel lines, and carrying out rotation treatment on the certificate according to the angle; secondly, the rotated copy is subjected to OCR technology to extract information such as certificate type, name of a certificate holder, issuing time and the like; and finally, generating the file ID number according to the directory code in the feature library.
4) Information extraction of the table:
the form is mainly composed of a form name, a form making time and various contents, and is expressed in a text format of PDF. In order to extract form information, a PDF text structure is analyzed first; extracting each line of text of the home page, carrying out fuzzy matching on each line of text and file name items in a feature library, and identifying table names; and finally, extracting key information according to the table names and the characteristics thereof to form a brief description thereof, and generating an ID number.
(3) Criminal case volume file management library:
the file of criminal cases is mainly used for reading and archiving, and in order to facilitate management and timely generation of file catalogs with different purposes, the file management library is built by using MYSQL 8.1. The management library mainly comprises a file making organization, a file name, a file attribute, a file category, a right, a file ID number, a file type and a brief description.
The production organization is filled in according to the issuing department.
The file name is extracted from the file by using a character recognition technology and is filled in.
The file attribute is extracted from the file by using a character recognition technology and is filled in. If not, the positive roll is filled in.
And inquiring and filling the file category from the file feature library according to the file name.
Rights, record the read rights of the file, and the publisher of the file fills in according to the case.
The file ID number not only indicates the sequence of the file in the catalog, but also indicates the number of the file in the file warehouse, such as a digital sequence AABBBCCC, and the AABBB inquires in the file feature library according to the file name; CCC represents the sub-file serial numbers (such as the reservation of different people in criminal cases) under the same file, and is automatically generated in time sequence.
And inquiring and filling the file types from the file feature library according to the file names.
Briefly, a summary of a volume file is recorded. And inquiring key information of the file from the file feature library according to the file name, retrieving related content from the file by utilizing the key information, and finally filling in corresponding items.
Thirdly, recording the criminal volume along with the case:
in order to improve public security, judicial departments at all levels and litigation participants quickly learn basic conditions and evidence constitution of criminal cases from massive volumes at any time. The invention combines the files provided by public security, inspection theatre, court and litigation participants and corresponding bibliographic norms to divide files of criminal cases into public security materials, inspection theatre materials, court materials, executive materials, self-complaint materials, notice materials, third party institution materials (arbitration notarization), audio and video materials, other litigation related materials and others. Public security material is formed by public security authorities during investigation and produces evidence which is generally formed by legal documents and investigation processes used by public security authorities when working with cases. The materials of the inspection yard are manufactured by the inspection yard and are divided into a forward roll (legal documents and external procedures of the inspection yard) and a secondary roll (reports and internal procedures formed by the inspection yard). Court materials are legal documents and evidence that are formed by courts during the course of an examination. The executive materials include legal documents and evidence formed by the judicial administrative authorities in accordance with the execution of court documents. The self-complaint material, the material to be reported is a document and an evidence material submitted by a self-complaint or a reported person.
(1) Catalog framework:
in order to embody criminal case handling nodes and file sources formed along with cases in the catalogue, the invention designs a criminal case file catalogue frame as follows: police materials, inspection yard materials, court materials, executive materials, self-complaint materials, interviewee materials, third party agency materials (arbitration notarization), audio-visual materials, other litigation-related materials, and others are primary catalogs; legal documents and evidence are secondary catalogs; the file category is a three-level directory; the specific file is a four-level directory. And compiling a catalog according to the related items of the file management library of the file.
The primary catalog mainly indicates the source of the file, namely the making organization, and is generated according to the making organization items of the file management library.
The secondary catalog primarily indicates the type of file, i.e., legal documents and evidence. And generating according to the file type items of the file management library of the file.
The tertiary catalog mainly indicates the category of the file, and is generated according to category items of a file management library of the file.
The four-level directory is mainly composed of names and summaries of specific volume files. The file name is generated from the file name in the volume management library. The abstract is used for facilitating a reader to quickly know the basic condition and evidence composition of the case from the catalogue. The abstract is different from the file information of the file, such as the file brief description of the effective certificate only comprises the name and the date of the certificate, and some files brief description are more, such as the field investigation record (the time of finding or reporting the record, the field protector name and unit, the field protector arrival time, the investigation place, the field investigation commander and the investigation personnel name, job title and unit, the witness name, unit and address, and the field condition). A brief description of the present invention is given in abstract form. The content is written based only on the volume management library profile.
(2) Catalogue of volumes:
the case files can be divided into an archiving file and a reading file from the use angle, and an archiving file catalog and a reading file catalog are correspondingly generated. The directory order is determined by the file ID numbers in the volume management library. The specific serial numbers are as follows: increasing the sequence by AA. In the case of AA being identical, the order is increased by BBB. In the case of the same AABBB, the order is increased by CCC.
1) Archive catalogue is authored:
archiving files refers to all the file sets formed in the criminal case handling process, and the catalogs of the file sets summarize all the file files of the cases without any restriction. The purpose is mainly used for archiving processing, and the directory does not contain a three-level directory brief description part and mainly comprises the file names of a first-level directory, a second-level directory, a third-level directory, a fourth-level directory and the like.
2) Reading the catalogue of the files and writing:
the reading file is a readable file according to the authority of related personnel or departments, and the catalog of the reading file can only comprise file files in the reading authority. The purpose is mainly to help the reader to see the rough case and evidence from the paper reading catalogue, wherein the catalogue mainly comprises a primary catalogue, a secondary catalogue, a tertiary catalogue and a quaternary catalogue (file name and abstract) meeting the reading authority.
Fourth, hanging the file in the file system:
after a reader roughly knows the basic condition and evidence constitution of a case according to a catalog, the reader generally holds the questionable attitude analysis case and purposefully reviews the case file, and at this time, needs to read the entire document file. In contrast, the invention searches the corresponding file according to the file ID number in the file management library and displays the file.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (7)

1. The method for cataloging the criminal review electronic volume is characterized by comprising the following steps of:
analyzing a criminal review file, extracting characteristics of the file, and constructing a criminal file characteristic library;
step two, classifying and identifying the file of the criminal case, extracting file information according to the characteristics, and constructing a criminal file management library;
thirdly, compiling a reading file and an archiving file catalog by combining a management library;
the first step is to construct a criminal volume feature library: the method comprises the steps of a file making organization, a file name, a file attribute, a file type, a file category, a directory code and key information;
the file attribute refers to the attribute of the file, namely the primary and secondary volumes;
whether the file type is whether the file is a document or evidence;
the file class is a specific classification of the file class files;
the directory code is a directory number of the file, and is according to the specification of the directory sequence of the file monitored by the public inspection law;
the key information records the summary information of the file, and the key information of the file is constructed according to the reading key points of legal workers on the file;
the second step of constructing a criminal file management library comprises the following steps:
(1) File structure of the file: the file of criminal cases comprises files written by the official laws, text materials of self-complaints and reported persons; the template of the document comprises a head part, a text part and a tail part; header writers, file names, primary and secondary volumes, and others; the text part writes the reason and the clause of the offender; the tail writes the underwriting unit, underwriting person and date;
(2) Extracting file information: extracting information by using text, form, image and streaming media processing technology by means of text, image, audiovisual media, copy and form expression;
(3) File management library of criminal cases:
using MYSQL8.1 to build a file management library, wherein the management library mainly comprises a file making organization, a file name, a file attribute, a file category, a right, a file ID number, a file type and a brief description;
the production organization fills in according to the release department;
the file name is extracted from the file by using a character recognition technology and is filled in;
the file attribute is extracted from the file by using a character recognition technology and is filled in;
inquiring and filling file types from a file feature library according to file names;
rights, recording the reading rights of the file, and filling in by the file publisher according to the case;
the file ID number not only indicates the sequence of the file in the catalog, but also indicates the serial number of the file in the file warehouse;
inquiring and filling in file types from a file feature library according to file names;
briefly, a summary of a file of the file is recorded, key information of the file is queried from a file feature library according to the file name, related content is retrieved from the file by utilizing the key information, and finally corresponding items are filled in.
2. The criminal investigation electronic volume catalog authoring method of claim 1, wherein the first part of the formal document is of four types: the first class is only the document name; the second category comprises production authorities, file names and letter numbers; the third class consists of a production organization, a file name, a letter number and others; the fourth class is to add a positive and a negative volume on the basis of the third class;
the image and audiovisual material consists of two parts: a description body and related media data, the description body writes the source, time, place, collection personnel and related content description of the media; the copy refers to a valid certificate issued by a related unit; the table consists of table names and various table contents; the table names appear in the top page in separate rows.
3. The criminal investigation electronic volume catalog authoring method of claim 1, wherein the volume file information extraction further comprises:
1) And (3) extracting the names and information of the documents:
analyzing a PDF text structure; secondly, extracting each line of text of the home page by using a text recognition technology; finally, fuzzy matching is carried out on each line of texts and file name items in the feature library, and the names of the files are identified;
extracting other information of the document: according to the document name, firstly, searching corresponding contents by combining key information of the document in a feature library to form a brief description of the document; secondly, generating the ID number of the file according to the directory code of the file in the feature library; finally analyzing the file attribute and the category;
2) Information extraction of images and audiovisual media:
the description part of the image and the audio-visual data is expressed in PDF format, and the structure of the description part PDF text is analyzed first; secondly, the text recognition technology is utilized to combine the time, place and collection personnel content in the file feature library, and meanwhile, related content is searched from the explanation part according to key information in the feature library to form a brief explanation of images and audiovisual media; finally, generating the file ID number according to the directory code in the feature library;
3) Information extraction of the copy:
firstly, detecting the edge of a certificate by using an edge detection algorithm, detecting parallel lines on the upper, lower, left and right sides of the boundary of the certificate by carrying out Hough change on the edge, analyzing the angle of the certificate when the certificate is collected according to the slopes of the upper, lower parallel lines and the left and right parallel lines, and carrying out rotation treatment on the certificate according to the angle; secondly, the rotated copy is applied to OCR technology to extract the certificate type, name of the certificate holder and issue time information; finally, generating the file ID number according to the directory code in the feature library;
4) Information extraction of the table:
firstly, analyzing a PDF text structure; extracting each line of text of the home page, carrying out fuzzy matching on each line of text and file name items in a feature library, and identifying table names; and finally, extracting key information according to the table names and the characteristics thereof to form a brief description thereof, and generating an ID number.
4. The criminal investigation electronic volume catalog authoring method of claim 1, wherein the third step of compiling the reading volume and the archiving volume catalog in combination with the management library specifically comprises:
(1) Catalog framework:
the criminal case file catalog framework is designed as follows: police materials, inspection yard materials, court materials, executive materials, self-complaint materials, interviewee materials, third party agency materials, audio and video materials, other litigation-related materials and others are primary catalogs; legal documents and evidence are secondary catalogs; the file category is a three-level directory; the specific file is a four-level directory, and the directory is compiled according to the related items of the file management library;
(2) Catalogue of volumes:
the case files are divided into an archiving file and a reading file from the use angle, and an archiving file catalog and a reading file catalog are correspondingly generated; the directory order is determined by the file ID numbers in the volume management library.
5. The criminal investigation electronic volume catalog authoring method of claim 4, wherein the primary catalog indicates the source of the volume file, the production organization according to the volume file management library;
the secondary catalog indicates the file type of the file, legal documents and evidence; generating according to file type items of a file management library of the file;
the third-level catalogue indicates the category of the file, and is generated according to category items of a file management library of the file;
the four-level directory is composed of names and abstracts of specific file files, and file names are generated according to file names in a file management library.
6. The criminal investigation electronic volume catalog authoring method of claim 4, wherein the volume catalog authoring comprises:
1) Archive catalogue is authored:
archiving files refers to all file sets formed in the criminal case handling process, and catalogues of the file sets are free of any constraint to summarize all file files of the case; the method is used for archiving, and the catalog does not contain the abstract part of the fourth-level catalog and consists of file names of the first-level catalog, the second-level catalog, the third-level catalog and the fourth-level catalog;
2) Reading the catalogue of the files and writing:
the reading file is a readable file according to the authority of related personnel or departments, and the catalog of the reading file can only comprise file files in the reading authority; the catalogue mainly comprises a first grade catalogue, a second grade catalogue, a third grade catalogue and a fourth grade catalogue meeting the reading authority.
7. The criminal investigation electronic volume catalog authoring method of claim 4 further comprising a volume file hooking, retrieving and displaying the corresponding volume file according to the file ID number in the volume management library.
CN201910936642.5A 2019-09-29 2019-09-29 Method for cataloging electronic file along with criminal investigation Active CN110675289B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910936642.5A CN110675289B (en) 2019-09-29 2019-09-29 Method for cataloging electronic file along with criminal investigation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910936642.5A CN110675289B (en) 2019-09-29 2019-09-29 Method for cataloging electronic file along with criminal investigation

Publications (2)

Publication Number Publication Date
CN110675289A CN110675289A (en) 2020-01-10
CN110675289B true CN110675289B (en) 2023-05-05

Family

ID=69080176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910936642.5A Active CN110675289B (en) 2019-09-29 2019-09-29 Method for cataloging electronic file along with criminal investigation

Country Status (1)

Country Link
CN (1) CN110675289B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112612893A (en) * 2020-12-29 2021-04-06 广西安怡臣信息技术有限公司 Electronic file case generation system
CN113157642A (en) * 2021-03-19 2021-07-23 浪潮云信息技术股份公司 Method for realizing electronic material digital process automation
CN113222788A (en) * 2021-05-17 2021-08-06 广西安怡臣信息技术有限公司 Intelligent marking method
CN113222417A (en) * 2021-05-17 2021-08-06 广西安怡臣信息技术有限公司 Electronic file data factory full-process intelligent application management system
CN113254396B (en) * 2021-06-23 2021-09-24 昌和云科技有限公司 Case collaborative management system for multiple departments
CN113609856A (en) * 2021-07-21 2021-11-05 浙江建达科技股份有限公司 Electronic file reading system based on artificial intelligence and marking tool thereof
CN115391577B (en) * 2022-09-29 2023-06-23 浙江星汉信息技术股份有限公司 Electronic file management method and system based on machine learning algorithm

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2738368A1 (en) * 1995-09-01 1997-03-07 Finance Christian Design and production of personalised multi-media electronic catalogue
CN101853311A (en) * 2010-06-18 2010-10-06 上海百事通信息技术有限公司 Legal service method and system
CN102955822A (en) * 2011-08-31 2013-03-06 河南新创元信息网络有限公司 Classification-type secretarial document management system and method
CN104636835A (en) * 2013-11-06 2015-05-20 北京航天长峰科技工业集团有限公司 Trans-department case coordination processing system
CN105159968A (en) * 2015-08-25 2015-12-16 浪潮(北京)电子信息产业有限公司 Directory management method for file system and client
CN107085584A (en) * 2016-11-09 2017-08-22 中国长城科技集团股份有限公司 A kind of cloud document management method, system and service end based on content
CN109977073A (en) * 2019-03-11 2019-07-05 厦门纵横集团科技股份有限公司 A kind of law court's electronics folder automation filing system and its method
CN110135715A (en) * 2019-05-06 2019-08-16 江苏新视云科技股份有限公司 A kind of intelligence court management method
CN110209632A (en) * 2019-05-27 2019-09-06 武汉市润普网络科技有限公司 A kind of electronics folder with case production, turn shelves system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2738368A1 (en) * 1995-09-01 1997-03-07 Finance Christian Design and production of personalised multi-media electronic catalogue
CN101853311A (en) * 2010-06-18 2010-10-06 上海百事通信息技术有限公司 Legal service method and system
CN102955822A (en) * 2011-08-31 2013-03-06 河南新创元信息网络有限公司 Classification-type secretarial document management system and method
CN104636835A (en) * 2013-11-06 2015-05-20 北京航天长峰科技工业集团有限公司 Trans-department case coordination processing system
CN105159968A (en) * 2015-08-25 2015-12-16 浪潮(北京)电子信息产业有限公司 Directory management method for file system and client
CN107085584A (en) * 2016-11-09 2017-08-22 中国长城科技集团股份有限公司 A kind of cloud document management method, system and service end based on content
CN109977073A (en) * 2019-03-11 2019-07-05 厦门纵横集团科技股份有限公司 A kind of law court's electronics folder automation filing system and its method
CN110135715A (en) * 2019-05-06 2019-08-16 江苏新视云科技股份有限公司 A kind of intelligence court management method
CN110209632A (en) * 2019-05-27 2019-09-06 武汉市润普网络科技有限公司 A kind of electronics folder with case production, turn shelves system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
论归档文件整理工作的简化;沈蕾等;《档案学通讯》;20160630(第6期);第39-42页 *

Also Published As

Publication number Publication date
CN110675289A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
CN110675289B (en) Method for cataloging electronic file along with criminal investigation
US8935265B2 (en) Document journaling
Pearce-Moses et al. A glossary of archival and records terminology
US10089287B2 (en) Redaction with classification and archiving for format independence
An An integrated approach to records management
US20150012448A1 (en) Collaborative matter management and analysis
CN114202319B (en) Archive management system based on mixed metadata scheme
CN109388648B (en) Method for extracting personnel information and relation person from electronic record
Bows Violence Against Older Women, Volume I: Nature and Extent
Gaikwad et al. Text-Based Sources
CN112597763A (en) Method and device for extracting and displaying judicial literature information in association manner and storage medium
Forstrom Managing electronic records in manuscript collections: A case study from the Beinecke Rare Book and Manuscript Library
US20050034072A1 (en) Method and system for documenting and processing intellectual assets
Bhardwaj et al. Metadata framework for online legal information system in indian environment
Mastley Representation of Black History in Archives: A Collection-Centered Quantitative Analysis of the Billups-Garth Archive
Dimisyqiyani et al. Using Archival Information System for Effective Retrieval of Document
Lambert et al. Grey Literature, institutional repositories, and the organisational context
Emery Document and records management: Understanding the differences and embracing integration
Sanders Archivists and records managers: another marriage in trouble?
Abdumalik Information Support For Forensic Expert Activities Of Forensic Institutions: Current Problems Of Theory And Practice
Mardiati et al. The Potential Use of Artificial Intelligence Technology in the Process of Collecting Metadata in Photo Archive Description Activities
Nelson The impact of computers on the legal profession
Haried US Attorneys' Options for Managing Case Investigative Information in Small, Medium, and Large Cases
Waldman International newspapers and research
Mokhsin et al. Design Requirements on Web-Based Ancestry Platform for Islamic Family Inheritance in Malaysia

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant