CN116663549B - Digitized management method, system and storage medium based on enterprise files - Google Patents

Digitized management method, system and storage medium based on enterprise files Download PDF

Info

Publication number
CN116663549B
CN116663549B CN202310567168.XA CN202310567168A CN116663549B CN 116663549 B CN116663549 B CN 116663549B CN 202310567168 A CN202310567168 A CN 202310567168A CN 116663549 B CN116663549 B CN 116663549B
Authority
CN
China
Prior art keywords
file
files
text
enterprise
digital
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310567168.XA
Other languages
Chinese (zh)
Other versions
CN116663549A (en
Inventor
陈四娣
潘灵
胡敏
袁虎将
李慢慢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan University of Science and Technology
Original Assignee
Hainan University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan University of Science and Technology filed Critical Hainan University of Science and Technology
Priority to CN202310567168.XA priority Critical patent/CN116663549B/en
Publication of CN116663549A publication Critical patent/CN116663549A/en
Application granted granted Critical
Publication of CN116663549B publication Critical patent/CN116663549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections

Abstract

The invention discloses a digital management method, a digital management system and a storage medium based on enterprise files, and relates to the technical field of file digital management. The digitized management method based on enterprise files comprises the following steps: acquiring enterprise files needing digital management, wherein the enterprise files comprise text files, picture files, audio recording files and video recording files; performing digital processing on the enterprise file to obtain a digital file of the enterprise document, wherein the digital processing comprises: acquiring a text file, preprocessing an image and the like; extracting keywords from the digitized file and classifying the digitized file, wherein the keyword extraction comprises word segmentation processing on the text file to obtain a plurality of classification labels; according to the classification labels, archiving and sorting are carried out; the method realizes the function of acquiring more detailed labels to classify the documents, and solves the problem that the enterprise file digital management method in the prior art is difficult to classify and process the enterprise files further.

Description

Digitized management method, system and storage medium based on enterprise files
Technical Field
The invention relates to the technical field of file digital management, in particular to a digital management method, system and storage medium based on enterprise files.
Background
Enterprise archive management is always one of the important links in enterprise management. In the past, enterprise files have been stored in handwritten or printed form, and the management mode has been relatively cumbersome, with the risk of losing and destroying the files. However, with the development of information technology, digital file management is gradually and widely adopted by enterprises, so that comprehensive, unified, efficient and accurate file management is realized. The digital file management technology converts paper files into electronic data, and realizes the rapid storage, retrieval and management of the files through a computer technology, thereby greatly improving the reliability and safety of file management.
Existing enterprise archive digital management methods generally include scanning means, text reading means, data uploading means, data storage means and data modification means, or intelligent and accurate archive processing based on text semantic understanding of the electronically scanned document.
For example, publication No.: the invention patent of CN112633042A discloses a digital file management system and method, the system comprises: the device comprises a scanning device, a text reading device, a data uploading device, a data storage device and a data correction device; the scanning device is in signal connection with the text reading device and is used for sending the scanned archive text to the text reading device; the text reading device is in signal connection with the data uploading device and is used for reading the scanned text content, converting the read text content into digital content and transmitting the converted digital content to the data uploading device; the data uploading device is in signal connection with the data storage device and is used for uploading the digital content to the data storage device; the data storage device is in signal connection with the data correction device and is used for storing the digital content of the uploaded file; the data correction device is used for detecting error content in the digital content of the file and correcting the error content. The method has the advantages of high automation degree, archive text correction function and high management efficiency.
For example, publication No.: the invention patent of CN115827939a discloses a digitized archive management system, which extracts, through a context encoder comprising an embedded layer, global-based high-dimensional semantic features of each word in a text description of an electronically scanned document; and extracting multi-scale semantic understanding associated features of the text description under the word features of different scales by using a text convolutional neural network with one-dimensional convolutional kernels of different scales, classifying and judging topic labels corresponding to the text description according to the multi-scale semantic understanding associated features, and further automatically archiving the electronic scanning document. In this way, intelligent and accurate archival archiving processing is performed based on text semantic understanding of the electronically scanned document, thereby enabling digitized archival management.
However, in the process of implementing the technical scheme of the invention in the embodiment of the application, the inventor of the application finds that at least the following technical problems exist in the above technology:
the existing enterprise archive digital management method is provided with a scanning device, a text reading device, a data uploading device, a data storage device and a data correction device, so that error contents in digital contents of archives are detected, and correction is carried out on the error contents, but the digital contents cannot be further classified; the text semantic understanding of the electronic scanning document is intelligently and accurately archival processing by extracting the global high-dimensional semantic features of each word in the text description of the electronic scanning document through a context encoder comprising an embedded layer, but the text content cannot be classified more carefully and accurately. In summary, the enterprise file digital management method in the prior art has the problem that it is difficult to further classify and process the enterprise file.
Disclosure of Invention
The embodiment of the application solves the problem that the enterprise file digital management method in the prior art is difficult to carry out further classification processing on the enterprise file by providing the enterprise file based digital management method, the enterprise file based digital management system and the storage medium, and realizes further classification management on the enterprise file content.
The embodiment of the application provides a digitized management method based on enterprise files, which comprises the following steps: acquiring digitally managed enterprise files, wherein the enterprise files comprise text files, picture files, audio recording files and video recording files; performing digital processing on the enterprise file to obtain a digital file of the enterprise document; extracting keywords from the digitized file and classifying the digitized file, wherein the keyword extraction comprises word segmentation processing on the text file to obtain a plurality of classification labels; and (5) according to the classification labels, archiving and sorting.
Further, the word segmentation process performs word segmentation through a word segmentation model, and the main steps are as follows: and (3) feature selection: converting Chinese text in a text file into a text sequence, taking each character in the text sequence as a state, and extracting character features, wherein the character features comprise a current character, a previous character and a next character; model training: training a word segmentation model according to the set training corpus to obtain parameters including transition probabilities among states and conditional probabilities among states and features; word segmentation prediction: predicting a new text sequence by using the trained word segmentation model to obtain a word segmentation sequence serving as a word segmentation label; the transition probability between states is calculated by the following formula: Wherein P is<y i |y i-1 >Representing the previous state y i-1 Current state y i Conditional probability f of (f) k (y i-1 ,y i ) Representing the kth characteristic function at y i-1 And y i Lower value, lambda k The weight representing the kth feature function, i representing the ith state, i=1, 2,3, a.j.k represents a k-th feature function, k=1, 2,3 a.n.; the conditional probability between the state and the feature is calculated by the following formula:wherein P is<y i |y i-1 ,x>Representing conditional probabilities between states and features, where x represents the input sequence, P<y i |y i-1 ,x,i>Representing a given input sequence P<y i |x>And the labeling state y of the previous Chinese character i-1 When the current labeling state of the Chinese character x is y i Conditional probability of->Is given the state y of the previous Chinese character i-1 And under the condition of inputting sequence x, the current Chinese character state y i Is a sum of probabilities of (c).
Further, the digitizing process is performed on the text file, including the following steps: acquiring a text file, and acquiring image data of the text file through shooting or scanning; preprocessing an image, namely preprocessing the image data, wherein the preprocessing step comprises the steps of adjusting definition, denoising, self-adaptive binarization and text direction detection; dividing characters after image pretreatment into single characters or word blocks for processing; character recognition, namely recognizing the single character or word block by using an OCR technology, and converting the single character or word block into a text form; and (5) performing text post-processing, namely performing data processing on the text result obtained by recognition.
Further, extracting keywords from the picture files and classifying the picture files; acquiring a picture file; classifying pictures, namely classifying objects in the picture files by using a deep learning model, identifying different objects in the pictures, including vehicles, buildings and people by training the model, and generating keywords; object detection, namely detecting the region in the picture file by using an object detection algorithm, identifying the positions of a plurality of objects and generating keywords; and extracting keywords, namely performing word segmentation, part-of-speech tagging and grammar analysis on texts according to the picture identification result and the object detection result in the picture classification and by combining a text analysis function in a natural language processing technology, and extracting keywords, wherein the ranking and screening of the keywords are obtained according to the occurrence frequency and weight characteristics of the keywords.
Further, extracting keywords from the recording file and classifying the extracted keywords; acquiring a recording file; preprocessing the recording, namely preprocessing the recording file, wherein the preprocessing comprises audio format conversion, noise reduction and volume normalization; voice recognition, which is to use an automatic voice recognition technology to perform recording recognition on the preprocessed recording file and convert an audio signal in the recording into a text form; keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text by using a natural language processing technology according to a voice recognition result, extracting keywords, and sequencing and screening the keywords according to the occurrence frequency and weight characteristics of the keywords.
Further, extracting keywords from the video files and classifying the video files; acquiring a video file; preprocessing, namely converting the video record file into an image sequence, and preprocessing the image sequence, wherein the preprocessing comprises cutting, scaling and denoising operations; video analysis, namely performing video analysis on the processed video files by using a deep learning model, classifying scenes and objects in the video, extracting detection keywords, performing word segmentation, part-of-speech tagging and grammar analysis processing on texts according to video analysis results and combining text analysis functions in natural language processing technology, extracting keywords and phrases, and sequencing and screening the keywords according to the occurrence frequency and weight characteristics of the keywords.
Furthermore, the method for digitally managing enterprise files based on the file verification method for extracting the digitized files after archiving and arrangement comprises the following specific steps: receiving a digital file to be stored, acquiring management object information, receiving the digital file to be stored, and acquiring management object information used for representing first authority information of the digital file; the digital files to be stored also comprise management authority information corresponding to the management object information, wherein the management authority information is used for representing the authority level of the digital files and corresponds to the user access authority of the storage environment using objects and is used for limiting the access and acquisition of different using objects to different digital files; judging the encryption state and executing the encryption program: judging the encryption state of the digitized file to be stored, if the file is not encrypted, executing an encryption program, acquiring a corresponding object key and a device verification key based on management object information by using the encryption program, and encoding and encrypting the digitized file and other related information by using the device verification key to acquire an encrypted stored file, wherein the device verification key is used for representing hardware verification information; responding to the digitized file acquisition request: accessing the digitized file, sending a digitized file acquisition request, receiving the request and acquiring hardware verification information and management object information of a request object; comparing and judging according to the object of the digital file acquisition request and the management object information, generating a judging result, acquiring the authority level of the digital file corresponding to the digital file acquisition request and the management authority information of the request object when the judging result shows that the digital file acquisition request is different, generating a file transmission request if the management authority information corresponds to the authority level, transmitting the file transmission request to the management object corresponding to the digital file, acquiring feedback information, decrypting the digital file based on the feedback information and transmitting the digital file to the request object; decrypting the digitized file and verifying: decrypting the obtained encrypted storage document based on the management object information and the hardware verification information, calculating a group of hardware verification information comparison groups, comparing and judging, and generating a verification result; outputting or executing request feedback: and outputting the decrypted digital file if the verification result is passed, executing a request feedback program if the verification result is not passed, and responding to the digital file acquisition request through the management object.
Further, a key acquisition request is generated according to the management object information and is sent to the management object, and the biological identification and the equipment hardware information of the management object are acquired, wherein the biological identification and the equipment hardware information comprise an equipment mainboard number and an equipment hardware address; generating an object key by using preset biological identification information, encrypting equipment hardware information by using the key, and generating an equipment verification key; combining the data to be stored and the equipment verification key, encrypting by using the equipment verification key, and generating an encrypted storage document; the step of responding to the digitized file acquisition request specifically comprises the following steps: receiving a request and responding, acquiring hardware verification information and management object information of a request object, generating a corresponding object key based on the management object information, and encrypting the hardware verification information through the object key so as to generate an equipment verification key; and decrypting the encrypted and stored document by using the device verification key, wherein if the decryption fails, the device verification key indicates that the request object or the request device has errors, the device verification key and the data to be stored are generated if the decryption is successful, the device verification key is decrypted again by using the object key to acquire hardware verification information, the hardware verification information is compared and judged, the required digital file is provided for the request object if the verification passes, and the request is refused if the verification does not pass.
Furthermore, the digital management system based on the enterprise files comprises an acquisition module, a processing module, a classification module and an archiving and sorting module; the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring digitally managed enterprise files, and the enterprise files comprise text files, picture files, audio files and video files; the processing module is used for carrying out digital processing on the enterprise file to obtain a digital file of the enterprise file; the classification module is used for extracting keywords from the digitized file and classifying the digitized file, and comprises the steps of performing word segmentation on the text file to obtain a plurality of classification labels; and the archiving and arranging module is used for archiving and arranging according to the classification labels.
Further, embodiments of the present application provide a computer readable storage medium storing a program that when executed by a processor implements a method for digitally managing enterprise files.
One or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:
1. the existing enterprise archive digital management method detects error contents in digital contents of archives and corrects the error contents, but cannot classify the digital contents further, and the invention comprises the following steps: acquiring an enterprise file to be digitally managed, and performing digital processing on the enterprise file to obtain a digital file of the enterprise document; the method has the advantages that the digital files are extracted in a keyword mode and classified, and according to classification labels, archiving and sorting are carried out, so that the problem that in the prior art, enterprise files are difficult to further classify and process in the digital management method of the enterprise files is effectively solved.
2. The method comprises the steps of performing word segmentation on a text file to obtain a plurality of classification labels, performing word segmentation on a word segmentation model, converting a Chinese text into a sequence, extracting character features, training the word segmentation model according to a set training corpus, and obtaining transition probability and conditional probability between states and features; a new text sequence can be predicted by utilizing the trained word segmentation model, so that a word segmentation sequence is obtained and used as a word segmentation label; the method can identify the core information and the important subject in the document, and effectively improve the efficiency and accuracy of document management.
3. According to the method, the text files are subjected to keyword extraction and classification, the picture files are subjected to keyword extraction and classification, the audio recording files are subjected to keyword extraction and classification, and the video recording files are subjected to keyword extraction and classification, so that the digital management of enterprise files is realized, and the problem that the picture, audio recording and video recording files cannot be classified by taking keywords as standards is effectively solved.
Drawings
FIG. 1 is a flowchart of a method for digitally managing enterprise files according to an embodiment of the present disclosure;
fig. 2 is a schematic flow chart of a process for digitizing a text file in the enterprise file-based digitizing management method according to the embodiment of the present application;
Fig. 3 is a schematic structural diagram of a digital management system based on enterprise files according to an embodiment of the present application.
Detailed Description
For the purpose of further illustrating the various embodiments, the present invention provides the accompanying drawings, which are a part of the disclosure of the present invention, and which are mainly used to illustrate the embodiments and, together with the description, serve to explain the principles of the embodiments, and with reference to these descriptions, one skilled in the art will recognize other possible implementations and advantages of the present invention, wherein elements are not drawn to scale, and like reference numerals are generally used to designate like elements.
The technical scheme in the embodiment of the application aims to solve the problem that the enterprise file is difficult to further classify and process by the enterprise file-based digital management method in the prior art, and the general thought is as follows:
acquiring enterprise files needing digital management, wherein the enterprise files comprise text files, picture files, audio recording files and video recording files; performing digital processing on the enterprise file to obtain a digital file of the enterprise document; extracting keywords from the digitized file and classifying the digitized file; and (5) according to the classification labels, archiving and sorting. Word segmentation processing is carried out on the text file, and a plurality of classification labels are obtained; the word segmentation process carries out word segmentation through a word segmentation model, and mainly comprises the following steps: and (3) feature selection: converting Chinese text in the text file into a text sequence, taking each character in the text sequence as a state, and extracting character features, wherein the character features comprise a current character, a previous character and a next character; model training: training a word segmentation model according to the set training corpus to obtain parameters including transition probabilities among states and conditional probabilities among states and features; word segmentation prediction: and predicting a new text sequence by using the trained word segmentation model to obtain a word segmentation sequence serving as a word segmentation label. The method for digitizing the text file comprises the following steps: acquiring a text file, and acquiring image data of the text file through shooting or scanning; preprocessing an image, namely preprocessing image data, wherein the preprocessing step comprises the steps of adjusting definition, denoising, self-adaptive binarization and text direction detection; dividing characters after image pretreatment into single characters or word blocks for processing; character recognition, namely recognizing single characters or word blocks by using an OCR technology, and converting the single characters or word blocks into a text form; and (3) performing post-processing on the text result obtained by recognition, wherein the post-processing comprises formatting, correction and normalization, so that the accuracy and the accuracy of text recognition are improved. Extracting keywords from the picture files and classifying the picture files; acquiring a picture file; classifying pictures, namely classifying objects in a picture file by using a deep learning model, identifying different objects in the picture, including vehicles, buildings and people by training the model, and generating keywords; object detection, namely detecting the region in the picture file by using an object detection algorithm, identifying the positions of a plurality of objects and generating keywords; keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text according to a result of picture identification in picture classification and an object detection result in combination with a text analysis function in a natural language processing technology, extracting keywords, and sorting and screening the keywords according to the occurrence frequency and weight characteristics of the keywords. Extracting keywords from the record files and classifying the record files; acquiring a recording file; preprocessing the recording, namely preprocessing the recording file, wherein the preprocessing comprises audio format conversion, noise reduction and volume normalization; voice recognition, which is to use an automatic voice recognition technology to perform recording recognition on the preprocessed recording file and convert an audio signal in the recording into a text form; keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text by using a natural language processing technology according to a voice recognition result, extracting keywords, and sequencing and screening the keywords according to the occurrence frequency and weight characteristics of the keywords. Extracting keywords from the video files and classifying the video files; acquiring a video file; preprocessing, namely converting a video file into an image sequence, and preprocessing the image sequence, wherein the preprocessing comprises cutting, scaling and denoising operations; video analysis, which is to use a deep learning model to perform video analysis on the processed video files, and to classify and detect scenes and objects in the video; keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text according to a video analysis result and a text analysis function in a natural language processing technology, extracting keywords and phrases, and sequencing and screening the keywords according to the occurrence frequency and weight characteristics of the keywords.
In order to better understand the above technical solutions, the following detailed description will refer to the accompanying drawings and specific embodiments.
As shown in fig. 1, a flowchart of a method for digitally managing enterprise files according to an embodiment of the present application is shown, where the method includes the following steps: acquiring enterprise files needing digital management, wherein the enterprise files comprise text files, picture files, audio recording files and video recording files; performing digital processing on the enterprise file to obtain a digital file of the enterprise document; extracting keywords from the digitized file, and classifying the digitized file, wherein the keyword extraction comprises word segmentation processing on the text file to obtain a plurality of classification labels; and (5) according to the classification labels, archiving and sorting.
Further, word segmentation is carried out by a word segmentation model in word segmentation processing, and the main steps are as follows: and (3) feature selection: converting Chinese text in the text file into a text sequence, taking each character in the text sequence as a state, and extracting character features, wherein the character features comprise a current character, a previous character and a next character; model training: training a word segmentation model according to the set training corpus to obtain parameters including transition probabilities among states and conditional probabilities among states and features; word segmentation prediction: predicting a new text sequence by using the trained word segmentation model to obtain a word segmentation sequence serving as a word segmentation label; the transition probability between states is calculated by the following formula: Wherein P is<y i |y i-1 >Representing the previous state y i-1 Current state y i Conditional probability f of (f) k (y i-1 ,y i ) Representing the kth characteristic function at y i-1 And y i Lower value, lambda k The weight of the kth feature function, i, i=1, 2, 3. The conditional probability between the state and the feature is calculated by the following formula: />Wherein P is<y i |y i-1 ,x>Representation ofConditional probability between state and feature, where x represents input sequence, P<y i |y i-1 ,x,i>Representing a given input sequence P<y i |x>And the labeling state y of the previous Chinese character i-1 When the current labeling state of the Chinese character x is y i Conditional probability of->Is given the state y of the previous Chinese character i-1 And under the condition of inputting sequence x, the current Chinese character state y i Is the sum of the probabilities of (a); through P<y i |y i-1 ,x>And obtaining all labeling states of the current Chinese character and the corresponding probabilities thereof, and selecting the state with the maximum probability value as the labeling state of the current Chinese character by the model.
In this embodiment, the features are represented as states in feature selection, for example, whether the features are the beginning and the end of a word, and an optimization algorithm such as gradient descent is required in the model training process, so that the prediction result of the model on training data is the same as or has high similarity to the labeling result, the CRF word segmentation decodes the whole sequence by using the Viterbi algorithm, and the context information is considered to output the optimal word segmentation result. In actual training, the CRF word segmentation adopts methods such as maximum likelihood estimation or regularized maximum likelihood estimation to learn parameters, so that the prediction accuracy of the model on training data is the highest.
Further, as shown in fig. 2, in the method for digitally managing enterprise files according to the embodiment of the present application, a process diagram of performing digital processing on a text file is shown, where the process diagram includes the following steps: acquiring a text file, and acquiring image data of the text file through shooting or scanning; preprocessing an image, namely preprocessing image data, wherein the preprocessing step comprises the steps of adjusting definition, denoising, self-adaptive binarization and text direction detection; dividing characters after image pretreatment into single characters or word blocks for processing; character recognition, namely recognizing single characters or word blocks by using an OCR technology, and converting the single characters or word blocks into a text form; and (3) performing post-processing on the text result obtained by recognition, wherein the post-processing comprises formatting, correction and normalization, so that the accuracy and the accuracy of text recognition are improved.
In this embodiment, text segmentation is implemented using vertical, horizontal and diagonal scanning techniques, in combination with binarization, and text recognition is implemented using template matching or neural network-based learning algorithms.
Further, extracting keywords from the picture files and classifying the picture files; acquiring a picture file; classifying pictures, namely classifying objects in a picture file by using a deep learning model, identifying different objects in the picture, including vehicles, buildings and people by training the model, and generating keywords; object detection, namely detecting the region in the picture file by using an object detection algorithm, identifying the positions of a plurality of objects and generating keywords; keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text according to a result of picture identification in picture classification and an object detection result in combination with a text analysis function in a natural language processing technology, extracting keywords, and sorting and screening the keywords according to the occurrence frequency and weight characteristics of the keywords.
In this embodiment, the deep learning model in the image classification uses a Convolutional Neural Network (CNN), the object detection algorithm in the object detection uses a target detection method based on deep learning, and the detection algorithm is based on various models, such as Yolo, SSD, fasterR-CNN, etc.
Further, extracting keywords from the record files and classifying the record files; acquiring a recording file; preprocessing the recording, namely preprocessing the recording file, wherein the preprocessing comprises audio format conversion, noise reduction and volume normalization; voice recognition, which is to use an automatic voice recognition technology to perform recording recognition on the preprocessed recording file and convert an audio signal in the recording into a text form; keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text by using a natural language processing technology according to a voice recognition result, extracting keywords, and sequencing and screening the keywords according to the occurrence frequency and weight characteristics of the keywords.
In this embodiment, the automatic speech recognition technology is generally constructed by using a deep learning model and a speech feature extraction algorithm, such as Convolutional Neural Network (CNN), long short-term memory network (LSTM), mel-frequencycepstralcoefficients (MFCC), and the like, and by training the model, keywords and speech instructions in speech are recognized.
Further, extracting keywords from the video files and classifying the video files; acquiring a video file; preprocessing, namely converting a video file into an image sequence, and preprocessing the image sequence, wherein the preprocessing comprises cutting, scaling and denoising operations; video analysis, which is to use a deep learning model to perform video analysis on the processed video files, and to classify and detect scenes and objects in the video; keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text according to a video analysis result and a text analysis function in a natural language processing technology, extracting keywords and phrases, and sequencing and screening the keywords according to the occurrence frequency and weight characteristics of the keywords.
In this embodiment, the video is converted into a sequence of images, i.e. a continuous video stream is converted into a set of continuous still images, using the OpenCV library of Python or FFmpeg command line tool, the video analysis is by means of a Convolutional Neural Network (CNN) based video recognition model, also using object detection algorithms such as YOLO, SSD, maskR-CNN based on deep learning, etc.
Furthermore, the method for digitally managing enterprise files uses an extracted file verification mode when extracting the digitized files after archiving and finishing, and comprises the following specific steps: receiving a digital file to be stored, acquiring management object information, receiving the digital file to be stored, and acquiring management object information used for representing first authority information of the digital file; the digital files to be stored also comprise management authority information corresponding to the management object information, wherein the management authority information is used for representing the authority level of the digital files and is corresponding to the user access authority of the storage environment using objects and used for limiting the access and acquisition of different using objects to different digital files; judging the encryption state and executing the encryption program: judging the encryption state of the digitized file to be stored, if the file is not encrypted, executing an encryption program, acquiring a corresponding object key and an equipment verification key by the encryption program based on management object information, and encoding and encrypting the digitized file and other related information by using the equipment verification key to acquire an encrypted storage file, wherein the equipment verification key is used for representing hardware verification information; responding to the digitized file acquisition request: accessing the digitized file, sending a digitized file acquisition request, receiving the request and acquiring hardware verification information and management object information of a request object; comparing and judging according to the object of the digital file acquisition request and the management object information, generating a judging result, acquiring the authority level of the digital file corresponding to the digital file acquisition request and the management authority information of the request object when the judging result shows that the digital file acquisition request is different, generating a file transmission request if the management authority information corresponds to the authority level, transmitting the file transmission request to the management object corresponding to the digital file, acquiring feedback information, decrypting the digital file based on the feedback information and transmitting the digital file to the request object; decrypting the digitized file and verifying: decrypting the obtained encrypted storage document based on the management object information and the hardware verification information, calculating a group of hardware verification information comparison groups, comparing and judging, and generating a verification result; outputting or executing request feedback: and outputting the decrypted digital file if the verification result is passed, executing a request feedback program if the verification result is not passed, and responding to the digital file acquisition request through the management object.
In this embodiment, the management object information refers to some key information for authenticating the validity and authority of the digitized file, including content such as the outgoing place, the attribution, the file format, the creation time, the modification time, and the access rule of the digitized file, where the management object information for representing the attribution information of the digitized file generally includes identification information such as the name, the identification card number, and the organization code of the person or the enterprise to which the digitized file belongs.
Further, a key acquisition request is generated according to the management object information and is sent to the management object, and the biological identification and the equipment hardware information of the management object are acquired, wherein the biological identification and the equipment hardware information comprise an equipment mainboard number and an equipment hardware address; generating an object key by using preset biological identification information, encrypting equipment hardware information by using the key, and generating an equipment verification key; combining the data to be stored and the equipment verification key, encrypting by using the equipment verification key, and generating an encrypted storage document; the step of responding to the digitized file acquisition request specifically comprises the following steps: receiving a request and responding, acquiring hardware verification information and management object information of a request object, generating a corresponding object key based on the management object information, and encrypting the hardware verification information through the object key so as to generate an equipment verification key; and decrypting the encrypted and stored document by using the device verification key, wherein if the decryption fails, the device verification key indicates that the request object or the request device has errors, the device verification key and the data to be stored are generated if the decryption is successful, the device verification key is decrypted again by using the object key to acquire hardware verification information, the hardware verification information is compared and judged, the required digital file is provided for the request object if the verification passes, and the request is refused if the verification does not pass.
In this embodiment, by verifying the management object information and encrypting the device hardware information, the system ensures that only authorized users access the digitized file, thereby improving the security and confidentiality of the digitized file, and at the same time, the system also prevents malicious users from tampering or damaging the digitized file, and protects the integrity and reliability of the file.
Further, as shown in fig. 3, a schematic structural diagram of the enterprise file-based digital management system provided in the embodiment of the present application includes an obtaining module, a processing module, a classifying module and an archiving and sorting module; the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring enterprise files which are digitally managed, and the enterprise files comprise text files, picture files, audio recording files and video recording files; the processing module is used for carrying out digital processing on the enterprise file to obtain a digital file of the enterprise file; the classification module is used for extracting keywords from the digitized file and classifying the digitized file, and comprises the steps of word segmentation processing on the text file to obtain a plurality of classification labels; and the archiving and arranging module is used for archiving and arranging according to the classification labels.
Further, an embodiment of the present application provides a computer readable storage medium storing a program, where the program when executed by a processor implements a method for digitally managing enterprise files.
In summary, in this embodiment, by providing: acquiring an enterprise file to be digitally managed, and performing digital processing on the enterprise file to obtain a digital file of the enterprise document; the method has the advantages that the digital files are extracted in a keyword mode and classified, and according to classification labels, archiving and sorting are carried out, so that the problem that in the prior art, enterprise files are difficult to further classify and process in the digital management method of the enterprise files is effectively solved.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (8)

1. A digitalized management method based on enterprise files is characterized by comprising the following steps:
acquiring enterprise files needing digital management, wherein the enterprise files comprise text files, picture files, audio recording files and video recording files;
performing digital processing on the enterprise file to obtain a digital file of the enterprise file;
extracting keywords from the digitized file and classifying the digitized file, wherein the keyword extraction comprises word segmentation processing on the text file to obtain a plurality of classification labels;
according to the classification labels, archiving and sorting are carried out;
the word segmentation process carries out word segmentation through a word segmentation model, and comprises the following steps:
and (3) feature selection: converting Chinese text in a text file into a text sequence, taking each character in the text sequence as a state, and extracting character features, wherein the character features comprise a current character, a previous character and a next character;
Model training: training a word segmentation model according to the set training corpus to obtain parameters including transition probabilities among states and conditional probabilities among states and features;
word segmentation prediction: predicting a new text sequence by using the trained word segmentation model to obtain a word segmentation sequence serving as a word segmentation label;
the transition probability between states is calculated by the following formula:
wherein P is<y i |y i-1 >Representing the previous state y i-1 Current state y i Conditional probability f of (f) k (y i-1 ,y i ) Representing the kth characteristic function at y i-1 And y i Lower value, lambda k The weight representing the kth feature function, i representing the ith state, i=1, 2,3, a.j.k represents a k-th feature function, k=1, 2,3 a.n.;
the conditional probability between the state and the feature is calculated by the following formula:
wherein P is<y i |y i-1 ,x>Representing conditional probabilities between states and features, where x represents the input sequence, P<y i |y i-1 ,x,i>Representing a given input sequence P<y i |x>And the labeling state y of the previous Chinese character i-1 When the current labeling state of the Chinese character x is y i Is a function of the conditional probability of (1),is given the state y of the previous Chinese character i-1 And under the condition of inputting sequence x, the current Chinese character state y i Is the sum of the probabilities of (a);
the text file is digitized, which comprises the following steps:
Acquiring a text file, and acquiring image data of the text file through shooting or scanning;
preprocessing an image, namely preprocessing the image data, wherein the preprocessing step comprises the steps of adjusting definition, denoising, self-adaptive binarization and text direction detection;
dividing characters after image pretreatment into single characters or word blocks for processing;
character recognition, namely recognizing the single character or word block by using an OCR technology, and converting the single character or word block into a text form;
and (5) performing text post-processing, namely performing data processing on the text result obtained by recognition.
2. The method for digitally managing an enterprise archive of claim 1, wherein: extracting keywords from the picture files and classifying the picture files;
acquiring a picture file;
classifying pictures, namely classifying objects in the picture files by using a deep learning model, identifying different objects in the pictures, including vehicles, buildings and people by training the model, and generating keywords;
object detection, namely detecting the region in the picture file by using an object detection algorithm, identifying the positions of a plurality of objects and generating keywords;
And extracting keywords, namely performing word segmentation, part-of-speech tagging and grammar analysis on texts according to the picture identification result and the object detection result in the picture classification and by combining a text analysis function in a natural language processing technology, and extracting keywords, wherein the ranking and screening of the keywords are obtained according to the occurrence frequency and weight characteristics of the keywords.
3. The method for digitally managing an enterprise archive of claim 1, wherein: extracting keywords from the record files and classifying the record files;
acquiring a recording file;
preprocessing the recording, namely preprocessing the recording file, wherein the preprocessing comprises audio format conversion, noise reduction and volume normalization;
voice recognition, which is to use an automatic voice recognition technology to perform recording recognition on the preprocessed recording file and convert an audio signal in the recording into a text form;
keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text by using a natural language processing technology according to a voice recognition result, extracting keywords, and sequencing and screening the keywords according to the occurrence frequency and weight characteristics of the keywords.
4. The method for digitally managing an enterprise archive of claim 1, wherein: extracting keywords from the video files and classifying the video files;
Acquiring a video file;
preprocessing, namely converting the video record file into an image sequence, and preprocessing the image sequence, wherein the preprocessing comprises cutting, scaling and denoising operations;
video analysis, which is to use a deep learning model to perform video analysis on the processed video files, and to classify and detect scenes and objects in the video;
keyword extraction, namely performing word segmentation, part-of-speech tagging and grammar analysis on a text according to a video analysis result and a text analysis function in a natural language processing technology, and extracting keywords and phrases, wherein the ranking and screening of the keywords are obtained according to the occurrence frequency and weight characteristics of the keywords.
5. The method for digitally managing an enterprise archive of claim 1, wherein: the method for digitally managing the enterprise files based on the file comprises the following specific steps of:
receiving a digital file to be stored, acquiring management object information, receiving the digital file to be stored, and acquiring management object information used for representing first authority information of the digital file;
the digital files to be stored also comprise management authority information corresponding to the management object information, wherein the management authority information is used for representing the authority level of the digital files and corresponds to the user access authority of the storage environment using objects and is used for limiting the access and acquisition of different using objects to different digital files;
Judging the encryption state and executing the encryption program: judging the encryption state of the digitized file to be stored, if the file is not encrypted, executing an encryption program, acquiring a corresponding object key and a device verification key based on management object information by using the encryption program, and encoding and encrypting the digitized file and other related information by using the device verification key to acquire an encrypted stored file, wherein the device verification key is used for representing hardware verification information;
responding to the digitized file acquisition request: accessing the digitized file, sending a digitized file acquisition request, receiving the request and acquiring hardware verification information and management object information of a request object;
comparing and judging according to the object of the digital file acquisition request and the management object information, generating a judging result, acquiring the authority level of the digital file corresponding to the digital file acquisition request and the management authority information of the request object when the judging result shows that the digital file acquisition request is different, generating a file transmission request if the management authority information corresponds to the authority level, transmitting the file transmission request to the management object corresponding to the digital file, acquiring feedback information, decrypting the digital file based on the feedback information and transmitting the digital file to the request object;
Decrypting the digitized file and verifying: decrypting the obtained encrypted storage document based on the management object information and the hardware verification information, calculating a group of hardware verification information comparison groups, comparing and judging, and generating a verification result;
outputting or executing request feedback: and outputting the decrypted digital file if the verification result is passed, executing a request feedback program if the verification result is not passed, and responding to the digital file acquisition request through the management object.
6. The enterprise archive based digital management method of claim 5, wherein: generating a key acquisition request according to the management object information and sending the key acquisition request to the management object to acquire the biological identification and the equipment hardware information of the management object, wherein the biological identification and the equipment hardware information comprise an equipment mainboard number and an equipment hardware address;
generating an object key by using preset biological identification information, encrypting equipment hardware information by using the key, and generating an equipment verification key;
combining the data to be stored and the equipment verification key, encrypting by using the equipment verification key, and generating an encrypted storage document;
the step of responding to the digitized file acquisition request specifically comprises the following steps: receiving a request and responding, acquiring hardware verification information and management object information of a request object, generating a corresponding object key based on the management object information, and encrypting the hardware verification information through the object key so as to generate an equipment verification key;
And decrypting the encrypted and stored document by using the device verification key, wherein if the decryption fails, the device verification key indicates that the request object or the request device has errors, the device verification key and the data to be stored are generated if the decryption is successful, the device verification key is decrypted again by using the object key to acquire hardware verification information, the hardware verification information is compared and judged, the required digital file is provided for the request object if the verification passes, and the request is refused if the verification does not pass.
7. A system for applying the enterprise archive based digital management method of any one of claims 1-6, comprising an acquisition module, a processing module, a classification module, and an archive finishing module;
the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring digitally managed enterprise files, and the enterprise files comprise text files, picture files, audio files and video files;
the processing module is used for carrying out digital processing on the enterprise file to obtain a digital file of the enterprise file;
the classification module is used for extracting keywords from the digitized file and classifying the digitized file, and comprises the steps of performing word segmentation on the text file to obtain a plurality of classification labels;
And the archiving and arranging module is used for archiving and arranging according to the classification labels.
8. A computer readable storage medium having stored thereon a program, which when executed by a processor, implements the steps of the enterprise archive data processing method of any one of claims 1 to 6.
CN202310567168.XA 2023-05-18 2023-05-18 Digitized management method, system and storage medium based on enterprise files Active CN116663549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310567168.XA CN116663549B (en) 2023-05-18 2023-05-18 Digitized management method, system and storage medium based on enterprise files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310567168.XA CN116663549B (en) 2023-05-18 2023-05-18 Digitized management method, system and storage medium based on enterprise files

Publications (2)

Publication Number Publication Date
CN116663549A CN116663549A (en) 2023-08-29
CN116663549B true CN116663549B (en) 2024-03-19

Family

ID=87725307

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310567168.XA Active CN116663549B (en) 2023-05-18 2023-05-18 Digitized management method, system and storage medium based on enterprise files

Country Status (1)

Country Link
CN (1) CN116663549B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117556112B (en) * 2024-01-11 2024-04-16 中国标准化研究院 Intelligent management system for electronic archive information

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559214A (en) * 2013-10-11 2014-02-05 中国农业大学 Method and device for automatically generating video
CN110175273A (en) * 2019-05-22 2019-08-27 腾讯科技(深圳)有限公司 Text handling method, device, computer readable storage medium and computer equipment
CN111209749A (en) * 2020-01-02 2020-05-29 湖北大学 Method for applying deep learning to Chinese word segmentation
CN111741356A (en) * 2020-08-25 2020-10-02 腾讯科技(深圳)有限公司 Quality inspection method, device and equipment for double-recording video and readable storage medium
CN112966796A (en) * 2021-03-04 2021-06-15 南通苏博办公服务有限公司 Enterprise information archive storage management method and system based on big data
CN113033204A (en) * 2021-03-24 2021-06-25 广州万孚生物技术股份有限公司 Information entity extraction method and device, electronic equipment and storage medium
CN113254634A (en) * 2021-02-04 2021-08-13 天津德尔塔科技有限公司 File classification method and system based on phase space
CN113536182A (en) * 2021-07-12 2021-10-22 广州万孚生物技术股份有限公司 Method and device for generating long text webpage, electronic equipment and storage medium
CN114492437A (en) * 2022-02-16 2022-05-13 平安科技(深圳)有限公司 Keyword recognition method and device, electronic equipment and storage medium
CN115185888A (en) * 2022-07-27 2022-10-14 海南绿境高科环保有限公司 Enterprise environment-friendly archive management method, device, equipment and storage medium
CN115422347A (en) * 2022-07-25 2022-12-02 海南科技职业大学 Knowledge graph-based Chinese course teaching plan generation method
CN115455969A (en) * 2022-08-16 2022-12-09 华南师范大学 Medical text named entity recognition method, device, equipment and storage medium
CN115543915A (en) * 2022-09-23 2022-12-30 郑州大学 Automatic database building method and system for personnel file directory
CN115878847A (en) * 2023-02-21 2023-03-31 云启智慧科技有限公司 Video guide method, system, equipment and storage medium based on natural language
CN115934926A (en) * 2022-11-10 2023-04-07 上海工物高技术产业发展有限公司 Information extraction method and device, computer equipment and storage medium
CN116089620A (en) * 2023-04-07 2023-05-09 日照蓝鸥信息科技有限公司 Electronic archive data management method and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040006470A1 (en) * 2002-07-03 2004-01-08 Pioneer Corporation Word-spotting apparatus, word-spotting method, and word-spotting program
US9002838B2 (en) * 2009-12-17 2015-04-07 Wausau Financial Systems, Inc. Distributed capture system for use with a legacy enterprise content management system
US20140032973A1 (en) * 2012-07-26 2014-01-30 James K. Baker Revocable Trust System and method for robust pattern analysis with detection and correction of errors

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559214A (en) * 2013-10-11 2014-02-05 中国农业大学 Method and device for automatically generating video
CN110175273A (en) * 2019-05-22 2019-08-27 腾讯科技(深圳)有限公司 Text handling method, device, computer readable storage medium and computer equipment
CN111209749A (en) * 2020-01-02 2020-05-29 湖北大学 Method for applying deep learning to Chinese word segmentation
CN111741356A (en) * 2020-08-25 2020-10-02 腾讯科技(深圳)有限公司 Quality inspection method, device and equipment for double-recording video and readable storage medium
CN113254634A (en) * 2021-02-04 2021-08-13 天津德尔塔科技有限公司 File classification method and system based on phase space
CN112966796A (en) * 2021-03-04 2021-06-15 南通苏博办公服务有限公司 Enterprise information archive storage management method and system based on big data
CN113033204A (en) * 2021-03-24 2021-06-25 广州万孚生物技术股份有限公司 Information entity extraction method and device, electronic equipment and storage medium
CN113536182A (en) * 2021-07-12 2021-10-22 广州万孚生物技术股份有限公司 Method and device for generating long text webpage, electronic equipment and storage medium
CN114492437A (en) * 2022-02-16 2022-05-13 平安科技(深圳)有限公司 Keyword recognition method and device, electronic equipment and storage medium
CN115422347A (en) * 2022-07-25 2022-12-02 海南科技职业大学 Knowledge graph-based Chinese course teaching plan generation method
CN115185888A (en) * 2022-07-27 2022-10-14 海南绿境高科环保有限公司 Enterprise environment-friendly archive management method, device, equipment and storage medium
CN115455969A (en) * 2022-08-16 2022-12-09 华南师范大学 Medical text named entity recognition method, device, equipment and storage medium
CN115543915A (en) * 2022-09-23 2022-12-30 郑州大学 Automatic database building method and system for personnel file directory
CN115934926A (en) * 2022-11-10 2023-04-07 上海工物高技术产业发展有限公司 Information extraction method and device, computer equipment and storage medium
CN115878847A (en) * 2023-02-21 2023-03-31 云启智慧科技有限公司 Video guide method, system, equipment and storage medium based on natural language
CN116089620A (en) * 2023-04-07 2023-05-09 日照蓝鸥信息科技有限公司 Electronic archive data management method and system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Chinese word segment model for energy literature based on Neural Networks with Electricity User Dictionary;Bochuan Song等;2019 International Conference on Asian Language Processing;20200319;第194页-198页 *
企业档案工作数字化转型:实践探索与理论框架;王强;;浙江档案;20200930(第09期);第16页-20页 *
基于BiLSTM-CRF的中文分词和词性标注联合方法;袁里驰;中南大学学报;20230511;第54卷(第8期);第3145页-3153页 *
基于改进BERT的电力领域中文分词方法;夏飞等;计算机应用;20230415;第43卷(第12期);第3711页-3718页 *
融合词典修正的Bi-LSTM+CRF中文分词方法研究;孙艺玮;中国优秀硕士学位论文全文数据库信息科技辑;20210715(第7期);第I138-805页 *

Also Published As

Publication number Publication date
CN116663549A (en) 2023-08-29

Similar Documents

Publication Publication Date Title
CN109117777B (en) Method and device for generating information
CN109299273B (en) Multi-source multi-label text classification method and system based on improved seq2seq model
US8014604B2 (en) OCR of books by word recognition
EP2657884B1 (en) Identifying multimedia objects based on multimedia fingerprint
CN110232340B (en) Method and device for establishing video classification model and video classification
WO2020164278A1 (en) Image processing method and device, electronic equipment and readable storage medium
US20120224765A1 (en) Text region detection system and method
CN116663549B (en) Digitized management method, system and storage medium based on enterprise files
CN111428028A (en) Information classification method based on deep learning and related equipment
CN116416480B (en) Visual classification method and device based on multi-template prompt learning
CN114416979A (en) Text query method, text query equipment and storage medium
CN112862024A (en) Text recognition method and system
CN113190502A (en) Archive management method based on deep learning
KR102334018B1 (en) Apparatus and method for validating self-propagated unethical text
CN114218945A (en) Entity identification method, device, server and storage medium
CN116150651A (en) AI-based depth synthesis detection method and system
CN114372532A (en) Method, device, equipment, medium and product for determining label marking quality
KR101800975B1 (en) Sharing method and apparatus of the handwriting recognition is generated electronic documents
CN114283429A (en) Material work order data processing method, device, equipment and storage medium
CN113887191A (en) Method and device for detecting similarity of articles
CN113673322A (en) Character expression posture lie detection method and system based on deep learning
CN111666928A (en) Computer file similarity recognition system and method based on image analysis
CN111813975A (en) Image retrieval method and device and electronic equipment
Alzuru et al. Quality-Aware Human-Machine Text Extraction for Biocollections using Ensembles of OCRs
CN117493645B (en) Big data-based electronic archive recommendation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant