CN113392075A - Multithreading collaborative file batch naming method - Google Patents

Multithreading collaborative file batch naming method Download PDF

Info

Publication number
CN113392075A
CN113392075A CN202110729518.9A CN202110729518A CN113392075A CN 113392075 A CN113392075 A CN 113392075A CN 202110729518 A CN202110729518 A CN 202110729518A CN 113392075 A CN113392075 A CN 113392075A
Authority
CN
China
Prior art keywords
matching
keywords
threads
multithreading
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110729518.9A
Other languages
Chinese (zh)
Other versions
CN113392075B (en
Inventor
朱咸超
卢道
王文斌
蔡梦洁
郭琪
李征
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Penglai Industrial Technology Co ltd
Original Assignee
Shenzhen Penglai Industrial Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Penglai Industrial Technology Co ltd filed Critical Shenzhen Penglai Industrial Technology Co ltd
Priority to CN202110729518.9A priority Critical patent/CN113392075B/en
Publication of CN113392075A publication Critical patent/CN113392075A/en
Application granted granted Critical
Publication of CN113392075B publication Critical patent/CN113392075B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/164File meta data generation
    • G06F16/166File name conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Development Economics (AREA)
  • Technology Law (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of file processing, in particular to a multithreading collaborative file batch naming method, which comprises the following steps of S1, extracting corresponding keywords according to the characteristics of various types of materials, taking the keywords as an initial keyword library, and storing the keywords in a corresponding file label attribute library; wherein each keyword record corresponds to: keyword name, keyword length and file type of the keyword; and step S2, setting thread number according to the actual hardware configuration of the client, and performing keyword label attribute matching by utilizing computer resources to the maximum extent through a multithreading technical means.

Description

Multithreading collaborative file batch naming method
Technical Field
The invention relates to the technical field of file processing, in particular to a multithreading collaborative file batch naming method.
Background
When illegal assets in the banking industry are treated by judicial methods, a large amount of paper materials need to be manually identified and classified according to the scanned materials after being scanned, then are renamed and are conveniently submitted to a court system, a large amount of time is consumed for treating the paper materials by identification and classification, and the pure-man work efficiency is extremely low. In addition, the document identification provided at present can only identify the ID card materials, and the material pictures such as an application form, a contract and a chapter in the credit card file materials of the bank cannot be identified. Aiming at the situation that a lot of labor cost is consumed in the material examination and naming links, an intelligent segmented multithreading identification method based on an application form, a procurement contract and a chapter material type is provided.
Disclosure of Invention
The invention aims to provide a multithreading collaborative file batch naming method, which aims to solve the problems of naming, identifying and classifying a large number of paper materials after scanning of the paper materials when illegal assets in the banking industry are subjected to judicial disposal.
In order to achieve the above purpose, the present invention is widely applied to the technical scheme of file naming, identification and classification, and particularly provides the following technical scheme: a method for naming files in batches based on multithreading collaboration comprises the following steps;
step S1, extracting corresponding keywords according to the characteristics of each type of material to serve as an initial keyword library and storing the keywords in a corresponding file label attribute library;
wherein each keyword record corresponds to: keyword name, keyword length and file type of the keyword;
step S2, setting thread number according to the actual hardware configuration of the client, and performing keyword label attribute matching by utilizing computer resources to the maximum extent through a multithreading technical means;
step S3, the recognition result returned by the Baidu OCR interface is analyzed, sorted and then stored and submitted to a matching queue;
wherein: the method comprises the steps that a data packet in a JSON format is returned by a hundredth OCR recognition interface, Fastjson is a set of JSON processing tools of an Arribaba open source, a result is analyzed into a HashMap object through the FastJson, specific recognition contents can be obtained through the HashMap object, and after all the recognition contents are extracted, the recognition contents are stored in a matching queue;
the matching queue is a data set formed by a List, and all contents to be matched are stored in an ordered form;
after each matching is completed, the matching queue destroys the matched data; according to the final matching quantity and the type of the keywords, the file type can be determined;
step S4, obtaining the total number of keywords from the matching queue in step S3, and then batching the matching keywords according to the number of the available threads, wherein the number of the keywords in each batch is as follows: total number of keywords/number of threads;
and step S5, matching the text content distributed to the thread in the step S4 with the keyword library in the step S1, wherein the matching times of each batch are as follows: the total number of keywords is the number of keywords in each batch, the matching efficiency is improved through a multithreading technical means, and finally the keywords are collected;
step S6, matching once according to 100% matching rules in the matching process of step S5, directly marking matching success if all keywords are successfully matched, recording the successfully matched keywords if not all keywords are successfully matched, and returning a matching result;
step S7: matching according to the information that the matching is not successful in the matching result of the step S6 and a specific strategy rule, and if the matching is still not successful, terminating the matching;
and step S8, finishing the step S7, automatically naming the file name corresponding to the label attribute according to the file type of the keyword which is successfully matched, and moving the file to the folder to which the file belongs.
Preferably, in the step S6, all the keywords which are not successfully matched but have the matching records therein are checked, and if it is confirmed that there is a correlation with the current file, a new matching rule and policy are made.
Preferably, when the thread number is set in step S2, a thread pool is created by itself through a constructor of threadpoolsexecutor, and the number of kernel threads, the maximum thread number, and the maximum survival time of idle threads exceeding the corePoolSize number in the thread pool are set at the time of creation.
Preferably, the number of the core threads is that each task needs to be processed in taskfime seconds, so that each thread can process 1/taskfime task per second, and the number of threads needed by the system to process taskfime tasks per second is as follows: tasks/(1/tasktime), i.e., tasks × tasktime number of threads.
Preferably, the maximum number of threads is that when the system load reaches a maximum value, the number of core threads cannot process all tasks on time, and then the number of threads needs to be increased.
Preferably, the maximum survival time of idle threads exceeding the corePoolsize number in the thread pool is increased or decreased by the number of threads;
specifically, when the load is reduced, the number of threads can be reduced, and if the idle time of one thread reaches keepalivietime, the thread exits; by default the thread pool will hold at least corePoolsize threads.
Preferably, after the OCR recognition in the step S3, the system averagely splits the result into N segments according to the number fed back by the OCR result, and each segment is allocated to N threads of the system to perform multithreading and matching at the same time;
where N is the maximum number of threads supported by the client.
Compared with the prior art, the invention has the beneficial effects that:
the invention can make different application forms, such as credit card application forms with different formats in various banks, the contract with large content difference and multiple versions can be identified and named by single picture by OCR identification and intelligent segmentation multithread batch naming method, the adopted time is only 1/N of the former time, in addition, the effective utilization rate of the system resource is improved by 90 percent compared with the prior time, in addition, the matching of the attributes of different files is carried out by establishing a keyword library and a file label attribute library, the success rate of matching is improved, the error rate of file naming is reduced, in addition, when the matching of all keywords is not successful, the matching rules and strategies can be changed in time by manual intervention, thereby improving the flexibility of matching.
Drawings
FIG. 1 is a flow chart of the method steps of the present invention;
FIG. 2 is a diagram of a file tag property library according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1 and fig. 2, an embodiment of the present invention includes:
a method for naming files in batches based on multithreading collaboration comprises the following steps;
step S1, extracting corresponding keywords according to the characteristics of each type of material to serve as an initial keyword library and storing the keywords in a corresponding file label attribute library;
wherein each keyword record corresponds to: keyword name, keyword length and file type of the keyword;
step S2, setting thread number according to the actual hardware configuration of the client, and performing keyword label attribute matching by utilizing computer resources to the maximum extent through a multithreading technical means;
step S3, the recognition result returned by the Baidu OCR interface is analyzed, sorted and then stored and submitted to a matching queue;
wherein: the method comprises the steps that a data packet in a JSON format is returned by a hundredth OCR recognition interface, Fastjson is a set of JSON processing tools of an Arribaba open source, a result is analyzed into a HashMap object through the FastJson, specific recognition contents can be obtained through the HashMap object, and after all the recognition contents are extracted, the recognition contents are stored in a matching queue.
The matching queue is a data set formed by a List, and all contents to be matched are stored in an ordered form;
after each matching, the matching queue destroys the matched data. According to the final matching quantity and the type of the keywords, the file type can be determined;
step S4, extracting the keyword list from the matching queue in step S3, and then batching the matched keywords according to the number of the available threads, wherein the number of the keywords in each batch is as follows: total number of keywords/number of threads;
and step S5, matching the text content distributed to the thread in the step S4 with the keyword library in the step S1, wherein the matching times of each batch are as follows: the total number of keywords is the number of keywords in each batch, the matching efficiency is improved through a multithreading technical means, and finally the keywords are collected;
step S6, matching once according to 100% matching rules in the matching process of step S5, directly marking matching success if all keywords are successfully matched, recording the successfully matched keywords if not all keywords are successfully matched, and returning a matching result;
step S7: matching according to the information that the matching is not successful in the matching result of the step S6 and a specific strategy rule, and if the matching is not successful, terminating the matching;
and step S8, finishing the step S7, automatically naming the file name corresponding to the label attribute according to the file type of the keyword which is successfully matched, and moving the file to the folder to which the file belongs.
In step S6, all the keywords that are not successfully matched but have records that are successfully matched are checked, and if it is determined that there is a relationship with the current file, a new matching rule and policy are made.
When the thread number is set in the step S2, creating a thread pool by itself through a constructor of threadpoolsexecutor, and setting the number of core threads, the maximum thread number, and the maximum survival time of idle threads exceeding the corepoolseze number in the thread pool when creating;
while creating a thread pool, assigning specific parameters to the number of core threads, the maximum number of threads and the maximum survival time of idle threads exceeding the number of corePoolSize in the thread pool;
the description is as follows:
at present, the common core i 310 generation serial processor of a household machine is taken as an example, 4 cores and 8 threads are adopted, the main frequency is 3.7GHz, under the parameter configuration of the household machine, the maximum value of the number of the adjustable threads is 8, and the minimum value is 1; the lowest value of the adjustable frequency is 1GHz, and the highest position is 3.7 GHz; this configuration is to make the resource utilization greater.
The number of the core threads is processed according to the task time second required by each task, each thread can process 1/task time task per second, the system has task tasks required to be processed per second, and the required thread number is as follows: tasks/(1/tasktime), namely tasks × tasktime threads;
assuming that the number of tasks per second of the system is 100-1000, and each task takes 0.1 second, 100 × 0.1-1000 × 0.1 threads are needed, i.e. 10-100 threads;
the corePoolSize should be set to be larger than 10, and the specific number is preferably according to the 8020 principle, i.e. the number of tasks per second of the system in 80%, and if the number of tasks per second of the system in 80% is smaller than 200 and at most 1000, the corePoolSize can be set to be 20.
When the system load reaches the maximum value, the core thread number cannot process all tasks on time, and then threads need to be added;
where 200 tasks per second require 20 threads, then when 1000 tasks per second are reached, (1000-queueCapacity) × (20/200), i.e., 60 threads, may set maxPoolSize to 60.
The maximum survival time of idle threads exceeding the number of corePoolSize in the thread pool is increased or decreased;
specifically, when the load is reduced, the number of threads can be reduced, and if the idle time of one thread reaches keepalivietime, the thread exits; under the default condition, the Thread pool can at least keep corePoolSize threads, and then the Thread pool successively bears Thread classes, and in the run method, the matching rules are accurately matched according to the label library.
After the OCR recognition in the step S3, the system averagely splits the OCR result into N segments according to the number fed back by the OCR result, where each segment is allocated to N threads of the system to perform multithreading and matching at the same time;
where N is the maximum number of threads supported by the client.
In step S1, the file tag attribute library includes a file name, a matching type, a matching policy, and a file name association feature tag;
the file name comprises an application form and a receiving contract;
the matching type comprises accurate matching and specific strategy matching, wherein the accurate matching comprises the following steps: the character pattern, the character number and the sequence of the feature tag are completely matched; the specific strategy is as follows: matching according to content rules of condition limitation
The matching strategy comprises that 100% and the alternate occurrence frequency of continuous rows is more than or equal to 2 times;
the file name association feature label comprises XX bank credit cards, recommended person credit card numbers, bank exclusive columns, credit card chapters, affiliated card applicants, applicant card types/pieces, bill mailing addresses, main card claimant signatures, card accepting modes, contract making contracts, specific charging items and standards related to the contracts are shown in charging standards, card title () numbers, A party + B party, card issuers (hereinafter called 'A party') applicants (hereinafter called 'B party')
Examples
The invention can make different application forms, such as credit card application forms with different formats in various banks, the contract with large content difference and multiple versions can be identified and named by single picture by OCR identification and intelligent segmentation multithread batch naming method, the adopted time is only 1/N of the former time, in addition, the effective utilization rate of the system resource is improved by 90 percent compared with the prior time, in addition, the matching of the attributes of different files is carried out by establishing a keyword library and a file label attribute library, the success rate of matching is improved, the error rate of file naming is reduced, in addition, when the matching of all keywords is not successful, the matching rules and strategies can be changed in time by manual intervention, thereby improving the flexibility of matching.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (7)

1. A multithreading collaborative file batch naming method is characterized in that: the method comprises the following steps;
step S1, extracting corresponding keywords according to the characteristics of each type of material to serve as an initial keyword library and storing the keywords in a corresponding file label attribute library;
wherein each keyword record corresponds to: keyword name, keyword length and file type of the keyword;
step S2, setting thread number according to the actual hardware configuration of the client, and performing keyword label attribute matching by utilizing computer resources to the maximum extent through a multithreading technical means;
step S3, the recognition result returned by the Baidu OCR interface is analyzed, sorted and then stored and submitted to a matching queue;
step S4, obtaining the total number of keywords from the matching queue in step S3, and then batching the matching keywords according to the number of the available threads, wherein the number of the keywords in each batch is as follows: total number of keywords/number of threads;
and step S5, matching the text content distributed to the thread in the step S4 with the keyword library in the step S1, wherein the matching times of the keywords in each batch are as follows: the total number of keywords is the number of keywords in each batch, the matching efficiency is improved through a multithreading technical means, and finally the keywords are collected;
step S6, matching once according to 100% matching rules in the matching process of step S5, directly marking matching success if all keywords are successfully matched, recording the successfully matched keywords if not all keywords are successfully matched, and returning a matching result;
step S7: matching according to the information that the matching is not successful in the matching result of the step S6 and a specific strategy rule, and if the matching is not successful, terminating the matching;
and step S8, finishing the step S7, automatically naming the file name corresponding to the label attribute according to the file type of the keyword which is successfully matched, and moving the file to the folder to which the file belongs.
2. The multithreading collaborative file batch naming method according to claim 1, wherein the multithreading collaborative file batch naming method comprises the following steps: in step S6, all the keywords that are not successfully matched but have records that are successfully matched are checked, and if it is determined that there is a relationship with the current file, a new matching rule and policy are made.
3. The multithreading collaborative file batch naming method according to claim 1, wherein the multithreading collaborative file batch naming method comprises the following steps: when the thread number is set in the step S2, a thread pool is created by itself through a constructor of threadpoolsexecutor, and the number of kernel threads, the maximum thread number, and the maximum survival time of idle threads exceeding the corePoolSize number in the thread pool are set at the time of creation.
4. The multithreading collaborative file batch naming method according to claim 3, wherein the multithreading collaborative file batch naming method comprises the following steps: the number of the core threads is processed according to the task time second required by each task, each thread can process 1/task time task per second, the system has task tasks required to be processed per second, and the required thread number is as follows: tasks/(1/tasktime), i.e., tasks × tasktime number of threads.
5. The multithreading collaborative file batch naming method according to claim 4, wherein the multithreading collaborative file batch naming method comprises the following steps: when the system load reaches the maximum value, the core thread number cannot process all tasks on time, and then the threads need to be added.
6. The multithreading collaborative file batch naming method according to claim 5, wherein the multithreading collaborative file batch naming method comprises the following steps: the maximum survival time of idle threads exceeding the number of corePoolSize in the thread pool is increased or decreased;
specifically, when the load is reduced, the number of threads can be reduced, and if the idle time of one thread reaches keepalivietime, the thread exits; by default the thread pool will hold at least corePoolsize threads.
7. The multithreading collaborative file batch naming method according to claim 1, wherein the multithreading collaborative file batch naming method comprises the following steps: after the OCR recognition in the step S3, the system averagely splits the OCR result into N segments according to the number fed back by the OCR result, where each segment is allocated to N threads of the system to perform multithreading and matching at the same time;
where N is the maximum number of threads supported by the client.
CN202110729518.9A 2021-06-29 2021-06-29 Multithreading collaborative file batch naming method Active CN113392075B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110729518.9A CN113392075B (en) 2021-06-29 2021-06-29 Multithreading collaborative file batch naming method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110729518.9A CN113392075B (en) 2021-06-29 2021-06-29 Multithreading collaborative file batch naming method

Publications (2)

Publication Number Publication Date
CN113392075A true CN113392075A (en) 2021-09-14
CN113392075B CN113392075B (en) 2022-02-11

Family

ID=77624455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110729518.9A Active CN113392075B (en) 2021-06-29 2021-06-29 Multithreading collaborative file batch naming method

Country Status (1)

Country Link
CN (1) CN113392075B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100299487A1 (en) * 2009-05-20 2010-11-25 Harold Scott Hooper Methods and Systems for Partially-Transacted Data Concurrency
US20150046488A1 (en) * 2013-08-08 2015-02-12 Avision Inc. Method for naming image file
CN109522128A (en) * 2018-11-21 2019-03-26 北京像素软件科技股份有限公司 Segmented multithreading task executing method and device
CN110554997A (en) * 2019-09-12 2019-12-10 广东电网有限责任公司 File name batch modification method and system
CN111813747A (en) * 2020-07-09 2020-10-23 广东一一五科技股份有限公司 File batch renaming method, electronic equipment and computer readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100299487A1 (en) * 2009-05-20 2010-11-25 Harold Scott Hooper Methods and Systems for Partially-Transacted Data Concurrency
US20150046488A1 (en) * 2013-08-08 2015-02-12 Avision Inc. Method for naming image file
CN109522128A (en) * 2018-11-21 2019-03-26 北京像素软件科技股份有限公司 Segmented multithreading task executing method and device
CN110554997A (en) * 2019-09-12 2019-12-10 广东电网有限责任公司 File name batch modification method and system
CN111813747A (en) * 2020-07-09 2020-10-23 广东一一五科技股份有限公司 File batch renaming method, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN113392075B (en) 2022-02-11

Similar Documents

Publication Publication Date Title
US11816165B2 (en) Identification of fields in documents with neural networks without templates
AU2005264153B2 (en) A method for determining near duplicate data objects
CN109214904B (en) Method, device, computer equipment and storage medium for acquiring financial false-making clues
CN111882403A (en) Financial service platform intelligent recommendation method based on user data
Sadasivam et al. Corporate governance fraud detection from annual reports using big data analytics
CN111858062A (en) Evaluation rule optimization method, service evaluation method and related equipment
US20090145818A1 (en) Mail Processing System And Method
CN111612519A (en) Method, device and storage medium for identifying potential customers of financial product
CN110597984B (en) Method and device for determining abnormal behavior user information, storage medium and terminal
CN112508000B (en) Method and equipment for generating OCR image recognition model training data
CN113392075B (en) Multithreading collaborative file batch naming method
CN112966681A (en) Method, equipment and storage medium for intelligent identification filing retrieval of commodity photographing
CN116246294B (en) Image information identification method, device, storage medium and electronic equipment
CN115796183A (en) Data field unified standard naming method and device
CN115098766B (en) Bidding information recommendation method and system for electronic bidding transaction platform
Jayalath et al. Enhancing Performance of Operationalized Machine Learning Models by Analyzing User Feedback
Prexawanprasut et al. Employing Machine Learning and an OCR Validation Technique to Identify Product Category Based on Visible Packaging Features
CN113626655A (en) Method for extracting information in file, computer equipment and storage device
CN113763143A (en) Auditing processing method, computer equipment and storage device
Chaitra et al. Bug triaging: right developer recommendation for bug resolution using data mining technique
CN112862409A (en) Picking bill verification method and device
CN112685650A (en) Commodity searching method, system, equipment and readable storage medium
CN111209397A (en) Method for determining enterprise industry category
Manjula et al. Identification and classification of multilingual document using maximized mutual information
Kazdar et al. Table Recognition in Scanned Documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant