CN113254398A - Sample file management method, device, equipment and medium - Google Patents

Sample file management method, device, equipment and medium Download PDF

Info

Publication number
CN113254398A
CN113254398A CN202011594962.6A CN202011594962A CN113254398A CN 113254398 A CN113254398 A CN 113254398A CN 202011594962 A CN202011594962 A CN 202011594962A CN 113254398 A CN113254398 A CN 113254398A
Authority
CN
China
Prior art keywords
target
sample file
directory
file
hierarchical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011594962.6A
Other languages
Chinese (zh)
Inventor
杜杨君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yihua Time Technology Co Ltd
Original Assignee
Shenzhen Yihua Time Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yihua Time Technology Co Ltd filed Critical Shenzhen Yihua Time Technology Co Ltd
Priority to CN202011594962.6A priority Critical patent/CN113254398A/en
Publication of CN113254398A publication Critical patent/CN113254398A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • G06F16/152File search processing using file content signatures, e.g. hash values
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/156Query results presentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a sample file management method, which comprises the following steps: acquiring at least one preset hierarchical directory and the hierarchical sequence of each hierarchical directory, and constructing a hierarchical structure among the at least one hierarchical directory according to the hierarchical sequence; acquiring target directory features of a target sample file, and storing the target sample file into a bottommost target hierarchical directory corresponding to the target directory features in a hierarchical structure; and acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the modified file name. The method can well distinguish and manage the sample files, and is convenient for subsequent retrieval and use. In addition, a sample file management apparatus, a device, and a storage medium are also proposed.

Description

Sample file management method, device, equipment and medium
Technical Field
The invention relates to the technical field of financial equipment, in particular to a sample file management method, a sample file management device, sample file management equipment and sample file management media.
Background
With the advent of the information age, the demand and usage of information data have been growing explosively, and in order to quickly and accurately find desired information data when used, it is necessary to reasonably store and manage files for storing information data.
The conventional document management method simply classifies documents, and taking a banknote sample document as an example, if training of 'whether a corner exists in a banknote image' is performed subsequently, the collected banknote sample document is divided into a banknote sample document with a corner and a sample document without a banknote, so that the banknote sample documents can be used for training of 'whether a corner exists in a banknote image'. However, if training of "whether there is contamination in the banknote image" is subsequently performed, data acquisition is performed again, or data acquired before is re-integrated and then classified again, but in any way, it is laborious and time-consuming.
Disclosure of Invention
In view of the above, it is necessary to provide a sample file management method, apparatus, device, and medium that achieve accurate subdivision in view of the above problems.
A method of sample file management, the method comprising:
acquiring at least one preset hierarchical directory and the hierarchical sequence of each hierarchical directory, and constructing a hierarchical structure among the at least one hierarchical directory according to the hierarchical sequence;
acquiring target directory features of a target sample file, and storing the target sample file into a bottommost target hierarchical directory corresponding to the target directory features in the hierarchical structure;
and acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the modified file name.
In one embodiment, the target data features include basic data features and data features to be marked, and the basic data features form the original file name according to a preset sequence;
the obtaining of the target data characteristics of each target sample file and the modifying of the original file name of the corresponding target sample file according to the target data characteristics includes:
and acquiring the labeling sequence of the characteristics of the data to be labeled, and adding the label of the characteristics of the data to be labeled in the original file name according to the labeling sequence.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes:
determining a target retrieval mode, wherein the target retrieval mode comprises a directory retrieval mode and a file name retrieval mode;
when the target retrieval mode is the directory retrieval mode, acquiring input directory features to be retrieved, searching a to-be-output hierarchical directory containing the directory features to be retrieved, and outputting a sample file under the to-be-output hierarchical directory as a retrieval result;
and when the target retrieval mode is the file name retrieval mode, acquiring at least one input data feature to be retrieved, searching a sample file to be output containing all the data features to be retrieved in the corrected file name, and outputting the sample file to be output as a retrieval result.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes:
when the corrected file name or the target path of the target sample file has the characteristic loss, performing characteristic supplement on the lost characteristic according to the unreleased characteristic belonging to the same attribute; the target path is composed of a target level directory and a level directory having a direct level relation with the target level directory;
and when the corrected file name of the target sample file is inconsistent with the characteristics of the target path with the same attribute, determining the correct characteristics in the corrected file name, and correcting the error characteristics with the same attribute according to the correct characteristics.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes:
acquiring a first identity characteristic of each server sample file stored in a server, wherein the first identity characteristic is used for uniquely representing the identity of the corresponding server sample file;
and determining a second identity characteristic of each target sample file, matching the first identity characteristic with the second identity characteristic, and synchronizing the target sample files according to the server sample file when the first identity characteristic is not completely matched with the second identity characteristic.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes:
obtaining a feature to be trained, and screening a sample file to be trained from the target sample file according to the feature to be trained;
and performing identification training on the sample file to be trained to obtain a feature identification result, and calculating the normal banknote identification rate and/or the abnormal banknote detection rate of the sample file to be trained according to the feature to be trained and the feature identification result.
In one embodiment, the target sample file is a banknote sample file;
the target directory features comprise at least one of equipment type, currency, data type, equipment identification code, acquisition date, issuing organization, currency value, version and orientation;
the target data characteristics include at least one of a banknote number, a hash code, a banknote exception bit, a collection date, a collection number, a currency, an issuing institution, a version, and an orientation.
A sample file management apparatus, the apparatus comprising:
the system comprises a catalog construction module, a hierarchical structure generation module and a hierarchical structure generation module, wherein the catalog construction module is used for acquiring at least one preset hierarchical catalog and the hierarchical sequence of each hierarchical catalog and constructing the hierarchical structure among the hierarchical catalogs according to the hierarchical sequence;
the paper money deposit module is used for obtaining the target directory characteristics of the target sample file and depositing the target sample file into the lowest target hierarchical directory corresponding to the target directory characteristics in the hierarchical structure;
and the file name correction module is used for acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the corrected file name.
A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring at least one preset hierarchical directory and the hierarchical sequence of each hierarchical directory, and constructing a hierarchical structure among the at least one hierarchical directory according to the hierarchical sequence;
acquiring target directory features of a target sample file, and storing the target sample file into a bottommost target hierarchical directory corresponding to the target directory features in the hierarchical structure;
and acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the modified file name.
A sample file management apparatus comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of:
acquiring at least one preset hierarchical directory and the hierarchical sequence of each hierarchical directory, and constructing a hierarchical structure among the at least one hierarchical directory according to the hierarchical sequence;
acquiring target directory features of a target sample file, and storing the target sample file into a bottommost target hierarchical directory corresponding to the target directory features in the hierarchical structure;
and acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the modified file name.
The invention provides a sample file management method, which constructs a plurality of hierarchical directories into a hierarchical structure capable of being subdivided step by step according to a preset hierarchical sequence at a directory level, and realizes hierarchical refinement of directory features. And then the target sample files are put into the target hierarchical directory at the bottommost layer by layer according to the target directory characteristics of the target sample files, so that different sample files are stored into the corresponding hierarchical directories which are the most detailed, and the files are easy to track and not easy to generate errors in the storage process. In the file name layer, the original file name is modified through the data characteristics recorded in the sample file, so that the sample file can be well distinguished and managed only through the file name, and subsequent retrieval, use and other purposes are facilitated.
The invention also provides a sample file management device, which is applied to the sample file management method and comprises a directory construction module used for constructing the hierarchical structure among the hierarchical directories according to the hierarchical sequence. The paper money deposit module is used for depositing the target sample file into the target level catalog at the bottommost layer, so that the catalog type classification management can be realized. The file name correction module is used for modifying the original file name, so that the sample files can be well distinguished and managed only through the file name.
The invention also provides a computer readable storage medium for storing a computer program, wherein when the computer program is executed, the computer program constructs a plurality of hierarchical directories into a hierarchical structure which can be subdivided step by step according to a preset hierarchical sequence, and then puts the target sample file into the target hierarchical directory at the bottommost layer by layer according to the target directory characteristics of the target sample file, so that different sample files are subdivided according to different hierarchical directories, the files are easy to track in the storing process, and errors are not easy to occur. And then the original file name is modified through the data characteristics recorded in the sample file, so that the sample file can be well distinguished and managed only through the file name.
The invention also provides sample file management equipment which comprises a memory and a processor, wherein the memory stores a computer program, the computer program constructs a plurality of hierarchical directories into a hierarchical structure which can be subdivided step by step according to a preset hierarchical sequence when being executed by the processor, and then the target sample files are put into the target hierarchical directory at the bottommost layer by layer according to the target directory characteristics of the target sample files. And modifying the original file name through the data characteristics recorded in the sample file. The device can well distinguish and manage the sample files.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
FIG. 1 is a schematic flow chart diagram illustrating a sample file management method in one embodiment;
FIG. 2 is a diagram illustrating the storage of a target sample file into a target-level directory, in one embodiment;
FIG. 3 is a diagram showing the structure of a sample file management apparatus according to an embodiment;
FIG. 4 is a block diagram showing a configuration of a sample file management apparatus according to an embodiment.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, fig. 1 is a schematic flow chart of a sample file management method in an embodiment, a type of a sample file may be in various forms, specifically, a banknote sample file, a ticket sample file, a bond sample file, and the like, and the following embodiment is described in detail only with reference to the banknote sample file.
The sample file management method comprises the following steps:
step 102, obtaining at least one preset hierarchical directory and a hierarchical sequence of each hierarchical directory, and constructing a hierarchical structure among the at least one hierarchical directory according to the hierarchical sequence.
The method is characterized in that a unique corresponding directory feature is set among directories of different levels, and only sample files meeting the directory feature are stored in one level directory. For example, the directory feature of the a-level directory is 2020, and the a-level directory only stores the sample files of 2020. The hierarchical sequence is used for determining the sequential hierarchical relationship of different hierarchical directory components, and the hierarchical directory can be constructed into a tree-shaped hierarchical structure according to the hierarchical sequence, so that the screening and the subdivision of the sample files are realized.
Illustratively, in a specific application scenario, the set hierarchical directory specifically includes: (1) the device category list, (2) the currency list, (3) the data category list, (4) the acquisition device ID-SN (Identity document-Serial Number) list, (5) the acquisition date list, (6) the currency-organization list, (7) the currency value list, (8) the version list, and (9) the oriented list. In the parentheses, the numbers indicate the hierarchical order. The equipment category directory represents equipment for collecting sample data of the paper money, and the form of the equipment category directory can be bv001 sd. The currency type catalog represents the currency type of the paper currency collected by the equipment, and the form can be 001.CNY, and the currency type catalog represents the paper currency type and consists of a 3-digit number and a 3-digit currency international code. The data category catalog represents whether the data is abnormal or not, and the form can be 1.NSD for normal data, 2.ESD for abnormal data and 3.OT for other data. The collection device ID-SN directory represents an identification number of the collection device, and the form may be specifically ID0000001SN 000001. The collection date catalog represents the time for collecting sample data by the equipment, and the form of the collection date catalog can be 20201124, and the collection date catalog consists of 4-bit years, 2-bit months and 2-bit days. The currency-institution directory represents the currency and the issuing institution of the paper currency, and can be specifically CNY _ PBC, which consists of 3 currency international codes and paper currency issuing institution codes. The currency value list characterizes the amount of the banknote, for example 20, 50 and 100. The version table indicates the release version of the bill, for example 2015. The catalog-oriented representation represents the orientation of the collected banknote image, which may be in the form of FU, representing a right side up.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating an embodiment of storing a target sample file into a target hierarchical directory, first creating a first directory bv001sd, and then sequentially creating next hierarchical directories according to a preset hierarchical order under the bv001sd directory. When a plurality of sub-directories exist, for example, when a currency value directory is created, three currency values of 5 yuan, 10 yuan and 20 yuan are determined to exist according to sample data, and then three sub-currency value directories are created under the currency-organization directory CNY _ PBC respectively. Other directories are also created in sequence until the lowest hierarchical directory is created. It is understood that the limitation of the number and the hierarchical order of the hierarchical directories is not limited to the manner shown in fig. 2, and no specific limitation is made.
And 104, acquiring target directory characteristics of the target sample file, and storing the target sample file into a bottommost target hierarchical directory corresponding to the target directory characteristics in the hierarchical structure.
The target sample file is a sample file which needs to be stored in the hierarchical directory currently, and each target sample file can be stored independently, or a plurality of target sample files of the current batch can be stored simultaneously, and the target sample files are determined according to the performance of the equipment or set manually. The target catalog characteristic is a characteristic of the sample data record associated with the deposited catalog.
Specifically, when the target sample file is stored, the target directory features and the hierarchical directories are sequentially compared according to the hierarchical sequence, and the target sample file is stored in the hierarchical directory which is successfully compared. For example, when the collection date characteristic of the target directory characteristic is "20190904" and the collection date directory includes the subdirectories "20190904" and "20190905", the target sample file is stored in the subdirectory of "20190904", and then the storage of the next-level directory is performed until the lowest target-level directory recorded by the target directory characteristic is reached, referring to fig. 2, the lowest target-level directory in this embodiment is one of "BD", "BU", "FD", and "FU", where FD denotes right side up, FD denotes right side down, BU denotes right side up, and BD denotes right side down.
And 106, acquiring target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the modified file name.
The target data characteristics are the main attribute characteristics of the sample files and can be used for distinguishing different sample files.
For example, in a specific application scenario, the target data characteristics acquired by the acquisition device for each target sample file specifically include: note number, hash code, note abnormal position, year, month, day, hour, minute, second, collection number, denomination, organization, version, denomination, face. The paper currency represents the serial number character string of the paper currency face, and the length is 6-20. The hash code is 32 bytes long HexStr and is used for identifying the uniqueness of the sample, detecting the repeatability of the sample and carrying out bidirectional synchronization with the server locally. The banknote abnormality position is 16-byte length HexStr, is divided into public information abnormality and other information abnormality, and is used for representing the mark of abnormal information existing in the sample. As shown in tables 1 and 2 below, table 1 is a common information exception table, and table 2 is another information exception table.
Table 1:
Figure BDA0002867969360000081
table 2:
Figure BDA0002867969360000082
Figure BDA0002867969360000091
Figure BDA0002867969360000101
the year, month and day represents the date when the sample data was collected. The time in minutes and seconds represents the time of sample data acquisition. The collection number represents the number of the sample file, and the number is 0000-9999, and can be expanded. The currency represents the currency to which the sample note belongs, such as CNY. The institution characterizes the issuing institution to which the sample note belongs, such as the PBC. The version characterizes the release version of the banknote sample, such as the 2015 version. The denomination is expressed in the form of a base index AABB, AA indicating that the denomination number of the banknote does not contain the 0 at the tail, and BB indicating the number of 0. Such as renminbi 100 yuan, denoted 0102. 01 denotes the number 1 and 02 denotes 20 s. The face facing the characterization sample banknote includes FU, FD, BU, BD. The target data characteristics comprise basic data characteristics and data characteristics to be marked, the basic data characteristics are data characteristics marked by the acquisition equipment in advance, and the basic data characteristics form an original file name according to a preset sequence and can be used for simply distinguishing sample files. The marking data characteristics comprise more characteristics for distinguishing the sample files, and each target sample file can be further subdivided on the basis of the basic data characteristics.
In the above embodiment, the banknote number, the collection number, the year, month, day, hour, minute, and second are basic data characteristics, and the original file name is sequentially composed. The rest target data are specially the data features to be labeled, wherein the labeling sequence of the hash code is 2, the labeling sequence of the abnormal positions of the paper currency is 3, the labeling sequence of the currency is 7, the labeling sequence of the mechanism is 8, the labeling sequence of the version is 9, the labeling sequence of the currency value is 10, the facing labeling sequence is 11, the labels of the data features to be labeled are added in the original file name according to the labeling sequence, and finally a plurality of target sample files shown in fig. 2 are obtained. Furthermore, after the file names of all sample files are modified, the accuracy of file management can be determined in a mode of manually checking or randomly checking the marking information, and sample data which is not accurately checked or randomly checked is manually marked again.
According to the sample file management method, a plurality of hierarchical directories are constructed into a hierarchical structure which can be subdivided step by step according to a preset hierarchical sequence at a directory level, and hierarchical refinement of directory features is realized. And then the target sample files are put into the target hierarchical directory at the bottommost layer by layer according to the target directory characteristics of the target sample files, so that different sample files are stored into the corresponding hierarchical directories which are the most detailed, and the files are easy to track and not easy to generate errors in the storage process. In the file name layer, the original file name is modified through the data characteristics recorded in the sample file, so that the sample file can be well distinguished and managed only through the file name, and subsequent retrieval, use and other purposes are facilitated.
With a complete detailed hierarchy and sample file names with data features, we can use these sample files to complete specific tasks, which will be described in detail below with reference to different application scenarios.
In one specific application scenario, a desired banknote sample file is retrieved through a search function provided by an operating system for subsequent training of the completion algorithm. The selectable target retrieval modes comprise a directory retrieval mode and a file name retrieval mode, wherein the directory retrieval mode refers to one retrieval mode taking a directory name as a retrieval path, and the file name retrieval mode refers to the other retrieval mode taking a sample file name as the retrieval path. When the user selects the directory retrieval mode, the user is provided with the selectable candidate directory features covered by the device type directory, currency directory, data category directory, acquisition device ID-SN directory, acquisition date directory, currency-organization directory, currency value directory, version directory, directory-oriented directory and the like. And then, taking the input of the user as the characteristics of the directory to be retrieved, searching the hierarchical directory to be output containing the characteristics of the directory to be retrieved, and outputting the sample file under the hierarchical directory to be output as a retrieval result. For example, when the characteristic of the directory to be retrieved is 20 monetary value, all sample files under the hierarchical directory to be output with the monetary value directory of 20 are output. Of course, a plurality of directory features to be retrieved can be input simultaneously, so that the output result can be further refined, and a sample file really needed can be found.
Correspondingly, when the target retrieval mode is a file name retrieval mode, candidate data characteristics which comprise paper currency numbers, Hash codes, paper currency abnormal positions, year, month and day, hour, minute and second, collection numbers, currency types, mechanisms, versions, currency values, orientation and the like can be selected for the user. And then, the data characteristics to be retrieved input by the user are used as the characteristics of the data to be retrieved, the sample file to be output, which contains all the characteristics of the data to be retrieved in the corrected file name, is searched, and the sample file to be output is used as a retrieval result to be output. For example, when the data to be retrieved is characterized by the existence of the stained paper currency, all the sample files with the abnormal positions of the paper currency in the stained position field of 1 are output. Similarly, a plurality of data features to be retrieved can be input simultaneously. This allows for quick retrieval of desired sample data, whether in a single selection or in a combined selection.
In yet another specific application scenario, mutual complementation and correction is performed based on the hierarchical structure and sample file name. And when the characteristics of the corrected file name or the target path of the target sample file are lost, performing characteristic supplement on the lost characteristics according to the unreleased characteristics belonging to the same attribute. For example, when the monetary value of the modified file name is lost and 20 is recorded in the destination path, the lost part of the modified file name is supplemented to 20. Accordingly, the file name can be corrected according to the lost characteristics recorded in the unremoved characteristic target path.
Another situation is that when the revised file name of the target sample file is inconsistent with the feature of the same attribute of the target path, for example, when the currency value recorded by the revised file name is 20 and the currency value recorded by the target path is 10, the correct feature input by the user is received, and the error feature of the same attribute is revised according to the correct feature.
In another specific application scenario, the local sample file and the server sample file can be synchronized through a synchronization management tool. And the sample file synchronization needs to ensure the uniqueness of the sample, so the sample file is synchronized without repetition by comparing the identity characteristics of the local sample file and the server sample file, for example, hash coding is used.
For each server sample file, a plurality of projection directions are determined based on file data recorded in the sample file, and then the file data is projected in the projection directions to obtain a specific projection value in each projection direction. And acquiring projection threshold values of a plurality of intervals set in each projection direction, wherein each interval has a corresponding sub-code. And determining the sub-codes in each projection direction based on the specific projection value and the projection threshold value, and connecting all the sub-codes according to a certain sequence to obtain the first hash code of the server sample file. Similarly, a second hash code for each local target sample file is determined. Matching all the first hash codes with all the second hash codes, wherein when the first hash codes are not completely matched with the second hash codes, for example, redundant second hash codes exist, the target sample file is not needed; and when the second missing hash code exists, the target sample file is not stored. In either case, the target sample file is updated synchronously based on the server sample file. The synchronous management mode based on the Hash codes has uniqueness, so the Hash codes can be repeatedly used after once marking, the Hash codes do not need to be calculated again in the synchronous process, and the synchronous speed is improved. Similarly, because the hash code has uniqueness, each target sample file can be quickly and accurately located by retrieving the hash code.
In another specific application scenario, the training result of the sample can also be calculated according to the marked features. Referring to table 1 and table 2 above, the banknote abnormality bit represents the abnormality information existing in the banknote sample, and when image training of a certain banknote sample is performed, for example, stained banknotes need to be selected in a stain detection algorithm training process, first, the banknote images meeting the stain requirement can be found out according to a file name retrieval mode, and the banknote images are used as samples to be trained for training. Of course, a certain number of images of banknotes without soiling can also be added as reference control. And inputting the sample file to be trained into a training model for recognition training to obtain a feature recognition result, wherein the feature recognition result comprises recognition of the stained banknote image in proportion and recognition of the non-stained banknote image in proportion. And comparing the recognition result with the correct marked result to obtain the normal banknote recognition rate and the abnormal banknote detection rate. Similarly, the calculation of other training results such as the detection rate of magnetic anomaly and the detection rate of thickness anomaly can be performed.
In one embodiment, as shown in fig. 3, there is provided a sample file management apparatus including:
the directory construction module 302 is configured to obtain at least one preset hierarchical directory and a hierarchical order of each hierarchical directory, and construct a hierarchical structure between the at least one hierarchical directory according to the hierarchical order;
the paper money deposit module 304 is used for obtaining the target directory characteristics of the target sample file and storing the target sample file into the lowest target hierarchical directory corresponding to the target directory characteristics in the hierarchical structure;
and the file name correction module 306 is configured to obtain target data characteristics of each target sample file, and modify the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the corrected file name.
According to the sample file management device, a plurality of hierarchical directories are constructed into a hierarchical structure which can be subdivided step by step according to a preset hierarchical sequence at a directory level, so that hierarchical refinement of directory features is realized. And then the target sample files are put into the target hierarchical directory at the bottommost layer by layer according to the target directory characteristics of the target sample files, so that different sample files are stored into the corresponding hierarchical directories which are the most detailed, and the files are easy to track and not easy to generate errors in the storage process. In the file name layer, the original file name is modified through the data characteristics recorded in the sample file, so that the sample file can be well distinguished and managed only through the file name, and subsequent retrieval, use and other purposes are facilitated.
In an embodiment, the filename modification module 306 is further specifically configured to: and acquiring the labeling sequence of the characteristics of the data to be labeled, and adding the label of the characteristics of the data to be labeled in the original file name according to the labeling sequence.
In one embodiment, the sample file management apparatus further includes: the retrieval module is used for determining a target retrieval mode, and the target retrieval mode comprises a directory retrieval mode and a file name retrieval mode; when the target retrieval mode is a directory retrieval mode, acquiring input directory features to be retrieved, searching a to-be-output hierarchical directory containing the directory features to be retrieved, and outputting a sample file under the to-be-output hierarchical directory as a retrieval result; and when the target retrieval mode is a file name retrieval mode, acquiring at least one input data feature to be retrieved, searching a sample file to be output containing all the data features to be retrieved in the corrected file name, and outputting the sample file to be output as a retrieval result.
In one embodiment, the sample file management apparatus further includes: the supplement correction module is used for performing feature supplement on the lost features according to the unreleased features belonging to the same attribute when the corrected file name or the target path of the target sample file has the feature loss; the target path is composed of a target level directory and a level directory having a direct level relation with the target level directory; and when the corrected file name of the target sample file is inconsistent with the characteristics of the target path with the same attribute, determining the correct characteristics in the corrected file name, and correcting the error characteristics with the same attribute according to the correct characteristics.
In one embodiment, the sample file management apparatus further includes: the synchronization module is used for acquiring a first identity characteristic of each server sample file stored in the server, wherein the first identity characteristic is used for uniquely representing the identity of the corresponding server sample file; and determining a second identity characteristic of each target sample file, matching the first identity characteristic with the second identity characteristic, and synchronizing the target sample files according to the server sample files when the first identity characteristic is not completely matched with the second identity characteristic.
In one embodiment, the sample file management apparatus further includes: the calculation module is used for obtaining the characteristics to be trained and screening the sample files to be trained from the target sample files according to the characteristics to be trained; and carrying out identification training on the sample file to be trained to obtain a feature identification result, and calculating the normal banknote identification rate and/or the abnormal banknote detection rate of the sample file to be trained according to the feature to be trained and the feature identification result.
FIG. 4 is a diagram showing an internal structure of a sample file management apparatus in one embodiment. As shown in fig. 4, the sample file management apparatus includes a processor, a memory, and a network interface connected through a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the sample file management apparatus stores an operating system and may further store a computer program that, when executed by the processor, causes the processor to implement the sample file management method. The internal memory may also have a computer program stored therein, which when executed by the processor, causes the processor to perform the method for sample file management. It will be understood by those skilled in the art that the structure shown in fig. 4 is a block diagram of only a portion of the structure associated with the present application, and does not constitute a limitation on the sample document management apparatus to which the present application is applied, and a particular sample document management apparatus may include more or less components than those shown in the drawings, or may combine some components, or have a different arrangement of components.
A sample file management apparatus comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring at least one preset hierarchical directory and the hierarchical sequence of each hierarchical directory, and constructing a hierarchical structure among the at least one hierarchical directory according to the hierarchical sequence; acquiring target directory features of a target sample file, and storing the target sample file into a bottommost target hierarchical directory corresponding to the target directory features in a hierarchical structure; and acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the modified file name.
In one embodiment, obtaining target data characteristics of each target sample file, and modifying an original file name of the corresponding target sample file according to the target data characteristics includes: and acquiring the labeling sequence of the characteristics of the data to be labeled, and adding the label of the characteristics of the data to be labeled in the original file name according to the labeling sequence.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes: determining a target retrieval mode, wherein the target retrieval mode comprises a directory retrieval mode and a file name retrieval mode; when the target retrieval mode is a directory retrieval mode, acquiring input directory features to be retrieved, searching a to-be-output hierarchical directory containing the directory features to be retrieved, and outputting a sample file under the to-be-output hierarchical directory as a retrieval result; and when the target retrieval mode is a file name retrieval mode, acquiring at least one input data feature to be retrieved, searching a sample file to be output containing all the data features to be retrieved in the corrected file name, and outputting the sample file to be output as a retrieval result.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes: when the corrected file name or the target path of the target sample file has the characteristic loss, performing characteristic supplement on the lost characteristic according to the unreleased characteristic belonging to the same attribute; and when the corrected file name of the target sample file is inconsistent with the characteristics of the target path with the same attribute, determining the correct characteristics in the corrected file name, and correcting the error characteristics with the same attribute according to the correct characteristics.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes: acquiring a first identity characteristic of each server sample file stored in a server, wherein the first identity characteristic is used for uniquely representing the identity of the corresponding server sample file; and determining a second identity characteristic of each target sample file, matching the first identity characteristic with the second identity characteristic, and synchronizing the target sample files according to the server sample files when the first identity characteristic is not completely matched with the second identity characteristic.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes: acquiring a feature to be trained, and screening a sample file to be trained from a target sample file according to the feature to be trained; and carrying out identification training on the sample file to be trained to obtain a feature identification result, and calculating the normal banknote identification rate and/or the abnormal banknote detection rate of the sample file to be trained according to the feature to be trained and the feature identification result.
A computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of: acquiring at least one preset hierarchical directory and the hierarchical sequence of each hierarchical directory, and constructing a hierarchical structure among the at least one hierarchical directory according to the hierarchical sequence; acquiring target directory features of a target sample file, and storing the target sample file into a bottommost target hierarchical directory corresponding to the target directory features in a hierarchical structure; and acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the modified file name.
In one embodiment, obtaining target data characteristics of each target sample file, and modifying an original file name of the corresponding target sample file according to the target data characteristics includes: and acquiring the labeling sequence of the characteristics of the data to be labeled, and adding the label of the characteristics of the data to be labeled in the original file name according to the labeling sequence.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes: determining a target retrieval mode, wherein the target retrieval mode comprises a directory retrieval mode and a file name retrieval mode; when the target retrieval mode is a directory retrieval mode, acquiring input directory features to be retrieved, searching a to-be-output hierarchical directory containing the directory features to be retrieved, and outputting a sample file under the to-be-output hierarchical directory as a retrieval result; and when the target retrieval mode is a file name retrieval mode, acquiring at least one input data feature to be retrieved, searching a sample file to be output containing all the data features to be retrieved in the corrected file name, and outputting the sample file to be output as a retrieval result.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes: when the corrected file name or the target path of the target sample file has the characteristic loss, performing characteristic supplement on the lost characteristic according to the unreleased characteristic belonging to the same attribute; and when the corrected file name of the target sample file is inconsistent with the characteristics of the target path with the same attribute, determining the correct characteristics in the corrected file name, and correcting the error characteristics with the same attribute according to the correct characteristics.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes: acquiring a first identity characteristic of each server sample file stored in a server, wherein the first identity characteristic is used for uniquely representing the identity of the corresponding server sample file; and determining a second identity characteristic of each target sample file, matching the first identity characteristic with the second identity characteristic, and synchronizing the target sample files according to the server sample files when the first identity characteristic is not completely matched with the second identity characteristic.
In one embodiment, after obtaining the target sample file with the revised file name, the method further includes: acquiring a feature to be trained, and screening a sample file to be trained from a target sample file according to the feature to be trained; and carrying out identification training on the sample file to be trained to obtain a feature identification result, and calculating the normal banknote identification rate and/or the abnormal banknote detection rate of the sample file to be trained according to the feature to be trained and the feature identification result.
It should be noted that the sample file management method, apparatus, device and computer-readable storage medium described above belong to a general inventive concept, and the contents in the embodiments of the sample file management method, apparatus, device and computer-readable storage medium may be mutually applicable.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A sample file management method, the method comprising:
acquiring at least one preset hierarchical directory and the hierarchical sequence of each hierarchical directory, and constructing a hierarchical structure among the at least one hierarchical directory according to the hierarchical sequence;
acquiring target directory features of a target sample file, and storing the target sample file into a bottommost target hierarchical directory corresponding to the target directory features in the hierarchical structure;
and acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the modified file name.
2. The method according to claim 1, wherein the target data characteristics comprise basic data characteristics and data characteristics to be labeled, and the basic data characteristics constitute the original file names in a preset order;
the obtaining of the target data characteristics of each target sample file and the modifying of the original file name of the corresponding target sample file according to the target data characteristics includes:
and acquiring the labeling sequence of the characteristics of the data to be labeled, and adding the label of the characteristics of the data to be labeled in the original file name according to the labeling sequence.
3. The method of claim 2, further comprising, after obtaining the target sample file with the revised file name:
determining a target retrieval mode, wherein the target retrieval mode comprises a directory retrieval mode and a file name retrieval mode;
when the target retrieval mode is the directory retrieval mode, acquiring input directory features to be retrieved, searching a to-be-output hierarchical directory containing the directory features to be retrieved, and outputting a sample file under the to-be-output hierarchical directory as a retrieval result;
and when the target retrieval mode is the file name retrieval mode, acquiring at least one input data feature to be retrieved, searching a sample file to be output containing all the data features to be retrieved in the corrected file name, and outputting the sample file to be output as a retrieval result.
4. The method of claim 1, further comprising, after obtaining the target sample file with the revised file name:
when the corrected file name or the target path of the target sample file has the characteristic loss, performing characteristic supplement on the lost characteristic according to the unreleased characteristic belonging to the same attribute; the target path is composed of a target level directory and a level directory having a direct level relation with the target level directory;
and when the corrected file name of the target sample file is inconsistent with the characteristics of the target path with the same attribute, determining the correct characteristics in the corrected file name, and correcting the error characteristics with the same attribute according to the correct characteristics.
5. The method of claim 1, further comprising, after obtaining the target sample file with the revised file name:
acquiring a first identity characteristic of each server sample file stored in a server, wherein the first identity characteristic is used for uniquely representing the identity of the corresponding server sample file;
and determining a second identity characteristic of each target sample file, matching the first identity characteristic with the second identity characteristic, and synchronizing the target sample files according to the server sample file when the first identity characteristic is not completely matched with the second identity characteristic.
6. The method of claim 1, further comprising, after obtaining the target sample file with the revised file name:
obtaining a feature to be trained, and screening a sample file to be trained from the target sample file according to the feature to be trained;
and performing identification training on the sample file to be trained to obtain a feature identification result, and calculating the normal banknote identification rate and/or the abnormal banknote detection rate of the sample file to be trained according to the feature to be trained and the feature identification result.
7. The method of claim 1, wherein the target sample file is a banknote sample file;
the target directory features comprise at least one of equipment type, currency, data type, equipment identification code, acquisition date, issuing organization, currency value, version and orientation;
the target data characteristics include at least one of a banknote number, a hash code, a banknote exception bit, a collection date, a collection number, a currency, an issuing institution, a version, and an orientation.
8. A sample file management apparatus, characterized in that the apparatus comprises:
the system comprises a catalog construction module, a hierarchical structure generation module and a hierarchical structure generation module, wherein the catalog construction module is used for acquiring at least one preset hierarchical catalog and the hierarchical sequence of each hierarchical catalog and constructing the hierarchical structure among the hierarchical catalogs according to the hierarchical sequence;
the paper money deposit module is used for obtaining the target directory characteristics of the target sample file and depositing the target sample file into the lowest target hierarchical directory corresponding to the target directory characteristics in the hierarchical structure;
and the file name correction module is used for acquiring the target data characteristics of each target sample file, and modifying the original file name of the corresponding target sample file according to the target data characteristics to obtain the target sample file with the corrected file name.
9. A computer-readable storage medium, storing a computer program which, when executed by a processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 7.
10. A sample file management device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method of any one of claims 1 to 7.
CN202011594962.6A 2020-12-29 2020-12-29 Sample file management method, device, equipment and medium Pending CN113254398A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011594962.6A CN113254398A (en) 2020-12-29 2020-12-29 Sample file management method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011594962.6A CN113254398A (en) 2020-12-29 2020-12-29 Sample file management method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN113254398A true CN113254398A (en) 2021-08-13

Family

ID=77181367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011594962.6A Pending CN113254398A (en) 2020-12-29 2020-12-29 Sample file management method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113254398A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115576903A (en) * 2022-11-16 2023-01-06 统信软件技术有限公司 File system construction method, computing device and storage medium
CN116089364A (en) * 2023-04-11 2023-05-09 山东英信计算机技术有限公司 Storage file management method and device, AI platform and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115576903A (en) * 2022-11-16 2023-01-06 统信软件技术有限公司 File system construction method, computing device and storage medium
CN116089364A (en) * 2023-04-11 2023-05-09 山东英信计算机技术有限公司 Storage file management method and device, AI platform and storage medium

Similar Documents

Publication Publication Date Title
CN109144968B (en) Data distribution management system
US20050210007A1 (en) Document search methods and systems
US8073874B2 (en) Bit string searching apparatus, searching method, and program
US8190591B2 (en) Bit string searching apparatus, searching method, and program
US9454545B2 (en) Automated field position linking of indexed data to digital images
CN113254398A (en) Sample file management method, device, equipment and medium
JPH04248623A (en) Method and device for conducting single entity version control for source data
US8386526B2 (en) Coupled node tree backup/restore apparatus, backup/restore method, and program
CN110688349B (en) Document sorting method, device, terminal and computer readable storage medium
CN109522290A (en) A kind of HBase data block restores and data record extraction method
CN114461673A (en) Block chain query optimization method based on-chain and off-chain cooperation
CN111813849A (en) Data extraction method, device and equipment and storage medium
Pahade et al. A survey on multimedia file carving
CN111259017B (en) Order retrieval method, computer device, and storage medium
CN110096571B (en) Mechanism name abbreviation generation method and device and computer readable storage medium
CN108062323A (en) A kind of log read method and device
US8166043B2 (en) Bit strings search apparatus, search method, and program
US20140122491A1 (en) Systems and methods for authenticating and aiding in indexing of and searching for electronic files
CN114138403A (en) Mirror image storage and distribution platform
CN110457332B (en) Information processing method and related equipment
CN108415915A (en) A kind of proof of algorithm method and device based on bank note data
JP2011159256A (en) Method and program for reading visiting card
CN114328389B (en) Big data file analysis processing system and method under cloud computing environment
CN115544048B (en) Method and terminal for monitoring data change
US20200201847A1 (en) Semiconductor parts search method using last alphabet deletion algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination