CN112527757B - Rapid retrieval method based on large-scale chip test result - Google Patents

Rapid retrieval method based on large-scale chip test result Download PDF

Info

Publication number
CN112527757B
CN112527757B CN201910879802.7A CN201910879802A CN112527757B CN 112527757 B CN112527757 B CN 112527757B CN 201910879802 A CN201910879802 A CN 201910879802A CN 112527757 B CN112527757 B CN 112527757B
Authority
CN
China
Prior art keywords
file
directory
files
checklist
check
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910879802.7A
Other languages
Chinese (zh)
Other versions
CN112527757A (en
Inventor
谭坚
蒋昊辰
王丽一
吴臻
陈磊
肖旵敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN201910879802.7A priority Critical patent/CN112527757B/en
Publication of CN112527757A publication Critical patent/CN112527757A/en
Application granted granted Critical
Publication of CN112527757B publication Critical patent/CN112527757B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a quick retrieval method based on a large-scale chip test result, which comprises the following steps: checking the directory, sequencing log files in the directory in an ascending order according to file names, checking whether check files exist or not, if so, executing the next step, otherwise, sequentially warehousing all files in the directory, and writing a data list into the check files; writing a data list obtained from the directory into the checksum _ new file; reading a check file in the directory; comparing MD5 values of check files, check sum and check sum _ new in the directory, judging whether the MD5 values of the two files are the same, if so, ending the operation; according to the method, the files in the directory are layered and retrieved, so that the purpose of quickly positioning and updating the directory or updating the files is achieved, and the detection efficiency is improved.

Description

Rapid retrieval method based on large-scale chip test result
Technical Field
The invention belongs to the technical field of computer processor testing, and particularly relates to a quick retrieval method based on a large-scale chip test result.
Background
In the large-scale chip test and verification process, a large amount of log files are generated, and in order to ensure the smooth progress of the test, the log needs to be quickly and efficiently retrieved and analyzed, and the summary log needs to be put into a warehouse. In the process of processing a large amount of logs, to increase the speed, a new log name must be quickly located, and only a new log is processed. However, in the existing test results, directories are often created according to resource information or dates, result logs generated in a certain class or a certain day are stored in corresponding directories, a huge number of test directories are inevitably generated as time goes on, and a large number of test result logs are also stored in the test directories.
The reclamation and scanning of test results is typically delayed and typically performed by a tester after the current resource test is completed. In order to avoid repetition, the judgment is carried out before each warehousing: whether a new log is generated under each directory, whether the directory is a newly generated log directory, and whether the content of the log is updated. Because the result directories are many and huge in the testing process, if the directory and the file are checked one by one, a lot of time is spent on the checking and comparing work, a feeling of needing a long time for delaying is brought to a testing user, and the use is seriously influenced, so that a technical means for quickly checking the result directories and reading the log files needing to be processed in the directories is urgently needed.
Disclosure of Invention
The invention aims to provide a quick retrieval method based on a large-scale chip test result, which achieves the purpose of quickly positioning and updating a directory or updating a file and improves the detection efficiency by grading the files in the directory and retrieving the MD5 value of the temporary file in a list.
In order to achieve the purpose, the invention adopts the technical scheme that: a fast search method based on large-scale chip test result, adopt MD5 value check < file name, file byte size > file of format, check whether there is renewal, characterized by that: the method comprises the following steps:
s1, checking a test directory, and checking whether a check file exists in the current test directory or not, wherein if the check file exists in the current test directory, the step of S3 is skipped to be executed, and if not, the step of S2 is executed;
s2, sequencing the log files in the current test directory in an ascending order according to file names, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into a temporary file,. Checksum file, and jumping to S5 for continuation;
s3, sequencing all log files under the current test directory in an ascending order, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into a checksum _ new file, and placing the checksum _ new file under the current test directory;
s4, performing MD5 value verification on the checksum file and the checksum _ new file under the current test directory, comparing whether the MD5 values of the two files are the same, if so, ending the operation, and otherwise, continuing to execute the next step;
s5, entering a subdirectory of the current test directory, checking whether a check file exists in the current subdirectory, checking if the check file exists in the current subdirectory, jumping to S7 for execution if the check file exists in the current subdirectory, and jumping to S6 for continuing if the check file does not exist in the current subdirectory;
s6, sorting the log files in the current subdirectory in an ascending order according to file names, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into the checklist file, simultaneously caching the file name list into a temporary file to be used as a basis for result processing, and then jumping to S9;
s7, sequencing all log files under the current subdirectory in an ascending manner, simultaneously acquiring the byte number of the log files, forming a group of data list information, writing the data list information into a checklist _ new file, and placing the checklist _ new file under the current subdirectory;
s8, comparing the checklist file and the checklist _ new file in the current subdirectory, checking whether the MD5 values of the two files are the same, if so, continuing to execute the next step, otherwise, acquiring a file name list in the checklist file and the checklist _ new file, sequencing and removing duplication, and circularly checking a file name list with difference in the two files: if the size and the byte number content of the file are the same as each other in the checklist file and the checklist _ new file, the file is not updated; otherwise, the file name list is cached in the temporary file to be used as a basis for result processing, if the update exists;
and S9, judging whether the unprocessed next-level subdirectory exists under the current subdirectory, if so, skipping to execute S5, and otherwise, ending the operation.
The technical scheme of further improvement in the technical scheme is as follows:
1. in the above solution, the format of the data list information is < filename, byte size of the corresponding log file >.
2. In the above scheme, in S8, the checklist file and the checklist _ new file in the directory are sorted and deduplicated, data list information in the file with a format of < file name, byte size of corresponding log file > is checked, hierarchical retrieval is performed according to the directory and the file, and if the directory is found to be not updated at the first level, file retrieval is not performed.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:
1. the invention provides a method for quickly searching based on large-scale chip test results, and provides a technology for checking an MD5 value of a temporary file of a hierarchical and searched file size list, wherein the MD5 value is adopted to check list information formed by a log file and the file size, and the size of a directory and a result log in the directory is searched to realize updating check so as to achieve the purpose of quickly positioning and updating the directory or updating the file; a large amount of file reading expenses can be effectively avoided, rapid recovery scanning is realized, a log file list needing to be put in storage for processing is timely positioned, time is saved, and management efficiency is improved; meanwhile, repeated warehousing of warehoused result logs can be avoided, changes of log file information caused by manual misoperation are identified, specific updated log information is accurately obtained, and retrieval accuracy is high.
2. The quick retrieval method based on the large-scale chip test result has no environment-related dependency, does not need to install any dependent software or frame, and has convenient and concise use and high applicability.
Drawings
FIG. 1 is a schematic flow chart of basic modules of the present invention
Detailed Description
The invention is further described below with reference to the following examples:
example (b): a fast search method based on large-scale chip test result, adopt MD5 value check < file name, file byte size > file of format, check whether there is renewal, characterized by that: the method comprises the following steps:
s1, checking a test directory, and checking whether a check file exists in the current test directory or not, wherein if yes, the step of S3 is skipped to be executed, and if not, the step of S2 is executed;
s2, sequencing the log files in the current test directory in an ascending order according to file names, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into a temporary file,. Checksum file, and jumping to S5 for continuation;
s3, sequencing all log files under the current test directory in an ascending order, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into a checksum _ new file, and placing the checksum _ new file under the current test directory;
s4, performing MD5 value verification on the checksum file and the checksum _ new file under the current test directory, comparing whether the MD5 values of the two files are the same, if so, ending the operation, otherwise, continuing to execute the next step;
s5, entering a subdirectory of the current test directory, checking whether a check file exists under the current subdirectory, checking if the check file exists, and jumping to S7 for execution if the check file exists, or jumping to S6 for continuation;
s6, sorting the log files in the current subdirectory in an ascending order according to file names, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into the checklist file, simultaneously caching the file name list into a temporary file to be used as a basis for result processing, and then jumping to S9;
s7, sequencing all log files under the current subdirectory in an ascending manner, simultaneously acquiring the byte number of the log files, forming a group of data list information, writing the data list information into a checklist _ new file, and placing the checklist _ new file under the current subdirectory;
s8, comparing the checklist file and the checklist _ new file in the current subdirectory, checking whether the MD5 values of the two files are the same, if so, continuing to execute the next step, otherwise, acquiring a file name list in the checklist file and the checklist _ new file, sequencing and removing duplication, and circularly checking the file name list with difference in the two files: if the size and the byte number content of the file are the same as each other in the checklist file and the checklist _ new file, the file is not updated; otherwise, the file name list is cached in the temporary file to be used as a basis for result processing, if the update exists;
and S9, judging whether the unprocessed next-level subdirectory exists under the current subdirectory, if so, skipping to execute S5, and otherwise, ending the operation.
The format of the data list information is < filename, byte size of the corresponding log file >.
In S8, sorting and de-duplicating the checklist file and the checklist _ new file in the directory, checking the data list information with the format of < file name, byte size of corresponding log file > in the file, and performing hierarchical retrieval according to the directory and the file, if the directory is not updated in the first level, not performing the file retrieval.
The above-mentioned aspects of the invention are further explained as follows:
the invention is directed to the practical problems that a large number of directories exist and a large number of log files exist in the directories, and when the number of the files is large, a large amount of time is spent on verifying the files by adopting an MD5 value verification mode.
Based on the problems, the invention uses the characteristics of the MD5 value, bypasses the check of the log content, only checks the byte number of the log file, acquires the byte number of the log file under the directory and the subdirectories to form a temporary file, searches the file, compares the temporary files formed by the last search, compares whether the MD5 values of the two files have a difference value, and judges whether the content is updated.
In order to improve the retrieval speed and ensure the retrieval quality, the log files in the directory are sorted in an ascending order according to the names, and the byte number of the files is acquired in the order; a set of list information in such a format of < filename, corresponding byte size > is obtained and placed in a file, an MD5 check is performed on the file and placed under the current directory, named.
It should be noted that, the MD5 is used to check the character size of the file list based on such a consideration: when a log changes, the size of the resulting file changes in most cases, since the log is typically updated in an additional manner. When the characters are not changed and the content of the file is changed, the file is modified by misoperation of a test administrator, so that the analysis of the log result of the file is not influenced, and the file does not need to be repeatedly put in storage at this time. When more information is modified by misoperation of an administrator, the log file information is changed inevitably, the correctness of the test result in the log is possibly influenced, and the method can identify and repeatedly store the log.
If the directory is not retrieved, all files under the directory are directly put in storage in sequence, and the formed information is placed under the current directory and named as checksum; if the directory is retrieved, the information formed here is placed in checksum _ new, whether the logs in the directory are the same or not is compared, and if the logs in the directory are different, whether the update exists in the retrieval file is carried out by entering the directory; otherwise, the retrieval is finished without updating.
When the directory is found to have updated files, the files in the directory are also sorted and the sizes of the files are obtained, the sizes of the characters of the files are compared to be consistent enough, if the sizes of the characters of the files are not consistent, the file names are returned to be cached in the temporary files, and if the sizes of the characters of the files are consistent, the file names are skipped over; and obtaining an updated file list under the directory until the checking of all files in the directory is finished, and returning the updated file list to the user for subsequent processing to be used as a certificate needing to be put into a warehouse for updating.
By carrying out hierarchical processing on the directory, the retrieval time is reduced, and whether the subdirectory of the current directory is updated can be found only by one MD5 verification; if not, then do not need to enter the catalogue to process; similarly, only one time of MD5 value verification is needed to determine whether all the subdirectories in the directory and the next level directories of the subdirectories are updated; if not, no processing is needed; and by means of carrying out hierarchical processing through the directory, unnecessary retrieval is reduced, and the retrieval efficiency is improved.
When the rapid retrieval method based on the large-scale chip test result is adopted, an MD5 value checking technology for a temporary file of a hierarchical and retrieval file size list is provided, list information formed by a log file and the file size is checked by the MD5 value, and the size of a directory and a result log in the directory is retrieved to realize updating check so as to achieve the purpose of rapidly positioning and updating the directory or updating the file;
a large amount of file reading expenses can be effectively avoided, rapid recovery scanning is realized, a log file list needing to be put in storage for processing is timely positioned, time is saved, and management efficiency is improved;
meanwhile, repeated warehousing of the warehoused result logs can be avoided, changes of log file information caused by manual misoperation are identified, the specifically updated log information is accurately obtained, and the retrieval accuracy is high;
the method has no environment-related dependency, does not need to install any dependent software or framework, and is convenient and concise to use and high in applicability.
To facilitate a better understanding of the invention, the terms used herein will be briefly explained as follows:
MD5 value: the file information abstract value is a file information abstract value which can guarantee that two files with different contents have necessarily different abstract values.
The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered in the protection scope of the present invention.

Claims (3)

1. A fast search method based on large-scale chip test result, adopt MD5 value check < file name, file byte size > file of format, check whether there is renewal, characterized by that: the method comprises the following steps:
s1, checking a test directory, and checking whether a check file exists in the current test directory or not, wherein if yes, the step of S3 is skipped to be executed, and if not, the step of S2 is executed;
s2, sequencing the log files in the current test directory in an ascending order according to file names, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into a temporary file,. Checksum file, and jumping to S5 for continuation;
s3, sequencing all log files under the current test directory in an ascending order, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into a checksum _ new file, and placing the checksum _ new file under the current test directory;
s4, performing MD5 value verification on the checksum file and the checksum _ new file under the current test directory, comparing whether the MD5 values of the two files are the same, if so, ending the operation, otherwise, continuing to execute the next step;
s5, entering a subdirectory of the current test directory, checking whether a check file exists in the current subdirectory, checking if the check file exists in the current subdirectory, jumping to S7 for execution if the check file exists in the current subdirectory, and jumping to S6 for continuing if the check file does not exist in the current subdirectory;
s6, sorting the log files in the current subdirectory in an ascending order according to file names, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into the checklist file, simultaneously caching the file name list into a temporary file to be used as a basis for result processing, and then jumping to S9;
s7, sequencing all log files under the current subdirectory in an ascending manner, simultaneously acquiring the byte number of the log files, forming a group of data list information, writing the data list information into a checklist _ new file, and placing the checklist _ new file under the current subdirectory;
s8, comparing the checklist file and the checklist _ new file in the current subdirectory, checking whether the MD5 values of the two files are the same, if so, continuing to execute the next step, otherwise, acquiring a file name list in the checklist file and the checklist _ new file, sequencing and removing duplication, and circularly checking a file name list with difference in the two files: if the size and the byte number content of the file are the same as each other in the checklist file and the checklist _ new file, the file is not updated; otherwise, the file name list is cached in the temporary file to be used as a basis for result processing, if the update exists;
and S9, judging whether the unprocessed next-level subdirectory exists under the current subdirectory, if so, skipping to execute S5, and otherwise, ending the operation.
2. The method for rapidly retrieving based on the large-scale chip test result according to claim 1, wherein: the format of the data list information is < filename, byte size of the corresponding log file >.
3. The large-scale chip test result-based rapid retrieval method according to claim 1, wherein: in S8, sorting and de-duplicating the checklist file and the checklist _ new file in the directory, checking the data list information with the format of < file name, byte size of corresponding log file > in the file, and performing hierarchical retrieval according to the directory and the file, if the directory is not updated in the first level, not performing the file retrieval.
CN201910879802.7A 2019-09-18 2019-09-18 Rapid retrieval method based on large-scale chip test result Active CN112527757B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910879802.7A CN112527757B (en) 2019-09-18 2019-09-18 Rapid retrieval method based on large-scale chip test result

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910879802.7A CN112527757B (en) 2019-09-18 2019-09-18 Rapid retrieval method based on large-scale chip test result

Publications (2)

Publication Number Publication Date
CN112527757A CN112527757A (en) 2021-03-19
CN112527757B true CN112527757B (en) 2022-11-15

Family

ID=74974973

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910879802.7A Active CN112527757B (en) 2019-09-18 2019-09-18 Rapid retrieval method based on large-scale chip test result

Country Status (1)

Country Link
CN (1) CN112527757B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115695504B (en) * 2023-01-03 2023-04-11 东方合智数据科技(广东)有限责任公司 Internet of things platform communication method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100174690A1 (en) * 2009-01-08 2010-07-08 International Business Machines Corporation Method, Apparatus and Computer Program Product for Maintaining File System Client Directory Caches with Parallel Directory Writes
CN109933570A (en) * 2019-03-15 2019-06-25 中山大学 A kind of metadata management method, system and medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100174690A1 (en) * 2009-01-08 2010-07-08 International Business Machines Corporation Method, Apparatus and Computer Program Product for Maintaining File System Client Directory Caches with Parallel Directory Writes
CN109933570A (en) * 2019-03-15 2019-06-25 中山大学 A kind of metadata management method, system and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《Integrated test data decompression and core wrapper design for low-cost system-on-a-chip testing》;P.T. Gonciari 等;《Proceedings. International Test Conference》;20021231;全文 *
《高性能计算机芯片测试技术概述》;梁斌 等;《现代交际》;20161031;全文 *

Also Published As

Publication number Publication date
CN112527757A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
CN110569214B (en) Index construction method and device for log file and electronic equipment
RU2565109C2 (en) Method and apparatus for recovering backup database
US9015214B2 (en) Process of generating a list of files added, changed, or deleted of a file server
CN111400724B (en) Operating system vulnerability detection method, system and medium based on code similarity analysis
CN111459799B (en) Software defect detection model establishing and detecting method and system based on Github
CN110855748B (en) Remote sensing image data automatic standardized processing method, device and medium based on FTP
CN105335246B (en) A kind of program crashing defect self-repairing method based on question and answer web analytics
CN110489701A (en) Extract the method, apparatus and CMS recognition methods of CMS identification feature
CN111881455A (en) Firmware security analysis method and device
CN112328499A (en) Test data generation method, device, equipment and medium
CN110795614A (en) Index automatic optimization method and device
CN112445997A (en) Method and device for extracting CMS multi-version identification feature rule
CN103177022A (en) Method and device of malicious file search
CN112527757B (en) Rapid retrieval method based on large-scale chip test result
CN112068981A (en) Knowledge base-based fault scanning recovery method and system in Linux operating system
CN107590233B (en) File management method and device
CN111460255A (en) Music work information data acquisition and storage method
CN105117462A (en) Sensitive word checking method and device
CN112328379A (en) Application migration method, device, equipment and medium
CN112698861A (en) Source code clone identification method and system
CN117453646A (en) Kernel log combined compression and query method integrating semantics and deep neural network
CN112363904A (en) Log data analysis positioning method and device and computer readable storage medium
KR101688629B1 (en) Method and apparatus for recovery of file system using metadata and data cluster
CN115080114B (en) Application program transplanting processing method, device and medium
CN104714956A (en) Comparison method and device for isomerism record sets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant