CN112527757B

CN112527757B - Rapid retrieval method based on large-scale chip test result

Info

Publication number: CN112527757B
Application number: CN201910879802.7A
Authority: CN
Inventors: 谭坚; 蒋昊辰; 王丽一; 吴臻; 陈磊; 肖旵敏
Original assignee: Wuxi Jiangnan Computing Technology Institute
Current assignee: Wuxi Jiangnan Computing Technology Institute
Priority date: 2019-09-18
Filing date: 2019-09-18
Publication date: 2022-11-15
Anticipated expiration: 2039-09-18
Also published as: CN112527757A

Abstract

The invention discloses a quick retrieval method based on a large-scale chip test result, which comprises the following steps: checking the directory, sequencing log files in the directory in an ascending order according to file names, checking whether check files exist or not, if so, executing the next step, otherwise, sequentially warehousing all files in the directory, and writing a data list into the check files; writing a data list obtained from the directory into the checksum _ new file; reading a check file in the directory; comparing MD5 values of check files, check sum and check sum _ new in the directory, judging whether the MD5 values of the two files are the same, if so, ending the operation; according to the method, the files in the directory are layered and retrieved, so that the purpose of quickly positioning and updating the directory or updating the files is achieved, and the detection efficiency is improved.

Description

Rapid retrieval method based on large-scale chip test result

Technical Field

The invention belongs to the technical field of computer processor testing, and particularly relates to a quick retrieval method based on a large-scale chip test result.

Background

In the large-scale chip test and verification process, a large amount of log files are generated, and in order to ensure the smooth progress of the test, the log needs to be quickly and efficiently retrieved and analyzed, and the summary log needs to be put into a warehouse. In the process of processing a large amount of logs, to increase the speed, a new log name must be quickly located, and only a new log is processed. However, in the existing test results, directories are often created according to resource information or dates, result logs generated in a certain class or a certain day are stored in corresponding directories, a huge number of test directories are inevitably generated as time goes on, and a large number of test result logs are also stored in the test directories.

The reclamation and scanning of test results is typically delayed and typically performed by a tester after the current resource test is completed. In order to avoid repetition, the judgment is carried out before each warehousing: whether a new log is generated under each directory, whether the directory is a newly generated log directory, and whether the content of the log is updated. Because the result directories are many and huge in the testing process, if the directory and the file are checked one by one, a lot of time is spent on the checking and comparing work, a feeling of needing a long time for delaying is brought to a testing user, and the use is seriously influenced, so that a technical means for quickly checking the result directories and reading the log files needing to be processed in the directories is urgently needed.

Disclosure of Invention

The invention aims to provide a quick retrieval method based on a large-scale chip test result, which achieves the purpose of quickly positioning and updating a directory or updating a file and improves the detection efficiency by grading the files in the directory and retrieving the MD5 value of the temporary file in a list.

In order to achieve the purpose, the invention adopts the technical scheme that: a fast search method based on large-scale chip test result, adopt MD5 value check < file name, file byte size > file of format, check whether there is renewal, characterized by that: the method comprises the following steps:

s1, checking a test directory, and checking whether a check file exists in the current test directory or not, wherein if the check file exists in the current test directory, the step of S3 is skipped to be executed, and if not, the step of S2 is executed;

s2, sequencing the log files in the current test directory in an ascending order according to file names, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into a temporary file,. Checksum file, and jumping to S5 for continuation;

s3, sequencing all log files under the current test directory in an ascending order, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into a checksum _ new file, and placing the checksum _ new file under the current test directory;

s4, performing MD5 value verification on the checksum file and the checksum _ new file under the current test directory, comparing whether the MD5 values of the two files are the same, if so, ending the operation, and otherwise, continuing to execute the next step;

s5, entering a subdirectory of the current test directory, checking whether a check file exists in the current subdirectory, checking if the check file exists in the current subdirectory, jumping to S7 for execution if the check file exists in the current subdirectory, and jumping to S6 for continuing if the check file does not exist in the current subdirectory;

s6, sorting the log files in the current subdirectory in an ascending order according to file names, simultaneously acquiring the byte number of the log files to form a group of data list information, writing the data list information into the checklist file, simultaneously caching the file name list into a temporary file to be used as a basis for result processing, and then jumping to S9;

s7, sequencing all log files under the current subdirectory in an ascending manner, simultaneously acquiring the byte number of the log files, forming a group of data list information, writing the data list information into a checklist _ new file, and placing the checklist _ new file under the current subdirectory;

s8, comparing the checklist file and the checklist _ new file in the current subdirectory, checking whether the MD5 values of the two files are the same, if so, continuing to execute the next step, otherwise, acquiring a file name list in the checklist file and the checklist _ new file, sequencing and removing duplication, and circularly checking a file name list with difference in the two files: if the size and the byte number content of the file are the same as each other in the checklist file and the checklist _ new file, the file is not updated; otherwise, the file name list is cached in the temporary file to be used as a basis for result processing, if the update exists;

and S9, judging whether the unprocessed next-level subdirectory exists under the current subdirectory, if so, skipping to execute S5, and otherwise, ending the operation.

The technical scheme of further improvement in the technical scheme is as follows:

1. in the above solution, the format of the data list information is < filename, byte size of the corresponding log file >.

2. In the above scheme, in S8, the checklist file and the checklist _ new file in the directory are sorted and deduplicated, data list information in the file with a format of < file name, byte size of corresponding log file > is checked, hierarchical retrieval is performed according to the directory and the file, and if the directory is found to be not updated at the first level, file retrieval is not performed.

Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:

1. the invention provides a method for quickly searching based on large-scale chip test results, and provides a technology for checking an MD5 value of a temporary file of a hierarchical and searched file size list, wherein the MD5 value is adopted to check list information formed by a log file and the file size, and the size of a directory and a result log in the directory is searched to realize updating check so as to achieve the purpose of quickly positioning and updating the directory or updating the file; a large amount of file reading expenses can be effectively avoided, rapid recovery scanning is realized, a log file list needing to be put in storage for processing is timely positioned, time is saved, and management efficiency is improved; meanwhile, repeated warehousing of warehoused result logs can be avoided, changes of log file information caused by manual misoperation are identified, specific updated log information is accurately obtained, and retrieval accuracy is high.

2. The quick retrieval method based on the large-scale chip test result has no environment-related dependency, does not need to install any dependent software or frame, and has convenient and concise use and high applicability.

Drawings

FIG. 1 is a schematic flow chart of basic modules of the present invention

Detailed Description

The invention is further described below with reference to the following examples:

example (b): a fast search method based on large-scale chip test result, adopt MD5 value check < file name, file byte size > file of format, check whether there is renewal, characterized by that: the method comprises the following steps:

s1, checking a test directory, and checking whether a check file exists in the current test directory or not, wherein if yes, the step of S3 is skipped to be executed, and if not, the step of S2 is executed;

s4, performing MD5 value verification on the checksum file and the checksum _ new file under the current test directory, comparing whether the MD5 values of the two files are the same, if so, ending the operation, otherwise, continuing to execute the next step;

s5, entering a subdirectory of the current test directory, checking whether a check file exists under the current subdirectory, checking if the check file exists, and jumping to S7 for execution if the check file exists, or jumping to S6 for continuation;

s8, comparing the checklist file and the checklist _ new file in the current subdirectory, checking whether the MD5 values of the two files are the same, if so, continuing to execute the next step, otherwise, acquiring a file name list in the checklist file and the checklist _ new file, sequencing and removing duplication, and circularly checking the file name list with difference in the two files: if the size and the byte number content of the file are the same as each other in the checklist file and the checklist _ new file, the file is not updated; otherwise, the file name list is cached in the temporary file to be used as a basis for result processing, if the update exists;

The format of the data list information is < filename, byte size of the corresponding log file >.

In S8, sorting and de-duplicating the checklist file and the checklist _ new file in the directory, checking the data list information with the format of < file name, byte size of corresponding log file > in the file, and performing hierarchical retrieval according to the directory and the file, if the directory is not updated in the first level, not performing the file retrieval.

The above-mentioned aspects of the invention are further explained as follows:

the invention is directed to the practical problems that a large number of directories exist and a large number of log files exist in the directories, and when the number of the files is large, a large amount of time is spent on verifying the files by adopting an MD5 value verification mode.

Based on the problems, the invention uses the characteristics of the MD5 value, bypasses the check of the log content, only checks the byte number of the log file, acquires the byte number of the log file under the directory and the subdirectories to form a temporary file, searches the file, compares the temporary files formed by the last search, compares whether the MD5 values of the two files have a difference value, and judges whether the content is updated.

In order to improve the retrieval speed and ensure the retrieval quality, the log files in the directory are sorted in an ascending order according to the names, and the byte number of the files is acquired in the order; a set of list information in such a format of < filename, corresponding byte size > is obtained and placed in a file, an MD5 check is performed on the file and placed under the current directory, named.

It should be noted that, the MD5 is used to check the character size of the file list based on such a consideration: when a log changes, the size of the resulting file changes in most cases, since the log is typically updated in an additional manner. When the characters are not changed and the content of the file is changed, the file is modified by misoperation of a test administrator, so that the analysis of the log result of the file is not influenced, and the file does not need to be repeatedly put in storage at this time. When more information is modified by misoperation of an administrator, the log file information is changed inevitably, the correctness of the test result in the log is possibly influenced, and the method can identify and repeatedly store the log.

If the directory is not retrieved, all files under the directory are directly put in storage in sequence, and the formed information is placed under the current directory and named as checksum; if the directory is retrieved, the information formed here is placed in checksum _ new, whether the logs in the directory are the same or not is compared, and if the logs in the directory are different, whether the update exists in the retrieval file is carried out by entering the directory; otherwise, the retrieval is finished without updating.

When the directory is found to have updated files, the files in the directory are also sorted and the sizes of the files are obtained, the sizes of the characters of the files are compared to be consistent enough, if the sizes of the characters of the files are not consistent, the file names are returned to be cached in the temporary files, and if the sizes of the characters of the files are consistent, the file names are skipped over; and obtaining an updated file list under the directory until the checking of all files in the directory is finished, and returning the updated file list to the user for subsequent processing to be used as a certificate needing to be put into a warehouse for updating.

By carrying out hierarchical processing on the directory, the retrieval time is reduced, and whether the subdirectory of the current directory is updated can be found only by one MD5 verification; if not, then do not need to enter the catalogue to process; similarly, only one time of MD5 value verification is needed to determine whether all the subdirectories in the directory and the next level directories of the subdirectories are updated; if not, no processing is needed; and by means of carrying out hierarchical processing through the directory, unnecessary retrieval is reduced, and the retrieval efficiency is improved.

When the rapid retrieval method based on the large-scale chip test result is adopted, an MD5 value checking technology for a temporary file of a hierarchical and retrieval file size list is provided, list information formed by a log file and the file size is checked by the MD5 value, and the size of a directory and a result log in the directory is retrieved to realize updating check so as to achieve the purpose of rapidly positioning and updating the directory or updating the file;

a large amount of file reading expenses can be effectively avoided, rapid recovery scanning is realized, a log file list needing to be put in storage for processing is timely positioned, time is saved, and management efficiency is improved;

meanwhile, repeated warehousing of the warehoused result logs can be avoided, changes of log file information caused by manual misoperation are identified, the specifically updated log information is accurately obtained, and the retrieval accuracy is high;

the method has no environment-related dependency, does not need to install any dependent software or framework, and is convenient and concise to use and high in applicability.

To facilitate a better understanding of the invention, the terms used herein will be briefly explained as follows:

MD5 value: the file information abstract value is a file information abstract value which can guarantee that two files with different contents have necessarily different abstract values.

The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered in the protection scope of the present invention.

Claims

1. A fast search method based on large-scale chip test result, adopt MD5 value check < file name, file byte size > file of format, check whether there is renewal, characterized by that: the method comprises the following steps:

2. The method for rapidly retrieving based on the large-scale chip test result according to claim 1, wherein: the format of the data list information is < filename, byte size of the corresponding log file >.

3. The large-scale chip test result-based rapid retrieval method according to claim 1, wherein: in S8, sorting and de-duplicating the checklist file and the checklist _ new file in the directory, checking the data list information with the format of < file name, byte size of corresponding log file > in the file, and performing hierarchical retrieval according to the directory and the file, if the directory is not updated in the first level, not performing the file retrieval.