CN112416787A - JAVA-based project source code scanning analysis method, system and storage medium - Google Patents

JAVA-based project source code scanning analysis method, system and storage medium Download PDF

Info

Publication number
CN112416787A
CN112416787A CN202011362102.XA CN202011362102A CN112416787A CN 112416787 A CN112416787 A CN 112416787A CN 202011362102 A CN202011362102 A CN 202011362102A CN 112416787 A CN112416787 A CN 112416787A
Authority
CN
China
Prior art keywords
file
source code
project source
scanning
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011362102.XA
Other languages
Chinese (zh)
Inventor
刘鑫宇
刘浩
冯辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202011362102.XA priority Critical patent/CN112416787A/en
Publication of CN112416787A publication Critical patent/CN112416787A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Stored Programmes (AREA)

Abstract

The invention relates to big data processing technology, and discloses a JAVA-based project source code scanning analysis method, which comprises the steps of scanning project source codes of a project source code file in a multi-thread parallel mode through a front-end thread, a back-end thread and an SQL thread to acquire required key information; carrying out data structure processing on the acquired key information to generate a data object, and carrying out persistent storage on the generated data object; and compiling and formulating corresponding data statistical logic for the database data and the JSON file which are persistently stored according to different statistical analysis requirements, and generating an analysis result. The invention also relates to a block chain technology, data is stored in the block chain, the project source code analysis method can realize the association of the database, can also obtain enough project information, and displays the analysis result in a multi-form way, thereby achieving the technical effect of improving the source code analysis efficiency.

Description

JAVA-based project source code scanning analysis method, system and storage medium
Technical Field
The invention relates to big data processing, in particular to a JAVA-based project source code scanning analysis method, a JAVA-based project source code scanning analysis system and a JAVA-based project source code scanning analysis storage medium.
Background
In a project with larger code flow, if project code analysis is carried out, the Java project code analysis and the checking call relation can be inquired and checked only by loading a development environment IDE and completing configuration and then compiling the IDE; if the quality checking is carried out by means of a third-party tool (such as a firm tool), the checking is only limited to the checking of the encoding legality and the compliance of the independent codes, and dynamic statistics of called conditions of the classes, the methods and the variables of the versions are not included. The comprehensiveness of the investigation leads to the fact that the usability analysis result of the project code has a large limitation.
The existing code scanning tool has the following disadvantages:
1) the method can only scan according to byte codes or source codes, and cannot analyze, record and output the structure of the codes;
2) only a single class of the project codes can be subjected to compliance judgment, the associated projects cannot be subjected to linkage judgment, and a relation map cannot be generated for the whole project codes.
Therefore, a code analysis method that can perform a comprehensive scan analysis on a code is needed.
Disclosure of Invention
The invention provides a JAVA-based project source code scanning analysis method, a JAVA-based project source code scanning analysis system and a computer-readable storage medium, which mainly solve the problem that code analysis is not associated with a database.
In order to achieve the above object, the present invention provides a JAVA-based project source code scanning analysis method, applied to an electronic device, the method comprising:
acquiring a project source code file;
multithreading parallel scanning the project source code of the project source code file to obtain the required key information; the multithreading comprises a front-end thread, a back-end thread and an SQL thread, and the key information at least comprises a clicking or jumping event component number, event content description information, constant class information in a file, an operation data table, a field and operation type information;
carrying out data structure processing on the acquired key information to generate a data object, and carrying out persistent storage on the generated data object; wherein the persistent storage comprises direct storage as database data and storage as JSON files;
and compiling and formulating corresponding data statistical logic for the database data and the JSON file which are persistently stored according to different statistical analysis requirements, and generating an analysis result.
Further, optionally, after the step of obtaining the project source code file, the method further includes:
judging whether a network or a database exists;
if the fact that no database or network exists is determined, traversing, screening and matching are conducted after the JSON files cached in the local area are loaded according to a preset dependency relation execution sequence;
regenerating the data object of the B + tree structure, and persistently storing the regenerated data object of the B + tree structure as a JSON file;
and compiling the stored JSON file to make corresponding data statistical logic according to different statistical analysis requirements, and generating an analysis result.
Further, optionally, the preset execution sequence of the dependency relationship sequentially includes a front-end HTML/JS/JSP file, a rear-end JAVA file, and an SQL file.
Further, optionally, the step of multithread parallel scanning the project source codes of the project source code file includes:
performing recursive traversal on the acquired project source code file, and dynamically identifying the type of the project source code file;
and scanning according to corresponding rules according to the type of the project source code file.
Further, optionally, the step of writing and formulating corresponding data statistical logic for the persistently stored database data and JSON file according to different statistical analysis requirements further includes a file verification program before the step of generating the analysis result; the file verification program includes:
adding a file identifier to the stored JSON file; the file identification comprises a beginning symbol, an end symbol and a byte number;
checking whether the JSON file corresponds to the current file to be processed or not according to the file identification;
and if the JSON file is qualified, compiling the JSON file to make corresponding data statistical logic and generating an analysis result.
In order to achieve the above object, the present invention further provides a JAVA-based project source code scanning analysis system, which includes a project source code file obtaining unit, a multithreading scanning unit, a persistent storage unit, and an analysis result generating unit; the project source code file acquisition unit is used for acquiring a project source code file; the multithreading scanning unit is used for multithreading parallel scanning of the project source codes of the project source code file to acquire required key information; the multithreading comprises a front-end thread, a back-end thread and an SQL thread, and the key information at least comprises a clicking or jumping event component number, event content description information, constant class information in a file, an operation data table, a field and operation type information; the persistent storage unit is used for carrying out data structure processing on the acquired key information to generate a data object and carrying out persistent storage on the generated data object; wherein the persistent storage comprises direct storage as database data and storage as JSON files; and the analysis result generation unit is used for compiling and formulating corresponding data statistical logic for the database data and the JSON file which are persistently stored according to different statistical analysis requirements to generate an analysis result.
Further, optionally, the multithreading scanning unit includes a project source code file traversal module, a file type identification module, a multithreading parallel scanning module, and a key information acquisition module; the project source code file traversal module is used for performing recursive traversal on the acquired project source code file; the file type identification module is used for dynamically identifying the type of the traversed project source code file; the multithreading parallel scanning module is used for scanning corresponding rules according to the type of the project source code file; and the key information acquisition module is used for acquiring key information according to the scanning result.
Further, optionally, the analysis result generating unit includes a file checking module, a data statistical logic formulating module, and an analysis result generating module; the file verification module is used for verifying the file,
the JSON file processing method comprises the steps of judging whether a current JSON file corresponds to a file to be processed or not;
and the data statistical logic formulating module is used for compiling and formulating corresponding data statistical logic for the JSON file according to different statistical analysis requirements and the analysis result generating module is used for generating an analysis result.
To achieve the above object, the present invention also provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores a program executable by the at least one processor, the program being executable by the at least one processor to enable the at least one processor to perform the JAVA based project source code scan analysis method as described above.
In addition, to achieve the above object, the present invention further provides a computer-readable storage medium storing a computer program, wherein the computer program is configured to implement the steps of the JAVA-based project source code scan analysis method when being executed by a processor.
According to the JAVA-based project source code scanning analysis method, system, electronic device and computer readable storage medium, provided by the invention, project source codes of a project source code file are scanned in parallel in a multi-thread mode through a front-end thread, a rear-end thread and an SQL thread, and required key information is acquired; carrying out data structure processing on the acquired key information to generate a data object, and carrying out persistent storage on the generated data object; and compiling and formulating corresponding data statistical logic for the database data and the JSON file which are persistently stored according to different statistical analysis requirements, and generating an analysis result. The beneficial effects are as follows:
1) the project source code scanning analysis method does not need to be compiled into a class byte code file and then is used for analyzing the code structure and the code quality; the java source code file can be directly analyzed and scanned without depending on an IDE integrated development environment, a byte file does not need to be compiled, an environment is not required to be independently established for a project, the source code can be obtained and then scanned and analyzed, and the effect that the source code analyzing and scanning function is not limited by code quality, vulnerability scanning and vulnerability is realized;
2) the project source code scanning analysis method can simultaneously analyze the front-end page code, the background JAVA code and the SQL (structured query language) file of the persistent layer, realize the dynamic analysis of the scanning result and improve the code analysis efficiency;
3) the project source code scanning analysis method can scan codes and code hierarchical structures of projects, can effectively analyze code complexity, code calling relations, code operation database table fields, and displays analysis results in visual modes such as a data table;
4) the project source code scanning and analyzing method provided by the invention overcomes the defects that a code scanning tool is not associated with a database and a code analysis result is displayed in a file form in the prior art, not only can be associated with the database, but also enough project information can be obtained, and a multi-form display mode such as a data report is provided.
Drawings
FIG. 1 is a flowchart illustrating a JAVA-based project source scan analysis method according to an embodiment of the present invention;
FIG. 2 is a data summary table diagram of the scanning analysis result of the JAVA-based project source code scanning analysis method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the logical structure of the JAVA-based project source code scanning analysis system of the present invention;
FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In order to improve the coding efficiency of a user, the invention provides a JAVA-based project source code scanning analysis method. Fig. 1 shows a flow of an embodiment of the JAVA-based project source code scan analysis method of the present invention. Referring to FIG. 1:
the dependency relationships of the front end and the rear end of a project are connected in series through scanning, all interface information of the front end of Web application, which occurs in database interaction, is associated with the fields of an actual database, the use condition of the live fields of the database table of the system is analyzed, invalid codes in the system are analyzed, and then the omnibearing scanning analysis of project files is realized. Therefore, the problems of environment dependence and the like of the code to be monitored are solved, and the efficiency of related development is improved.
It should be noted that, according to the present invention, a JAVA-based project source scan analysis method, specifically, the JAVA-based project source scan analysis method includes the following steps S110 to S140:
and S110, acquiring a project source code file.
It should be noted that, in the JAVA-based project source code analysis method, the scan engine is written in JAVA language, and static scan of HTML, JS, JSP, JAVA, MyBatis XML files is realized. Accordingly, the types of project source code in the project source code file include pre-system code, post-system code, back-end code, and persistence layer code.
The specific obtaining process comprises the steps of obtaining an actual path of a current project and then obtaining project configuration file information. Acquiring an actual path of a current project, such as/D:/jboss-4.2.2. GA/server/default/hp.war; and then the corresponding project configuration file information (namely the project source code file) is obtained through the actual path.
S120, multithreading parallel scanning of the project source codes of the project source code file is performed to obtain required key information; the multithreading comprises a front-end thread, a back-end thread and an SQL thread, and the key information at least comprises a clicking or jumping event component number, event content description information, constant class information in a file, an operation data table, a field and operation type information.
And multithreading parallel scanning of the project source codes of the project source code file is realized by a scanning process through a static code analysis technology, and then key information of the file is obtained through scanning. The key information comprises a clicking or jumping event component number, event content description data, all constant classes in a file, an operation data table, a field and operation type information. The key information is obtained by scanning the project source code of the file through the following static code analysis technology.
The static code analysis technology is a code analysis technology which scans program codes through lexical analysis, syntactic analysis, control flow analysis, data flow analysis and the like and verifies whether the codes meet the indexes of normalization, safety, reliability, maintainability and the like.
The static code analysis includes: syntactic analysis, lexical analysis, abstract syntax tree analysis, semantic analysis control flow analysis, data flow analysis, and taint analysis.
Specifically, lexical analysis reads each character of a source code from left to right, scans a character stream constituting a program, and generates a related symbol list by converting the source code into an equivalent symbol (Token) stream using a regular expression matching method.
And (5) analyzing the grammar, judging whether the structure of the source program is correct, and sorting the related symbols into a grammar tree by using the context-free grammar.
And (4) abstract syntax tree analysis, wherein the program is organized into a tree structure, and nodes in the tree represent related codes in the program. A JAVA Parser abstract syntax tree generation tool based on JAVA development is used herein.
Semantic analysis, which examines the context-related properties of the source program with correct structure.
Analyzing the control flow to generate a directed control flow graph, wherein nodes represent basic code blocks, directed edges among the nodes represent control flow paths, and reverse edges represent loops possibly existing; and a function call relation graph can be generated to represent the call relation among the functions.
And analyzing the data flow, traversing the control flow graph, recording initialization points and reference points of variables, and storing related data information of the slice.
And the taint analysis, which judges which variables in the source code are possibly attacked based on the data flow graph, is a key for verifying program input and identifying code defects.
The step of multithread parallel scanning of project source codes of the project source code file comprises the following steps: s121, performing recursive traversal on the acquired project source code file, and dynamically identifying the type of the project source code file; and S122, scanning according to corresponding rules according to the type of the project source code file.
Specifically, all files to be scanned are divided into three types of files, namely a front end file type, a back end file type and an SQL file type, by recursively traversing file information of the files to be scanned in a source code path folder; and then, the front end, the back end and the SQL (namely a front end file (HTML/JS/JSP file), a back end file java file and an SQL file xmlMapper file) are scanned in parallel without mutual interference.
In step S121, a recursive search is performed on the document first, and all documents to be scanned are retrieved for scanning analysis. The recursion relationship is (F (n) ═ F (n-2) + F (n-1)), and the recursion exit is F (1) ═ 1; and the retrieved objects are all the files to be scanned in the source path folder. It should be noted that the recursive search described herein refers to a method of querying fields of scanned system database tables or front-end pages involved in operations by means of reverse recursive query.
And in step S122, scanning according to the corresponding rule is performed according to the type of the project source code file. That is, the scanning is performed sequentially according to the order of the item root directory. That is, for three file types scanned in parallel, each file type is scanned sequentially for the order of its root directory. That is, different scan processing is performed for a file with a different suffix name, and if the suffix is JS, front-end page processing is performed, and if the suffix is Java, back-end code processing is performed.
In step S122, after the corresponding rule is scanned according to the type of the project source code file, key information is acquired. The key information described herein includes the relevant files, classes, methods, and also data tables and fields for SQL operations.
Specifically, the different types of files to be scanned include HTML files, JS/JSP file scans, JAVA file scans, and Mybatis mapper XML file scans.
That is, the static code analysis of the corresponding rule is performed according to the type of the project source code file. And for scanning the HTML file and the JS/JSP file, the scanning comprises HTML file scanning and JS/JSP file scanning.
The HTML file scanning is used for acquiring information such as all click or jump event building numbers, event description and the like. Specifically, all interaction event components (such as click, href and the like) in the HTML file and all JS file information referenced in the HTML file are scanned and acquired through the Jsoup tool.
For JS/JSP file scanning, acquiring all click or jump event information (URL interacting with the background, input and output parameter information); specifically, a JS code is scanned through a preset regular expression and a script Engine of java; and all event information (method names, URL information of background interaction and the like) interacting with the background in the JSP in batch is obtained through the regular expression.
For JAVA file scanning; firstly, java codes are scanned through a syntax tree, then all constant classes and enumeration information in a project are respectively obtained, and methods in all the class information and key information in the methods are obtained simultaneously. Specifically, the java source code is statically scanned by the JavaParser scanning engine, and the source code is entered and scanned in layers (class → method body content); according to the analysis of the morphology and the grammar of the java and the collocation of the regular expressions, the relation information of the weight of the java method, such as member variables, static method calling, private method calling, interface calling and the like is obtained.
For the Mybatis mapper XML file scanning, firstly, scanning the XML file, and acquiring SQLID information and a corresponding SQL statement block; then, the SQL code base is scanned to obtain the data table information, the field information and the operation type information of the operation. Specifically, according to an open source Jdom component of JAVA, scanning an XML file; acquiring all SQLID information; acquiring all SQL information in the response SQLID; because reference relations exist among the SQLIDs, all the referenced SQL is merged into the SQL block of the same SQLID; SQL file scanning is carried out through an SQLParser open source tool; and obtaining information such as a data table, a data field, an operation type and the like of the current SQLID operation.
The scanning project is divided into three parallel threads of a front-end (namely HTML/JS/JSP) file, a background java file and an SQL (SQL Mybatis xmlMapper) file for scanning, and different rules are identified and scanned aiming at different types of files to be scanned, namely, the files are scanned in a multi-thread parallel scanning mode, so that the scanning speed is improved.
In summary, the three types of files (front-end HTML/JS/JSP file scanning, back-end JAVA file scanning, SQL Mybatis mapper XML file scanning) are used as three parallel threads, the dependency relationships between the front and back ends of the project are connected in series through scanning, all interface information of the WEB application front end, which occurs in database interaction, is associated with the fields of the actual database, the usage of live fields of the database table of the system is analyzed, the invalid codes in the system are analyzed, and thus, the omnibearing scanning analysis of the project files is realized.
S130, performing data structure processing on the acquired key information to generate a data object, and performing persistent storage on the generated data object; wherein the persistent storage comprises direct storage as database data and storage as JSON files.
In other words, after the file is scanned, the values of all related constant classes and the information of the enumerated classes are stored persistently in a data object mode; and the specific data object is a B + tree. That is, the result of content retrieval analysis in the file is stored in the corresponding storage object by means of B + tree.
Namely, the scanned project code is processed to obtain key information, and persistent storage is performed, wherein the persistent storage includes the following two ways:
1. the data are directly stored as database data, and the advantage of the database storage is that reports and analysis data can be generated dynamically for the data at the later stage; namely, the existing oracle database store;
2. the storage is a JSON file, and the method has the advantages that the method can be executed by separating from a database, avoids special scenes, and can also execute scanning processing under the condition of no database. The JSON file is stored to supplement the condition that the environment has no database or network conditions are not satisfied, and the single-machine scanning analysis is realized.
Specifically, the scanned result is stored in a storage or a JOSN storage file by the system scanning engine, so that in step S140, according to different statistical analysis requirements, the JSON file is compiled and formulated into a corresponding data statistical logic, and an analysis result is generated.
In the specific implementation process, the characteristic value is added to the result of each piece changing scanning for the front and back correlation use. Specifically, the feature value is the relative path in the item according to the file + event number as the associated foreign key. This is a data structuring process and the generated data objects must include feature values to increase the consistency between each operation in the overall scanning process. The data object JSON file is stored as the prior art, but the structure of the data information in the specific JSON file is personally drawn up according to the system requirements.
In a specific embodiment, after the step of obtaining the project source code file, the method further comprises the step of judging whether a database or a network exists; if the fact that no database or network exists is determined, traversing, screening and matching are conducted after the JSON files cached in the local area are loaded according to a preset dependency relation execution sequence; regenerating the data object of the B + tree structure, and persistently storing the regenerated data object of the B + tree structure as a JSON file; and compiling and formulating corresponding data statistical logic for the stored JSON file according to different statistical analysis requirements to generate an analysis result.
That is, when there is no database or the network condition is poor, after various types of data are scanned and identified, the corresponding preset execution sequence of the dependency relationship is obtained one by one according to the traversal mode of the preset execution sequence. The execution sequence is front HTML/JS/JSP file, back JAVA file and SQL file (HTML/JS/HTML → JAVA class → method → mybatis mapper Xml file → SQL → database table and field). And all the data objects are object JSON files cached locally, are loaded into a memory, are subjected to traversal screening, are matched, and regenerate the data object objects of the B + tree of the overall structure. And (4) persisting the finally generated integral relation tree to a local environment and storing the integral relation tree in a file form.
And S140, compiling and formulating corresponding data statistical logic for the database data and the JSON file which are stored persistently according to different statistical analysis requirements, and generating an analysis result.
Specifically, after the data is stored in the database in a persistent manner, the analysis result can be generated according to the relational data table in the database. The analysis result can comprise visual reports such as an interaction diagram, a reference relation diagram and the like at the front end and the back end of the whole system.
Or after the data is stored as the JSON file in a persistent mode, directional query can be conducted according to different analysis requirements, and a corresponding report or a corresponding relation can be generated. And generating a report through the SQL Server.
In a particular embodiment of the present invention,
the step of compiling and formulating corresponding data statistical logic for the database data and the JSON file which are stored persistently according to different statistical analysis requirements and generating an analysis result further comprises a file verification program before;
the file verification program includes: adding a file identifier to the stored JSON file; the file identification comprises a beginning symbol, an end symbol and a byte number; checking whether the JSON file corresponds to the current file to be processed or not according to the file identification; and if the JSON file is qualified, compiling the JSON file to make corresponding data statistical logic and generating an analysis result.
Specifically, when the stored file is scanned, there may be a situation of being damaged, so when the file is analyzed, there is a link of checking the JSON file.
The checking mode is as follows: after the scanning is finished, start/end, similar beginning and end symbols are added to the head and the tail of the generated file, the number of bytes is recorded, and the bytes are recorded in the head of the file.
When the scanning result needs to be analyzed, firstly, the file beginning mark is obtained, the byte number is obtained, the file beginning mark is compared with the actual file to be processed, and the loading analysis processing is carried out after the comparison is correct
The file format is as follows:
Figure BDA0002804282990000101
Figure BDA0002804282990000111
taking the AMC system based on the PAFA framework as an example, the JAVA-based project source code scanning analysis method can be applied to analyzing the AMC system based on the PAFA framework. A data table and a relational tree diagram are obtained by applying the source code scanning analysis system of the scheme.
Fig. 2 shows an exemplary scan analysis result of the JAVA-based project source code scan analysis method of the present invention. FIG. 2 is a data table diagram of the scan analysis result of the embodiment of the JAVA-based project source code scan analysis method of the present invention; as shown with reference to figure 2 of the drawings,
the use condition of the related information can be clearly seen through scanning and analyzing a part of data summary table; and the front-end and back-end relationship of the AMC system based on the PAFA framework can be intuitively obtained by scanning the front-end and back-end relationship tree graph of the analysis result.
Further, the database tables or fields of the scanned system in the scanning analysis result can be queried by using a reverse recursive query, and the front-end pages or functions operate on the fields. And analyzing the scanning analysis result by database SQL recursive query, and judging all database table fields, source code classes, methods, variables and the like which are not applied to the scanning analysis result, so that the system can be optimized correspondingly.
In summary, the project source code scanning analysis method of the present invention can combine the scanning system and the database, and can implement front-end analysis, background analysis and SQL analysis, i.e. scanning from the front-end request to the database layer, thereby greatly improving the scanning efficiency.
FIG. 3 is a schematic diagram of the logical structure of the JAVA-based project source code scanning analysis system of the present invention; as shown with reference to figure 3 of the drawings,
a JAVA-based project source code scanning and analyzing system 300 comprises a project source code file acquisition unit 310, a multi-thread scanning unit 320, a persistent storage unit 330 and an analysis result generation unit 340. The project source code file obtaining unit 310 is configured to obtain a project source code file; the multithreading scanning unit 320 is used for multithreading parallel scanning of the project source codes of the project source code file so as to acquire required key information; the multithreading comprises a front-end thread, a back-end thread and an SQL thread, and the key information at least comprises a clicking or jumping event component number, event content description information, constant class information in a file, an operation data table, a field and operation type information; a persistent storage unit 330, configured to perform data structure processing on the acquired key information to generate a data object, and perform persistent storage on the generated data object; wherein the persistent storage comprises direct storage as database data and storage as JSON files; and the analysis result generating unit 340 is configured to compile and formulate corresponding data statistical logic for the persistently stored database data and the JSON file according to different statistical analysis requirements, and generate an analysis result.
In a specific embodiment, the multi-thread scanning unit 320 includes an item source code file traversing module 321, a file type identifying module 322, a multi-thread parallel scanning module 323, and a key information obtaining module 324.
A project source code file traversal module 321, configured to perform recursive traversal on the obtained project source code file; the file type identification module 322 is configured to dynamically identify the type of the item source code file from the traversed item source code file; the multithreading parallel scanning module 323 is used for scanning corresponding rules according to the type of the project source code file; the key information obtaining module 324 is configured to obtain key information according to a scanning result.
In a specific embodiment, the analysis result generating unit 340 includes a file checking module 341, a data statistics logic formulation module 342, and an analysis result generating module 343.
The file checking module 341 is configured to determine whether the current JSON file corresponds to a file to be processed;
the data statistical logic formulating module 342 is configured to compile and formulate a corresponding data statistical logic for the JSON file according to different statistical analysis requirements;
the analysis result generating module 343 is configured to generate an analysis result. The analysis effect of the JAVA-based project source code scanning analysis system is evaluated in three dimensions:
firstly, the missing report rate and the false report rate are improved; by means of a parallel scanning analysis mode, code scanning of the database and each layer is achieved, and therefore the missing report rate and the false report rate are greatly reduced.
Secondly, the expansion is easy and the customization is carried out according to the specific requirements of the user; the code item scanning tool of the scheme does not need to be compiled into a class bytecode file and then carries out code structure and code quality analysis. The code scanning tool can directly analyze and scan java source code files, does not depend on an IDE integrated development environment, does not need to compile byte files and the like, does not need to independently build an environment for a project, can perform scanning analysis after acquiring the source codes, and can improve the usability and the expansibility of a system by various storage modes.
Thirdly, the time and the resource occupation required by the analysis are required; if the time required by one-time scanning is too long or the occupied memory resources are too large, the integration with daily development work and processes is difficult, and further the purpose of improving the development efficiency cannot be achieved.
The invention provides a JAVA-based project source code scanning analysis method, which is applied to an electronic device 4.
Fig. 4 illustrates an application environment of an embodiment of a JAVA based project source code scan analysis method according to the present invention.
Referring to fig. 4, in the present embodiment, the electronic device 4 may be a terminal device having an arithmetic function, such as a server, a smart phone, a tablet computer, a portable computer, or a desktop computer.
The electronic device 4 includes: a processor 42, a memory 41, a communication bus 43, and a network interface 45.
The memory 41 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory 41, and the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 4, such as a hard disk of the electronic device 4. In other embodiments, the readable storage medium may also be an external memory 41 of the electronic device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the electronic device 5.
In the present embodiment, the readable storage medium of the memory 41 is generally used for storing the JAVA-based project source code scanning analysis program 40 and the like installed in the electronic device 4. The memory 41 may also be used to temporarily store data that has been output or is to be output.
The processor 42, which in some embodiments may be a Central Processing Unit (CPU), microprocessor or other data Processing chip, is configured to run program code stored in the memory 41 or process data, such as executing the JAVA-based project source code scan analysis program 40.
The communication bus 43 is used to realize connection communication between these components.
The network interface 44 may optionally include a standard wired interface, a wireless interface (e.g., a WI-FI interface), and is typically used to establish a communication link between the electronic apparatus 4 and other electronic devices.
Fig. 4 only shows the electronic device 4 with components 41-44, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may alternatively be implemented.
Optionally, the electronic device 4 may further include a user interface, which may include an input unit such as a Keyboard (Keyboard), a voice input device such as a microphone (microphone) or other equipment with voice recognition function, a voice output device such as a sound box, a headset, etc., and optionally may also include a standard wired interface or a wireless interface.
Optionally, the electronic device 4 may further include a display, which may also be referred to as a display screen or a display unit. In some embodiments, the display device may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch device, or the like. The display is used for displaying information processed in the electronic apparatus 4 and for displaying a visualized user interface.
Optionally, the electronic device 4 may further include a Radio Frequency (RF) circuit, a sensor, an audio circuit, and the like, which are not described in detail herein.
In the apparatus embodiment shown in fig. 4, the memory 41, which is a kind of computer storage medium, may include therein an operating system, and a JAVA-based project source scan parser 40; the processor 42 executes the JAVA based project source scan parser 40 stored in the memory 41 to perform the following steps: acquiring a project source code file; multithreading parallel scanning the project source code of the project source code file and acquiring the required key information; the multithreading comprises a front-end thread, a back-end thread and an SQL thread; carrying out data structure processing on the acquired key information to generate a data object, and carrying out persistent storage on the generated data object; wherein the persistent storage comprises direct storage as database data and storage as JSON files; and compiling and formulating corresponding data statistical logic for the database data and the JSON file which are persistently stored according to different statistical analysis requirements, and generating an analysis result.
In other embodiments, the JAVA based project source scan parser 40 may be further divided into one or more modules, which are stored in the memory 41 and executed by the processor 42 to implement the present invention. The modules referred to herein are a series of computer program segments that perform particular functions. The JAVA-based project source scan parser 40 may be divided into a project source file acquisition unit 310, a multi-thread scan unit 320, a persistent storage unit 330, and a parsing result generation unit 340.
In addition, the present invention also provides a computer readable storage medium, which mainly includes a storage data area and a storage program area, wherein the storage data area may store data created according to the use of the blockchain node, and the storage program area may store an operating system and an application program required by at least one function, the computer readable storage medium includes a JAVA based project source code scan analysis program, and the JAVA based project source code scan analysis program implements the operation of the JAVA based project source code scan analysis method when executed by a processor.
The specific implementation of the computer readable storage medium of the present invention is substantially the same as the specific implementation of the method, system and electronic device for analyzing source code scanning of a JAVA-based project, and will not be described herein again.
In summary, the project source code scanning analysis method, system, electronic device and computer readable storage medium based on JAVA of the present invention, through a front-end thread, a back-end thread and an SQL thread, multithread parallel scan the project source code of the project source code file and obtain the required key information; carrying out data structure processing on the acquired key information to generate a data object, and carrying out persistent storage on the generated data object; and compiling and formulating corresponding data statistical logic for the database data and the JSON file which are persistently stored according to different statistical analysis requirements, and generating an analysis result. The dependency relationships of the front end and the rear end of the project are connected in series through scanning, all interface information of the front end of the Web application, which occurs in database interaction, is associated with fields of an actual database, the use condition of live fields of a database table of a system is analyzed, invalid codes in the system are analyzed, and then all-round scanning analysis of the project file is achieved. Therefore, the problems of environment dependence and the like of the code to be monitored are solved, and the efficiency of related development is improved.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above, and includes several programs for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A JAVA-based project source code scanning analysis method is applied to an electronic device and is characterized by comprising the following steps:
acquiring a project source code file;
multithreading parallel scanning the project source code of the project source code file to obtain the required key information; the multithreading comprises a front-end thread, a back-end thread and an SQL thread, and the key information at least comprises a clicking or jumping event component number, event content description information, constant class information in a file, an operation data table, a field and operation type information;
carrying out data structure processing on the acquired key information to generate a data object, and carrying out persistent storage on the generated data object; wherein the persistent storage comprises direct storage as database data and storage as JSON files;
and compiling and formulating corresponding data statistical logic for the database data and the JSON file which are persistently stored according to different statistical analysis requirements, and generating an analysis result.
2. The JAVA-based item source code scan analysis method of claim 1,
after the step of obtaining the project source code file, the method further comprises the following steps:
judging whether a network or a database exists;
if it is determined that there is no database or network,
according to a preset dependency relationship execution sequence, loading the JSON file cached in the local area, and then performing traversal screening and matching;
regenerating the data object of the B + tree structure, and persistently storing the regenerated data object of the B + tree structure as a JSON file;
and compiling and formulating corresponding data statistical logic for the stored JSON file according to different statistical analysis requirements to generate an analysis result.
3. The JAVA-based project source code scanning analysis method of claim 2, wherein the preset dependency relationship execution sequence is a front-end HTML/JS/JSP file, a back-end JAVA file and an SQL file in sequence.
4. The JAVA based project source scan analysis method of claim 1, wherein the step of multithreaded parallel scanning project source codes of the project source code file comprises:
performing recursive traversal on the acquired project source code file, and dynamically identifying the type of the project source code file;
and scanning according to corresponding rules according to the type of the project source code file.
5. The JAVA-based project source code scanning analysis method of claim 1, wherein the step of writing and formulating corresponding data statistical logic for the persistently stored database data and JSON files according to different statistical analysis requirements and generating analysis results further comprises a file verification program; the file verification program includes:
adding a file identifier to the stored JSON file; the file identification comprises a beginning symbol, an end symbol and a byte number;
checking whether the JSON file corresponds to the current file to be processed or not according to the file identification;
and if the JSON file is qualified, compiling the JSON file to make corresponding data statistical logic and generating an analysis result.
6. A JAVA-based project source code scanning and analyzing system is characterized by comprising a project source code file acquisition unit, a multi-thread scanning unit, a persistent storage unit and an analysis result generation unit; wherein the content of the first and second substances,
the project source code file acquisition unit is used for acquiring a project source code file;
the multithreading scanning unit is used for multithreading to parallelly scan the project source codes of the project source code file so as to acquire required key information; the multithreading comprises a front-end thread, a back-end thread and an SQL thread, and the key information at least comprises a clicking or jumping event component number, event content description information, constant class information in a file, an operation data table, a field and operation type information;
the persistent storage unit is used for carrying out data structure processing on the acquired key information to generate a data object and carrying out persistent storage on the generated data object; wherein the persistent storage comprises direct storage as database data and storage as JSON files;
and the analysis result generation unit is used for compiling and formulating corresponding data statistical logic for the database data and the JSON file which are persistently stored according to different statistical analysis requirements to generate an analysis result.
7. The JAVA-based project source code scanning analysis system of claim 6, wherein the multi-thread scanning unit comprises a project source code file traversal module, a file type identification module, a multi-thread parallel scanning module and a key information acquisition module;
the project source code file traversal module is used for performing recursive traversal on the acquired project source code file;
the file type identification module is used for dynamically identifying the type of the traversed project source code file;
the multithreading parallel scanning module is used for scanning corresponding rules according to the type of the project source code file;
and the key information acquisition module is used for acquiring key information according to the scanning result.
8. The JAVA-based project source code scanning analysis system of claim 6, wherein the analysis result generation unit comprises a file verification module, a data statistics logic formulation module and an analysis result generation module;
the file checking module is used for judging whether the current JSON file corresponds to a file to be processed or not;
the data statistical logic formulating module is used for compiling and formulating corresponding data statistical logic for the JSON file according to different statistical analysis requirements;
and the analysis result generation module is used for generating an analysis result.
9. An electronic device, comprising: at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a program executable by the at least one processor to enable the at least one processor to perform the JAVA based project source scan analysis method of any of claims 1 to 5.
10. A computer readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the JAVA based project source code scan analysis method of any of claims 1 to 5.
CN202011362102.XA 2020-11-27 2020-11-27 JAVA-based project source code scanning analysis method, system and storage medium Pending CN112416787A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011362102.XA CN112416787A (en) 2020-11-27 2020-11-27 JAVA-based project source code scanning analysis method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011362102.XA CN112416787A (en) 2020-11-27 2020-11-27 JAVA-based project source code scanning analysis method, system and storage medium

Publications (1)

Publication Number Publication Date
CN112416787A true CN112416787A (en) 2021-02-26

Family

ID=74842731

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011362102.XA Pending CN112416787A (en) 2020-11-27 2020-11-27 JAVA-based project source code scanning analysis method, system and storage medium

Country Status (1)

Country Link
CN (1) CN112416787A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632550A (en) * 2021-03-05 2021-04-09 北京邮电大学 Method for detecting application security of password and secret key and electronic equipment thereof
CN112989348A (en) * 2021-04-15 2021-06-18 中国电子信息产业集团有限公司第六研究所 Attack detection method, model training method, device, server and storage medium
CN113010478A (en) * 2021-03-15 2021-06-22 北京金山云网络技术有限公司 List file generation method and device, electronic equipment and medium
CN113254001A (en) * 2021-07-06 2021-08-13 统信软件技术有限公司 Source code analysis method, computing device and storage medium
CN113486335A (en) * 2021-05-27 2021-10-08 贵州电网有限责任公司 JNI malicious attack detection method and device based on RASP zero rule
CN113704176A (en) * 2021-07-09 2021-11-26 奇安信科技集团股份有限公司 File scanning method, file scanning device, electronic equipment, program product and storage medium
CN114116517A (en) * 2021-12-06 2022-03-01 北京字节跳动网络技术有限公司 Front-end item analysis method, device, medium and electronic equipment

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632550A (en) * 2021-03-05 2021-04-09 北京邮电大学 Method for detecting application security of password and secret key and electronic equipment thereof
CN112632550B (en) * 2021-03-05 2021-06-29 北京邮电大学 Method for detecting application security of password and secret key and electronic equipment thereof
CN113010478A (en) * 2021-03-15 2021-06-22 北京金山云网络技术有限公司 List file generation method and device, electronic equipment and medium
CN112989348A (en) * 2021-04-15 2021-06-18 中国电子信息产业集团有限公司第六研究所 Attack detection method, model training method, device, server and storage medium
CN113486335A (en) * 2021-05-27 2021-10-08 贵州电网有限责任公司 JNI malicious attack detection method and device based on RASP zero rule
CN113486335B (en) * 2021-05-27 2023-02-03 贵州电网有限责任公司 JNI malicious attack detection method and device based on RASP zero rule
CN113254001A (en) * 2021-07-06 2021-08-13 统信软件技术有限公司 Source code analysis method, computing device and storage medium
CN113704176A (en) * 2021-07-09 2021-11-26 奇安信科技集团股份有限公司 File scanning method, file scanning device, electronic equipment, program product and storage medium
CN113704176B (en) * 2021-07-09 2023-10-31 奇安信科技集团股份有限公司 File scanning method, device, electronic equipment and storage medium
CN114116517A (en) * 2021-12-06 2022-03-01 北京字节跳动网络技术有限公司 Front-end item analysis method, device, medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN112416787A (en) JAVA-based project source code scanning analysis method, system and storage medium
Hegewald et al. XStruct: efficient schema extraction from multiple and large XML documents
CN1297936C (en) Method and system for comparing files of two computers
US8392467B1 (en) Directing searches on tree data structures
US20080263039A1 (en) Pattern-matching system
US20070266378A1 (en) Source code generation method, apparatus, and program
CN113032362A (en) Data blood margin analysis method and device, electronic equipment and storage medium
CN114090671A (en) Data import method and device, electronic equipment and storage medium
CN112860265A (en) Method and device for detecting operation abnormity of source code database
Kuramitsu Nez: practical open grammar language
JP2008299723A (en) Program verification method and device
US6981006B2 (en) Schema-based file conversion
CN113419721B (en) Web-based expression editing method, device, equipment and storage medium
US7882429B2 (en) High-level virtual machine for fast XML parsing and validation
JP2007041683A (en) Device, method and program for extracting sequence pattern
US10719424B1 (en) Compositional string analysis
US20040010780A1 (en) Method and apparatus for approximate generation of source code cross-reference information
CN113687827B (en) Data list generation method, device and equipment based on widget and storage medium
CN112230895B (en) EL expression analysis method, device, equipment and storage medium
CN114611500A (en) Expression processing method and device, electronic equipment and computer readable storage medium
CN114691197A (en) Code analysis method and device, electronic equipment and storage medium
US20060005174A1 (en) Defining hierarchical structures with markup languages and reflection
CN113344023A (en) Code recommendation method, device and system
CN111046636A (en) Method and device for screening PDF file information, computer equipment and storage medium
CN112433943A (en) Method, device, equipment and medium for detecting environment variable based on abstract syntax tree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination